Error when running CPU benchmark ??

Message boards : Number crunching : Error when running CPU benchmark ??

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
[AF>Belgique]Mamouth

Send message
Joined: 18 Sep 05
Posts: 4
Credit: 580,683
RAC: 0
Message 489 - Posted: 25 Sep 2005, 20:31:41 UTC

24/09/2005 17:44:03||Suspending computation and network activity - running CPU benchmarks
24/09/2005 17:44:03|rosetta@home|Pausing result 1pvaA_abrelax_20533_0 (removed from memory)
24/09/2005 17:44:03|rosetta@home|Pausing result 1pvaA_abrelax_23518_0 (removed from memory)
24/09/2005 17:44:04|rosetta@home|Unrecoverable error for result 1pvaA_abrelax_20533_0 ( - exit code -1073741819 (0xc0000005))
24/09/2005 17:44:04|rosetta@home|Unrecoverable error for result 1pvaA_abrelax_23518_0 ( - exit code -1073741819 (0xc0000005))
24/09/2005 17:44:04||request_reschedule_cpus: process exited
24/09/2005 17:44:05||Running CPU benchmarks
24/09/2005 17:45:02||Benchmark results:
24/09/2005 17:45:02|| Number of CPUs: 2
24/09/2005 17:45:02|| 1451 double precision MIPS (Whetstone) per CPU
24/09/2005 17:45:02|| 2094 integer MIPS (Dhrystone) per CPU
24/09/2005 17:45:02||Finished CPU benchmarks
24/09/2005 17:45:02||Resuming computation and network activity
24/09/2005 17:45:02||request_reschedule_cpus: Resuming activities
24/09/2005 17:45:02|rosetta@home|Deferring communication with project for 2 seconds
24/09/2005 17:45:02|rosetta@home|Computation for result 1pvaA_abrelax_20533_0 finished
24/09/2005 17:45:02|rosetta@home|Computation for result 1pvaA_abrelax_23518_0 finished
24/09/2005 17:45:02|LHC@home|Restarting result wjun4B_v6s4hhpac_mqx__16__64.2764_59.2927__4_6__6__30_1_sixvf_boinc20913_3 using sixtrack version 4.67
24/09/2005 17:45:02|LHC@home|Restarting result wjun4B_v6s4hhpac_mqx__18__64.2784_59.2947__6_8__6__50_1_sixvf_boinc23820_5 using sixtrack version 4.67


Did someone of you got that kind of errors ?
ID: 489 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
[AF>france>pas-de-calais]symaski62

Send message
Joined: 19 Sep 05
Posts: 47
Credit: 33,871
RAC: 0
Message 493 - Posted: 25 Sep 2005, 21:20:54 UTC
Last modified: 25 Sep 2005, 21:23:47 UTC

Mamouth
||
/
GenuineIntel
Intel(R) Pentium(R) 4 CPU 3.00GHz

Microsoft Windows XP
Professional Edition, Service Pack 2, (05.01.2600.00)

ID: 493 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nightbird

Send message
Joined: 17 Sep 05
Posts: 70
Credit: 32,418
RAC: 0
Message 496 - Posted: 25 Sep 2005, 21:29:57 UTC

@ is it your first benchmark with Rosetta ?


ID: 496 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ocean Archer
Avatar

Send message
Joined: 22 Sep 05
Posts: 32
Credit: 49,302
RAC: 0
Message 497 - Posted: 25 Sep 2005, 21:55:47 UTC

Mamouth --

By any chance, did you get some type of Microsoft Windows message just prior to the error message you list in your thread? It might have had the format "(Program) has crashed do you want to report it to Microsoft?"

The reason I ask, is because the form and format of your error is exceptionally similar to one seen from another BOINC process (ClimatePrediction) detailed in the BOINC-Wiki, indicating that the BOINC Daemon was not properly exited.
ID: 497 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
J D K
Avatar

Send message
Joined: 23 Sep 05
Posts: 168
Credit: 101,266
RAC: 0
Message 503 - Posted: 26 Sep 2005, 1:10:02 UTC

ID: 503 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Christian Barrett
Avatar

Send message
Joined: 17 Sep 05
Posts: 11
Credit: 14,933
RAC: 0
Message 512 - Posted: 26 Sep 2005, 4:20:44 UTC - in response to Message 489.  

24/09/2005 17:44:03||Suspending computation and network activity - running CPU benchmarks
24/09/2005 17:44:03|rosetta@home|Pausing result 1pvaA_abrelax_20533_0 (removed from memory)
24/09/2005 17:44:03|rosetta@home|Pausing result 1pvaA_abrelax_23518_0 (removed from memory)
24/09/2005 17:44:04|rosetta@home|Unrecoverable error for result 1pvaA_abrelax_20533_0 ( - exit code -1073741819 (0xc0000005))
24/09/2005 17:44:04|rosetta@home|Unrecoverable error for result 1pvaA_abrelax_23518_0 ( - exit code -1073741819 (0xc0000005))
24/09/2005 17:44:04||request_reschedule_cpus: process exited


Did someone of you got that kind of errors ?


I got this same error but for a different reason. Mine happened when i manually switched it to another project during a run. I think the error is from the same action. Rosetta must have trouble holding its information after well into the crunch, maybe 50% or more. I played with switching around 8% and didnt have a problem with crashes, only later in the runs.

I think this bug is new with 4.77 but we might have to wait until the Rosetta peeps are back from vacation.
ID: 512 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
[AF>Belgique]Mamouth

Send message
Joined: 18 Sep 05
Posts: 4
Credit: 580,683
RAC: 0
Message 527 - Posted: 26 Sep 2005, 6:28:42 UTC - in response to Message 497.  

@ is it your first benchmark with Rosetta ?


No idea to be honnest

Mamouth --

By any chance, did you get some type of Microsoft Windows message just prior to the error message you list in your thread? It might have had the format "(Program) has crashed do you want to report it to Microsoft?"

The reason I ask, is because the form and format of your error is exceptionally similar to one seen from another BOINC process (ClimatePrediction) detailed in the BOINC-Wiki, indicating that the BOINC Daemon was not properly exited.


No microsoft error, my computer was only running BOINC
+ my computer is stable and not overclocked
ID: 527 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
PlaNed

Send message
Joined: 25 Sep 05
Posts: 3
Credit: 37,334
RAC: 0
Message 537 - Posted: 26 Sep 2005, 11:25:42 UTC

I have a same problem!

26/09/2005 14:04:45||Suspending computation and network activity - running CPU benchmarks
26/09/2005 14:04:45|rosetta@home|Pausing result 1pvaA_abrelax_no_cst_05910_0 (removed from memory)
26/09/2005 14:04:47|rosetta@home|Unrecoverable error for result 1pvaA_abrelax_no_cst_05910_0 ( - exit code -1073741819 (0xc0000005))
26/09/2005 14:04:47||request_reschedule_cpus: process exited
26/09/2005 14:04:47||Running CPU benchmarks

<img src="http://boinc.mundayweb.com/one/stats.php?userID=120&amp;trans=off">
ID: 537 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Polian
Avatar

Send message
Joined: 21 Sep 05
Posts: 152
Credit: 10,141,266
RAC: 0
Message 724 - Posted: 28 Sep 2005, 23:31:07 UTC

I noticed the same problem with similar errors here on my linux box.

9/28/2005 4:24:55 PM||Running CPU benchmarks
9/28/2005 4:24:56 PM|rosetta@home|Unrecoverable error for result 1btn__abrelax_no_cst_18915_0 (process exited with code 131 (0x83))
9/28/2005 4:24:57 PM||request_reschedule_cpus: process exited
9/28/2005 4:24:57 PM|rosetta@home|Unrecoverable error for result 1btn__abrelax_no_cst_18922_0 (process exited with code 131 (0x83))
9/28/2005 4:24:57 PM||request_reschedule_cpus: process exited
9/28/2005 4:25:03 PM||Aborting CPU benchmarks, one or more active tasks are still running.
9/28/2005 4:25:04 PM||Resuming computation and network activity
9/28/2005 4:25:04 PM|rosetta@home|Deferring communication with project for 53 seconds
9/28/2005 4:25:04 PM|rosetta@home|Computation for result 1btn__abrelax_no_cst_18915_0 finished
9/28/2005 4:25:04 PM||schedule_cpus: must schedule
9/28/2005 4:25:04 PM|rosetta@home|Restarting result 1btn__abrelax_no_cst_17139_0 using rosetta version 4.77
9/28/2005 4:25:04 PM|rosetta@home|resume_or_start(): unexpected process state 2
9/28/2005 4:25:04 PM|rosetta@home|Starting result 1btn__abrelax_no_cst_19387_0 using rosetta version 4.77
9/28/2005 4:25:04 PM||ACTIVE_TASK_SET::check_app_exited(): pid 16299 not found
9/28/2005 4:25:05 PM|rosetta@home|Computation for result 1btn__abrelax_no_cst_18922_0 finished
9/28/2005 4:25:05 PM||ACTIVE_TASK_SET::check_app_exited(): pid 16300 not found
9/28/2005 4:25:06 PM||ACTIVE_TASK_SET::check_app_exited(): pid 16301 not found
9/28/2005 4:25:07 PM||ACTIVE_TASK_SET::check_app_exited(): pid 16302 not found
9/28/2005 4:25:58 PM|rosetta@home|Requesting 24514.95 seconds of work
9/28/2005 4:25:58 PM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
9/28/2005 4:26:02 PM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded

ID: 724 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
STE\/E

Send message
Joined: 17 Sep 05
Posts: 125
Credit: 3,277,715
RAC: 1,208
Message 727 - Posted: 29 Sep 2005, 0:42:42 UTC
Last modified: 29 Sep 2005, 0:43:03 UTC

Same here, lost 2 WU's this morning when this PC Benchmarked.

2005-09-28 07:12:26 [---] Suspending computation and network activity - running CPU benchmarks
2005-09-28 07:12:26 [rosetta@home] Pausing result 1pvaA_abrelax_24584_0 (removed from memory)
2005-09-28 07:12:26 [rosetta@home] Pausing result 1pvaA_abrelax_24437_0 (removed from memory)
2005-09-28 07:12:28 [rosetta@home] Unrecoverable error for result 1pvaA_abrelax_24584_0 ( - exit code -1073741819 (0xc0000005))
2005-09-28 07:12:28 [rosetta@home] Unrecoverable error for result 1pvaA_abrelax_24437_0 ( - exit code -1073741819 (0xc0000005))
2005-09-28 07:12:28 [---] request_reschedule_cpus: process exited
2005-09-28 07:12:28 [---] Running CPU benchmarks
2005-09-28 07:13:25 [---] Benchmark results:
2005-09-28 07:13:25 [---] Number of CPUs: 2
2005-09-28 07:13:25 [---] 1754 double precision MIPS (Whetstone) per CPU
2005-09-28 07:13:25 [---] 1887 integer MIPS (Dhrystone) per CPU
2005-09-28 07:13:25 [---] Finished CPU benchmarks
2005-09-28 07:13:25 [---] Resuming computation and network activity
2005-09-28 07:13:25 [---] request_reschedule_cpus: Resuming activities
2005-09-28 07:13:25 [rosetta@home] Computation for result 1pvaA_abrelax_24584_0 finished
2005-09-28 07:13:25 [rosetta@home] resume_or_start(): unexpected process state 2
2005-09-28 07:13:26 [rosetta@home] Starting result 1pvaA_abrelax_24396_0 using rosetta version 4.77
2005-09-28 07:13:27 [rosetta@home] Computation for result 1pvaA_abrelax_24437_0 finished
2005-09-28 07:14:40 [---] Exit requested by user
2005-09-28 07:14:40 [---] request_reschedule_cpus: exit_tasks
ID: 727 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Blanckaert

Send message
Joined: 21 Sep 05
Posts: 3
Credit: 420,828
RAC: 0
Message 6814 - Posted: 19 Dec 2005, 21:52:05 UTC

did anyone find out what was causing this problem? As I just had it happen to me today, when switching (auto) from one process to another...


12/19/2005 6:38:09 AM|SZTAKI Desktop Grid|Computation for result 164e958e-f3df-4be9-a718-152ca7812ebc_3 finished
12/19/2005 6:38:09 AM|rosetta@home|Resuming result 1ogw__topology_sample_121951_1 using rosetta version 480
12/19/2005 7:38:09 AM|rosetta@home|Pausing result 1ogw__topology_sample_121951_1 (removed from memory)
12/19/2005 7:38:09 AM|SETI@home|Starting result 13fe05aa.24442.15824.628416.214_2 using setiathome version 418
12/19/2005 7:38:11 AM|rosetta@home|Unrecoverable error for result 1ogw__topology_sample_121951_1 ( - exit code -1073741819 (0xc0000005))
12/19/2005 7:38:11 AM||Rescheduling CPU: process exited
12/19/2005 7:38:11 AM|rosetta@home|Computation for result 1ogw__topology_sample_121951_1 finished

Mark


ID: 6814 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,488,060
RAC: 2
Message 6816 - Posted: 19 Dec 2005, 21:59:07 UTC - in response to Message 6814.  

did anyone find out what was causing this problem?

12/19/2005 7:38:09 AM|rosetta@home|Pausing result 1ogw__topology_sample_121951_1 (removed from memory)


Yes - you have "leave applications in memory when preempted" set to "no", which causes Rosetta to fail. BOINC V5.2.13 will obey this setting even when benchmarks are run - earlier versions switched the Rosetta app out during a benchmark regardless of the setting.

ID: 6816 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Blanckaert

Send message
Joined: 21 Sep 05
Posts: 3
Credit: 420,828
RAC: 0
Message 7037 - Posted: 21 Dec 2005, 16:34:56 UTC - in response to Message 6816.  

did anyone find out what was causing this problem?

12/19/2005 7:38:09 AM|rosetta@home|Pausing result 1ogw__topology_sample_121951_1 (removed from memory)


Yes - you have "leave applications in memory when preempted" set to "no", which causes Rosetta to fail. BOINC V5.2.13 will obey this setting even when benchmarks are run - earlier versions switched the Rosetta app out during a benchmark regardless of the setting.



Yes... but mine was during a switch out to another project... NOT during benchmarks testing.....


Mark


ID: 7037 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,488,060
RAC: 2
Message 7041 - Posted: 21 Dec 2005, 16:57:19 UTC - in response to Message 7037.  

"Leave applications in memory when preempted" set to "no" causes Rosetta to sometimes fail, whenever it is switched out of memory, for whatever reason...

ID: 7041 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Blanckaert

Send message
Joined: 21 Sep 05
Posts: 3
Credit: 420,828
RAC: 0
Message 7050 - Posted: 21 Dec 2005, 17:51:46 UTC - in response to Message 7041.  

"Leave applications in memory when preempted" set to "no" causes Rosetta to sometimes fail, whenever it is switched out of memory, for whatever reason...


Okay so is this something that it being looked at, to be fixed? My system is still running Rosetta 4.81 so I'm not sure if that is the up-to-date prog or not...
Mark


ID: 7050 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Andrew

Send message
Joined: 19 Sep 05
Posts: 162
Credit: 105,512
RAC: 0
Message 7069 - Posted: 21 Dec 2005, 18:33:12 UTC - in response to Message 7050.  

Okay so is this something that it being looked at, to be fixed? My system is still running Rosetta 4.81 so I'm not sure if that is the up-to-date prog or not...


The dev's are looking into this issue, but it isn't fixed yet. (4.81 is the lastest Rosetta version)

The work around to this bug is setting "Leave applications in memory when preempted" set to "yes" as Bill has stated.
ID: 7069 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
NJMHoffmann

Send message
Joined: 17 Dec 05
Posts: 45
Credit: 45,891
RAC: 0
Message 7077 - Posted: 21 Dec 2005, 19:21:01 UTC - in response to Message 7069.  

The work around to this bug is setting "Leave applications in memory when preempted" set to "yes" as Bill has stated.

I think, that only helps if the computer is always on. No chance to "leave in memory" if you shutdown your PC. And then there are the checkpoints, that are too infrequent (in my opinion). I had a slow(!!) PC do the same calculation again and again until I realized it, aborted WUs and set Rosetta to "suspend".

Norbert
ID: 7077 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Andrew

Send message
Joined: 19 Sep 05
Posts: 162
Credit: 105,512
RAC: 0
Message 7086 - Posted: 21 Dec 2005, 20:21:43 UTC - in response to Message 7077.  
Last modified: 21 Dec 2005, 20:23:14 UTC

The work around to this bug is setting "Leave applications in memory when preempted" set to "yes" as Bill has stated.

I think, that only helps if the computer is always on. No chance to "leave in memory" if you shutdown your PC. And then there are the checkpoints, that are too infrequent (in my opinion). I had a slow(!!) PC do the same calculation again and again until I realized it, aborted WUs and set Rosetta to "suspend".

Norbert



That is correct. The bug is when the client is removed from memory... whether that be if the computer is turned off, boinc is shutdown, the "Leave applications in memory when preempted" set to "no", etc. Unfortunately if you have a slow computer that you regularly turn off, then you won't be able to run Rosetta on it at the moment.

When/if this bug is fixed is up to the dev's. Right now, they have some other issues that they're fixing. Hopefully they're fix this in the new year, when they're back at school. :)
ID: 7086 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
NJMHoffmann

Send message
Joined: 17 Dec 05
Posts: 45
Credit: 45,891
RAC: 0
Message 7094 - Posted: 21 Dec 2005, 21:35:27 UTC - in response to Message 7086.  

When/if this bug is fixed is up to the dev's. Right now, they have some other issues that they're fixing. Hopefully they're fix this in the new year, when they're back at school. :)

And while they are at fixing, they can look why there are 9MB of *.gz-files left when suspending Rosetta. Seti and LHC are better at housekeeping (they have empty directories when suspended).

Norbert
ID: 7094 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile River~~
Avatar

Send message
Joined: 15 Dec 05
Posts: 761
Credit: 285,578
RAC: 0
Message 7114 - Posted: 22 Dec 2005, 0:11:07 UTC - in response to Message 7094.  


And while they are at fixing, they can look why there are 9MB of *.gz-files left when suspending Rosetta. Seti and LHC are better at housekeeping (they have empty directories when suspended).


Rosetta has so-called 'sticky' data files, that is files that deliberately do not go away between work units, but I feel that is a benefit not a disadvantage. It is intended to save you downloading the same data time and time again. Whether you suspend between WU or keep running, by keeping the file on disk it saves a repeated download.

Anytime you have no Rosetta results showing in the GUI, you can alsways get rid of these files by using reset project - this deletes all files that relate to the selected project. But don't unless you need the disk space -- you will just be slugging Rosetta's internet connection and your own to get the same files back later next time Rosetta has you looking at that protein again.

The project also has the capability of tellingyour box to delete those files if they become obsolete. I'd only expect them to do this if they stopped researching a particualr protien.

Einstein falls somewhere between Rosetta and SETI in this - it has data files that apply for a few hundred WU, and get automatically deleted after the last of those WU is returned.

River~~
ID: 7114 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : Error when running CPU benchmark ??



©2024 University of Washington
https://www.bakerlab.org