Message boards : Number crunching : Problems with Rosetta version 5.80
Previous · 1 . . . 7 · 8 · 9 · 10
Author | Message |
---|---|
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
This one failed after 34sec, first one in a while. https://boinc.bakerlab.org/rosetta/workunit.php?wuid=107003644 11/5/2007 2:12:36 PM|rosetta@home|Reason: Unrecoverable error for result 1n0u__TREEJUMP_ABRELAX_TOR_EQ_-1_PROB_.1_SAVE_ALL_OUT-1n0u_-_BARCODE__2244_8499_0 (Incorrect function. (0x1) - exit code 1 (0x1)) 11/5/2007 2:12:36 PM|rosetta@home|Computation for task 1n0u__TREEJUMP_ABRELAX_TOR_EQ_-1_PROB_.1_SAVE_ALL_OUT-1n0u_-_BARCODE__2244_8499_0 finished 11/5/2007 2:12:36 PM|rosetta@home|Output file 1n0u__TREEJUMP_ABRELAX_TOR_EQ_-1_PROB_.1_SAVE_ALL_OUT-1n0u_-_BARCODE__2244_8499_0_0 for task 1n0u__TREEJUMP_ABRELAX_TOR_EQ_-1_PROB_.1_SAVE_ALL_OUT-1n0u_-_BARCODE__2244_8499_0 absent Pete. |
rsubler Send message Joined: 24 Jun 07 Posts: 8 Credit: 172,618 RAC: 0 |
After 19 hours of crunching on work unit 106367190 this error condition has suddenly appeared. I tried limiting BOINC to this work unit, reboooting my computer and running only BOINC -- to no avail. The error condition persists. The computer is an AMD x2 3800+, both cores are available for BOINC and 90% of the 1 gig physical memory and 75% of a 3 gig swap file are available to BOINC. Is there anything that I can do besides purging the WU? Ron |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
rsubler, if you open Windows task manager and go to the processes tab, how much memory does it indicate is used for that process? (it will have Rosetta in the name) Rosetta Moderator: Mod.Sense |
rsubler Send message Joined: 24 Jun 07 Posts: 8 Credit: 172,618 RAC: 0 |
Mod sense When I activate the next Rosetta WU (also 5.80), Task Manager shows 130,808k. When I try to activate the problem WU, Task Manager does not have a Rosetta entry. This is with all other BOINC WUs suspended. Ron |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
rsubler, you say "after 19 hrs of crunching"... has it recorded that much CPU time? Or has it been "waiting for memory" that long? If it has recorded that much CPU time, then it should mean it has been in a "running" status for that long. Has the amount of time spent increased in the last several hours? Looks like your runtime preference must be 24hrs. What is shown for the % completed on it now? Have you exited and restarted BOINC since you noticed this one? Rosetta Moderator: Mod.Sense |
rsubler Send message Joined: 24 Jun 07 Posts: 8 Credit: 172,618 RAC: 0 |
Mod.Sense 1. That is the amount of CPU time shown by Boinc. It has not changed for the last hour or two -- since I noticed the problem. 2. Yes, I have been running 24 hour Rosetta WUs for some months. The problem WU is now showing 79.646% complete. 3. Maybe. I restricted BOINC to the problem WU by suspending all others and then rebooted my system. I did not explicitly exit BOINC. I will try this next. I just did an exit of BOINC and restart. The problem persists. Thanks, Ron |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
I had another task fail last night same type, different computer. https://boinc.bakerlab.org/rosetta/workunit.php?wuid=107168558 11/5/2007 7:25:53 PM|rosetta@home|Reason: Unrecoverable error for result 2reb__TREEJUMP_ABRELAX_TOR_EQ_-1_PROB_.1_SAVE_ALL_OUT-2reb_-_BARCODE__2244_13080_0 (Incorrect function. (0x1) - exit code 1 (0x1)) 11/5/2007 7:25:53 PM|rosetta@home|Computation for task 2reb__TREEJUMP_ABRELAX_TOR_EQ_-1_PROB_.1_SAVE_ALL_OUT-2reb_-_BARCODE__2244_13080_0 finished 11/5/2007 7:25:53 PM|rosetta@home|Output file 2reb__TREEJUMP_ABRELAX_TOR_EQ_-1_PROB_.1_SAVE_ALL_OUT-2reb_-_BARCODE__2244_13080_0_0 for task 2reb__TREEJUMP_ABRELAX_TOR_EQ_-1_PROB_.1_SAVE_ALL_OUT-2reb_-_BARCODE__2244_13080_0 absent Is someone looking at this problem. Pete. |
rsubler Send message Joined: 24 Jun 07 Posts: 8 Credit: 172,618 RAC: 0 |
To Mod.Sense Re: Waiting for memory, rsubler. The problem has cured itself and the WU is now running. I wish anyone good luck in duplicating the fault and cure. The WU resumed running while I was playing an ancient game with DOSBOX. I thought that DOSBOX was a resource hog, but that is the only variable of which I am aware. Cheers, Ron |
Conan Send message Joined: 11 Oct 05 Posts: 151 Credit: 4,244,078 RAC: 843 |
Had one of these errors on the 6th and now another two on the 7th. 2 were on Windows machine and 1 on a Linux machine all with the same error code on the same '1n0u' type WU. This WU This WU This WU <core_client_version>5.8.15</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # cpu_run_time_pref: 21600 # random seed: 1408543 ERROR:: Exit from: .pose.cc line: 769 Also on core client version 5.10.21 |
Luuklag Send message Joined: 13 Sep 07 Posts: 262 Credit: 4,171 RAC: 0 |
[url=https://boinc.bakerlab.org/rosetta/result.php?resultid=117906669 ]a failed one[/url] ERROR:: Exit from: .pose.cc line: 769 |
M.L. Send message Joined: 21 Nov 06 Posts: 182 Credit: 180,462 RAC: 0 |
Result ID 118032106 Name 2reb__TREEJUMP_ABRELAX_TOR_EQ_-1_PROB_.1_SAVE_ALL_OUT-2reb_-_BARCODE__2244_13258_0 Workunit 107265106 Created 5 Nov 2007 8:09:11 UTC Sent 5 Nov 2007 11:10:58 UTC Received 8 Nov 2007 16:17:30 UTC Server state Over Outcome Client error Client state Compute error Exit status 1 (0x1) Computer ID 510574 Report deadline 15 Nov 2007 11:10:58 UTC CPU time 5042.90625 stderr out <core_client_version>5.10.28</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # cpu_run_time_pref: 21600 # random seed: 1419133 ERROR:: Exit from: .pose.cc line: 769 </stderr_txt> ]]> Validate state Invalid Claimed credit 20.725275001017 Granted credit 0 application version 5.80 |
M.L. Send message Joined: 21 Nov 06 Posts: 182 Credit: 180,462 RAC: 0 |
To Luuklag. SNAP. |
M.L. Send message Joined: 21 Nov 06 Posts: 182 Credit: 180,462 RAC: 0 |
Result ID 118192540 Name 1n0u__TREEJUMP_ABRELAX_TOR_EQ_-5_PROB_.5_SAVE_ALL_OUT-1n0u_-_BARCODE__2243_13643_1 Workunit 107277272 Created 5 Nov 2007 22:07:48 UTC Sent 5 Nov 2007 22:08:22 UTC Received 8 Nov 2007 23:31:01 UTC Server state Over Outcome Client error Client state Compute error Exit status 1 (0x1) Computer ID 510574 Report deadline 15 Nov 2007 22:08:22 UTC CPU time 3377.8125 stderr out <core_client_version>5.10.28</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # cpu_run_time_pref: 21600 # random seed: 1488748 ERROR:: Exit from: .pose.cc line: 769 </stderr_txt> ]]> Validate state Invalid Claimed credit 13.8820928833196 Granted credit 0 --also failed as WU 118046654 5 Nov. |
rochester new york Send message Joined: 2 Jul 06 Posts: 2842 Credit: 2,020,043 RAC: 0 |
could someone tell me why i got all these errors |
Message boards :
Number crunching :
Problems with Rosetta version 5.80
©2024 University of Washington
https://www.bakerlab.org