Report problems with Rosetta version 5.32

Message boards : Number crunching : Report problems with Rosetta version 5.32

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3520
Credit: 0
RAC: 0
Message 29213 - Posted: 12 Oct 2006, 1:46:35 UTC

Version 5.32 is now active on Rosetta@home. There is no need to do anything with your PC, the update will automatically download when you download your first v5.32 tasks. And there is no need to change your existing tasks. They will complete normally and should not be aborted.

Very early in the day with 5.32 there was a batch of work units which had errors and were removed from the server. If you have any of these tasks, please abort them. They have a task name with the following pattern:
\"DOC_....pert_bench_1263_...\"

Report any problems your encounter with v5.32 here in this thread. Please note your operating system (Windows, Linux, MAC), and provide the task names or links to the tasks you had problems with.
Rosetta Moderator: Mod.Sense
ID: 29213 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
frederick corse

Send message
Joined: 7 Oct 05
Posts: 10
Credit: 1,545,999
RAC: 0
Message 29219 - Posted: 12 Oct 2006, 2:44:39 UTC

Hello I was running two PSH 0007 hiv when the new version was posted . I got two programs downloaded at the same time and the first one aborted . The second one doidn\'t have what it was in the header. The screensaver started up but only ran the first step then it hungup. The program will run with out the screensaverif the screensaver is used it will reset the program. I tried another program on another computer and when the screensaver went toanother progam it aborted that one. I am running one dual G5 at 2.5ghz with 8gb of ram, the other one is G5dual 2.0ghz also with 8gb ram.
ID: 29219 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 29223 - Posted: 12 Oct 2006, 6:15:52 UTC

A silly question but will the old app 5.25 be deleted from the folder

when the new one takes over.

Thanks.

ID: 29223 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3520
Credit: 0
RAC: 0
Message 29230 - Posted: 12 Oct 2006, 11:37:32 UTC - in response to Message 29223.  

A silly question but will the old app 5.25 be deleted from the folder

when the new one takes over.

Thanks.


Yes, once you\'ve completed and reported all of your v5.25 tasks, this version of the application will be removed from your system. So you don\'t have to worry about different versions of Rosetta taking up more and more space over time. But for a short time you may have both versions on your system.

Rosetta Moderator: Mod.Sense
ID: 29230 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Norman_RKN

Send message
Joined: 9 Feb 06
Posts: 3
Credit: 139,056
RAC: 0
Message 29243 - Posted: 12 Oct 2006, 16:57:45 UTC - in response to Message 29213.  

<core_client_version>5.4.11</core_client_version>
<![CDATA[
<message>
- exit code 1073807364 (0x40010004)
</message>
<stderr_txt>
# random seed: 2004175
# cpu_run_time_pref: 86400

</stderr_txt>
]]>

1t4o_1_CASPR_1_1t4o_1_xxidid_model_07_core_0001IGNORE_THE_REST_idl_1274_1334_0

:(

ID: 29243 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SOAN
Avatar

Send message
Joined: 27 Sep 05
Posts: 252
Credit: 63,160
RAC: 0
Message 29251 - Posted: 12 Oct 2006, 22:21:28 UTC

Not a problem exactly, but after the new version kicked in, it seems like it is taking longer for WUs to initialize. I wonder if this means that the new version is slower? I\'m having trouble telling whether this is due to the new version, new work units or if my computer is slowing down for some reason. The RAC for the computer that I\'m watching looks pretty steady - maybe I\'m just paying more attention to something that\'s been there all along?
ID: 29251 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3520
Credit: 0
RAC: 0
Message 29254 - Posted: 12 Oct 2006, 22:36:40 UTC

Seems the new tasks for Docking progress through the \"steps\" slower then prior tasks. I suppose since docking is working with two proteins rather then just one, it has more work to do at each step. In fact I had one last night which had run for over 8 hours and was still crunching model 3!

So, I believe it is primarily due to the new science that the new version allows us to explore.

Rosetta Moderator: Mod.Sense
ID: 29254 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Chu

Send message
Joined: 23 Feb 06
Posts: 120
Credit: 112,439
RAC: 0
Message 29258 - Posted: 12 Oct 2006, 23:50:53 UTC - in response to Message 29254.  

For most of the proteins simulated in \"RELAX\" WUs, the size is around 100 residues. For Docking WUs, it is very common to have the total protein size more than 300 residues. Here we are targeting the interaction between two proteins whose individual structures are already known. The number of \"steps\" in the simulations is much smaller in docking but within each step a lot of computation is devoted to refine the more detailed interactions between protein sidechains across the interface.
Seems the new tasks for Docking progress through the \"steps\" slower then prior tasks. I suppose since docking is working with two proteins rather then just one, it has more work to do at each step. In fact I had one last night which had run for over 8 hours and was still crunching model 3!

So, I believe it is primarily due to the new science that the new version allows us to explore.


ID: 29258 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Bandit

Send message
Joined: 21 May 06
Posts: 12
Credit: 197,197
RAC: 0
Message 29280 - Posted: 13 Oct 2006, 8:56:44 UTC - in response to Message 29258.  

For the last few days when I \"become active\" on my computer and Rosetta (5.32) suspends itself, I\'ve gotten the message that Rosetta has a problem and should the error be reported to Microsoft. I finally got an error message this morning. The other times had no error message.

10/13/2006 4:51:51 AM||Suspending network activity - user is active
10/13/2006 4:51:59 AM|rosetta@home|Unrecoverable error for result 1t4o_1_CASPR_1_1t4o_1_xxidid_model_04_core_0001IGNORE_THE_REST_idl_1274_1944_0 (One or more arguments are invalid (0x80000003) - exit code -2147483645 (0x80000003))

Any ideas?

Bendit\'s Mom
ID: 29280 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
LG

Send message
Joined: 25 Jun 06
Posts: 2
Credit: 1,910
RAC: 0
Message 29286 - Posted: 13 Oct 2006, 15:55:19 UTC

A pervasive \"shared memory\" error after coming up from power system upgrade.
Please see that thread for response(s) on possible cause. Thanks.
ID: 29286 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Chu

Send message
Joined: 23 Feb 06
Posts: 120
Credit: 112,439
RAC: 0
Message 29298 - Posted: 13 Oct 2006, 16:58:04 UTC - in response to Message 29280.  

The run log shows that the simulation was stuck and the score was not changed for a long time. So the watchdog forced it to stop.
For the last few days when I \"become active\" on my computer and Rosetta (5.32) suspends itself, I\'ve gotten the message that Rosetta has a problem and should the error be reported to Microsoft. I finally got an error message this morning. The other times had no error message.

10/13/2006 4:51:51 AM||Suspending network activity - user is active
10/13/2006 4:51:59 AM|rosetta@home|Unrecoverable error for result 1t4o_1_CASPR_1_1t4o_1_xxidid_model_04_core_0001IGNORE_THE_REST_idl_1274_1944_0 (One or more arguments are invalid (0x80000003) - exit code -2147483645 (0x80000003))

Any ideas?

Bendit\'s Mom


ID: 29298 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Bandit

Send message
Joined: 21 May 06
Posts: 12
Credit: 197,197
RAC: 0
Message 29305 - Posted: 13 Oct 2006, 21:38:17 UTC - in response to Message 29298.  
Last modified: 13 Oct 2006, 21:38:44 UTC

The run log shows that the simulation was stuck and the score was not changed for a long time. So the watchdog forced it to stop.


Well, that makes sense. My RAC has taken a nosedive in the last few days. Any suggestions as to what, if anything, I can do to fix it? Preferrably simple and straightforward, and with step by step instructions.
ID: 29305 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Faust

Send message
Joined: 7 Sep 06
Posts: 14
Credit: 49,559
RAC: 0
Message 29322 - Posted: 14 Oct 2006, 2:29:51 UTC


Hi,

Got this today -

14/10/2006 03:56:49|rosetta@home|Unrecoverable error for result 1di2__BOINC_OLDRELAXFLAGS_ABRELAX_SAVE_ALL_OUT__1278_3896_0 ( - exit code 1073807364 (0x40010004))

<core_client_version>5.4.11</core_client_version>
<message>
- exit code 1073807364 (0x40010004)
</message>
<stderr_txt>
# random seed: 3496105
# cpu_run_time_pref: 10800

</stderr_txt>

Outcome: Client error
Client state: Compute error

Should I worry about other WU\'s, or is it just a one time glitch?


Thanks,
Faust.
ID: 29322 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Chu

Send message
Joined: 23 Feb 06
Posts: 120
Credit: 112,439
RAC: 0
Message 29326 - Posted: 14 Oct 2006, 5:59:16 UTC - in response to Message 29305.  
Last modified: 14 Oct 2006, 6:03:35 UTC

There is nothing wrong on your side. That problem happens occasionally and randomly when the simulation goes into some extreme conditions and can not rescue itself. That is one of the reasons why the \"watchdog\" feature was added to prevent a job from going endlessly. Sorry for your RAC loss and we wish we could have a better solution for this problem.

The run log shows that the simulation was stuck and the score was not changed for a long time. So the watchdog forced it to stop.


Well, that makes sense. My RAC has taken a nosedive in the last few days. Any suggestions as to what, if anything, I can do to fix it? Preferrably simple and straightforward, and with step by step instructions.


ID: 29326 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Chu

Send message
Joined: 23 Feb 06
Posts: 120
Credit: 112,439
RAC: 0
Message 29327 - Posted: 14 Oct 2006, 6:01:19 UTC - in response to Message 29322.  

Looks like a random unlucky error as there have been valid results returned for these WUs. Thanks for checking with us.

Hi,

Got this today -

14/10/2006 03:56:49|rosetta@home|Unrecoverable error for result 1di2__BOINC_OLDRELAXFLAGS_ABRELAX_SAVE_ALL_OUT__1278_3896_0 ( - exit code 1073807364 (0x40010004))

<core_client_version>5.4.11</core_client_version>
<message>
- exit code 1073807364 (0x40010004)
</message>
<stderr_txt>
# random seed: 3496105
# cpu_run_time_pref: 10800

</stderr_txt>

Outcome: Client error
Client state: Compute error

Should I worry about other WU\'s, or is it just a one time glitch?


Thanks,
Faust.


ID: 29327 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Bandit

Send message
Joined: 21 May 06
Posts: 12
Credit: 197,197
RAC: 0
Message 29334 - Posted: 14 Oct 2006, 9:42:25 UTC - in response to Message 29326.  

I\'m not sure that this is the same problem but I just got these messages. They\'re timed for when I last became active again.

10/14/2006 5:29:16 AM|rosetta@home|Unrecoverable error for result 1ogw__BOINC_OLDRELAXFLAGS_ABRELAX_SAVE_ALL_OUT__1278_3836_0 ( - exit code 1073807364 (0x40010004))

10/14/2006 5:37:06 AM|rosetta@home|Unrecoverable error for result DOC_1STF_pose_bound_perturb_benchmark_1280_155_0 ( - exit code 1073807364 (0x40010004))

Bandit\'s Mom


There is nothing wrong on your side. That problem happens occasionally and randomly when the simulation goes into some extreme conditions and can not rescue itself. That is one of the reasons why the \"watchdog\" feature was added to prevent a job from going endlessly. Sorry for your RAC loss and we wish we could have a better solution for this problem.

The run log shows that the simulation was stuck and the score was not changed for a long time. So the watchdog forced it to stop.


Well, that makes sense. My RAC has taken a nosedive in the last few days. Any suggestions as to what, if anything, I can do to fix it? Preferrably simple and straightforward, and with step by step instructions.



ID: 29334 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile alpha_fruit

Send message
Joined: 28 May 06
Posts: 5
Credit: 27,634
RAC: 0
Message 29338 - Posted: 14 Oct 2006, 10:22:43 UTC

Here is my log file, I even rebooted my PC and stil nada. Any help appreciated.

10/14/2006 6:03:22 AM||Starting BOINC client version 5.4.9 for windows_intelx86
10/14/2006 6:03:22 AM||libcurl/7.15.3 OpenSSL/0.9.8a zlib/1.2.3
10/14/2006 6:03:22 AM||Data directory: C:\\Program Files\\BOINC
10/14/2006 6:03:26 AM||Processor: 1 GenuineIntel Intel(R) Celeron(R) CPU 3.20GHz
10/14/2006 6:03:26 AM||Memory: 446.48 MB physical, 1.03 GB virtual
10/14/2006 6:03:26 AM||Disk: 67.73 GB total, 52.75 GB free
10/14/2006 6:03:27 AM|rosetta@home|URL: http://boinc.bakerlab.org/rosetta/; Computer ID: 229529; location: home; project prefs: default
10/14/2006 6:03:27 AM||General prefs: from rosetta@home (last modified 2006-06-23 09:00:43)
10/14/2006 6:03:27 AM||General prefs: no separate prefs for home; using your defaults
10/14/2006 6:03:27 AM||Local control only allowed
10/14/2006 6:03:29 AM||Listening on port 31416
10/14/2006 6:03:30 AM|rosetta@home|MD5 check failed for 1dtj_.fasta.gz
10/14/2006 6:03:30 AM|rosetta@home|expected 80cdd6cee4f0e01b5fce7f297fb5ac67, got 68d2b09ed5d0630440534a4c3566a1f0
10/14/2006 6:03:30 AM|rosetta@home|Started download of file 1dtj_.fasta.gz
10/14/2006 6:03:33 AM||Project communication failed: attempting access to reference site
10/14/2006 6:03:33 AM|rosetta@home|Temporarily failed download of 1dtj_.fasta.gz: http error
10/14/2006 6:03:33 AM|rosetta@home|Backing off 3 hours, 6 minutes and 4 seconds on download of file 1dtj_.fasta.gz
10/14/2006 6:03:36 AM||Access to reference site succeeded - project servers may be temporarily down.

ID: 29338 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
timtam

Send message
Joined: 28 Nov 05
Posts: 1
Credit: 109,604
RAC: 0
Message 29416 - Posted: 16 Oct 2006, 1:23:18 UTC

Similar unrecoverable errors. Ive now had 8 in the last day.
I notice it races from 1% to 100%.

Anything I should be looking for. Maybe a harddisk/file fault considering the speed of failure? (20 to 25seconds runtime)
SETI and Climatechange are both running OK.

16/10/2006 11:15:44 AM|rosetta@home|Started download of file 1r69.pdb.gz
16/10/2006 11:15:46 AM|rosetta@home|Finished download of file 1r69.pdb.gz
16/10/2006 11:15:46 AM|rosetta@home|Throughput 6260 bytes/sec
16/10/2006 11:15:53 AM|rosetta@home|Unrecoverable error for result 1n0u__BOINC_OLDRELAXFLAGS_ABRELAX_SAVE_ALL_OUT__1278_7757_0 ( - exit code -1073741819 (0xc0000005))
16/10/2006 11:15:53 AM|rosetta@home|Deferring scheduler requests for 4 minutes and 27 seconds
16/10/2006 11:15:53 AM||Rescheduling CPU: application exited
16/10/2006 11:15:53 AM|rosetta@home|Computation for task 1n0u__BOINC_OLDRELAXFLAGS_ABRELAX_SAVE_ALL_OUT__1278_7757_0 finished
16/10/2006 11:15:53 AM|rosetta@home|Starting task 1n0u__BOINC_NEWRELAXFLAGS_ABRELAX_SAVE_ALL_OUT__1275_9015_0 using rosetta version 532
16/10/2006 11:16:18 AM|rosetta@home|Unrecoverable error for result 1n0u__BOINC_NEWRELAXFLAGS_ABRELAX_SAVE_ALL_OUT__1275_9015_0 ( - exit code -1073741819 (0xc0000005))
16/10/2006 11:16:18 AM|rosetta@home|Deferring scheduler requests for 4 minutes and 30 seconds
16/10/2006 11:16:18 AM||Rescheduling CPU: application exited
16/10/2006 11:16:18 AM|rosetta@home|Computation for task 1n0u__BOINC_NEWRELAXFLAGS_ABRELAX_SAVE_ALL_OUT__1275_9015_0 finished
16/10/2006 11:16:18 AM|rosetta@home|Starting task 1n0u__BOINC_OLDRELAXFLAGS_ABRELAX_SAVE_ALL_OUT__1278_7756_0 using rosetta version 532
16/10/2006 11:16:44 AM|rosetta@home|Unrecoverable error for result 1n0u__BOINC_OLDRELAXFLAGS_ABRELAX_SAVE_ALL_OUT__1278_7756_0 ( - exit code -1073741819 (0xc0000005))
16/10/2006 11:16:44 AM|rosetta@home|Deferring scheduler requests for 19 minutes and 19 seconds
16/10/2006 11:16:44 AM||Rescheduling CPU: application exited
16/10/2006 11:16:44 AM|rosetta@home|Computation for task 1n0u__BOINC_OLDRELAXFLAGS_ABRELAX_SAVE_ALL_OUT__1278_7756_0 finished

ID: 29416 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 29420 - Posted: 16 Oct 2006, 1:45:32 UTC
Last modified: 16 Oct 2006, 1:52:14 UTC

From all the errors with FRA 2rio and the new ones as displayed by Timtam, that something about these new wus, don\'t care for for some computers. Is anyone seeing the same?
ID: 29420 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1740
Credit: 3,636,261
RAC: 706
Message 29424 - Posted: 16 Oct 2006, 3:43:40 UTC

I just began a FRA 2rio task. It\'s taking considerably more memory then most WUs. About 225MB. Folks with only the 256MB recommended minimum are seeing constant page faults running DOC WUs as well. Windows shows me the peak memory usage of my FRA 2rio was 278MB.
If having a DC project with BOINC is of interest to you, with volunteer or cloud computing resources, but have no time for the BOINC learning curve,
use a hosting service that understands BOINC projects: http://DeepSci.com
ID: 29424 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Report problems with Rosetta version 5.32



©2019 University of Washington
http://www.bakerlab.org