minirosetta v1.25 bug thread

Message boards : Number crunching : minirosetta v1.25 bug thread

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
BitSpit
Avatar

Send message
Joined: 5 Nov 05
Posts: 33
Credit: 4,147,344
RAC: 0
Message 53435 - Posted: 29 May 2008, 11:53:31 UTC
Last modified: 29 May 2008, 11:54:18 UTC

To me, 1.24 is looking a lot better than 1.25.

First up is t0391 LOOP IGNORE. These had a tendency to get stuck but still use CPU time and increase the time and percentage. They would run at least twice my selected runtime before I noticed. Upon restarting BOINC, they would go back to between 1 and 3 hours.

https://boinc.bakerlab.org/rosetta/result.php?resultid=166663997
https://boinc.bakerlab.org/rosetta/result.php?resultid=166584405
https://boinc.bakerlab.org/rosetta/result.php?resultid=166495787
https://boinc.bakerlab.org/rosetta/result.php?resultid=166469516
https://boinc.bakerlab.org/rosetta/result.php?resultid=166456690
https://boinc.bakerlab.org/rosetta/result.php?resultid=166451086 (got stuck 4 times. I aborted it.)

Next is 2oq2a BOINC CASP8 LOOPRELAX. They crashed with Maximum disk usage exceeded

https://boinc.bakerlab.org/rosetta/result.php?resultid=166703386
https://boinc.bakerlab.org/rosetta/result.php?resultid=166702956
https://boinc.bakerlab.org/rosetta/result.php?resultid=166700665
https://boinc.bakerlab.org/rosetta/result.php?resultid=166705432


I've reduced my runtime from 8 hours to 4 to see if it helps.
ID: 53435 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David Emigh
Avatar

Send message
Joined: 13 Mar 06
Posts: 158
Credit: 417,178
RAC: 0
Message 53437 - Posted: 29 May 2008, 14:12:50 UTC
Last modified: 29 May 2008, 14:16:40 UTC

Compute error at 25000+ seconds

resultid=166515872

Moderate sized debugger report at the link.

I'm 3/5 with Mini 1.25 so far.
Rosie, Rosie, she's our gal,
If she can't do it, no one shall!
ID: 53437 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Speedy
Avatar

Send message
Joined: 25 Sep 05
Posts: 162
Credit: 703,854
RAC: 0
Message 53444 - Posted: 29 May 2008, 20:52:50 UTC - in response to Message 53434.  

I remember reading that they are looking into that issue you have.

Thank's for info. I'm sure the graphics will get better with time.
Speedy

Have a crunching good day!!
ID: 53444 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,003,213
RAC: 0
Message 53445 - Posted: 29 May 2008, 22:32:28 UTC

I think I've figured out the issue with the minirosetta graphics. A fix will be added to the next application update which will happen hopefully soon. It was a silly error. Our options system is very strict and the boinc client passes options to the screensaver that are not recognized by our graphics app. These options caused the graphics app to exit prematurely.

The graphics will definitely get better with time. The current graphics are a placeholder.
ID: 53445 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Pepo
Avatar

Send message
Joined: 28 Sep 05
Posts: 115
Credit: 101,358
RAC: 0
Message 53446 - Posted: 29 May 2008, 23:07:27 UTC - in response to Message 53445.  

I think I've figured out the issue with the minirosetta graphics.

Perfect. I was thinking of BOINC 6.2.x being the culprit.

Peter
ID: 53446 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Speedy
Avatar

Send message
Joined: 25 Sep 05
Posts: 162
Credit: 703,854
RAC: 0
Message 53447 - Posted: 30 May 2008, 2:23:15 UTC - in response to Message 53445.  


The graphics will definitely get better with time. The current graphics are a placeholder.

This is great to hear your hard work is much appreciated by all of us
Speedy
Have a crunching good day!!
ID: 53447 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
John Moffitt

Send message
Joined: 20 Mar 07
Posts: 5
Credit: 135,888
RAC: 0
Message 53451 - Posted: 30 May 2008, 7:34:44 UTC
Last modified: 30 May 2008, 7:39:09 UTC

I don't know if this is an error or bug, but The application is crunching MUCH slower than the estimated time. The estimated time is just over 3 hours, but right now the two units that are being worked on are at 8 hours, and only 33 and 35% done. Since I have a 7 day queue set up, I now have about 70 units waiting to be crunched, all of them with a deadline of 9 days from now, meaning that about 50 will be aborted...

And since there has been a lot of mentions on the topic of the graphics, I figured I would mention that my graphics button is completely grayed out.
ID: 53451 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jipsu

Send message
Joined: 27 Jan 08
Posts: 10
Credit: 454,555
RAC: 0
Message 53461 - Posted: 30 May 2008, 14:01:06 UTC - in response to Message 53451.  

I don't know if this is an error or bug, but The application is crunching MUCH slower than the estimated time. The estimated time is just over 3 hours, but right now the two units that are being worked on are at 8 hours, and only 33 and 35% done. Since I have a 7 day queue set up, I now have about 70 units waiting to be crunched, all of them with a deadline of 9 days from now, meaning that about 50 will be aborted...

And since there has been a lot of mentions on the topic of the graphics, I figured I would mention that my graphics button is completely grayed out.


Is it one of the these h00x tasks?
h003__BOINC_ABRELAXt397__IGNORE_THE_REST-S25-10-S3-7--h003_-_3578_594_0
Took 18 hours to complete one decoy.

<core_client_version>5.10.45</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 86400
# cpu_run_time_pref: 86400
# cpu_run_time_pref: 86400
======================================================
DONE :: 1 starting structures 65543.7 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
called boinc_finish

</stderr_txt>
]]>

And don't worry if some of the WUs will be aborted, they will be assigned to other users. Just let it crunch! :)
ID: 53461 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 53464 - Posted: 30 May 2008, 16:41:38 UTC - in response to Message 53451.  

...crunching MUCH slower than the estimated time. The estimated time is just over 3 hours, but right now the two units that are being worked on are at 8 hours, and only 33 and 35% done.


This would be normal, assuming you've just changed your runtime preference to 24hours (the maximum). You are correct, having a cache full of work when making such a change is not desireable. As you say, you can just want and abort the tasks that will pass their deadlines... or, if you prefer, you could revised your runtime preference back down, and reduce your cache size, let the work crunch through for a few days, then set your runtime preference back to your desired level, complete and report a few results at that runtime, then gradually increase your cache back to your desired level.

...having said all of that... some specific tasks will have long running models. And so you may have a couple such long running models running now. If you've not modified your runtime preference, then I would expect you just hit a couple that run longer. When they complete, your other tasks are likely a mixture of other, more typical work and so things will get back to normal on their own.

Rosetta Moderator: Mod.Sense
ID: 53464 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
The_Bad_Penguin
Avatar

Send message
Joined: 5 Jun 06
Posts: 2751
Credit: 3,038,130
RAC: 4,539
Message 53466 - Posted: 30 May 2008, 20:30:27 UTC

rb_05_28_11656_20546_T0422_IGNORE_THE_REST_05_15_3589_98

CPU time 10228.14
stderr out <core_client_version>5.10.13</core_client_version>
<![CDATA[
<stderr_txt>

</stderr_txt>
]]>


Validate state Invalid

Claimed credit 43.898742771417
Granted credit 0
application version 1.25
ID: 53466 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
The_Bad_Penguin
Avatar

Send message
Joined: 5 Jun 06
Posts: 2751
Credit: 3,038,130
RAC: 4,539
Message 53468 - Posted: 30 May 2008, 20:32:24 UTC
Last modified: 30 May 2008, 20:32:41 UTC

rb_05_28_11655_20541_T0421_IGNORE_THE_REST_07_09_3588_208

stderr out <core_client_version>5.10.13</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>

ERROR: in::file::zip minirosetta_database_rev22619.zip does not exist!
ERROR:: Exit from: ....srcappspublicboincminirosetta.cc line: 91
called boinc_finish

</stderr_txt>
]]>


Validate state Invalid
Claimed credit 14.4683648963448
Granted credit 0
application version 1.25
ID: 53468 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
The_Bad_Penguin
Avatar

Send message
Joined: 5 Jun 06
Posts: 2751
Credit: 3,038,130
RAC: 4,539
Message 53469 - Posted: 30 May 2008, 20:34:41 UTC

rb_05_28_11655_20541_T0421_IGNORE_THE_REST_10_14_3588_139


stderr out <core_client_version>5.10.13</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>

ERROR: in::file::zip minirosetta_database_rev22619.zip does not exist!
ERROR:: Exit from: ....srcappspublicboincminirosetta.cc line: 91
called boinc_finish

</stderr_txt>
]]>


Validate state Invalid
Claimed credit 34.8923101867806
Granted credit 0
application version 1.25
ID: 53469 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
The_Bad_Penguin
Avatar

Send message
Joined: 5 Jun 06
Posts: 2751
Credit: 3,038,130
RAC: 4,539
Message 53470 - Posted: 30 May 2008, 20:36:23 UTC

rb_05_28_11655_20541_T0421_IGNORE_THE_REST_08_13_3588_139


CPU time 3602.984
stderr out <core_client_version>5.10.13</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>

ERROR: in::file::zip minirosetta_database_rev22619.zip does not exist!
ERROR:: Exit from: ....srcappspublicboincminirosetta.cc line: 91
called boinc_finish

</stderr_txt>
]]>


Validate state Invalid
Claimed credit 15.463854408087
Granted credit 0
application version 1.25
ID: 53470 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
neil.hunter14

Send message
Joined: 9 May 06
Posts: 10
Credit: 278,867
RAC: 3
Message 53480 - Posted: 31 May 2008, 11:49:41 UTC - in response to Message 53470.  

Just started some 1.25 minirosettas on my Linux machine.
First couple ran OK, but my third one seems to be stuck at 95% with just 10m 06s to go. The CPU Time is still running, and the BOINC Manager functions normally, but the WU is not getting any closer to 100% and completing.

Neil
____________

ID: 53480 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Speedy
Avatar

Send message
Joined: 25 Sep 05
Posts: 162
Credit: 703,854
RAC: 0
Message 53490 - Posted: 31 May 2008, 22:24:44 UTC - in response to Message 53480.  

Just started some 1.25 minirosettas on my Linux machine.
First couple ran OK, but my third one seems to be stuck at 95% with just 10m 06s to go. The CPU Time is still running, and the BOINC Manager functions normally, but the WU is not getting any closer to 100% and completing.

Neil
____________

If I was you I would let it run and see it it aborts itself.
Cheers
Speedy
Have a crunching good day!!
ID: 53490 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 4876
Credit: 4,562,816
RAC: 3,333
Message 53491 - Posted: 31 May 2008, 22:37:00 UTC

t393_looprelax_round1_fullatom_relax_aaT0393_2I76A_10_0001_3559_630_1
this task just started, but the graphics window is blank and when i try to shut it, windows says it is not responding and has to close it.
the task computation is not affected, just the graphics window.
ID: 53491 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 28 May 06
Posts: 58
Credit: 219,040
RAC: 0
Message 53496 - Posted: 1 Jun 2008, 6:27:20 UTC

Completed ok but did not receive credit b/c of the "file error"... yet it says credit was granted in the output as noted below.
Same results for my wingman.

Task ID 167779317 Name h004__BOINC_CASP8_ABRELAXt407__IGNORE_THE_REST-S25-7-S3-7--h004_-_3600_642_0
Workunit 153133982
Created 31 May 2008 1:15:51 UTC
Sent 31 May 2008 1:19:51 UTC
Received 31 May 2008 10:29:18 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 0 (0x0)
Computer ID 229877
Report deadline 10 Jun 2008 1:19:51 UTC
CPU time 4338.063
stderr out

<core_client_version>6.1.0</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 7200
======================================================
DONE :: 1 starting structures 4337.98 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
<file_name>h004__BOINC_CASP8_ABRELAXt407__IGNORE_THE_REST-S25-7-S3-7--h004_-_3600_642_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>


</message>

]]>

Validate state Invalid
Claimed credit 12.2542776836286
Granted credit 12.2542776836286
application version 1.25

ID: 53496 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ygrámul

Send message
Joined: 13 Apr 08
Posts: 1
Credit: 54,976
RAC: 0
Message 53505 - Posted: 1 Jun 2008, 13:58:17 UTC

Hi,
I've had a couple of tasks stuck in the "waiting to run" state. I did not give too much importance to the first, and as the client seemed not to be running it, I aborted it. I've also aborted the second, sorry :-( but at least have got the details:

Task ID: 167805217
Name: h005__BOINC_CASP8_ABRELAXt397__IGNORE_THE_REST-S25-7-S3-5--h005_-_3593_3373_0
Workunit: 153158110

I'm running version 5.10.45 of the client on linux (Debian Etch) with the 2.6.22 kernel. The CPU is an Intel Core 2 Duo.

The first task got stuck when it was about 36%. The second about 85%. Hope this can be useful!
Cheers!
ID: 53505 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Loki

Send message
Joined: 9 Dec 05
Posts: 9
Credit: 36,264
RAC: 0
Message 53519 - Posted: 2 Jun 2008, 16:05:07 UTC

https://boinc.bakerlab.org/rosetta/result.php?resultid=166970891


</stderr_txt>
<message>
<file_xfer_error>
<file_name>h003__BOINC_ABRELAXt407__IGNORE_THE_REST-S25-8-S3-5--h003_-_3340_694_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
ID: 53519 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike Francis
Avatar

Send message
Joined: 24 Nov 05
Posts: 8
Credit: 623,519
RAC: 0
Message 53523 - Posted: 2 Jun 2008, 20:23:49 UTC

6/2/2008 2:37:45 PM|rosetta@home|Starting 1ctf__BOINC_ABRELAX_SAVE_ALL_OUT_IGNORE_THE_REST-S25-9-S3-12--1ctf_-_3325_46_2
6/2/2008 2:37:45 PM|rosetta@home|Starting task 1ctf__BOINC_ABRELAX_SAVE_ALL_OUT_IGNORE_THE_REST-S25-9-S3-12--1ctf_-_3325_46_2 using minirosetta version 125
6/2/2008 2:37:48 PM|rosetta@home|Computation for task 1ctf__BOINC_ABRELAX_SAVE_ALL_OUT_IGNORE_THE_REST-S25-9-S3-12--1ctf_-_3325_46_2 finished
6/2/2008 2:37:48 PM|rosetta@home|Output file 1ctf__BOINC_ABRELAX_SAVE_ALL_OUT_IGNORE_THE_REST-S25-9-S3-12--1ctf_-_3325_46_2_0 for task 1ctf__BOINC_ABRELAX_SAVE_ALL_OUT_IGNORE_THE_REST-S25-9-S3-12--1ctf_-_3325_46_2 absent

ID: 53523 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : minirosetta v1.25 bug thread



©2021 University of Washington
https://www.bakerlab.org