Problems with Minirosetta v1.54

Message boards : Number crunching : Problems with Minirosetta v1.54

To post messages, you must log in.

Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · Next

AuthorMessage
Speedy
Avatar

Send message
Joined: 25 Sep 05
Posts: 159
Credit: 598,637
RAC: 0
Message 60564 - Posted: 9 Apr 2009, 7:42:01 UTC - in response to Message 60563.  

I\'m interested to know how Selecting a black screen, instead of the BOINC graphics, as your screen saver, and avoiding activating the BOINC graphics.
helps increase a work units score?

just because your computer doesn\'t have to do the computation for the graphics tread too then.

OK thanks
Have a crunching good day!!
ID: 60564 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 689
Credit: 9,446,754
RAC: 4,864
Message 60569 - Posted: 9 Apr 2009, 12:41:10 UTC - in response to Message 60561.  

I\'m interested to know how Selecting a black screen, instead of the BOINC graphics, as your screen saver, and avoiding activating the BOINC graphics.
helps increase a work units score? Thank\'s in advance


Selecting a black screen, which only needs to be calculated once, cuts down on CPU time needed to calculate the graphics, and lets more of what\'s available be used for the scientific calculations. Since Rosetta@home uses the number of decoys produced as a more important factor in calculating how much credit to give you than the CPU time required to do it, this is likely to increase the number of decoys your computer produces for that workunit, and therefore the resulting score.

Also, something involving the graphics seems to be able to trigger the lockfile problem for a workunit, with the results then returned marked as invalid and therefore worth a score of zero. Once a lockfile problem occurs, 1.54 seems to be unable to erase the lockfile from the slot used by that workunit, and therefore lets the problems spread to any 1.54 workunits run later in the same slot but before the next reboot. My results for Ralph@home indicate that the 1.58 now being tested there has kept this same problem, and therefore needs more testing before the 1.54 used at Rosetta@home is replaced with a newer version.
ID: 60569 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 60570 - Posted: 9 Apr 2009, 12:54:21 UTC

Also, something involving the graphics seems to be able to trigger the lockfile problem for a workunit, with the results then returned marked as invalid


I turn the graphics on and off several times during the course of the day to check on the performance and I haven\'t encountered this lockfile problem for a long time now on both Rosetta and Ralph.

Having said that, Murphy\'s Law states \'watch this space\'!!
ID: 60570 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 689
Credit: 9,446,754
RAC: 4,864
Message 60573 - Posted: 9 Apr 2009, 14:23:20 UTC - in response to Message 60570.  

Also, something involving the graphics seems to be able to trigger the lockfile problem for a workunit, with the results then returned marked as invalid


I turn the graphics on and off several times during the course of the day to check on the performance and I haven\'t encountered this lockfile problem for a long time now on both Rosetta and Ralph.

Having said that, Murphy\'s Law states \'watch this space\'!!


The lockfile problem results could vary depending on what operating system version and what BOINC version is used; if so, my results could easily apply only when using BOINC 6.2.28 under Vista SP1. In other words, I suspect that results from just the two of us aren\'t enough; we need more people with access to other operating system versions and more versions of BOINC to test for graphics causing the lockfile problem and report the results, along with which operating system version and which BOINC version was used.
ID: 60573 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 60591 - Posted: 10 Apr 2009, 6:12:14 UTC

New error:

ERROR: dis==0 in pairtermderiv!
ERROR:: Exit from: src/core/scoring/methods/PairEnergy.cc line: 338
BOINC:: Error reading and gzipping output datafile: default.out

For this task

The same task run on an XP machine ran for a long time and only failed on validate. Which is kind of interesting, it almost seems as if my machine (OS-X) tipped over on an assertion or parameter file error ... what is the difference in OS platform guys ...
ID: 60591 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 1,910,444
RAC: 407
Message 60599 - Posted: 10 Apr 2009, 22:27:45 UTC

I have a 10 preferred runtime on my MacBook Pro. I spotted lb_all_multi_threshold.2.0_hb_t317__IGNORE_THE_REST_1I9SA_12_10355_4_0
still running at 10 hours and 20 minutes so I opened the graphics window to check on it. It was on model 33, step 1920, stage unk. Checking on it later it had run another cpu hour but failed to make any progress so I shut down BOINC completely and restarted. it now showed 5 hours and 20 minutes cpu time consumed, all the rest the same. Within a few seconds it returned to step 0 and apparently restarted model 33 over from the beginning. I didn\'t catch exactly when it reached step 1920 but it would have been about 3-4 cpu minutes after the restart. It didn\'t get stuck this time but continued on its merry way. It also moved out of the unk stage by the time I glanced at it 4+ minutes after restart. It has now finished successfully and validated with 58 models completed in 10 (non-stuck)hours.

Hope this helps.

Snags
ID: 60599 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
jswolf19

Send message
Joined: 3 Apr 09
Posts: 3
Credit: 1,040,577
RAC: 0
Message 60605 - Posted: 11 Apr 2009, 14:08:57 UTC

I was looking through the RALPH minirosetta v1.54 bug thread and found an issue about setting day-of-week overrides (http://ralph.bakerlab.org/forum_thread.php?id=432&nowrap=true#4590). I had some set on network usage that when I cleared and restarted BOINC (which I upgraded to 6.6.20) I started registering progress on a minirosetta task (as well as having some stderr progress past

Initializing options.... ok

This appears to have been the cause of my problem.
ID: 60605 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TomaszPawel

Send message
Joined: 28 Apr 07
Posts: 54
Credit: 2,791,145
RAC: 0
Message 60620 - Posted: 14 Apr 2009, 10:59:53 UTC

ID: 60620 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gavin Shaw
Avatar

Send message
Joined: 1 Feb 07
Posts: 10
Credit: 506,456
RAC: 0
Message 60638 - Posted: 14 Apr 2009, 23:09:36 UTC

While not exactly a bug, this morning I had a rather large upload file...

Task 243404526 had a 6.8MB file to upload. The task only run for about 50 minutes and my preference is set to 4 hours. It did 99 decoys from 99 attempts.

Thought admin might want to know...

Never surrender and never give up. In the darkest hour there is always hope.

ID: 60638 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3545
Credit: 0
RAC: 0
Message 60639 - Posted: 15 Apr 2009, 0:20:20 UTC

Wow, good thing the watchdog only lets 99 models run. Just imagine how large it would have been with a 4 hour run!
Rosetta Moderator: Mod.Sense
ID: 60639 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 208
Credit: 7,317,047
RAC: 2,223
Message 60650 - Posted: 15 Apr 2009, 15:50:53 UTC

This task failed on Mac with an error in pairtermderiv that\'s been reported previously.

243548575

Hbond tripped.

ERROR: dis==0 in pairtermderiv!
ERROR:: Exit from: src/core/scoring/methods/PairEnergy.cc line: 338
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>


ID: 60650 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3545
Credit: 0
RAC: 0
Message 60651 - Posted: 15 Apr 2009, 16:41:26 UTC
Last modified: 15 Apr 2009, 16:41:53 UTC

further notes to svincent\'s failed task

ERROR: dis==0 in pairtermderiv!
ERROR:: Exit from: src/core/scoring/methods/PairEnergy.cc line: 338
BOINC:: Error reading and gzipping output datafile: default.out

...and only 198 seconds of runtime.
Rosetta Moderator: Mod.Sense
ID: 60651 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3545
Credit: 0
RAC: 0
Message 60653 - Posted: 15 Apr 2009, 18:27:09 UTC
Last modified: 15 Apr 2009, 18:27:52 UTC


!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Alert about problem WUs.

Problem task names all begin with \"res_careful_\". For details on which proteins are known to have problems and should be aborted, and which will run OK and should be run normally, please see the link above.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Rosetta Moderator: Mod.Sense
ID: 60653 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
l_mckeon

Send message
Joined: 5 Jun 07
Posts: 44
Credit: 180,717
RAC: 0
Message 60657 - Posted: 15 Apr 2009, 21:27:57 UTC

The following two tasks had shorter run times than usual (about 1:30 hrs and 1:50 hrs from memory) and their uploads totalled around 16MB.

rest3d85_ip40_1t4w.patchdock.6.pdb_0002_fa_dock.xml_score12_pert38_DOCK_10797_354_0_0
rest3d85_ip40_1t4w.patchdock.6.pdb_0002_fa_dock.xml_score12_pert38_DOCK_10797_354_0_0

ID: 60657 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3545
Credit: 0
RAC: 0
Message 60659 - Posted: 15 Apr 2009, 22:45:20 UTC

l_mckeon, yes, those tasks hit the 99 model limit before reaching your normal runtime preference.
Rosetta Moderator: Mod.Sense
ID: 60659 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gavin Shaw
Avatar

Send message
Joined: 1 Feb 07
Posts: 10
Credit: 506,456
RAC: 0
Message 60660 - Posted: 15 Apr 2009, 23:29:56 UTC

Had another big one overnight.

Task 243710356 was another 6.8MB upload, again with 99 decoys.

Of course I have now seen a post about some problem with units, but it didn\'t help as the unit had already crunched :)

Never surrender and never give up. In the darkest hour there is always hope.

ID: 60660 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 980
Credit: 21,925,287
RAC: 14,452
Message 60661 - Posted: 16 Apr 2009, 0:14:14 UTC

Another Validation error with this job:

crys__BOINC_ABRELAX_R120G_CRYSTALLIN_SAVE_ALL_OUT_IGNORE_THE_REST-S25-9-S3-3--crys_-_9344_11912_2

No errors reported within the Task Details of any of them.

Previous ones reported here and here.
ID: 60661 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 60662 - Posted: 16 Apr 2009, 6:48:51 UTC

Looks like I might have gotten one of the problems:

ERROR: [ERROR] Unable to open constraints file: resample_outward0.05_ub0.1_lb0.02_median.t364_.cst
ERROR:: Exit from: ..\\..\\src\\core\\scoring\\constraints\\ConstraintIO.cc line: 330
BOINC:: Error reading and gzipping output datafile: default.out

task 243804881
ID: 60662 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Speedy
Avatar

Send message
Joined: 25 Sep 05
Posts: 159
Credit: 598,637
RAC: 0
Message 60678 - Posted: 17 Apr 2009, 3:21:20 UTC

This task http://boinc.bakerlab.org/rosetta/result.php?resultid=243902658 made 99 decoys & the upload was about 7.14MB is this normal for these tasks?
Have a crunching good day!!
ID: 60678 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 4871
Credit: 3,645,451
RAC: 783
Message 60680 - Posted: 17 Apr 2009, 7:27:34 UTC - in response to Message 60678.  

This task http://boinc.bakerlab.org/rosetta/result.php?resultid=243902658 made 99 decoys & the upload was about 7.14MB is this normal for these tasks?


there is a limiter built into the program. it stops the crunching at 99 decoys.
this is normal.
ID: 60680 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · Next

Message boards : Number crunching : Problems with Minirosetta v1.54



©2019 University of Washington
http://www.bakerlab.org