Problems with Rosetta version 5.51

Message boards : Number crunching : Problems with Rosetta version 5.51

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 37811 - Posted: 14 Mar 2007, 19:11:36 UTC - in response to Message 37802.  

Hi all:

ThorLite, AMD_is_logical, and others: the \"0 decoys\" is misleading, many of the workunits that you report are producing tons of decoys and getting credit. A small fraction aren\'t getting credit, though, and I\'m tracking those down. I think I know the overall fix, and am working on it over on ralph.
I do want to say that the results are streaming in beautifully, and the data is pretty awesome.

I\'m not entirely sure about the quad core problem, thor -- if you attach your project to ralph part-time, I\'ll be doing an update soon. I think I know one possible issue with the graphics, and I\'ll have the fix on the next update. We obviously don\'t want to lose your machine!

Thanks to everybody for posting and crunching!


http://boinc.bakerlab.org/rosetta/result.php?resultid=67366016

1esy__BOINC_RNA_ABINITIO-1esy_-_1609_5735_0

CPU time 2009.7416

# random seed: 1591680
# cpu_run_time_pref: 86400
======================================================
DONE :: 1 starting structures built 30 (nstruct) times
This process generated 0 decoys from 0 attempts
======================================================

This has been a very reliable cruncher that\'s set for 24 hour execution preference.

-- David


ID: 37811 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Gatekeeper

Send message
Joined: 26 Feb 07
Posts: 4
Credit: 966,551
RAC: 0
Message 37812 - Posted: 14 Mar 2007, 19:16:45 UTC

I\'ve had 4 Validate errors on RNA WU\'s in the last 8 hours. All had very low CPU times. (see WU #\'s 60249661, 60260569, 60268231 and 60385171) The error messages are the same as others have posted. Odd..as I VERY seldome have had ANY errors with Rosetta (a pleasant change from SETI (smile))
ID: 37812 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 37813 - Posted: 14 Mar 2007, 19:29:12 UTC - in response to Message 37812.  

Hi all:

I found the fix for the validate errors (I think!) and will be testing this on RALPH later today. For now, I am no longer sending out any more RNA WU\'s until everything is fixed. In the meanwhile, you\'ll have some nice protein workunits to crunch on.

Thanks to everyone for posting, and if you\'re interested, give me feedback on 5.52 on ralph tomorrow!

I\'ve had 4 Validate errors on RNA WU\'s in the last 8 hours. All had very low CPU times. (see WU #\'s 60249661, 60260569, 60268231 and 60385171) The error messages are the same as others have posted. Odd..as I VERY seldome have had ANY errors with Rosetta (a pleasant change from SETI (smile))


ID: 37813 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 4835
Credit: 3,080,498
RAC: 592
Message 37816 - Posted: 14 Mar 2007, 20:16:39 UTC - in response to Message 37813.  

Good to know this, I just started RNA WU\'s here and the first one was done in under a hour but showing it computed 6hrs and has a validation error in it.
There goes my user average again. *sigh* But the graphics are cool!

Hi all:

I found the fix for the validate errors (I think!) and will be testing this on RALPH later today. For now, I am no longer sending out any more RNA WU\'s until everything is fixed. In the meanwhile, you\'ll have some nice protein workunits to crunch on.

Thanks to everyone for posting, and if you\'re interested, give me feedback on 5.52 on ralph tomorrow!

I\'ve had 4 Validate errors on RNA WU\'s in the last 8 hours. All had very low CPU times. (see WU #\'s 60249661, 60260569, 60268231 and 60385171) The error messages are the same as others have posted. Odd..as I VERY seldome have had ANY errors with Rosetta (a pleasant change from SETI (smile))



ID: 37816 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ulrich Metzner
Avatar

Send message
Joined: 17 Sep 05
Posts: 22
Credit: 255,680
RAC: 0
Message 37818 - Posted: 14 Mar 2007, 20:31:05 UTC
Last modified: 14 Mar 2007, 20:31:33 UTC

http://boinc.bakerlab.org/rosetta/result.php?resultid=67218730

This one was unusual short with following output:

DONE :: 1 starting structures built 30 (nstruct) times
This process generated 0 decoys from 0 attempts
greetz, Uli

ID: 37818 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 4835
Credit: 3,080,498
RAC: 592
Message 37822 - Posted: 14 Mar 2007, 21:58:37 UTC

whats odd is i see the same thing, but one RNA unit crashed and the other went ok and a third one is crunching now. 1 starting structure and 30 nstruct times and 0 decoys and 0 attempts

http://boinc.bakerlab.org/rosetta/result.php?resultid=67120285 - this is ok
http://boinc.bakerlab.org/rosetta/result.php?resultid=67185107 - validate error
http://boinc.bakerlab.org/rosetta/result.php?resultid=67244663 - this one is ok
http://boinc.bakerlab.org/rosetta/result.php?resultid=67311112 - also ok

starting last RNA WU. Should be done in about a hour
ID: 37822 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3435
Credit: 0
RAC: 0
Message 37824 - Posted: 15 Mar 2007, 1:58:11 UTC - in response to Message 37822.  

whats odd is i see the same thing, but one RNA unit crashed and the other went ok and a third one is crunching now. 1 starting structure and 30 nstruct times and 0 decoys and 0 attempts


...which is why Rhiju made this post and stopped putting RNA tasks on the queue for the time being.

Rosetta Moderator: Mod.Sense
ID: 37824 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ronald van Butselaar

Send message
Joined: 26 Jan 07
Posts: 2
Credit: 62,756
RAC: 0
Message 37848 - Posted: 15 Mar 2007, 12:08:07 UTC

I got 3 validate errors in the 2 days sinds this new version. before i had never problems.

http://boinc.bakerlab.org/rosetta/result.php?resultid=67516356
http://boinc.bakerlab.org/rosetta/result.php?resultid=67514510
http://boinc.bakerlab.org/rosetta/result.php?resultid=67456956

ID: 37848 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3435
Credit: 0
RAC: 0
Message 37849 - Posted: 15 Mar 2007, 12:41:11 UTC - in response to Message 37848.  

I got 3 validate errors in the 2 days sinds this new version. before i had never problems.

http://boinc.bakerlab.org/rosetta/result.php?resultid=67516356
http://boinc.bakerlab.org/rosetta/result.php?resultid=67514510
http://boinc.bakerlab.org/rosetta/result.php?resultid=67456956


Yep, all three of those are the new RNA tasks which are some of the new science that was released in the new version. And these are examples of the validate errors which Rhiju posted that he is now testing a fix for on Ralph.
Rosetta Moderator: Mod.Sense
ID: 37849 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile WiZMaC

Send message
Joined: 22 Jun 06
Posts: 4
Credit: 485,344
RAC: 0
Message 37852 - Posted: 15 Mar 2007, 13:22:37 UTC
Last modified: 15 Mar 2007, 13:26:52 UTC

Hi guys,

I am now getting the following errors:

http://boinc.bakerlab.org/rosetta/result.php?resultid=67722686
[error] Process 1816 not found
Restarting DOCKING_xxxxxxxxxxxxxxxxxxxxxxxxxxxxx_0 - message timeout

[error] Process 2110 not found
Restarting DOCKING_xxxxxxxxxxxxxxxxxxxxxxxxxxxxx_0 - message timeout

What to do?

ID: 37852 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 37861 - Posted: 15 Mar 2007, 18:53:53 UTC - in response to Message 37852.  

Hmm, I don\'t know -- I get similar output if my laptop goes to sleep and aborts the run too many times.
If that\'s not the case for you, it may be worth restarting BOINC.

Hi guys,

I am now getting the following errors:

http://boinc.bakerlab.org/rosetta/result.php?resultid=67722686
[error] Process 1816 not found
Restarting DOCKING_xxxxxxxxxxxxxxxxxxxxxxxxxxxxx_0 - message timeout

[error] Process 2110 not found
Restarting DOCKING_xxxxxxxxxxxxxxxxxxxxxxxxxxxxx_0 - message timeout

What to do?


ID: 37861 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile netwraith
Avatar

Send message
Joined: 3 Sep 06
Posts: 80
Credit: 13,483,227
RAC: 0
Message 37882 - Posted: 16 Mar 2007, 12:29:36 UTC



I am having another issue with a WU... This one is at 32 hours on a 14 hour maximum.... I am going to abort it as it appears to be stuck (only generating sigalarm\'s on strace)...

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=60408235


Looking for a team ??? Join BoincSynergy!!


ID: 37882 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile hedera
Avatar

Send message
Joined: 15 Jul 06
Posts: 66
Credit: 2,909,765
RAC: 1,371
Message 37898 - Posted: 17 Mar 2007, 4:14:43 UTC
Last modified: 17 Mar 2007, 4:20:05 UTC

I can\'t attach Rosetta at all. I brought my machine up this afternoon around 17:30 PDT and BOINC has had the little red and white circle ever since - it\'s now 21:14 PDT. I try to attach Rosetta, and put in my password, and nothing happens. Help?? I\'d LOVE to be able to test the new graphics.

OOPS. Never mind. I\'ve been working with Unix systems too long. I found a post in the Windows section which said, basically, kill BOINC and restart it; so I did, and it connected right up. I do have a Microsoft box at home, after all... sorry. I\'ll go play with the graphics now.
--hedera

Never be afraid to try something new. Remember that amateurs built the ark. Professionals built the Titanic.

ID: 37898 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Thomas Leibold

Send message
Joined: 30 Jul 06
Posts: 55
Credit: 19,627,164
RAC: 0
Message 37927 - Posted: 17 Mar 2007, 18:42:42 UTC

I don\'t think I have seen this one before:

<core_client_version>5.4.9</core_client_version>
<message>
process exited with code 1 (0x1)
</message>
<stderr_txt>
Graphics are disabled due to configuration...
# random seed: 1057481
# cpu_run_time_pref: 28800
ERROR:: Exit at: refold.cc line:337

</stderr_txt>

This is the 5.51 Rosetta Linux client on workunit 60792243

Team Helix
ID: 37927 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ulrich Metzner
Avatar

Send message
Joined: 17 Sep 05
Posts: 22
Credit: 255,680
RAC: 0
Message 37991 - Posted: 19 Mar 2007, 2:10:24 UTC

ID: 37991 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 37999 - Posted: 19 Mar 2007, 8:36:36 UTC - in response to Message 37991.  

Hi all: Long RNA workunit times and validate errors have been fixed and you\'ll see the benefits during the next update -- I just a couple more days of testing on RALPH with a new app.



Invalid: http://boinc.bakerlab.org/rosetta/result.php?resultid=67574739


ID: 37999 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
flying_pizza

Send message
Joined: 11 Mar 06
Posts: 1
Credit: 25,011
RAC: 0
Message 38010 - Posted: 19 Mar 2007, 14:36:55 UTC - in response to Message 37793.  

after trying to open the graphics window, my first wu (Workunit 60192061) finished with:

<core_client_version>5.8.15</core_client_version>
<![CDATA[
<message>
- exit code 1073807364 (0x40010004)
</message>
<stderr_txt>
# random seed: 1432142
No heartbeat from core client for 31 sec - exiting

</stderr_txt>
]]>

2nd is running right now and the graph worked.


I had the same whit WU 60823665.

ID: 38010 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Problems with Rosetta version 5.51



©2017 University of Washington
http://www.bakerlab.org