Minirosetta 1.90 and 1.91

Message boards : Number crunching : Minirosetta 1.90 and 1.91

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
j2satx

Send message
Joined: 17 Sep 05
Posts: 97
Credit: 3,670,592
RAC: 0
Message 62721 - Posted: 2 Aug 2009, 14:19:49 UTC - in response to Message 62703.  

Thanks! There was a change in that flag and I missed it. That work unit is disabled.

I got a few errors on lr5_combine_mods_run01_rlbn WUs.

https://boinc.bakerlab.org/rosetta/result.php?resultid=269713462
https://boinc.bakerlab.org/rosetta/result.php?resultid=269758962
https://boinc.bakerlab.org/rosetta/result.php?resultid=269787057
https://boinc.bakerlab.org/rosetta/result.php?resultid=269811876

They end after about 10 seconds with the error:

Native pose needed for OptionKeys::relax::constrain_relax_to_native_coords
ERROR:: Exit from: src/protocols/relax/ClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out



Is this one of the WUs tested on Ralph?
ID: 62721 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mel

Send message
Joined: 21 Jul 06
Posts: 4
Credit: 3,617,489
RAC: 0
Message 62725 - Posted: 2 Aug 2009, 17:35:32 UTC - in response to Message 62652.  

This version should solve the slowing down, and instant quitting problems.

New protocol added for predictions of changes in protein stability by mutations.

-----
DEK posted an explanation on what caused all the trouble in the last week:
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=5011&nowrap=true#62640

Here are some more details on what was done so far:
1. The signature problem that initially causes massive network traffic was solved on Monday.
2. The slowing down of the program was caused by two large changes in the code. One of the changes is to allow rosetta to model large, symmetric molecules (oligomers http://en.wikipedia.org/wiki/Oligomer). And the other is to allow modeling atomic interactions with higher definition. The bug introduced in the first change was fixed. And the second change is temporarily reversed until further evaluation of the computation cost.

Now unfortunately, due to the signature error and the update of the program, the web server will be extremely busy for the next few days. So downloading/uploading errors are still expected.


ID: 62725 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mel

Send message
Joined: 21 Jul 06
Posts: 4
Credit: 3,617,489
RAC: 0
Message 62726 - Posted: 2 Aug 2009, 17:41:53 UTC

I can no longer run Rosetta as none of my 4 machines will load minirosetta-database-rev31588.zip. This all started on the 30th of July. I have been running Rosetta since 2006 and this is the first time for this problem, however, it will run using version 5-10-45 ONLY. Is there a fix for this or do I just have to run the old version ??????????????
ID: 62726 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Path7

Send message
Joined: 25 Aug 07
Posts: 128
Credit: 61,751
RAC: 0
Message 62727 - Posted: 2 Aug 2009, 18:57:32 UTC - in response to Message 62726.  

I can no longer run Rosetta as none of my 4 machines will load minirosetta-database-rev31588.zip. This all started on the 30th of July. I have been running Rosetta since 2006 and this is the first time for this problem, however, it will run using version 5-10-45 ONLY. Is there a fix for this or do I just have to run the old version ??????????????

Hello Mel,
Had the same problems:
2-8-2009 19:50:27|rosetta@home|[error] Signature verification failed for minirosetta_database_rev31588.zip
2-8-2009 19:50:27|rosetta@home|[error] Checksum or signature error for minirosetta_database_rev31588.zip
This is on a Vista laptop & BOINC 5.10.45.
I tried often to download a new WU without success.
Yet i decided to reset the project and now the download proceeded successfully.
I'm not sure whether I was just lucky or the reset did the trick.

Good luck,
Path7.
ID: 62727 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1990
Credit: 38,522,839
RAC: 15,277
Message 62728 - Posted: 2 Aug 2009, 19:19:29 UTC - in response to Message 62726.  
Last modified: 2 Aug 2009, 19:24:44 UTC

I can no longer run Rosetta as none of my 4 machines will load minirosetta-database-rev31588.zip. This all started on the 30th of July. I have been running Rosetta since 2006 and this is the first time for this problem, however, it will run using version 5-10-45 ONLY. Is there a fix for this or do I just have to run the old version ??????????????

I wasn't aware this was happening any more.

Looking at your computers:
Computer ID 1116016 has failed downloading this file using Boinc 6.2.14
Computer ID 1116076 looks ok using Boinc 5.10.45 except for those WUs reporting errors as above:
Native pose needed for OptionKeys::relax::constrain_relax_to_native_coords
ERROR:: Exit from: src/protocols/relax/ClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out

And
ERROR: Option matching -in:detect_disulfides not found in command line top-level context

Computer ID 1116054 has downloaded 56 files but hasn't reported anything back yet.
Computer ID 1114788 has downloaded 1 file but hasn't reported anything back yet (running XP Pro SP2 - others all run SP3).

On the machines that don't work let your failed jobs return, upgrade to 6.4.7 which seems to be the most stable released version and try again. If they don't you may have to go to 5.10.45. There don't seem to be manydifferences between your machines, so it is odd that some work and some don't.

EDIT: Also, why do you use just a 1 hour run-time? Seems like an awful lot of to-ing a fro-ing when you could get much the same results with less impact on the servers here. Is it an indication of previous problems?
ID: 62728 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Alberthuang

Send message
Joined: 5 Dec 05
Posts: 6
Credit: 171,257
RAC: 0
Message 62735 - Posted: 3 Aug 2009, 4:16:38 UTC - in response to Message 62727.  
Last modified: 3 Aug 2009, 4:27:16 UTC

I can no longer run Rosetta as none of my 4 machines will load minirosetta-database-rev31588.zip. This all started on the 30th of July. I have been running Rosetta since 2006 and this is the first time for this problem, however, it will run using version 5-10-45 ONLY. Is there a fix for this or do I just have to run the old version ??????????????

Hello Mel,
Had the same problems:
2-8-2009 19:50:27|rosetta@home|[error] Signature verification failed for minirosetta_database_rev31588.zip
2-8-2009 19:50:27|rosetta@home|[error] Checksum or signature error for minirosetta_database_rev31588.zip
This is on a Vista laptop & BOINC 5.10.45.
I tried often to download a new WU without success.
Yet i decided to reset the project and now the download proceeded successfully.
I'm not sure whether I was just lucky or the reset did the trick.

Good luck,
Path7.


I just change another computer in my home for more than a month, because the previous one can not turn on due to the problem of the mainboard. My current computer also has such problems, too! Its CPU is Pemtium 4 3.06GHz with hyper threading, and its RAM is 1GB DDR2 SDRAM. The OS is Windows XP SP3, and the BOINC manager version is 6.6.36. Today when my computer downloaded 2 minirosetta version 1.90 workunits (Task ID 270050079 and 270139479), then the BOINC manager also showed download error message like you previously mentioned. And the message of task details is in the following:

<core_client_version>6.6.36</core_client_version>
<![CDATA[
<message>
app_version download error: couldn't get input files:
<file_xfer_error>
<file_name>minirosetta_database_rev31588.zip</file_name>
<error_code>-120</error_code>
<error_message>signature verification failed</error_message>
</file_xfer_error>

</message>
]]>

I have reset the Rosetta@home project with BOINC manager in this compter, but such error still happened. So at last I decided to abort both workunits and let my computer not download and run Rosetta@home workunits for a while until this problem can be completely solved.


ID: 62735 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 62737 - Posted: 3 Aug 2009, 4:45:14 UTC

Hi.

Not really an error i guess but has the limit of 99 been taken off tasks

because i saw this on one of mine.

======================================================
DONE :: 109 starting structures 14284.6 cpu seconds
This process generated 109 decoys from 109 attempts
======================================================


ID: 62737 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TimL

Send message
Joined: 16 Sep 06
Posts: 17
Credit: 15,480,956
RAC: 0
Message 62739 - Posted: 3 Aug 2009, 9:55:21 UTC

looprebuild_t374_nat_rlx_A_12863_3605_1 failed with error
ERROR: Option matching -in:detect_disulfides not found in command line top-level context


ID: 62739 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5664
Credit: 5,710,284
RAC: 2,004
Message 62741 - Posted: 3 Aug 2009, 12:09:07 UTC

6 failed tasks with all the same error message

Native pose needed for OptionKeys::relax::constrain_relax_to_native_coords
ERROR:: Exit from: ....srcprotocolsrelaxClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish


lr5_combine_mods_run01_rlbn_1tit_IGNORE_THE_REST_NATIVE_14608_97_0
lr5_combine_mods_run01_rlbn_1tul_IGNORE_THE_REST_NATIVE_14608_97_0
lr5_combine_mods_run01_rlbn_1ubi_IGNORE_THE_REST_NATIVE_14608_97_0
lr5_combine_mods_run01_rlbn_1ugh_IGNORE_THE_REST_NATIVE_14608_97_0
lr5_combine_mods_run01_rlbn_1unp_IGNORE_THE_REST_NATIVE_14608_97_0
lr5_combine_mods_run01_rlbn_1unr_IGNORE_THE_REST_NATIVE_14608_97_0
ID: 62741 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1990
Credit: 38,522,839
RAC: 15,277
Message 62745 - Posted: 3 Aug 2009, 15:04:06 UTC
Last modified: 3 Aug 2009, 15:06:41 UTC

More of the same:

Incorrect function. (0x1) - exit code 1 (0x1)
lr5_combine_mods_run01_rlbn_1ttz_IGNORE_THE_REST_NATIVE_14608_21_1

Native pose needed for OptionKeys::relax::constrain_relax_to_native_coords
ERROR:: Exit from: ....srcprotocolsrelaxClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out
lr5_combine_mods_run01_rlbn_1ew4_IGNORE_THE_REST_NATIVE_14608_26_0
lr5_combine_mods_run01_rlbn_2ci2_IGNORE_THE_REST_NATIVE_14608_42_1
lr5_combine_mods_run01_rlbn_2hl7_IGNORE_THE_REST_NATIVE_14608_141_0

Is there a range of WUs we can abort before they even start?
A pattern's forming but I don't want to assume anything.
ID: 62745 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5664
Credit: 5,710,284
RAC: 2,004
Message 62756 - Posted: 3 Aug 2009, 22:22:25 UTC - in response to Message 62741.  

Make that 7 now: lr5_combine_mods_run01_rlbn_1pxu_IGNORE_THE_REST_NATIVE_14608_4_1

Run time is between 0-10 seconds on these errors.
Someone please check what is going on, your killing my RAC.


6 failed tasks with all the same error message

Native pose needed for OptionKeys::relax::constrain_relax_to_native_coords
ERROR:: Exit from: ....srcprotocolsrelaxClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish


lr5_combine_mods_run01_rlbn_1tit_IGNORE_THE_REST_NATIVE_14608_97_0
lr5_combine_mods_run01_rlbn_1tul_IGNORE_THE_REST_NATIVE_14608_97_0
lr5_combine_mods_run01_rlbn_1ubi_IGNORE_THE_REST_NATIVE_14608_97_0
lr5_combine_mods_run01_rlbn_1ugh_IGNORE_THE_REST_NATIVE_14608_97_0
lr5_combine_mods_run01_rlbn_1unp_IGNORE_THE_REST_NATIVE_14608_97_0
lr5_combine_mods_run01_rlbn_1unr_IGNORE_THE_REST_NATIVE_14608_97_0

ID: 62756 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1990
Credit: 38,522,839
RAC: 15,277
Message 62760 - Posted: 4 Aug 2009, 6:40:23 UTC - in response to Message 62756.  

Run time is between 0-10 seconds on these errors.
Someone please check what is going on, your killing my RAC.

How can they be killing your RAC if they only run 10 seconds each?

Granted credit is definitely low compared to claimed credit in the current batches of WUs but the failing jobs have nothing to do with that.
ID: 62760 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rabenherz85

Send message
Joined: 25 Jun 09
Posts: 3
Credit: 9,089
RAC: 0
Message 62761 - Posted: 4 Aug 2009, 7:28:14 UTC

Here are 4 CPU`s waiting for work...but Rosetta doesn't download anything and just requests Tasks for GPU ?
ID: 62761 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5664
Credit: 5,710,284
RAC: 2,004
Message 62766 - Posted: 4 Aug 2009, 17:14:59 UTC - in response to Message 62760.  
Last modified: 4 Aug 2009, 17:16:49 UTC

Run time is between 0-10 seconds on these errors.
Someone please check what is going on, your killing my RAC.

How can they be killing your RAC if they only run 10 seconds each?

Granted credit is definitely low compared to claimed credit in the current batches of WUs but the failing jobs have nothing to do with that.



Well in general the RAC was diving like a dive bomber, but now I see it it going the other way. Going up like a rocket. So must have been all this server trouble and who knows what else that was causing it to drop so bad. From the graph it looks like it has climbed 40 points in the last 3 days. So never mind my statement there about killed RAC. Also looks like I finally got out of that bad batch of tasks.
ID: 62766 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TimL

Send message
Joined: 16 Sep 06
Posts: 17
Credit: 15,480,956
RAC: 0
Message 62767 - Posted: 4 Aug 2009, 20:30:12 UTC

lr5_combine_mods_run01_rlbn_1wdv_IGNORE_THE_REST_NATIVE_14608_112_1 failed with:

ERROR:: Exit from: ....srcprotocolsrelaxClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

ID: 62767 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5664
Credit: 5,710,284
RAC: 2,004
Message 62769 - Posted: 4 Aug 2009, 21:11:10 UTC - in response to Message 62767.  
Last modified: 4 Aug 2009, 21:13:45 UTC

oops..duplicate to the one above.
ID: 62769 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5664
Credit: 5,710,284
RAC: 2,004
Message 62770 - Posted: 4 Aug 2009, 21:12:39 UTC - in response to Message 62767.  

lr5_combine_mods_run01_rlbn_1wdv_IGNORE_THE_REST_NATIVE_14608_112_1 failed with:

ERROR:: Exit from: ....srcprotocolsrelaxClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish



Tim, watch out on those run01 tasks.
I had that same error in most if not all of them.
They will die within 10 secs. if that error shows up.
Will you list the ones that die here? It would be nice to see how many bugged out on your system.
ID: 62770 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1990
Credit: 38,522,839
RAC: 15,277
Message 62771 - Posted: 4 Aug 2009, 21:30:11 UTC - in response to Message 62770.  

Tim, watch out on those run01 tasks.
I had that same error in most if not all of them.
They will die within 10 secs. if that error shows up.
Will you list the ones that die here? It would be nice to see how many bugged out on your system.

Not quite true.

The ones that start "lr5_combine_mods_run01_rlbn_" have been problematic but
I'm currently running one starting "lr5_combine_mods_run01_rlbd_" and it's going just fine.
ID: 62771 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5664
Credit: 5,710,284
RAC: 2,004
Message 62772 - Posted: 5 Aug 2009, 1:25:09 UTC - in response to Message 62771.  

Tim, watch out on those run01 tasks.
I had that same error in most if not all of them.
They will die within 10 secs. if that error shows up.
Will you list the ones that die here? It would be nice to see how many bugged out on your system.

Not quite true.

The ones that start "lr5_combine_mods_run01_rlbn_" have been problematic but
I'm currently running one starting "lr5_combine_mods_run01_rlbd_" and it's going just fine.


thanks for the clarification.
ID: 62772 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dgnuff
Avatar

Send message
Joined: 1 Nov 05
Posts: 350
Credit: 24,773,605
RAC: 0
Message 62773 - Posted: 5 Aug 2009, 3:41:43 UTC

Agreed - I had this guy error out on one of my boxes:

resultid 269800139

Name matches the format noted by Sid, error matches the one posted by Tim.
ID: 62773 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Minirosetta 1.90 and 1.91



©2024 University of Washington
https://www.bakerlab.org