Message boards : Number crunching : MiniRosetta 3.17 Problems.
Author | Message |
---|---|
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Hi. I've had two different types of tasks error, the same types have been run before on this rig with 3.14 app and not erred. https://boinc.bakerlab.org/rosetta/workunit.php?wuid=418800096 place_CE_20110919_EBOV_GP_2d1v_ProteinInterfaceDesign_31440_359_0 <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> ERROR: drSOP ERROR:: Exit from: src/protocols/protein_interface_design/movers/PlaceStubMover.cc line: 1063 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish </stderr_txt> ================================================================================= https://boinc.bakerlab.org/rosetta/workunit.php?wuid=418800129 3filtr5A_CYpa_2aak_ProteinInterfaceDesign_23Aug2011_30588_1098_0 <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> ERROR: drSOP ERROR:: Exit from: src/protocols/protein_interface_design/movers/PlaceStubMover.cc line: 1063 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish </stderr_txt> |
cmiles Send message Joined: 4 Jan 11 Posts: 9 Credit: 0 RAC: 0 |
Thank you for reporting this problem. We're taking a look at it now. |
Shawn Volunteer moderator Project developer Project scientist Send message Joined: 22 Jan 10 Posts: 17 Credit: 53,741 RAC: 0 |
Thanks for letting us know. As you are probably aware, we recently changed our version of Rosetta@home. These current jobs are associated with protocols written for an older version. I did not notice any compatibility problems at the time, but I will do some more testing on these jobs to find out why they didn't work. |
Shawn Volunteer moderator Project developer Project scientist Send message Joined: 22 Jan 10 Posts: 17 Credit: 53,741 RAC: 0 |
Thanks for letting us know. I think we've identified the problem, and the ProteinInterfaceDesign team is now aware of the issue. Thanks once again for your time, your computational resources, and your feedback! |
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
See, if you guys would post a summary of this problem on the front page... it'd have a profound effect on users. They'd see that the rosetta team is working... etc. Same goes when the server goes down. Say: "Hey, someone unplugged the servers during last night's party. We'll get that fixed as soon a possible." Or something along those line would be great for people trying to know what's going on. Just my humble advice. |
pieface Send message Joined: 20 Sep 05 Posts: 17 Credit: 797,661 RAC: 0 |
I really don't mind the small things like the DrSOP problem, they tie up some resources for download then upload, but I don't get charged extra for that. But, during the same timeframe I also had something like a dozen ProteinInterfaceDesign and Ploop2x3 run to their full allotted time (6hrs or so depending on how watchdog was feeling) and then when the validator finally got caught-up they were marked as invalid. I had some of these on both machines I had crunching Rosetta - one is a Win XP X64 system and the other a Win7 box, no overclocking at all. Here are a couple of examples - any ideas or anyone else get those kind of results in this last batch? Ploop2x3 Ploop2x3 PID note: edited to take out 'over the weekend'. |
.clair. Send message Joined: 2 Jan 07 Posts: 274 Credit: 26,399,595 RAC: 0 |
Yup, Doctor SOP has a problem :¬) It is unusual for me to get errors, and i have now got fore errors with that in the output. https://boinc.bakerlab.org/rosetta/result.php?resultid=458811930 https://boinc.bakerlab.org/rosetta/result.php?resultid=458960603 https://boinc.bakerlab.org/rosetta/result.php?resultid=459144123 https://boinc.bakerlab.org/rosetta/result.php?resultid=459297330 |
cmiles Send message Joined: 4 Jan 11 Posts: 9 Credit: 0 RAC: 0 |
@pieface, @clive_G1FYE The errors you're both experiencing are due to a name change in a protocol commonly used by protein designers. These same work units executed without error on the previous version of Rosetta@Home. However, the name change was not done in a backwards-compatible way. We're putting a system in place to prevent these kinds of errors from happening in the future. Thank you both for reporting these problems. |
Trotador Send message Joined: 30 May 09 Posts: 108 Credit: 291,214,977 RAC: 1 |
lots of errors, stop downloading units https://boinc.bakerlab.org/rosetta/result.php?resultid=459660390 https://boinc.bakerlab.org/rosetta/result.php?resultid=459660074 https://boinc.bakerlab.org/rosetta/result.php?resultid=459660070 https://boinc.bakerlab.org/rosetta/result.php?resultid=459635613 https://boinc.bakerlab.org/rosetta/result.php?resultid=459658860 |
Trotador Send message Joined: 30 May 09 Posts: 108 Credit: 291,214,977 RAC: 1 |
More info T0....units seem ok ab_07_19... crashing all 2stubs... crash place_CE_... crash rlx_jsr... OK |
Mad_Max Send message Joined: 31 Dec 09 Posts: 209 Credit: 25,965,700 RAC: 14,164 |
All my WUs with names 3filtr5A_CYpa_ - has compute errors too: https://boinc.bakerlab.org/rosetta/result.php?resultid=458277890 https://boinc.bakerlab.org/rosetta/result.php?resultid=459075255 https://boinc.bakerlab.org/rosetta/result.php?resultid=459077432 https://boinc.bakerlab.org/rosetta/result.php?resultid=459085059 https://boinc.bakerlab.org/rosetta/result.php?resultid=459095484 And all WUs with names ploop2x3_design_ ends with validate errors: https://boinc.bakerlab.org/rosetta/result.php?resultid=458647710 https://boinc.bakerlab.org/rosetta/result.php?resultid=458839191 https://boinc.bakerlab.org/rosetta/result.php?resultid=459110460 |
TJ Send message Joined: 29 Mar 09 Posts: 127 Credit: 4,799,890 RAC: 0 |
All my WU's error out very soon, I got these error messages: ERROR: [ERROR] invalid header input for kill_hairpins file. ERROR:: Exit from: ......srccorescoringSS_Killhairpins_Info.cc line: 370 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish Greetings, TJ. |
.clair. Send message Joined: 2 Jan 07 Posts: 274 Credit: 26,399,595 RAC: 0 |
Yup, I got some dead hairpin file`s as well in the ab_07_19_ series The things you have to do to a protein to make them behave :¬) https://boinc.bakerlab.org/rosetta/result.php?resultid=459619880 https://boinc.bakerlab.org/rosetta/result.php?resultid=459639244 |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Some more errors, different type of tasks others i've had have been running o.k. apart from these. https://boinc.bakerlab.org/rosetta/workunit.php?wuid=401795240 ab_07_19_1fnaA_filtnr_IGNORE_THE_REST_06_08_28682_52_1 <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> Starting work on structure: _00001 ERROR: [ERROR] invalid header input for kill_hairpins file. ERROR:: Exit from: src/core/scoring/SS_Killhairpins_Info.cc line: 370 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish Watchdog active. </stderr_txt> ================================================================================== https://boinc.bakerlab.org/rosetta/workunit.php?wuid=401801710 ab_07_19_1acfA_control_IGNORE_THE_REST_03_07_28679_51_0 <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> Starting work on structure: _00001 ERROR: [ERROR] invalid header input for kill_hairpins file. ERROR:: Exit from: src/core/scoring/SS_Killhairpins_Info.cc line: 370 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish </stderr_txt> |
cmiles Send message Joined: 4 Jan 11 Posts: 9 Credit: 0 RAC: 0 |
The offending jobs have been removed. Rosetta is a large and diverse project. Unlike more focused efforts such as SETI@Home, the breadth of compute tasks being performed on Rosetta@Home is incredible. While offering enormous flexibility, this greatly complicates testing and validation. Unfortunately, some bad jobs slipped in this time. In many cases, Rosetta@Home users such as myself find out about failing jobs when you do, and we're just as frustrated when such jobs are distributed. Thank you for your continued support. |
TPCBF Send message Joined: 29 Nov 10 Posts: 111 Credit: 5,081,932 RAC: 1,845 |
The offending jobs have been removed.But why the *snap* can no sysadmin post some proper info about this in a timely fashion? It's just a matter of simple communication, doesn't even cost much time. :-( Ralf |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
The offending jobs have been removed. Why isn't ralph being used to catch these errors? All workunits I've received from ralph recently have been using app version 3.14. Best, Snags |
TPCBF Send message Joined: 29 Nov 10 Posts: 111 Credit: 5,081,932 RAC: 1,845 |
Why isn't ralph being used to catch these errors? All workunits I've received from ralph recently have been using app version 3.14.Yeah, what RALPH@Home is doing is a bit odd recently. Several times, I got swamped with sets of 20 WUs at a time, and a mix of applications labeled both as "Rosetta Mini Beta 3.17" (currently 2 awaiting their turn) and as "Rosetta Mini 3.14" (another 20 WUs piled up to be eventually being processed). Ralf |
cmiles Send message Joined: 4 Jan 11 Posts: 9 Credit: 0 RAC: 0 |
RALPH has separate executables for minirosetta (current version of Rosetta@Home) and minirosetta_beta (next version of Rosetta@Home). At the moment, the two applications are identical, despite their different version numbers. minirosetta => 3.18 minirosetta_beta => 3.17 During the update process, the two versions will diverge. The idea behind this is to always have a running version of the software currently deployed on Rosetta@Home available for test. |
TPCBF Send message Joined: 29 Nov 10 Posts: 111 Credit: 5,081,932 RAC: 1,845 |
RALPH has separate executables for minirosetta (current version of Rosetta@Home) and minirosetta_beta (next version of Rosetta@Home). At the moment, the two applications are identical, despite their different version numbers.And are you sure that everyone's on the same page here? :? Ralf |
Message boards :
Number crunching :
MiniRosetta 3.17 Problems.
©2024 University of Washington
https://www.bakerlab.org