Message boards : Number crunching : Minirosetta v1.45 bug thread
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
DaveSun Send message Joined: 3 May 07 Posts: 5 Credit: 200,480 RAC: 0 |
Got a validation error on score12_rlbd_1gvp_IGNORE_THE_REST_DECOY_5473_170 any indication as to what may have caused this? The task ran for the full time so no indication on my end of a problem. |
![]() Send message Joined: 16 Jun 08 Posts: 1221 Credit: 13,567,355 RAC: 1,235 ![]() |
Got a validation error on score12_rlbd_1gvp_IGNORE_THE_REST_DECOY_5473_170 any indication as to what may have caused this? I don't know, but I noticed that your wingman on that workunit seemed to have chosen a shorter workunit size, and therefore shut down before reaching whatever caused that problem. Also, I've noticed that choosing a preferred workunit length above 10 hours seems to get me more problematic workunits, so if you get such problems often, you might want to try reducing your preferred workunit size. |
![]() Send message Joined: 16 Jun 08 Posts: 1221 Credit: 13,567,355 RAC: 1,235 ![]() |
Got a validation error on score12_rlbd_1gvp_IGNORE_THE_REST_DECOY_5473_170 any indication as to what may have caused this? I took another look at your results, and noticed that it returned 596 decoys. I don't think I've seen a workunit before that returned a 3 digit number of decoys, so perhaps there needs to be a check of whether both minirosetta 1.45 and the workunit validation software can handle that many decoys for one workunit and still do it properly. |
A Few Good Men Send message Joined: 25 Mar 07 Posts: 14 Credit: 2,031,382 RAC: 0 |
Last 24 hours have produced this error on 5 WU's Server state Over Outcome Client error Client state Compute error Exit status -226 (0xffffff1e) Computer ID 963376 Report deadline 22 Dec 2008 1:30:07 UTC CPU time 21570.15 stderr out <core_client_version>6.2.19</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> # cpu_run_time_pref: 86400 # cpu_run_time_pref: 86400 # cpu_run_time_pref: 86400 # cpu_run_time_pref: 86400 # cpu_run_time_pref: 86400 Can't acquire lockfile - exiting Can't acquire lockfile - exiting |
DaveSun Send message Joined: 3 May 07 Posts: 5 Credit: 200,480 RAC: 0 |
Got a validation error on score12_rlbd_1gvp_IGNORE_THE_REST_DECOY_5473_170 any indication as to what may have caused this? I've been running at this setting for several months with out any major troubles and have had several that returned triple digit decoys. I setup to run 1 day after running for less than 10 hours for a long time and having units run what seemed like forever. This way I've not had any taks run over my preference and it works well for my setup. I just don't remember a task that did not validate that had run to completion here before this one. |
![]() Send message Joined: 16 Jun 08 Posts: 1221 Credit: 13,567,355 RAC: 1,235 ![]() |
Got a validation error on score12_rlbd_1gvp_IGNORE_THE_REST_DECOY_5473_170 any indication as to what may have caused this? Then perhaps the limit handled successfuly is higher than 99 decoys per workunit, but not as high as 596. |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 11,794,805 RAC: 0 |
Assertion failure in Task 213968874 (abinitio_abrelax_nohomfrag_129_B_1qgvA_5483_146_0) Workunit 195032150, Mac OS X 10.4.11 Failed after 30 seconds <core_client_version>6.2.18</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> <stderr_txt> ERROR: Assertion failure: assert( ( begin + size - 1 ) <= pose.total_residue() ); ERROR:: Exit from: src/protocols/abinitio/FragmentMover.cc line: 110 called boinc_finish # cpu_run_time_pref: 14400 </stderr_txt> ]]> |
Nothing But Idle Time Send message Joined: 28 Sep 05 Posts: 209 Credit: 139,545 RAC: 0 |
When I encountered two cs_vanilla compute errors in a row I set Rosetta to NNW. That was 4 days ago. Until the software is fixed and announced here it will remain so. It behooves the project team to fix these errors ASAP rather than wait until this thread (like its predecessors) is cluttered with hundreds of posts reporting the same stuff. I do not understand this counter-productive behavior. |
![]() Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
we are definitely working on it and will likely have an update within a few days after testing on ralph. |
Mike Tyka Send message Joined: 20 Oct 05 Posts: 96 Credit: 2,190 RAC: 0 |
Assertion failure in Task 213968874 (abinitio_abrelax_nohomfrag_129_B_1qgvA_5483_146_0) Appologies for this - i screwed up the submit for two proteins: 1qgv and 1t2j . I've tried to remove the jobs as soon as i noticed but around 200 WUs went out anyway. If you get a WU with either of those two protein tags please abort it! For the cs_vanilla jobs a fix is going out onto RALPH@HOme right now. If you get cs_vanilla jobs, also feel free to abort them. We'll resubmitonce the error is fixed http://beautifulproteins.blogspot.com/ http://www.miketyka.com/ |
![]() ![]() Send message Joined: 30 May 06 Posts: 5652 Credit: 5,622,096 RAC: 0 |
read here for two links on how to take care of lockfiles.
|
DaveSun Send message Joined: 3 May 07 Posts: 5 Credit: 200,480 RAC: 0 |
Got a validation error on score12_rlbd_1gvp_IGNORE_THE_REST_DECOY_5473_170 any indication as to what may have caused this? While that is possible you'd think that if there was a limit it'd be coded into the app and tasks would end once the limit was reached. |
Mike Francis![]() Send message Joined: 24 Nov 05 Posts: 8 Credit: 623,519 RAC: 0 |
12/13/2008 12:39:43 AM|rosetta@home|Starting loopbuild_reference_native_cst_hombench_loopbuild_t306__IGNORE_THE_REST_1B70B_12_5533_4_1 12/13/2008 12:39:43 AM|rosetta@home|Starting task loopbuild_reference_native_cst_hombench_loopbuild_t306__IGNORE_THE_REST_1B70B_12_5533_4_1 using minirosetta version 145 12/13/2008 12:46:00 AM|rosetta@home|Computation for task loopbuild_reference_native_cst_hombench_loopbuild_t306__IGNORE_THE_REST_1B70B_12_5533_4_1 finished 12/13/2008 12:46:00 AM|rosetta@home|Output file loopbuild_reference_native_cst_hombench_loopbuild_t306__IGNORE_THE_REST_1B70B_12_5533_4_1_0 for task loopbuild_reference_native_cst_hombench_loopbuild_t306__IGNORE_THE_REST_1B70B_12_5533_4_1 absent |
![]() Send message Joined: 18 Jul 06 Posts: 109 Credit: 1,859,263 RAC: 0 |
https://boinc.bakerlab.org/rosetta/result.php?resultid=213868228 Validate error Done 43,178.07 !!!!! https://boinc.bakerlab.org/rosetta/result.php?resultid=213166655 https://boinc.bakerlab.org/rosetta/result.php?resultid=212932042 https://boinc.bakerlab.org/rosetta/result.php?resultid=212932029 https://boinc.bakerlab.org/rosetta/result.php?resultid=212906401 https://boinc.bakerlab.org/rosetta/result.php?resultid=212906412 https://boinc.bakerlab.org/rosetta/result.php?resultid=212906413 https://boinc.bakerlab.org/rosetta/result.php?resultid=212931903 https://boinc.bakerlab.org/rosetta/result.php?resultid=212896182 https://boinc.bakerlab.org/rosetta/result.php?resultid=212896182 https://boinc.bakerlab.org/rosetta/result.php?resultid=212881858 https://boinc.bakerlab.org/rosetta/result.php?resultid=212692623 https://boinc.bakerlab.org/rosetta/result.php?resultid=212611598 https://boinc.bakerlab.org/rosetta/result.php?resultid=212499093 |
![]() ![]() Send message Joined: 30 May 06 Posts: 5652 Credit: 5,622,096 RAC: 0 |
1wjdA_ZNMP_ABRELAX_tetraL_IGNORE_THE_REST_ZINC_METALLOPROTEIN-1wjdA-_5478_4043_0 got stuck and was showing 23.45% remaining which is odd, being that the messages in boinc manager showed it had started about 5 minutes earlier before getting inturputed by benchmark testing. after aborting the task the next one started and the cores went to 100% immediately. |
![]() ![]() Send message Joined: 2 Jul 06 Posts: 2842 Credit: 2,020,043 RAC: 0 |
Server Status Page is showing a problem 839am 12/14/08 |
![]() Send message Joined: 16 Jun 08 Posts: 1221 Credit: 13,567,355 RAC: 1,235 ![]() |
https://boinc.bakerlab.org/rosetta/result.php?resultid=213868228 I notice that your results are the first I've seen that were run under boinc 6.4.1. I wonder if that's the source of the problem instead of minirosetta 1.45? |
![]() Send message Joined: 16 Jun 08 Posts: 1221 Credit: 13,567,355 RAC: 1,235 ![]() |
[duplicate] |
mikylinux Send message Joined: 25 Jul 07 Posts: 3 Credit: 73,155 RAC: 0 |
https://boinc.bakerlab.org/rosetta/result.php?resultid=213307491 |
AMD_is_logical Send message Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0 |
Server Status Page is showing a problem 839am 12/14/08 As of 14 Dec 2008 20:26:34 UTC the Server Status Page shows: Program rah_make_work1 on host srv3 with status "Not running". Work units Ready to send: 1 It looks like program rah_make_work2 isn't able to handle the load all by itself. |
Message boards :
Number crunching :
Minirosetta v1.45 bug thread
©2023 University of Washington
https://www.bakerlab.org