Posts by Erik

1) Message boards : Number crunching : Problem with task "exited with zero status but no 'finished' file" error (Message 78017)
Posted 11 Mar 2015 by Erik
Post:
Allowing BOINC to use 100% CPU time got rid of that error for me, possibly in conjunction with a 12-hour run time. In practice, the CPU usage tends to run between 75 to 90%. I also found that rebooting the machine will cause an in-process work unit to fail.
2) Message boards : Number crunching : Two consistent and persistent errors (Message 77979)
Posted 26 Feb 2015 by Erik
Post:
I don't know if this will be helpful, but yesterday when I rebooted my computer to install updates, several of the Rosetta units in process at the time failed, even though I shut down BOINC gracefully. The tasks were all from either the SAVE_ALL_OUT or IGNORE_THE_REST group. Here's a couple examples:


24-Feb-15 22:12:24 | rosetta@home | Task TL_test_2008_0165_0994_0960_2059_00350256_0157_0891_0009_0875_0001_fold_SAVE_ALL_OUT_244879_1833_0 exited with zero status but no 'finished' file
24-Feb-15 22:12:24 | rosetta@home | Task rb_02_23_53371_99352_ab_stage0_h004___robetta_IGNORE_THE_REST_07_15_244897_85_0 exited with zero status but no 'finished' file

The next to finish was:

25-Feb-15 02:17:35 | rosetta@home | Computation for task TL_test_1478_0993_0262_0916_2046_0140_0741_0187_0164_0011_0153_0001_fold_SAVE_ALL_OUT_244870_3991_0 finished
3) Message boards : Number crunching : Two consistent and persistent errors (Message 77940)
Posted 14 Feb 2015 by Erik
Post:
I'm pretty sure I would have had some a couple months ago, but those have cycled out of my logs by now. If no one has any current ones, I can just set my client to grab 48 hour sets.
4) Message boards : Number crunching : Two consistent and persistent errors (Message 77938)
Posted 13 Feb 2015 by Erik
Post:
Is there a way to edit the title that I'm missing? I'd like to add [Fixed] to the title.
5) Message boards : Number crunching : Two consistent and persistent errors (Message 77937)
Posted 13 Feb 2015 by Erik
Post:
So I checked my log after getting home from work today, and it looks like everything is completing successfully now. The last unit to fail was just before I changed the preferences.

So, set the target CPU run time to twelve hours, and the max CPU time usage to 100%.

I hope the CPU usage requirement will be fixed soon. I don't want to have to run my box at 100% all day in a desert summer.
6) Message boards : Number crunching : Two consistent and persistent errors (Message 77930)
Posted 13 Feb 2015 by Erik
Post:
The odd thing is, Rosetta is the only project which returns those errors. I currently have the processor time set to the default of 50%. I'll let it run at 100% today and see if that makes a difference.
7) Message boards : Number crunching : Two consistent and persistent errors (Message 77928)
Posted 13 Feb 2015 by Erik
Post:
Thanks for responding David. I currently have the the target CPU run time preference set to twelve hours. I haven't received any time-out errors since, but the second error, "exited with zero status but no 'finished' file," is returned for nearly every unit I process.

12-Feb-15 20:37:30 | rosetta@home | Task rb_02_08_53120_99050_ab_stage0_h002___robetta_IGNORE_THE_REST_09_16_242762_18_0 exited with zero status but no 'finished' file
12-Feb-15 20:37:30 | rosetta@home | If this happens repeatedly you may need to reset the project.
12-Feb-15 20:49:19 | rosetta@home | Task rb_02_08_53120_99050_ab_stage0_h002___robetta_IGNORE_THE_REST_09_16_242762_18_0 exited with zero status but no 'finished' file
12-Feb-15 20:49:19 | rosetta@home | If this happens repeatedly you may need to reset the project.
12-Feb-15 20:56:31 | rosetta@home | Task rb_02_08_53120_99050_ab_stage0_h003___robetta_IGNORE_THE_REST_11_13_242763_13_0 exited with zero status but no 'finished' file
12-Feb-15 20:56:31 | rosetta@home | If this happens repeatedly you may need to reset the project.
12-Feb-15 21:01:56 | rosetta@home | Task rb_02_08_53120_99050_ab_stage0_h003___robetta_IGNORE_THE_REST_11_13_242763_13_0 exited with zero status but no 'finished' file
12-Feb-15 21:01:56 | rosetta@home | If this happens repeatedly you may need to reset the project.
12-Feb-15 21:08:22 | rosetta@home | Task Ross3X3_SAVE_ALL_OUT_t149_009_242754_249_0 exited with zero status but no 'finished' file
12-Feb-15 21:08:22 | rosetta@home | If this happens repeatedly you may need to reset the project.
12-Feb-15 21:08:26 | rosetta@home | Task rb_02_08_53120_99050_ab_stage0_h003___robetta_IGNORE_THE_REST_11_13_242763_13_0 exited with zero status but no 'finished' file
8) Message boards : Number crunching : Two consistent and persistent errors (Message 77876)
Posted 30 Jan 2015 by Erik
Post:
The timeout errors have been resolved by adjusting the runtime preference. The "exited with zero status" errors are still occurring. Some of the these do complete, but not many.

1/29/2015 1:26:18 PM | rosetta@home | Computation for task rb_01_23_53132_98689__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_242144_2228_0 finished
9) Message boards : Number crunching : Two consistent and persistent errors (Message 77869)
Posted 29 Jan 2015 by Erik
Post:
I changed that. The other changes per the link above don't appear to have made any difference.
10) Message boards : Number crunching : Two consistent and persistent errors (Message 77862)
Posted 28 Jan 2015 by Erik
Post:
I just saw this, I'll see if any of these steps make a difference.

http://boincfaq.mundayweb.com/index.php?language=1&view=116
11) Message boards : Number crunching : Two consistent and persistent errors (Message 77861)
Posted 28 Jan 2015 by Erik
Post:
1/26/2015 9:01:31 AM | rosetta@home | Starting task rb_01_23_53131_98688__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_242143_2568_0
1/26/2015 9:43:45 AM | rosetta@home | Aborting task rb_01_23_53132_98689__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_242144_2218_0: exceeded elapsed time limit 122964.12 (500000.00G/4.07G)

1/26/2015 3:04:53 PM | rosetta@home | Task rb_01_23_53132_98689__t000__0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_242144_2226_0 exited with zero status but no 'finished' file
1/26/2015 3:04:53 PM | rosetta@home | If this happens repeatedly you may need to reset the project.

I see other computers have been completing the work units, and I have reset the project a couple times with no effect, including reinstalling the client. Very nearly every unit I've run on this machine in the past couple of weeks gets one of these errors. Anyone have ideas or insights?






©2024 University of Washington
https://www.bakerlab.org