1)
Message boards :
Number crunching :
minirosetta 2.17
(Message 69825)
Posted 14 Mar 2011 by Kartsa Post: so, theres no error messages or anything. From restart to the point when I noticed that a wu has hanged 14/03/2011 20:26:33 Not using a proxy 14/03/2011 20:26:33 rosetta@home Restarting task T0596_rR_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_23234_4284_0 using minirosetta version 217 14/03/2011 20:26:33 rosetta@home Restarting task T0623_rH_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_23166_4307_0 using minirosetta version 217 14/03/2011 20:26:33 rosetta@home Restarting task T0528_rR_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_23193_4413_0 using minirosetta version 217 14/03/2011 20:26:34 rosetta@home Restarting task IF3_like_SAVE_ALL_OUT_i016_008_23333_342_0 using minirosetta version 217 14/03/2011 20:39:24 rosetta@home Computation for task T0596_rR_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_23234_4284_0 finished 14/03/2011 20:39:24 rosetta@home Starting IF3_like_SAVE_ALL_OUT_relax_i016_23334_521_0 14/03/2011 20:39:25 rosetta@home Starting task IF3_like_SAVE_ALL_OUT_relax_i016_23334_521_0 using minirosetta version 217 14/03/2011 20:39:26 rosetta@home Started upload of T0596_rR_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_23234_4284_0_0 14/03/2011 20:39:46 rosetta@home Finished upload of T0596_rR_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_23234_4284_0_0 14/03/2011 21:12:48 rosetta@home Computation for task IF3_like_SAVE_ALL_OUT_relax_i016_23334_521_0 finished 14/03/2011 21:12:48 rosetta@home Starting IF3_like_SAVE_ALL_OUT_i016_009_23333_1081_0 14/03/2011 21:12:48 rosetta@home Starting task IF3_like_SAVE_ALL_OUT_i016_009_23333_1081_0 using minirosetta version 217 14/03/2011 21:12:50 rosetta@home Started upload of IF3_like_SAVE_ALL_OUT_relax_i016_23334_521_0_0 14/03/2011 21:13:13 rosetta@home Finished upload of IF3_like_SAVE_ALL_OUT_relax_i016_23334_521_0_0 14/03/2011 21:41:14 rosetta@home Computation for task T0528_rR_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_23193_4413_0 finished 14/03/2011 21:41:14 rosetta@home Starting IF3_like_SAVE_ALL_OUT_i016_008_23333_703_0 14/03/2011 21:41:15 rosetta@home Starting task IF3_like_SAVE_ALL_OUT_i016_008_23333_703_0 using minirosetta version 217 14/03/2011 21:41:17 rosetta@home Started upload of T0528_rR_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_23193_4413_0_0 14/03/2011 21:41:27 rosetta@home Finished upload of T0528_rR_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_23193_4413_0_0 and it was the task T0623_rH_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_23166_4307_0 that got stuck edit and just to add: I had all the other projects suspended before the restart, so only rosetta was running |
2)
Message boards :
Number crunching :
minirosetta 2.17
(Message 69824)
Posted 14 Mar 2011 by Kartsa Post: completely exit BOINC and restart it yep this seems to get them running again BOINC version 6.10.58, using the default(?) memory settings, 50% when in use and 90% when not in use. I have 8 gigs total and rosetta rarely uses more than 2gigs in total (4 wus, ~500MB each; usually it's a lot less, 250-400MB each. Apps are not left in memory when suspended. At this point cant say anything certain about the possible messages since I just restarted the client, will post the next time some wu 'hangs'. The other projects I'm running, Seti and Einstein, aren't affected by this problem. |
3)
Message boards :
Number crunching :
minirosetta 2.17
(Message 69810)
Posted 13 Mar 2011 by Kartsa Post: Workunit T0635_rR_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_23623_2499 has stopped using ANY CPU time, even though BOINC thinks it is still running. I'm having similar issues with some of the units on two different machines (win 7 and xp). Boinc thinks they are still running but they are not using any cpu and progress doesn't increase. I'm not using any throttling at all, full 100% all the time. I just abort the failed units since suspending/resume doesn't make any difference. I've been having these problems for couple of months I think, most of the wus work just fine. Tried resetting the project, didn't help. two examples T0590_rH_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_23137_1570 T0620_rH_rs_stg0_lrlxjcst_t000__casp9_SAVE_ALL_OUT_23164_2093 (someone seems to have successfully finished this one, though...) |
©2024 University of Washington
https://www.bakerlab.org