Posts by William Timbrook

1) Message boards : Number crunching : Report long-running models here (Message 55850)
Posted 18 Sep 2008 by William Timbrook
Post:
William, parts of what you describe are normal and expected, and some parts are not. I've moved your posts here to this thread because you appear to have a 3hr runtime (the default) configured for that host, and so the 8hrs you report is well beyond that.

Your tasks was abinitio_nohomfrag_70_A_1qgvA_4466_9601, v1.34, running BOINC 6.2.18 and Windows 2000.

So it ran longer then expected.

The parts of what you describe that are normal are that any time you end BOINC or remove a task from memory (which happens if BOINC switches to running another project, suspending the R@h task, and you are not keeping suspended tasks in memory), you will lose some work. The amount lost depends on when Rosetta was able to last save a checkpoint. And some tasks are able to checkpoint more frequently then others.

So, seeing the CPU time reduced (sometimes all the way back to zero) when the task restarts, is normal.

The other thing is that the 3 hours you are probably currently seeing as the initial estimated time to completion is just based on your runtime preference (which you can set here on the website in your Rosetta-specific preferences). Actually, it is based on your BOINC client's history of working tasks with your runtime preference. Some tasks take longer then that. So, rather then showing a negative estimated time to complete once the original estimate is reached, the program starts to make time pass slower and slower once it reaches about 10 minutes remaining. So, the part you describe about 10 minutes remaining for an extended period of time is normal as well.

The resulting confusion when tasks go longer then your preference is why I started this thread, and why the Project Team is working to address these long-running models that cause runtimes to be exceeded.




Thanks for the update.
I had another one like that which was experiencing the same thing. Seeing the 10hrs of cpu time just didn't look that comforting.
I wanted to finish the jobs but... some other hosts can pick those 2 up.
I was on 5.10.45 (?) but upgraded that host to 6.2.18 but it seem to not make a difference.
2) Message boards : Number crunching : Report long-running models here (Message 55825)
Posted 17 Sep 2008 by William Timbrook
Post:
[quote]I've noticed a couple of odd work unit packets but I think I now have a specific question.

Work Unit 175541763 has clocked over 8 hours of cpu time and 98.001% done. I went to the host machine, stopped and restarted boinc. Now the work unit is showing 55.704% done and about 100 minutes of cpu time.

Is this a known opportunity?


After 2 restarts of boinc (and upgrading to 6.2.18), the unit got stuck with 9:52 to finish. I aborted it.
3) Message boards : Number crunching : Report long-running models here (Message 55823)
Posted 17 Sep 2008 by William Timbrook
Post:
I've noticed a couple of odd work unit packets but I think I now have a specific question.

Work Unit 175541763 has clocked over 8 hours of cpu time and 98.001% done. I went to the host machine, stopped and restarted boinc. Now the work unit is showing 55.704% done and about 100 minutes of cpu time.

Is this a known opportunity?



Thanks,
William






©2022 University of Washington
https://www.bakerlab.org