Report stuck & aborted WU here please - II

Message boards : Number crunching : Report stuck & aborted WU here please - II

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9

AuthorMessage
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 15208 - Posted: 1 May 2006, 19:46:35 UTC - in response to Message 15203.  

It was stuck at 1.044x ...Confusing.

Let me attempt to explain. Model 1 will show 1% something. The fractions have something to do with specific points within the model where they have completed one portion and are beginning another. They revise the completion fractionally and it is really sort of a counter on where within model 1 they presently are. They used this information for debugging and helping to resolve these "hung WU" issues.

Once model 1 completes, then it basically looks at the target runtime and compares with how long you've run already. Your next model will tend to take roughly as long as the last one did. Now that we've COMPELTED one, we've some idea how long the future models will take. So, let's say target runtime is 4hrs, and model 1 took 1 hr, it would recompute progress to be about 25%, and then begin model 2. The estimated time remaining would then recalculate to roughly 3hrs. As model 2 progresses, you'll see fractional increases, again as it reaches various points within the model. The estimated runtime remaining will be shown to INCREASE during model 2, once it completes (another hour) progress is recalculated to be 50%, and estimated remaining time is recalculated and so takes a sudden drop over what it just was near the end of the model.

In short, with these proteins varying so widely in size, it really doesn't KNOW how far done it is until it completes that first model to give it a frame of reference. In your case, the first model exceeded your runtime preference and so you zipped right to 100%.

Making this more intuitive is definately on the list of desirable things to improve. What's really unfortunate is that it's the MOST confusing for the short runtimes... which is the default :(

Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 15208 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
cduk

Send message
Joined: 10 Dec 05
Posts: 3
Credit: 27,710
RAC: 0
Message 15209 - Posted: 1 May 2006, 20:26:24 UTC

Feet1st,

Since posting I've re-read the FAQs (which have changed quite a bit since the last time I looked - I'll make a mental note to re-visit more often).

After doing this and after your excellent explanation, I now understand what was happening. It wasn't actually stuck, but since the progress %age wasn't moving and I hadn't seen this before in this or other BOINC projects, I mistakenly thought it had.

I appreciate that time estimation can be extremely difficult....!

Many thanks for your help.

CD
ID: 15209 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 6 · 7 · 8 · 9

Message boards : Number crunching : Report stuck & aborted WU here please - II



©2026 University of Washington
https://www.bakerlab.org