Job Sizes

Message boards : Number crunching : Job Sizes

To post messages, you must log in.

AuthorMessage
No longer involved

Send message
Joined: 19 Mar 06
Posts: 22
Credit: 327,220
RAC: 0
Message 37442 - Posted: 5 Mar 2007, 1:29:14 UTC

I have used the preferences panel provided to limit the size of jobs being received to about 4 hours. However, the current jobs are running at 10+ hours estimated time. If there is a preferences option, why is it not being used? Am I not detecting something here?
ID: 37442 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 37444 - Posted: 5 Mar 2007, 1:40:27 UTC

Your machines are hidden... have you completed any work units? How long did they actually take to complete?

In short, don't worry too much about the estimate. You must complete at least one model, and for some work units that can take a couple of hours. But at each model completion, Rosetta decides if it should crunch another model, or mark the task completed. If the time of the first models indicates that another can be completed within your runtime preference, then it will begin another model.

...but the estimated time is really only accurate when you've completed a model.
Rosetta Moderator: Mod.Sense
ID: 37444 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
No longer involved

Send message
Joined: 19 Mar 06
Posts: 22
Credit: 327,220
RAC: 0
Message 37531 - Posted: 6 Mar 2007, 15:07:29 UTC - in response to Message 37444.  
Last modified: 6 Mar 2007, 15:13:01 UTC

Your machines are hidden... have you completed any work units? How long did they actually take to complete?

In short, don't worry too much about the estimate. You must complete at least one model, and for some work units that can take a couple of hours. But at each model completion, Rosetta decides if it should crunch another model, or mark the task completed. If the time of the first models indicates that another can be completed within your runtime preference, then it will begin another model.

...but the estimated time is really only accurate when you've completed a model.


In short then, it does not matter what we select in the preferences, Rosetta will decide that running two jobs both over 10+ hours is OK. This usually results in four or six jobs in a partially completed state at any given time. May be that is why I get these messagea bout "if this continues you may need to reset the project". Oh well, I'll just keep cancelling them when I see them sitting in the que. When it down loads a group of them I just suspend the project. I don't need disk space taken up with uncompleted jobs.
ID: 37531 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 37533 - Posted: 6 Mar 2007, 16:23:21 UTC - in response to Message 37531.  

In short then, it does not matter what we select in the preferences,


The preference determines how long the WU will crunch. The estimated time that BOINC displays is a random guess by the BOINC client and it is not directly affected by the preference. After many WUs are completed, BOINC will notice that they are taking less time than its estimate, and then BOINC will adjust its guesses downward.

In other words, ignore the estimated time that BOINC shows you.
ID: 37533 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile meshmar

Send message
Joined: 1 Apr 06
Posts: 26
Credit: 176,432
RAC: 0
Message 37537 - Posted: 6 Mar 2007, 17:46:25 UTC

Setting preferences affects how long the WU actually crunches. The fact that the estimated time is way off has a major impact on my downloaded queue for ALL my projects - not just Rosetta. If a Rosetta WU can't be acurately predicted by Boinc then it is a Rosetta issue - not a Boinc issue ...
ID: 37537 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nothing But Idle Time

Send message
Joined: 28 Sep 05
Posts: 209
Credit: 139,545
RAC: 0
Message 37538 - Posted: 6 Mar 2007, 18:08:52 UTC - in response to Message 37537.  

Setting preferences affects how long the WU actually crunches. The fact that the estimated time is way off has a major impact on my downloaded queue for ALL my projects - not just Rosetta. If a Rosetta WU can't be acurately predicted by Boinc then it is a Rosetta issue - not a Boinc issue ...

Boinc doesn't "predict" the run time; it doesn't know that you have chosen a run time of 4 hours or 24 hours until you run a few tasks at the preferred length and boinc learns it by adjusting the DCF accordingly. You can hasten the learning process by adjusting the DCF manually yourself, if you feel up to it. It is better to learn how the system is designed to work and accomodate it.
ID: 37538 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 37540 - Posted: 6 Mar 2007, 18:13:38 UTC

GEM, the only way I've seen multiple tasks get started and not continue crunching, the way you describe, is when BOINC feels it is short of the memory required. It seems to halt the second task, which throws it over the memory limit you've asked BOINC to live within, and then it begins another task. If that second task used less memory, then perhaps it could keep both running. And BOINC has no way to know, until it starts running it, how successful it will be in that effort.

Aborting the tasks is not resolving anything. BOINC will run them when it can do so within your memory preferences.

The error about the zero finish line is another issue, and not related to the rest of this discussion.

I did not say that 10hours was OK. I am confident that you will find that if you actually let them run to completion, that they will complete normally in roughly your 4 hour preference. What I was attempting to imply is just that there is always that "one model completed" minimum. And on a slow computer and a task with a very long protein, it might take 6 or 8 hours to complete that one model. And this would be true regardless of the runtime preference, and the estimated completion time will not be recomputed until that one model completes. But if your CPU is faster then say 1Ghz, you should see completion within the runtime you've configured for the location of a given computer. My 3Ghz machines are crunching about 2 models per hour for the current tasks. After each model, Rosetta reassesses whether it should end now to conform to your runtime preference, it also recomputes the completion percentage which then causes BOINC to revise the estimated time to completion. So, if your machine is like mine, it should complete within a range of 3hr 15min to 3hr 45min with a 4hr preference in place.
Rosetta Moderator: Mod.Sense
ID: 37540 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile meshmar

Send message
Joined: 1 Apr 06
Posts: 26
Credit: 176,432
RAC: 0
Message 37571 - Posted: 7 Mar 2007, 12:17:47 UTC - in response to Message 37538.  

Setting preferences affects how long the WU actually crunches. The fact that the estimated time is way off has a major impact on my downloaded queue for ALL my projects - not just Rosetta. If a Rosetta WU can't be acurately predicted by Boinc then it is a Rosetta issue - not a Boinc issue ...

Boinc doesn't "predict" the run time; it doesn't know that you have chosen a run time of 4 hours or 24 hours until you run a few tasks at the preferred length and boinc learns it by adjusting the DCF accordingly. You can hasten the learning process by adjusting the DCF manually yourself, if you feel up to it. It is better to learn how the system is designed to work and accomodate it.


I've had Rosetta running on the same boxes with no changes in settings for a long time. The predicted time changes every time a different type of WU is sent by Rosetta. Sometimes this can be a drastic change. Some of the recent WUs have been predicted for 24 hour run times! I should not be required to adjust the DCF every time Rosetta sends a different type of WU. If I have a preferred run time of 4 hours, then my predicted time should NEVER be more then 4 hours the first time a new type of WU is sent. If it is, then Rosetta and Boinc have issues that need to be fixed.

ID: 37571 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nothing But Idle Time

Send message
Joined: 28 Sep 05
Posts: 209
Credit: 139,545
RAC: 0
Message 37573 - Posted: 7 Mar 2007, 12:44:24 UTC - in response to Message 37571.  
Last modified: 7 Mar 2007, 12:48:24 UTC

I've had Rosetta running on the same boxes with no changes in settings for a long time. The predicted time changes every time a different type of WU is sent by Rosetta. Sometimes this can be a drastic change. Some of the recent WUs have been predicted for 24 hour run times!
I have never observed this behavior, new one on me. Once boinc learned that my preference is 8 hours I have never seen the estimated run time vary from that number regardless of what task I was given. Anyone else observe this behavior? The only thing I can think of is that the estimated runtime supplied by the Rosetta project is not the one usually provided, then the DCF would lead to an incorrect estimate. Example: project supplied estimate is 8 hours and your DCF is 0.5 leading to an runtime estimate of 4 hours. If a new task is estimated at 12 hours by the project -- rather than the usual 8 -- then your DCF would lead to an estimated runtime of 6 hours instead of 4...something like that.
ID: 37573 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 37600 - Posted: 7 Mar 2007, 23:52:21 UTC - in response to Message 37573.  

Anyone else observe this behavior?

I've had my runtime preference set at 24 hours ever since that was possible, and noticed my last few jobs saying they were 6 or 8 hours long. My settings here still say 24 hours. I'm running Boinc 5.4.9 still; if the new server software requires a newer Boinc client to work right.


ID: 37600 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nothing But Idle Time

Send message
Joined: 28 Sep 05
Posts: 209
Credit: 139,545
RAC: 0
Message 37613 - Posted: 8 Mar 2007, 12:23:36 UTC - in response to Message 37600.  

Anyone else observe this behavior?

I've had my runtime preference set at 24 hours ever since that was possible, and noticed my last few jobs saying they were 6 or 8 hours long. My settings here still say 24 hours. I'm running Boinc 5.4.9 still; if the new server software requires a newer Boinc client to work right.
Thanks. Well, maybe I don't understand how things REALLY work, but my brain tells me that if the DCF and your preferred run time are to work as intended then every estimated task runtime (provided by the project) must remain constant no matter what... otherwise the DCF will never become adjusted to your run preference, it will always be a moving target.
ID: 37613 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile River~~
Avatar

Send message
Joined: 15 Dec 05
Posts: 761
Credit: 285,578
RAC: 0
Message 37622 - Posted: 8 Mar 2007, 18:14:17 UTC - in response to Message 37613.  

Anyone else observe this behavior?

I've had my runtime preference set at 24 hours ever since that was possible, and noticed my last few jobs saying they were 6 or 8 hours long. My settings here still say 24 hours. I'm running Boinc 5.4.9 still; if the new server software requires a newer Boinc client to work right.
Thanks. Well, maybe I don't understand how things REALLY work, but my brain tells me that if the DCF and your preferred run time are to work as intended then every estimated task runtime (provided by the project) must remain constant no matter what... otherwise the DCF will never become adjusted to your run preference, it will always be a moving target.


hmmm,

In the normal case (ie non-Rosetta), the estimated run length needs to be proportional to the number of flops in a task, so that machines can learn the constant of proportionality.

In the case of Rosetta, the wall time is the time used to calibrate the DCF, as it is the wall time that determines when a run stops, for any given setting.

So to the first approximation I agree that the server-provided run length wants to be constant for any given machine. Ideally, so that new users start from a plausible approximation, it should depend inversely on the benchmark, so that the initial approx is 3hrs. Even if it does not depend on the benchmark, if it is constant then the machine only has to learn the DCF once.

But then, different tasks run at different cpu efficiencies. A task that over-fills memory and needs a lot of swapping will run less cpu in the same wall time. This will mean that the DCF rises, as the average useful cpu cycle takes longer.

This means that keeping a constant asserted value is still not going to keep the predicted run length stable when there is a varied mix of kinds of workunit. The whole DCF approach is based on the premise that work from a given project has (for example) the same mix of adds and multiplies, and the same proportion of page faults, etc etc over a huge number of WU. This assumption works well on Einstein, SETI, and LHC, which effectively run the same program for months or years.

However, as these effects cannot be predicted easily, and as they vary considerably from one machine to another, I still agree that constant is the best approach -- just don't be surprised if the effects fluctuate rather.

R~~

ID: 37622 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Job Sizes



©2024 University of Washington
https://www.bakerlab.org