Time estimates seem less than optimal

Message boards : Number crunching : Time estimates seem less than optimal

To post messages, you must log in.

AuthorMessage
crystalsys
Avatar

Send message
Joined: 11 Aug 09
Posts: 8
Credit: 1,544,190
RAC: 627
Message 97285 - Posted: 8 Jun 2020, 12:12:12 UTC

So I check the controller before I go to bed, and it's crunching on tasks that had 8 hour estimates. Get up in the morning, they've been running all night, now the estimate is over a day? This will make the later ones, which also think they are 8 hour jobs, report late.
ID: 97285 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 97287 - Posted: 8 Jun 2020, 12:29:33 UTC - in response to Message 97285.  

So I check the controller before I go to bed, and it's crunching on tasks that had 8 hour estimates. Get up in the morning, they've been running all night, now the estimate is over a day? This will make the later ones, which also think they are 8 hour jobs, report late.

All three of my Ubuntu machines are now stuck on 8 hour estimates, even though I run the 18-hour jobs. They no longer correct themselves. It must be the new BOINC version (7.16.6 or 7.16.7, or whatever they have decided to call it today) and how it interacts with the server.
ID: 97287 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,767,285
RAC: 12,464
Message 97294 - Posted: 8 Jun 2020, 23:18:06 UTC - in response to Message 97285.  

So I check the controller before I go to bed, and it's crunching on tasks that had 8 hour estimates. Get up in the morning, they've been running all night, now the estimate is over a day? This will make the later ones, which also think they are 8 hour jobs, report late.


Reduce the cache size and just abort them so your wiingmen don't have to wait for the Project to abort them. Aborted units are put right back into the available units pile.
ID: 97294 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1481
Credit: 14,584,354
RAC: 14,609
Message 97300 - Posted: 9 Jun 2020, 5:55:44 UTC - in response to Message 97285.  

So I check the controller before I go to bed, and it's crunching on tasks that had 8 hour estimates.
The Estimate is for CPU time. Actual Runtime (the time it takes to actually process the Task) can be much, much, much longer if the system is overcommitted with other processes.
I suggest you use Task Manger or similar on your Windows system to see what else is using the CPU and taking CPU time away from processing Rosetta work.
Grant
Darwin NT
ID: 97300 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
theCase

Send message
Joined: 6 May 07
Posts: 7
Credit: 702,041
RAC: 0
Message 97444 - Posted: 18 Jun 2020, 16:47:25 UTC

I also have a problem with Rosetta having short deadlines. Rosetta consistently sends work units that cannot be finished in time. Currently I have ten(!) work units estimated at 8 hours each.

I've set up my "target CPU time" to two hours, "store at least" is 0.1, "additional" is 0.5, My PC is on 25% of the time with BOINC available 99.9%. Resource share is set to 15% (WCC plays a lot nicer). This is on an Intel I7-77xx eight core laptop.

Given the information Rosetta has available why does it continue to send WU's that will not be finished? I'd like to look at the scheduler code you're using because it is not doing a good job.
ID: 97444 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
sgaboinc

Send message
Joined: 2 Apr 14
Posts: 282
Credit: 208,966
RAC: 0
Message 97445 - Posted: 18 Jun 2020, 17:55:34 UTC - in response to Message 97444.  
Last modified: 18 Jun 2020, 18:03:09 UTC

I also have a problem with Rosetta having short deadlines. Rosetta consistently sends work units that cannot be finished in time. Currently I have ten(!) work units estimated at 8 hours each.

I've set up my "target CPU time" to two hours, "store at least" is 0.1, "additional" is 0.5, My PC is on 25% of the time with BOINC available 99.9%. Resource share is set to 15% (WCC plays a lot nicer). This is on an Intel I7-77xx eight core laptop.

Given the information Rosetta has available why does it continue to send WU's that will not be finished? I'd like to look at the scheduler code you're using because it is not doing a good job.
try using 0.1/0 instead i.e. don't cache tasks,
i think partly the long run times of several hours for each task would stretch out timelines in particular if boinc-client is multi-tasking between projects. each project gets less time to run and that the tasks that needs a long time to run gets further stretched out. hence, if you run multiple projects don't cache tasks
ID: 97445 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1481
Credit: 14,584,354
RAC: 14,609
Message 97446 - Posted: 18 Jun 2020, 18:50:59 UTC - in response to Message 97444.  
Last modified: 18 Jun 2020, 18:53:18 UTC

Given the information Rosetta has available why does it continue to send WU's that will not be finished?
Why wouldn't they be finished?
If you are using all cores and threads, that's 8 you have available to process work, so those 10 Tasks would be done in just over 16 hours with an 8 hour Target CPU time.

BOINC takes in to account the number of cores & threads it can use, the amount of time a system is on, the amount of time it can do work while a system is on, the amount of time it takes to actually process a Tasks (Runtime as opposed to the CPU time), Task deadlines, Estimated Completion times and your Resource share settings.
As posted by sgaboinc, the smaller your cache- the better (and the "additional days" value should be the smaller one of the 2 values to avoid odd behaviour). It reduces the risks of missed deadlines, and it allows BOINC to juggle things so that projects are done in accordance with your Resource share settings sooner rather than later. Whenever you make changes to anything that affects processing (such as Resource share settings & Target CPU time), the Manager has to re-juggle things to match those changes. The time it takes to do that can range from several days (no cache, only 2 projects with 50:50 Resource share settings) to months (large cache, multiple projects with varying Resource share settings).
Grant
Darwin NT
ID: 97446 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JohnDK
Avatar

Send message
Joined: 6 Apr 20
Posts: 33
Credit: 2,390,240
RAC: 432
Message 97447 - Posted: 18 Jun 2020, 19:04:46 UTC - in response to Message 97287.  

So I check the controller before I go to bed, and it's crunching on tasks that had 8 hour estimates. Get up in the morning, they've been running all night, now the estimate is over a day? This will make the later ones, which also think they are 8 hour jobs, report late.

All three of my Ubuntu machines are now stuck on 8 hour estimates, even though I run the 18-hour jobs. They no longer correct themselves. It must be the new BOINC version (7.16.6 or 7.16.7, or whatever they have decided to call it today) and how it interacts with the server.

The same here, both linux and windows. I had set my 24/7 hosts to 14 hours, but, I think, after the 4.20 app they would only run around 8 hours.

My understanding was that longer running tasks would mean less press on the servers, but now I've changed my settings back to the default 8 hours runtime. Things work just as well, it just means more work to do for the servers I guess.
ID: 97447 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
theCase

Send message
Joined: 6 May 07
Posts: 7
Credit: 702,041
RAC: 0
Message 97454 - Posted: 19 Jun 2020, 2:28:41 UTC - in response to Message 97445.  

I will change my settings as you recommend:

    But why do other projects (WCG) NOT have a problem with my settings?

    Why does Rosetta send me 8 hour work units when I specify 2 Hour WU's?

    Why does Rosetta send me WU''s due in three days when they can't be finished?


As I mentioned, I believe there are problems in the scheduler algorithm. Something is not right....

ID: 97454 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1481
Credit: 14,584,354
RAC: 14,609
Message 97455 - Posted: 19 Jun 2020, 7:45:11 UTC - in response to Message 97454.  

Why does Rosetta send me 8 hour work units when I specify 2 Hour WU's?
Have you saved those changes?
If they aren't saved, then then changes won't be sent through the next time the computer contacts the project.



Why does Rosetta send me WU''s due in three days when they can't be finished?
What makes you think they can't be finished?
As i pointed out, 16 hours is all that's needed to finish them.



If you're so concerned about what is being sent, turn on the CPU scheduling debug option for the Event log & see what is being asked for and why (BOINC Manager, Advanced view, Options, Event log options).
Grant
Darwin NT
ID: 97455 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
theCase

Send message
Joined: 6 May 07
Posts: 7
Credit: 702,041
RAC: 0
Message 97462 - Posted: 19 Jun 2020, 13:59:21 UTC - in response to Message 97455.  
Last modified: 19 Jun 2020, 14:00:03 UTC

Why does Rosetta send me 8 hour work units when I specify 2 Hour WU's?Have you saved those changes? Yes, as of two/three months ago.
If they aren't saved, then then changes won't be sent through the next time the computer contacts the project.

Why does Rosetta send me WU''s due in three days when they can't be finished?What makes you think they can't be finished? my PC "duty time" is about 25% and I'll be turning off my PC for the next several days, it will probably be off till the 22nd. (the 10 Rosetta tasks currently running are due on 6/21)
As i pointed out, 16 hours is all that's needed to finish them.


If you're so concerned about what is being sent, turn on the CPU scheduling debug option for the Event log & see what is being asked for and why (BOINC Manager, Advanced view, Options, Event log options).[/quote] Weird, I never noticed the "event log options" even after running BOINC for 20 years, I will take a look! Thanks for all your help.[/size]
ID: 97462 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,767,285
RAC: 12,464
Message 97464 - Posted: 19 Jun 2020, 16:45:48 UTC - in response to Message 97462.  


Why does Rosetta send me WU''s due in three days when they can't be finished?What makes you think they can't be finished? my PC "duty time" is about 25% and I'll be turning off my PC for the next several days, it will probably be off till the 22nd. (the 10 Rosetta tasks currently running are due on 6/21)
As i pointed out, 16 hours is all that's needed to finish them.


And exactly HOW would ANY program know that you PLAN to shut down your pc for the next several days after running it for days on end around 25% of the time?
The easy answer is to set your pc to no new tasks and just abort all the tasks you won't be able to finish, they will be sent to the 'tasks available' list and someone else will download and crunch them. Boinc uses and analgorithm to figure out how often you crunch and how fast your pc goes thru them but it takes time to figure that out, since you only crunch about 25% of the time it's taking a bit longer. My pc's on the other hand crunch 24/7 so it's a bit quicker.
ID: 97464 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1481
Credit: 14,584,354
RAC: 14,609
Message 97466 - Posted: 19 Jun 2020, 18:54:57 UTC - in response to Message 97462.  

my PC "duty time" is about 25% and I'll be turning off my PC for the next several days, it will probably be off till the 22nd. (the 10 Rosetta tasks currently running are due on 6/21)
As Mikey posted, BOINC bases it's work Scheduling on your system's past usage. And based on that, the work can be done in time- it has no way of knowing that you plan to shut it down for a longer than usual period of time.
Grant
Darwin NT
ID: 97466 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
theCase

Send message
Joined: 6 May 07
Posts: 7
Credit: 702,041
RAC: 0
Message 97471 - Posted: 19 Jun 2020, 23:03:46 UTC

While I understand the thought process of "how can one plan for three day shutdown" , consider this:

    My target CPU time is 2 hours,
    Time devoted to Rosetta is 15% (as opposed to other projects).
    MY CPU is on 25% of the time.


These three items are available to the scheduler.

Why do I get ten work units with an estimated run time of 10 hours all at once? Why are they due in three days?

In a perfect world, the scheduler would send me one task every few days, (e.g 15% of 6 hours a day is 54 minutes of CPU time dedicated to Rosetta), instead the scheduler sends nothing for days, then slams my PC with 100 hours worth of tasks due in three days.

I feel there may be a problem with the scheduler...

ID: 97471 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1481
Credit: 14,584,354
RAC: 14,609
Message 97472 - Posted: 20 Jun 2020, 1:34:46 UTC - in response to Message 97471.  

Why do I get ten work units with an estimated run time of 10 hours all at once?
Because that is what you requested, according to your preferences & settings.



Why are they due in three days?
Because that is how long the deadlines are.



In a perfect world, the scheduler would send me one task every few days, (e.g 15% of 6 hours a day is 54 minutes of CPU time dedicated to Rosetta), instead the scheduler sends nothing for days, then slams my PC with 100 hours worth of tasks due in three days.
I feel there may be a problem with the scheduler...
I feel there maybe a problem with your understanding of what Resource share is and how it is worked out & how BOINC goes about meeting your settings with the resources it has available & the time limits it is aware of.
Processing time between projects isn't allocated according to time spent, but work actually done for each project, based on what is known as REC- Recent Equivalent Credit.
If Rosetta is owed more processing, then it will get more work; If your other project is owned more processing, then it will get more work.

As i have mentioned several times before- your system is capable of processing the work your system received. That you are intending to do something that the Scheduler has no way of know about is the only reason for any work missing deadlines. The Scheduler isn't the issue- it's what you plan to do that is out of usual operations for that system that is the issue.
Grant
Darwin NT
ID: 97472 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,767,285
RAC: 12,464
Message 97479 - Posted: 20 Jun 2020, 23:21:05 UTC - in response to Message 97471.  

While I understand the thought process of "how can one plan for three day shutdown" , consider this:

    My target CPU time is 2 hours,
    Time devoted to Rosetta is 15% (as opposed to other projects).
    MY CPU is on 25% of the time.


These three items are available to the scheduler.

Why do I get ten work units with an estimated run time of 10 hours all at once? Why are they due in three days?

In a perfect world, the scheduler would send me one task every few days, (e.g 15% of 6 hours a day is 54 minutes of CPU time dedicated to Rosetta), instead the scheduler sends nothing for days, then slams my PC with 100 hours worth of tasks due in three days...



I think Grant is getting to this part of your comment "Time devoted to Rosetta is 15% (as opposed to other projects)."
Where did you set that? Are you thinking 'Resource Share' works like that?
ID: 97479 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Time estimates seem less than optimal



©2024 University of Washington
https://www.bakerlab.org