Why are my 'Remaining' time estimates so far off?

Message boards : Number crunching : Why are my 'Remaining' time estimates so far off?

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
scott

Send message
Joined: 18 Aug 19
Posts: 4
Credit: 2,238,450
RAC: 0
Message 98134 - Posted: 16 Jul 2020, 19:49:26 UTC
Last modified: 16 Jul 2020, 19:50:05 UTC

Hi Everyone!

I have been running BOINC on my desktop, which has a Ryzen 2600 (12-cores @ 3.66) and 16GB of memory, and when I start a task, it estimates that they will take 8-hours to compete, but end up taking about 24 hours to complete.

Why is the estimate so far off of the actual time required? I have linked a screenshot of my BOINC window and the properties of one of the tasks.

https://imgur.com/a/vrkpngV

Thank you for your help!
ID: 98134 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
manalog

Send message
Joined: 8 Apr 15
Posts: 24
Credit: 233,155
RAC: 0
Message 98135 - Posted: 16 Jul 2020, 20:11:33 UTC - in response to Message 98134.  

Which runtime did you set on your preferences?
Default is 8 hours, but you could modified it to one day. The other option is that your computer is taking 24 hours just for the first decoy, but this is very unlikely (in particular given your host's performances)
ID: 98135 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1465
Credit: 14,162,844
RAC: 15,905
Message 98138 - Posted: 16 Jul 2020, 20:43:37 UTC - in response to Message 98134.  

Why is the estimate so far off of the actual time required?
If your system is busy doing things other than processing BOINC work, or you have "Use at most 100% of CPU time" set to anything less than 100% it will take more than 8 hours Runtime to do 8 hours of CPU time. A heavily overcommitted system can take 24 hours to complete one 8 hour Task (or longer).

However in your case there is no sign of any recently completed work in your Task list, so it's not possible to see if that is the case here.


I have linked a screenshot of my BOINC window and the properties of one of the tasks.
And it shows a group of Tasks that have just started processing (3min down) and 8 hours to go, which is what i would expect.
Grant
Darwin NT
ID: 98138 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
CIA

Send message
Joined: 3 May 07
Posts: 100
Credit: 21,059,812
RAC: 0
Message 98164 - Posted: 17 Jul 2020, 17:18:35 UTC - in response to Message 98134.  
Last modified: 17 Jul 2020, 17:20:58 UTC

Just as a heads up, you are running an older version of BOINC. I suggest you download the current version. https://boinc.berkeley.edu/download_all.php
ID: 98164 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1465
Credit: 14,162,844
RAC: 15,905
Message 98190 - Posted: 17 Jul 2020, 23:20:01 UTC

OK, looking at the results- you aren't using the default Target CPU time which is 8 hours, you've set it to 24 hours. The Estimated times (as i understand it) were meant to be determined now by your Targe CPU time.
However it would appear that they are set to the default CPU time (8 hours). As you process & return work, then those Estimated times will eventually end up matching the actual processing time. But it will take a while.

As it is, there are signs of heavy usage of that system- the CPU time & Runtime show a difference of an hour and a quarter
Run time 1 days 1 hours 12 min 51 sec
CPU time 1 days 0 hours  0 min  2 sec
On a lighly used system i would expect a difference of maybe 5min over 24hrs of processing.
Grant
Darwin NT
ID: 98190 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 98196 - Posted: 18 Jul 2020, 8:59:50 UTC - in response to Message 98190.  

However it would appear that they are set to the default CPU time (8 hours). As you process & return work, then those Estimated times will eventually end up matching the actual processing time. But it will take a while.

Unfortunately, the corrections no longer work. All of my Ryzen 3000's are set for 18 hour work units, but show 8 hour estimates.
It has been that way for weeks/months. (And I use a config.xml to speed up the corrections too.)
ID: 98196 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 0
Message 98201 - Posted: 18 Jul 2020, 12:31:14 UTC

As far as I can see, all tasks are delivered to all clients (regardless of preferences) with an estimated 80 000 GFLOPs of work to perform and a command-line option to run for 8 hours:
<workunit>
    <rsc_fpops_est>80000000000000.000000</rsc_fpops_est>
    <command_line>… -cpu_run_time 28800 …</command_line>
</workunit>
The associated application is declared as achieving 2.77 GFLOPs per second:
<app_version>
    <app_name>rosetta</app_name>
    <version_num>420</version_num>
    <flops>2777777777.777778</flops>
</app_version>
The application must also see the project preference for target run time and, if set, override the command-line parameter with it. But BOINC does not know about that, so will initially estimate that each task will take 80 000 ÷ 2.77 seconds, which is 8 hours.

As tasks complete, BOINC compares the elapsed time with the initial estimate and calculates a correction factor. Over time, this factor gets adjusted so that future estimates should be closer to the actual run time. You can see the adjustments by enabling dcf_debug in your Event Log options, and see the factor (for your own machines) on the Computer Details page online. So with target run time set to 24 hours, the correction factor should end up at 3 and the estimate should match the target. (Lots of ‘should’s here, I know; maybe it doesn’t actually work like this, or at all…)
ID: 98201 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 98202 - Posted: 18 Jul 2020, 14:07:51 UTC - in response to Message 98201.  

As tasks complete, BOINC compares the elapsed time with the initial estimate and calculates a correction factor. Over time, this factor gets adjusted so that future estimates should be closer to the actual run time. You can see the adjustments by enabling dcf_debug in your Event Log options, and see the factor (for your own machines) on the Computer Details page online. So with target run time set to 24 hours, the correction factor should end up at 3 and the estimate should match the target. (Lots of ‘should’s here, I know; maybe it doesn’t actually work like this, or at all…)

The estimate corrections stopped working with 4.20. I don't know if it is a bug or a feature, or even if they know about it.
The communications is less than ideal.
ID: 98202 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
scott

Send message
Joined: 18 Aug 19
Posts: 4
Credit: 2,238,450
RAC: 0
Message 98203 - Posted: 18 Jul 2020, 15:03:45 UTC - in response to Message 98190.  


As it is, there are signs of heavy usage of that system- the CPU time & Runtime show a difference of an hour and a quarter
Run time 1 days 1 hours 12 min 51 sec
CPU time 1 days 0 hours  0 min  2 sec
On a lighly used system i would expect a difference of maybe 5min over 24hrs of processing.


Yea fairly decent usage. I also have it set to use 100% of the CPUs 80% of the time, and to stop when non-BOINC is over 75%, which doesn't happen too often.

OK, looking at the results- you aren't using the default Target CPU time which is 8 hours, you've set it to 24 hours. The Estimated times (as i understand it) were meant to be determined now by your Targe CPU time.


Oh ok, thank you! I went into the settings and changed it.

Just as a heads up, you are running an older version of BOINC. I suggest you download the current version. https://boinc.berkeley.edu/download_all.php


Thanks, I will upgrade after these tasks finish!

My current ones are showing 11 hours down and 16 left, so it might be a bit more accurate. We'll see with the next batch! Thanks for the help everyone!
ID: 98203 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 0
Message 98205 - Posted: 18 Jul 2020, 16:36:50 UTC - in response to Message 98196.  

Jim1348 wrote:
(And I use a config.xml to speed up the corrections too.)
Could you give more detail about that? I wonder whether something here is interacting badly with BOINC’s attempts to determine the correction factor automatically.
ID: 98205 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 98206 - Posted: 18 Jul 2020, 16:47:08 UTC - in response to Message 98205.  

Could you give more detail about that?
It has always worked thus far. Maybe something has changed in BOINC?

<cc_config>        
 <options>                
  <rec_half_life_days>1.000000</rec_half_life_days>
  <use_all_gpus>1</use_all_gpus>                
  <allow_multiple_clients>1</allow_multiple_clients>        
  <allow_remote_gui_rpc>1</allow_remote_gui_rpc> 
  <max_file_xfers_per_project>4</max_file_xfers_per_project>
  <max_file_xfers>4</max_file_xfers> 
 </options>
</cc_config>

ID: 98206 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 0
Message 98209 - Posted: 18 Jul 2020, 17:38:24 UTC - in response to Message 98206.  

Thanks. I don’t see anything there that would affect the correction factor.

Digging deeper, I think the problem is that the project is configured with
<dont_use_dcf/>
which prevents the duration correction factor from being used at all…

Getting that fixed will require a project admin’s attention. I’ve mentioned it in the Ralph forum thread where the change was announced.
ID: 98209 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 98212 - Posted: 18 Jul 2020, 18:34:48 UTC - in response to Message 98209.  

Getting that fixed will require a project admin’s attention. I’ve mentioned it in the Ralph forum thread where the change was announced.

Very good. They did it for a reason. I hope it can be changed. You have saved us the trouble of asking. Thanks.
ID: 98212 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Daedalus

Send message
Joined: 1 Aug 08
Posts: 39
Credit: 9,948,624
RAC: 1,251
Message 98215 - Posted: 18 Jul 2020, 20:18:05 UTC

I use a 4 hours target time and all the WU's show a time to completion of 8 hours. But they complete in 4 hours and half.
ID: 98215 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,456,727
RAC: 11,262
Message 98216 - Posted: 18 Jul 2020, 22:49:59 UTC - in response to Message 98215.  

I use a 4 hours target time and all the WU's show a time to completion of 8 hours. But they complete in 4 hours and half.


Do you split time with other Boinc Projects?
ID: 98216 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1465
Credit: 14,162,844
RAC: 15,905
Message 98218 - Posted: 18 Jul 2020, 23:17:46 UTC

BOINC stopped using DCF long time ago- when Credit New came out from memory.
Rosetta uses it's own Credit mechanism (which i believe reverts to Credit New under certain circumstances).


As far as I can see, all tasks are delivered to all clients (regardless of preferences) with an estimated 80 000 GFLOPs of work to perform and a command-line option to run for 8 hours:
On all other projects, the rsc_fpops_est value is used for Estimated completion time & Credit calculations. Due to the way Rosetta works (fixed run time, not time to finish a given amount of data) they have a modified credit & Estimated completion time mechanism.


As far as I can see, all tasks are delivered to all clients (regardless of preferences) with an estimated 80 000 GFLOPs of work to perform and a command-line option to run for 8 hours:
That <command_line>… -cpu_run_time 28800 …</command_line> would explain the fixed Completion time estimates. My understanding was that it was meant to be supplied by the users Target CPU time setting.

The BOINC Manager uses that value for determining how much work to request, and i would expect the Scheduler makes use of it for determining how much work to actually send out (along with Max tasks per day etc). So that value is available to to given to each Task as it is sent out to different hosts.


At least the present system does stop most people from getting way more than they can handle when new work types/applications come out. The only ones now impacted will be those with larger than the default target CPU Runtime, and at least they won't get huge amounts more than they can process. Just a bit more.
Grant
Darwin NT
ID: 98218 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1465
Credit: 14,162,844
RAC: 15,905
Message 98221 - Posted: 18 Jul 2020, 23:22:56 UTC - in response to Message 98215.  

I use a 4 hours target time and all the WU's show a time to completion of 8 hours. But they complete in 4 hours and half.
The CPU is doing things other than just crunching BOINC work, that's why the discrepancy between CPU time & Runtime.

Your system.
Run time  4 hours 17 min 44 sec
CPU time  3 hours 57 min 49 sec


My lightly used system.
Run time  7 hours 55 min 24 sec
CPU time  7 hours 52 min 27 sec


My dedicated cruncher.
Run time  7 hours 58 min 13 sec
CPU time  7 hours 57 min 30 sec

Grant
Darwin NT
ID: 98221 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
scott

Send message
Joined: 18 Aug 19
Posts: 4
Credit: 2,238,450
RAC: 0
Message 98235 - Posted: 19 Jul 2020, 22:34:34 UTC

Well, after updating to the newest version and changing the time setting in my profile to 8 hours, it still appears to be doing the same thing. After updating, it also downloaded more than it is currently running, though I usually have it stop between sets of tasks so they stay on the same schedule, and it hasn't downloaded extra tasks before the update. They aren't any smaller, so I'm not sure why they populated, but they are probably based on the incorrect estimate of 8 hours. I can get these done before the deadline, but I was just curious why it happened after updating?

Picture of current tasks:

https://imgur.com/a/7AJutph
ID: 98235 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1465
Credit: 14,162,844
RAC: 15,905
Message 98239 - Posted: 20 Jul 2020, 5:44:52 UTC - in response to Message 98235.  
Last modified: 20 Jul 2020, 5:56:01 UTC

Well, after updating to the newest version and changing the time setting in my profile to 8 hours, it still appears to be doing the same thing.
Since you haven't changed anything else, of course it is.
You have so many cores & threads available, your system is on for so many hours each day, it is able to process BOPINC work for a certain percentage of that time, you have your cache set to a certain size, you have your Resource share between projects set to a certain ratio.
All the Manager is doing is meeting your settings.

If you only want it to download enough work to process, and no more, then set your cache to zero.


The larger your cache, the longer it takes to process work, the more projects you do, the more settings you change, etc, etc, etc the longer it will take for your Resource share settings to be honoured.



I also have it set to use 100% of the CPUs 80% of the time, and to stop when non-BOINC is over 75%, which doesn't happen too often.
That's part of your Task processing time problem. Having CPUs xx% of the time at anything less than 100% means things will take much longer than they should. If you have a problem with your system cooling, then limit the number of cores/threads in use, but keep "Use at most xx % of CPU time" at 100%.

And i personally don't bother with "Suspend when non-BOINC CPU usage is above --- %" at all. Rosetta (and most other BOINC projects) all run at idle priority. If something else starts running, it gets as much CPU time as it needs. No need to stop BOINC from processing, that happens anyway.
And other than badly behaved web page scripts, very few other general use programmes require even that much CPU time.

If you have a programme that really needs the CPU time when it's running (eg rendering), then you can use the Exclusive applications option in the BOINC Manager to stop it what that application starts up, then restart when it's done.
Grant
Darwin NT
ID: 98239 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Daedalus

Send message
Joined: 1 Aug 08
Posts: 39
Credit: 9,948,624
RAC: 1,251
Message 98338 - Posted: 25 Jul 2020, 21:35:24 UTC - in response to Message 98216.  

I use a 4 hours target time and all the WU's show a time to completion of 8 hours. But they complete in 4 hours and half.


Do you split time with other Boinc Projects?


I used to. But it was too much of a hassle so my main rig works for folding now.

None of my two computers are pure crunching boxes. I use them for common work. I used to even game on my main rig before the COVID crisis.
ID: 98338 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : Why are my 'Remaining' time estimates so far off?



©2024 University of Washington
https://www.bakerlab.org