"Completed, too late to validate" - result of too short estimated runtime?

Message boards : Number crunching : "Completed, too late to validate" - result of too short estimated runtime?

To post messages, you must log in.

AuthorMessage
Chris

Send message
Joined: 12 Apr 06
Posts: 6
Credit: 13,598,060
RAC: 0
Message 94526 - Posted: 15 Apr 2020, 10:21:47 UTC

Hi all,
For about a week now, I'm contributing next to nothing, maybe because:
- estimated runtime for my rig is about 5 hrs per task
- real runtime for most WUs is between 20 and 24 hrs
- deadline for new WUs is very short

As a result, majority of WUs are either aborted or not validated, because they finish too late.
I've set my job cache to 0.8 + 0.1, two days ago, but today morning it's still the same, 24hrs of crunching for nothing, too late.
There are not started WUs in queue with deadline today, which I know I have to abort, or it will be waste of cpu.

I'm using two PCs with Win10, one is 6800K, the other 9700K, both with similar issues.
What I could possibly do to make sure those WUs are processed and validated?

Chris
ID: 94526 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1507
Credit: 14,928,967
RAC: 21,309
Message 94531 - Posted: 15 Apr 2020, 11:24:40 UTC - in response to Message 94526.  

What I could possibly do to make sure those WUs are processed and validated?

Chris
It all depends on what values your other settings have.
For starters, go to your Account, Rosetta@home preferences, make sure "Target CPU run time" is "not selected" and save the changes.
Hit update on the BOINC Manager for changes to take effect.
That will mean the Target CPU Runtime for any new Tasks to start will be 8 hours (the default). Already started Tasks will continue to run to the previous setting.


I've set my job cache to 0.8 + 0.1, two days ago,
If you're running more than one project, that is still too big.

With your computer(s) hidden it's not possible to see what else is going on, but i'd suggest the following settings, particularly the much smaller cache setting
Computing
   Usage limits	
                                   Use at most 100% of the CPUs
                                   Use at most 100% of CPU time

   When to suspend	
           Suspend when computer is on battery (not selected)
               Suspend when computer is in use (not selected)
 Suspend GPU computing when computer is in use (not selected)
   'In use' means mouse/keyboard input in last 3 minutes
  Suspend when no mouse/keyboard input in last --- minutes
     Suspend when non-BOINC CPU usage is above --- %
                          Compute only between ---

   Other	
                                Store at least 0.3 days of work
                     Store up to an additional 0.02 days of work
                    Switch between tasks every 60 minutes
     Request tasks to checkpoint at most every 60 seconds

   Disk
                              Use no more than 20 GB
                                Leave at least 2 GB free
                              Use no more than 60 % of total

   Memory
          When computer is in use, use at most 95 %
      When computer is not in use, use at most 95 %
 Leave non-GPU tasks in memory while suspended (not selected)
                   Page/swap file: use at most 75 %
It will still take some time for the Estimated completion times to match the Target CPU Runtime, so some Taks may still time out, but it will help stop you from getting even more work that you won't be able to finish.
And it will allow the BOINC Manager to eventually honour your Resource share settings once the Estimated completion times get closer to reality.

Use at most 100% of the CPUs may cause memory problems with different Tasks that require more RAM, but with your system(s) hidden there's no way of knowing.
Grant
Darwin NT
ID: 94531 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Chris

Send message
Joined: 12 Apr 06
Posts: 6
Credit: 13,598,060
RAC: 0
Message 94535 - Posted: 15 Apr 2020, 12:36:22 UTC - in response to Message 94531.  

Many thanks Grant,
My computers should be visible now.
I'm running only R@H these days after SETI closed, but anyway I followed your suggestion about job cache.
The older PC is running 100%CPU/Time, but it has liquid cooling, the newer one runs 6 out of 8 cores, since it's getting too hot and loud.

Best regards from literally other side of the globe ;)
Chris
ID: 94535 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1507
Credit: 14,928,967
RAC: 21,309
Message 94572 - Posted: 15 Apr 2020, 23:00:05 UTC - in response to Message 94535.  
Last modified: 15 Apr 2020, 23:03:22 UTC

I'm running only R@H these days after SETI closed, but anyway I followed your suggestion about job cache.
It'd be worth making sure the Target CPU Runtime in the Rosetta project preferences is set to 8 hours (or better yet "not selected" so it can change according to the project's requirements).



The older PC is running 100%CPU/Time, but it has liquid cooling, the newer one runs 6 out of 8 cores, since it's getting too hot and loud.
Good choice- if things get hot or noisy, limiting cores is better than reducing "Use at most xx % of CPU time" from 100%.
Both systems have plenty of RAM for the number of cores/threads so you won't run in to problems when the next group of large RAM requirement Tasks come through (roughly 1.3GB RAM per Task).

But i did notice an issue with the i7-6800K system.
Run time	1 days 1 hours 57 min 50 sec
CPU time	1 days 0 hours  2 min 34 sec
With 24hours of Runtime, i'd expect the difference between CPU time & Runtime to be around 10min or so, not 2hrs.

Is this the system you use during the day? Do you make use of any CPU intensive programmes? Because a difference between CPU time & Runtime that large shows the CPU is doing lots of work other than crunching BOINC work.
If you're making lots of CPU heavy use of the system, no problem. If not, you might want to use Task Manager (or Process Explorer) to see what is taking up that chunk of CPU time from BOINC work.
Grant
Darwin NT
ID: 94572 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : "Completed, too late to validate" - result of too short estimated runtime?



©2024 University of Washington
https://www.bakerlab.org