How to Limit CPU cores ?

Questions and Answers : Preferences : How to Limit CPU cores ?

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
hadron

Send message
Joined: 4 Sep 22
Posts: 47
Credit: 760,162
RAC: 5,164
Message 107178 - Posted: 9 Oct 2022, 0:14:12 UTC - in response to Message 107176.  

I assume that "high priority" is determined by the resource share settings,
Partially.
<snip>
And looking at your LHC Tasks, that is an issue. It's taking you 13hrs 45 min to do 12hrs 15min worth of work. Same with Rosetta- it's taking you 9 hours to do 8 hours worth of work. So that's yet another thing BOINC has to take in to account when it's trying to work out what to do & when for which project & application.
Depending on the application, GPU Tasks may require a CPU thread to support each running GPU Task. If they have to share time with a CPU Task, then processing times increase significantly. Or you have some other application on your system sucking up CPU time.

I've made the two changes you suggest in the part I snipped out. All of that makes sense to me You did, in fact, go much further than I wanted to know: I didn't say so, but my original question was about how priority was determined when things are "normal", or what you refer to as "balanced".
The only CPU-intensive program I run that I can think of is the weekly check on the RAID array -- 6x6TB drives, so it takes quite some time. Even before I started running boinc, whenever that was running, watching videos (all are on that array) was difficult, with interruptions at least in the video stream, sometimes also in the audio or subs streams. So far, though, I haven't noticed any problems with boinc tasks.
All in all, then, I suppose I could give boinc one more thread to play with.

As for what I have quoted: That time difference I attribute to the fact that my processor has only 1 FPU per core, so two tasks running on that core must share the FPU between them. Without evidence contradicting that notion, I'm not all that concerned by this point. I don't run any GPU tasks. I have an AMD RX570, which is capable of at best openCL 1.1, and if I understand correctly, boinc (VirtualBox?) requires openCL 2.0. In any event, when I did enable GPU on WCG, I was not given any OPNG tasks, even though those were available at the time.
ID: 107178 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1481
Credit: 14,589,068
RAC: 14,761
Message 107185 - Posted: 9 Oct 2022, 4:00:50 UTC - in response to Message 107178.  
Last modified: 9 Oct 2022, 4:08:55 UTC

That time difference I attribute to the fact that my processor has only 1 FPU per core, so two tasks running on that core must share the FPU between them.
Nope.
Look at my Run times & CPU times for Universe. Bugger all difference, even on the system i use daily. And i use all cores & threads all the time for BOINC processing.



Without evidence contradicting that notion, I'm not all that concerned by this point.
*shrug*
A system working well shouldn't have a significant difference between CPU time & Run time.
Sure, if you're doing video transcoding or some CPU intensive task like that there will be a big difference in those times for Tasks done while doing that video work if BOINC is trying to use those same cores/threads. Or people that run Folding@home & BOINC and haven't reserved the needed number of cores & threads to support Folding so BOINC doesn't try to use them as well.
But for a system doing the usual email, word processing, Youtube type things, there should be very little difference between the two. A job that runs once a week (or once a day) or so will only impact the processing of Tasks being done at that same time, the rest of the time CPU times & Run times should be very close to the same.


Edit- eg the top computer on the Top Computers List (definitely a dedicated cruncher)
Run time	23 hours 58 min 27 sec
CPU time	23 hours 56 min 18 sec
Only a couple of minutes difference over 24 hours of processing.
Grant
Darwin NT
ID: 107185 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hadron

Send message
Joined: 4 Sep 22
Posts: 47
Credit: 760,162
RAC: 5,164
Message 107186 - Posted: 9 Oct 2022, 4:26:07 UTC - in response to Message 107185.  
Last modified: 9 Oct 2022, 4:58:10 UTC

OK, contradictory evidence -- in which case, I have absolutely no idea why the difference in run vs CPU times here. The only other things I can think of that might be eating up a lot of CPU time are a Java-based BT client, and Firefox (3 windows/25 tabs), and my system monitor shows that neither of those is taking as much CPU time as just one instance of VBoxHeadless.

PS, I was running with "Use at most 95% of CPU time". Increasing that to 100 as you suggested seems to have brought the two times a little closer. That's from tasks currently running; I'll have a clearer picture once they're complete.
ID: 107186 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hadron

Send message
Joined: 4 Sep 22
Posts: 47
Credit: 760,162
RAC: 5,164
Message 107192 - Posted: 9 Oct 2022, 15:06:48 UTC

OK, the board is clear, both app_config.xml files are gone, and everything is set according to all your recommendations. Current settings are:
max CPUs 11,, use 100% of CPU time (it does seem to make a bit of a difference)
Store 0.2 days/Store additional 0.01 days
Suspend when non-BOINC... disabled
Recovery half-life 1.0 days
On LHC, max CPUs 2 (this affects ATLAS only)
For now, I am going to let LHC tasks come down as they will; I will figure out how to limit Theory tasks when things have settled down.

On re-enabling task downloads, I have received 11 Rosetta tasks (all running) and nothing from LHC; the event log is telling me that it is not requesting LHC tasks because the job cache is full.
Changing the "Store at least...days" setting to 0.5 brought in 6 more Rosetta tasks, and 6 from LHC -- all ATLAS, which is surprising, as there are plenty of CMS and Theory tasks available also. Perhaps LHC's "max CPUs" setting does not work as advertised??
ID: 107192 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1481
Credit: 14,589,068
RAC: 14,761
Message 107213 - Posted: 10 Oct 2022, 7:21:45 UTC - in response to Message 107192.  

Changing the "Store at least...days" setting to 0.5 brought in 6 more Rosetta tasks, and 6 from LHC -- all ATLAS, which is surprising, as there are plenty of CMS and Theory tasks available also. Perhaps LHC's "max CPUs" setting does not work as advertised??
Continually changing things means they will never, ever settle down. Limiting the number of CPUs for one Project will impact other Projects processing, and not always in the way you might expect.

Leave the "Use at most x% of the CPUs" limit on to keep one for your non-BOINC activities. Put the cache back to 0.2days, and i would suggest removing the max CPU limit for LHC as well.
And then just let things be for several days, at least.
The fact that you went and increased the cache & downloaded all that other work just means it's going to take that much longer for things to finally settle down now that it has to clear all that work with it's particular deadlines before it can have any hope of meeting your Resource Share settings.

Reduce the cache again, (removing the LHC CPU limit will most likely speed up the processing of those Tasks which will get your Resource Share met sooner), and then just leave it alone for at least a few days. Even with the REC half-life set to 1, the extra work you got is going to impact it's ability to sort things out quickly.



Perhaps LHC's "max CPUs" setting does not work as advertised??
How do you think it's meant to work???
From what i can see it just limits the number of CPUs one of those Tasks can use. It does not limit the number or type of Tasks, just the number of CPU cores/threads they may use when they are being processed.
Generally if an application needs/can use more than one thread at a time then let it- the sooner work gets done then the sooner your Resource Share settings will be met. The longer it takes to process work, then the longer it will take for your Resource Share settings to be met.
Grant
Darwin NT
ID: 107213 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hadron

Send message
Joined: 4 Sep 22
Posts: 47
Credit: 760,162
RAC: 5,164
Message 107216 - Posted: 10 Oct 2022, 10:57:12 UTC - in response to Message 107213.  
Last modified: 10 Oct 2022, 11:04:08 UTC

Changing the "Store at least...days" setting to 0.5 brought in 6 more Rosetta tasks, and 6 from LHC -- all ATLAS, which is surprising, as there are plenty of CMS and Theory tasks available also. Perhaps LHC's "max CPUs" setting does not work as advertised??
Continually changing things means they will never, ever settle down. Limiting the number of CPUs for one Project will impact other Projects processing, and not always in the way you might expect.

Leave the "Use at most x% of the CPUs" limit on to keep one for your non-BOINC activities. Put the cache back to 0.2days, and i would suggest removing the max CPU limit for LHC as well.
And then just let things be for several days, at least.
The fact that you went and increased the cache & downloaded all that other work just means it's going to take that much longer for things to finally settle down now that it has to clear all that work with it's particular deadlines before it can have any hope of meeting your Resource Share settings.

Reduce the cache again, (removing the LHC CPU limit will most likely speed up the processing of those Tasks which will get your Resource Share met sooner), and then just leave it alone for at least a few days. Even with the REC half-life set to 1, the extra work you got is going to impact it's ability to sort things out quickly.


I'm not sure I want to tackle any of this right now. There is a backlog of 1.5 million Rosetta tasks awaiting validation. I don't see much chance of things on my system "settling down" until my completed tasks start getting validated.
Meanwhile, i can use the intervening time to play with a few of the settings to see just what they do. For example, you appear not to understand what I said previously. If I set Max CPUs on LHC to 2, then all I get is ATLAS tasks set to run on 2 threads (ATLAS is the only LHC project that is capable of multi-threading). I get nothing for CMS or Theory, even though the server stats show those projects have plenty of tasks available. If I set the limit to "no limit" all I get are ATLAS tasks set for 8 threads, that being the maximum number ATLAS tasks are capable of handling. There are still no tasks for the other 2 projects.
It is only when I set the number to 1 that I am sent tasks for all 3 projects.
That is why I said "does not work as advertised"; the wording "Max CPUs" suggests that tasks should be sent out up to and including that number of threads, which should then always involve a mix of all three projects. It does not.
What I most need to do now is find out if a command line setting (--nthreads N) in an app_config.xml file can override what LHC sends out (see the BOINC user manual, section Client Configuration and scroll down to "project level configuration" to see how to use this).

In all of the above, do also bear in mind that a single-threaded ATLAS task can take up to 24 hours to complete, while a CMS task can take up to 12 hours. 0.2 days is less than 5 hours, and just 2 Rosetta tasks or 3 Theory tasks -- or just 1 ATLAS task running on 8 threads -- will fill that size cache. With a cache size that small, is it not possible that I might never see anything other than Rosetta tasks, or perhaps LHC tasks -- all dependent on whose task is being reported when ET calls home?

These are all questions I feel I need to answer to my own satisfaction before trying to achieve this "balance" between Rosetta and LHC, and with the hiatus in Rosetta task validation, now seems a good time to do that.

The final fly in the ointment is that I still do not fully understand just what that "balance" is supposed to mean. You say it is a balance between the total work done by each group, Rosetta and LHC. However, the only numbers for "work done" I can see in boinc are the awarded credits -- and LHC hands out a more credits than Rosetta on a per-hour basis. Given that I have 433K credits on LHC, but only 6450 on Rosetta, finding that 80-20 balance looks like it will take a very long time to achieve, and once achieved, might never be capable of being sustained.
Please note that I'm not trying to pick a fight with you on this last point; it is simply how things look to me at the moment.

On a more positive note, I have noticed that setting my local preferences to "Use 100% of CPU time" to 100% has definitely improved the situation regarding Run time vs. CPU time, so I do thank you for being so insistent on that point.
ID: 107216 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
KLiK

Send message
Joined: 1 Apr 14
Posts: 2
Credit: 949,857
RAC: 107
Message 107218 - Posted: 10 Oct 2022, 14:13:41 UTC - in response to Message 107216.  

This issue also pisses me off, as Rosetta simply ignore app_config or is not taking it into account at all.

So abandoning Rosetta@home with all future task, until they (scientists) figure it out how to put a software that is "inclusional" & not "exclusive". 👍

non-profit org. Play4Life in Zagreb, Croatia, EU
ID: 107218 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hadron

Send message
Joined: 4 Sep 22
Posts: 47
Credit: 760,162
RAC: 5,164
Message 107229 - Posted: 10 Oct 2022, 19:55:38 UTC - in response to Message 107218.  

This issue also pisses me off, as Rosetta simply ignore app_config or is not taking it into account at all.

So abandoning Rosetta@home with all future task, until they (scientists) figure it out how to put a software that is "inclusional" & not "exclusive". 👍

I haven't noticed this. Previously, I was limiting Rosetta to 2 running tasks, and this setting in the app_config file was being honoured.
ID: 107229 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1481
Credit: 14,589,068
RAC: 14,761
Message 107237 - Posted: 11 Oct 2022, 4:51:05 UTC - in response to Message 107229.  

This issue also pisses me off, as Rosetta simply ignore app_config or is not taking it into account at all.

So abandoning Rosetta@home with all future task, until they (scientists) figure it out how to put a software that is "inclusional" & not "exclusive". 👍

I haven't noticed this. Previously, I was limiting Rosetta to 2 running tasks, and this setting in the app_config file was being honoured.
There has been an issue for years with making use of the setting to limit running Tasks per Project. It has always been very intermittent as to when the bug will manifest- some people will get it without fail on a particular project, where as others can use it without problems occurring on that project, but have the problem on a different project.
I think the latest version of BOINC is meant to include a fix for the bug, but i'm very unsure if that is the case.
Grant
Darwin NT
ID: 107237 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hadron

Send message
Joined: 4 Sep 22
Posts: 47
Credit: 760,162
RAC: 5,164
Message 107257 - Posted: 11 Oct 2022, 23:36:03 UTC - in response to Message 107237.  

This issue also pisses me off, as Rosetta simply ignore app_config or is not taking it into account at all.

So abandoning Rosetta@home with all future task, until they (scientists) figure it out how to put a software that is "inclusional" & not "exclusive". 👍

I haven't noticed this. Previously, I was limiting Rosetta to 2 running tasks, and this setting in the app_config file was being honoured.
There has been an issue for years with making use of the setting to limit running Tasks per Project. It has always been very intermittent as to when the bug will manifest- some people will get it without fail on a particular project, where as others can use it without problems occurring on that project, but have the problem on a different project.
I think the latest version of BOINC is meant to include a fix for the bug, but i'm very unsure if that is the case.

I never had any problems with it when I was limiting LHC tasks, so maybe it is fixed (boinc 7.18.1).
ID: 107257 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hadron

Send message
Joined: 4 Sep 22
Posts: 47
Credit: 760,162
RAC: 5,164
Message 107267 - Posted: 12 Oct 2022, 6:13:18 UTC

OK, I am up and running, with a good mix of all task types, Rosetta, ATLAS, CMS and Theory. I did have to fudge things a bit to get to this state, but I have reset things according to previous suggestions -- and yes, Grant, I will leave things alone now :D

I even found out how to get the ATLAS tasks to run on 2 threads, even though in my LHC preferences I am only calling for single-thread tasks. Details of how to do this in the app_config.xml file are available on request.
There may, however, be a small glitch in this method. The running ATLAS task is definitely running on 2 threads (as determined by comparing CPU time with real time at 1 minute real time intervals), but boinc thinks it is using only 1. For a time, boinc was using all 12 threads, until I set "use at most... CPUs" to 10 (85%). The next ATLAS task should tell me if this was just an artifact of the way I grabbed that first task, or if a bug report is in order.

PS, omg is Rosetta ever a CPU hog!! 9 running Rosetta tasks have pushed the CPU temperature to over 85°C, a temperature I have never seen before on this system.
ID: 107267 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1481
Credit: 14,589,068
RAC: 14,761
Message 107268 - Posted: 12 Oct 2022, 6:37:03 UTC - in response to Message 107267.  

PS, omg is Rosetta ever a CPU hog!! 9 running Rosetta tasks have pushed the CPU temperature to over 85°C, a temperature I have never seen before on this system.
You might want to check your CPU cooling. The Rosetta applications haven't been optimised & their thermal impact even when running at 100% load & using all cores and threads isn't particularly high.
Grant
Darwin NT
ID: 107268 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hadron

Send message
Joined: 4 Sep 22
Posts: 47
Credit: 760,162
RAC: 5,164
Message 107271 - Posted: 12 Oct 2022, 8:00:59 UTC - in response to Message 107268.  

PS, omg is Rosetta ever a CPU hog!! 9 running Rosetta tasks have pushed the CPU temperature to over 85°C, a temperature I have never seen before on this system.
You might want to check your CPU cooling. The Rosetta applications haven't been optimised & their thermal impact even when running at 100% load & using all cores and threads isn't particularly high.

It's a stock AMD fan, a Wraith Spire. The processor's TDP is 95W, which is what the fan is rated for.
Maybe the fan just needs to be thoroughly cleaned -- I just did that recently, but maybe I didn't get out all the junk. I'm OK so long as the CPU temperature stays well below 95°, which is its maximum rating. Unless it starts getting close to that, I'm not going to declare an emergency.
Do you have any recommendations for a possible replacement, preferably one which does not need a back mount?
ID: 107271 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1481
Credit: 14,589,068
RAC: 14,761
Message 107272 - Posted: 12 Oct 2022, 8:52:55 UTC - in response to Message 107271.  

Do you have any recommendations for a possible replacement, preferably one which does not need a back mount?
I use water cooling these days due to the high ambient temperature & humidity here (without the aircon running the temperature in the house gets to the mid to high 30°c, and this time of the year the relative humidity is often 85%+).

For an air cooler apparently it's hard to go past the Thermalright Peerless Assassin 120 SE, excellent cooling at a reasonable price, but it does require a backing plate.
I had a quick look at a couple of stores, and there's not much that's available without a backing plate these days. And those without a backing plate don't appear to be much (if any) better than the stock Wraith Spire cooler anyway.


Pulling the fan off & giving the heatsink a good clean out along with the fan blades could make a difference, if the system's several years old, removing the heatsink, cleaning it's base & the CPU and putting fresh heat sink compound in can also make a difference. Or do what some people have done & fit a slightly larger & faster fan for improved airflow)
But given the mucking around involved in removing & re-fitting the old heatsink, or in getting a larger fan mounted & working, i'd just go with a new much better heatsink assembly with a backing plate (i figure it'd be a similar level of hassle, if not less, and you'd know it will be better).
Grant
Darwin NT
ID: 107272 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 374
Credit: 10,701,566
RAC: 5,484
Message 107277 - Posted: 12 Oct 2022, 14:05:26 UTC - in response to Message 107271.  
Last modified: 12 Oct 2022, 14:10:52 UTC


Do you have any recommendations for a possible replacement, preferably one which does not need a back mount?


On my R9 3900 rigs I’ve replaces the stock coolers with the uprated Wraith Prism. Although the 3900 is rated at 65W TDP it runs at about 88W at 100% CPU and the extra cooling of the 105W rated Prism makes about a 10C difference in the reported temperature.
ID: 107277 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hadron

Send message
Joined: 4 Sep 22
Posts: 47
Credit: 760,162
RAC: 5,164
Message 107311 - Posted: 13 Oct 2022, 23:37:36 UTC

Thanks for the cooler recommendations, guys. If/when I do replace the Wraith Spire, I think it will be with a Cooler Master Hyper 212 EVO. My reasoning is as follows:

    I have no room for a water cooler, otherwise I'd go with that.

    The only Wraith Prisms I can find available here are all used items taken from what I assume are recycling facilities in China. All you get is the fan, nothing else, and many are reported to be damaged.

    The Hyper 212 and the Thermalright Peter mentioned are comparable. The PA 120 SE has slightly higher airflow (66 vs 63), but the Hyper 212 EVO has significantly higher fan air pressure (2.5 mm vs 1.5 mm), which IMO is the more important of the two.

    And finally, I can get the PA 120 for $88 Canadian, plus tax, but the Hyper 212 EVO is only $50



On the crunching side, I've finally got things sorted out. Looks like ATLAS tasks will have to run on one thread for now. I miscued, and wound up with 3 of them running on 2 threads each, with a rather large basket of other tasks as well. Things do not go well when you have 11 tasks running, wanting a total of 14 threads, and there are only 12 in total!! So I had to babysit things until those 3 tasks were finished.
I'll let things go for a few days or more as they are, then maybe think about limiting the number of simultaneous Theory tasks I allow.

ID: 107311 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2

Questions and Answers : Preferences : How to Limit CPU cores ?



©2024 University of Washington
https://www.bakerlab.org