Only using 3 out of 4 available cores.

Message boards : Number crunching : Only using 3 out of 4 available cores.

To post messages, you must log in.

AuthorMessage
Profile dgnuff
Avatar

Send message
Joined: 1 Nov 05
Posts: 350
Credit: 24,773,605
RAC: 0
Message 75564 - Posted: 4 May 2013, 21:48:32 UTC

This system of mine:

https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1088056

is a quad core Intel Q8200. I was checking on it this morning, and found that it's only running three Rosetta processes, and nothing else. Meaning one core is idle.

I've checked the local preferences here which say to use at most 16 cores, and I've set the local preferences to use 100% of the CPU.

Any ideas?
ID: 75564 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dgnuff
Avatar

Send message
Joined: 1 Nov 05
Posts: 350
Credit: 24,773,605
RAC: 0
Message 75565 - Posted: 4 May 2013, 22:35:24 UTC - in response to Message 75564.  

Weird. No idea at all.

I used Process Explorer to kill boinc and its children, and then restarted from the shortcut in programsstartup. All 4 cores are now active.

ID: 75565 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,756,248
RAC: 13,174
Message 75566 - Posted: 5 May 2013, 11:04:17 UTC - in response to Message 75565.  

Weird. No idea at all.

I used Process Explorer to kill boinc and its children, and then restarted from the shortcut in programsstartup. All 4 cores are now active.


Hmm I have no idea either, could have been a bad unit that crashed but didn't let go of its resources, I really don't know either.
ID: 75566 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,756,248
RAC: 13,174
Message 75568 - Posted: 6 May 2013, 11:18:21 UTC

Was reading a thread at MilyWay and saw this:
OP "I just noticed it starting to happen again. It was only running 10 CPU tasks, then 9.

In order to fix it, I seem to have to both reset the project and restart BOINC.

Kind of weird. The only changes I've made somewhat recently was getting my GPUs to run more tasks at once, but this issue doesn't seem to affect the GPUs crunching, which stays at 4 tasks.

I added the XML files to change how many tasks the GPUs run.

cc_config.xml in the BOINC folder:

<cc_config>
<options>
<use_all_gpus>1</use_all_gpus>
</options>
</cc_config>



app_config.xml in the milkyway.cs.rpi.edu_milkyway folder:

<app_config>
<app>
<name>milkyway</name>
<max_concurrent>4</max_concurrent>
<gpu_versions>
<gpu_usage>.5</gpu_usage>
<cpu_usage>.05</cpu_usage>
</gpu_versions>
</app>
</app_config>



Not sure if that is related, but I added those in fairly recently, and hadn't been having the issue before that.

But then also, I did recently update to 7.0.64 in order for the second config file to work, so it may just be a weird intermittent issue with the version too."

Reply "Bump that max concurrent up to 12 as it also included CPU work in that setting."

Are you using an app_config file like that too?
ID: 75568 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Polian
Avatar

Send message
Joined: 21 Sep 05
Posts: 152
Credit: 10,141,266
RAC: 0
Message 75570 - Posted: 6 May 2013, 13:52:49 UTC

i noticed your machine has 2GB of memory. Some of these Rosetta tasks have been using over 1GB of memory per process lately. Perhaps BOINC has detected that you are simply out of physical memory and is not launching a fourth instance.
ID: 75570 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,756,248
RAC: 13,174
Message 75579 - Posted: 7 May 2013, 11:56:12 UTC

I am going to add a note about an app_config.xml file additional line that may help fix the problem. I am cutting and pasting this from the MilkyWay boards where I saw this.
The line you need to add is:
max_concurrent>?</max_concurrent>

The question mark should be replaced by:
"The app_config is for both CPU and GPU.
And make that 16 as you have a 12 core machine plus those 2 GTX680's."

Obviously dgnuff your individual pc would dictate your own numbers instead of "16" in the above example. In the "16" example he is assuming 2 threads per gpu plus the 12 cpu threads to come up with the 16.
I hope this works for you!
ID: 75579 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 75597 - Posted: 10 May 2013, 20:54:53 UTC

BOINC GPU applications require CPU resources to get them started, feed them work, organize their results... the amount of CPU resource varies greatly depending on what BOINC project you are running. But it's possible that the BOINC Manager has essentially estimated that one CPU is busy keeping the GPU busy. I believe the projects provide the BOINC Manager some estimate of CPU % per active GPU task.

Rosetta Moderator: Mod.Sense
ID: 75597 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
David Ball

Send message
Joined: 25 Nov 05
Posts: 25
Credit: 1,439,333
RAC: 0
Message 75627 - Posted: 19 May 2013, 9:18:52 UTC - in response to Message 75564.  


I've checked the local preferences here which say to use at most 16 cores, and I've set the local preferences to use 100% of the CPU.
Any ideas?


First, I'm fairly certain that the most recent versions (like 7.0.64) of boinc ignore the "use at most 16 cores" field and calculate the number of cores to use from the "use 100% of the CPU" field.

Second, there's the 2 MB memory problem. What percentage of the memory and swap space do you have boinc set to use? Is boinc set to "Leave applications in memory while suspended"?

If it's a memory problem you usually end up with a work unit that says "Waiting for memory" instead of running but there might be some situations where it will simply not start up a 4th work unit. I very rarely have a memory problem but I did see one today on a CPU work unit and it was because a GPU work unit grabbed a lot of memory. Apparently, boinc gives priority to the GPU work unit when it decides to make something wait for memory so it suspended a CPU work unit with the status message "Waiting for memory".

The operating system can change the amount of memory that it uses or start up some maintenance task and that can cause it to use more memory. In fact, the boinc manager is a separate program and it requires 20+ MB of memory. I'm attached to a lot of projects and my boinc manager is currently using 29MB and has used a peak of 54MB of memory with a commit size (IIRC this includes swap space) of 77MB.

I always set my page file to at least twice the size of memory so on that machine I would set it to a 4GB swap space and configure boinc to allow it to use up to 75% of the swap space.

Boinc.exe 7.0.x has a problem where it gradually uses more memory the longer it runs. I saw some of the test versions of boinc.exe in the 7.0.5x and 7.0.6x actually reach over 450MB of memory but I'm a boinc alpha tester and was running boinc.exe with all kinds of debug options set in cc_config.xml when that happened. With the normal options, boinc.exe doesn't grow much but you have so little memory that even a little growth could effect things.

Basically, I think you need to find a way to get more memory in that machine and make sure you have at least a 4GB page file.

Regards,

David Ball
Have you read a good Science Fiction book lately?
ID: 75627 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Only using 3 out of 4 available cores.



©2024 University of Washington
https://www.bakerlab.org