There's a max WU of 8 with Virtualbox

Message boards : Number crunching : There's a max WU of 8 with Virtualbox

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Dougga

Send message
Joined: 27 Nov 06
Posts: 28
Credit: 5,248,050
RAC: 0
Message 104834 - Posted: 16 Feb 2022, 9:28:27 UTC

I have the new intel Core i9-12900k which has 25 threads and 16 cores.
Boinc/Virtualbox will only run 8 work units for some reason.

The Boinc UI doesn't seem to have a max, but I'm still looking.

Can someone help me out?
ID: 104834 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MJH333

Send message
Joined: 29 Jan 21
Posts: 18
Credit: 5,748,861
RAC: 0
Message 104835 - Posted: 16 Feb 2022, 9:55:17 UTC - in response to Message 104834.  

It’s a lack of RAM.

See Falconet’s post (Feb 11 at 5:16pm) on the World Community Grid forum here https://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,44037_offset,0
ID: 104835 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1634
Credit: 16,775,951
RAC: 13,112
Message 104836 - Posted: 16 Feb 2022, 11:10:38 UTC - in response to Message 104835.  
Last modified: 16 Feb 2022, 11:12:29 UTC

It’s a lack of RAM.
Or Disk space even if you have enough RAM.
For Python Tasks roughly 3.5GB of RAM per Task is required (even though much less is actually used), and 7.5GB of disk space is needed per Task.
For Rosetta 4.20 Tasks, allowing 1.3GB of RAM per Task means you won't run in to lack of memory issues for that work type.

Check your Event log to see what the messages are for what Rosetta needs, and you still need to run the BOINC benchmarks.
Grant
Darwin NT
ID: 104836 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
computezrmle

Send message
Joined: 9 Dec 11
Posts: 63
Credit: 9,680,103
RAC: 0
Message 104837 - Posted: 16 Feb 2022, 12:27:30 UTC - in response to Message 104836.  

... you still need to run the BOINC benchmarks.

Not necessarily if the server runs a recent BOINC version (Rosetta does).
Then the benchmark results are just taken to initialize corresponding "speed" fields in the app_version record for that client.
Once initialized the server recalculates the "speed" values based on the reported runtimes.
Those updated values are sent back to the client as <flops>.

Example (all from the same client):
Benchmark p_fpops: 7271969760.658784
Rosetta 4.20 flops: 2777777850.436758
Rosetta python flops: 3409656081.627964

If a client has never sent a benchmark result 1000000000.000000 (p_fpops) is used as default.
The more p_fpops and flops differ the more credits and runtime estimation jumps up/down when a new app_version is sent out or the flops are reset server side. It also takes longer until the numbers return to values the volunteer is familiar with.
ID: 104837 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5690
Credit: 5,859,226
RAC: 10
Message 104848 - Posted: 16 Feb 2022, 20:29:41 UTC - in response to Message 104837.  

... you still need to run the BOINC benchmarks.

Not necessarily if the server runs a recent BOINC version (Rosetta does).
Then the benchmark results are just taken to initialize corresponding "speed" fields in the app_version record for that client.
Once initialized the server recalculates the "speed" values based on the reported runtimes.
Those updated values are sent back to the client as <flops>.

Example (all from the same client):
Benchmark p_fpops: 7271969760.658784
Rosetta 4.20 flops: 2777777850.436758
Rosetta python flops: 3409656081.627964

If a client has never sent a benchmark result 1000000000.000000 (p_fpops) is used as default.
The more p_fpops and flops differ the more credits and runtime estimation jumps up/down when a new app_version is sent out or the flops are reset server side. It also takes longer until the numbers return to values the volunteer is familiar with.



I'm running 9 Python and 6 x 4.2 right now. Forgot to take BOINC out of suspend mode before I went to work. So Rosie is complaining.
ID: 104848 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
computezrmle

Send message
Joined: 9 Dec 11
Posts: 63
Credit: 9,680,103
RAC: 0
Message 104849 - Posted: 16 Feb 2022, 20:50:56 UTC - in response to Message 104848.  

I'm running 9 Python and 6 x 4.2 right now. Forgot to take BOINC out of suspend mode before I went to work. So Rosie is complaining.

Did you read my post?
It appears that you didn't as I don't see any relationship.

You nearly always make full copies of all comments you find (including replies to ... including replies to ...) and waste the forum with it.
It would be easier for everybody reading it if you would focus on the small pieces you want to refer to.
Everything else can be read in the original posts close to the replies which can be found via an existing link.
ID: 104849 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dougga

Send message
Joined: 27 Nov 06
Posts: 28
Credit: 5,248,050
RAC: 0
Message 104850 - Posted: 16 Feb 2022, 21:17:11 UTC - in response to Message 104835.  
Last modified: 16 Feb 2022, 21:56:17 UTC

It’s a lack of RAM.


There are no complaints in the event logs regarding ram.
Looking at Resource Monitor is claims...

Memory:
Hardware reserved: 259 MB
In Use: 18756 MB
Modified: 148 MB
Standby: 13608 MB <----- this is oddly high
Free: 1 MB

It appears to me the VM's are reserving an enormous amount of memory they are not using which is the problem.
An additional 32GB ram is back-ordered so it will be added shortly.
Thanks.
ID: 104850 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5690
Credit: 5,859,226
RAC: 10
Message 104851 - Posted: 16 Feb 2022, 21:40:01 UTC - in response to Message 104849.  

Well then why not reply to the reply and skip the quote?
Geeess...you need some whine to go with that?
ID: 104851 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2074
Credit: 40,613,760
RAC: 5,140
Message 104854 - Posted: 16 Feb 2022, 23:05:54 UTC - in response to Message 104836.  
Last modified: 16 Feb 2022, 23:10:00 UTC

It’s a lack of RAM.
Or Disk space even if you have enough RAM.
For Python Tasks roughly 3.5GB of RAM per Task is required (even though much less is actually used), and 7.5GB of disk space is needed per Task.
For Rosetta 4.20 Tasks, allowing 1.3GB of RAM per Task means you won't run in to lack of memory issues for that work type.

Check your Event log to see what the messages are for what Rosetta needs, and you still need to run the BOINC benchmarks.

I have a weird thing going on with Disk space, over and above the understandable RAM limitations I have

Based on
Disk
Use no more than xx disk space - unselected
Leave at least 1GB free - selected
Use no more than xx% of total - unselected

RAM
When computer is in use, use at most 80%
When computer is not in use, use at most 90%
Leave non-GPU tasks in memory while suspended - selected
Page/swap file: use at most 75%

Event log shows my preferences as
16/02/2022 16:59:22 | | Reading preferences override file
16/02/2022 16:59:22 | | Preferences:
16/02/2022 16:59:22 | | max memory usage when active: 26143.12 MB
16/02/2022 16:59:22 | | max memory usage when idle: 29411.01 MB
16/02/2022 16:59:22 | | max disk usage: 826.78 GB


I have 7 python tasks running and 3 python tasks "waiting to run"

Event log shows:
16/02/2022 22:39:26 | Rosetta@home | Message from server: rosetta python projects needs 2059.75MB more disk space. You currently have 17013.73 MB available and it needs 19073.49 MB.

I get the RAM limitations, though there's no complaint about that in the event log, but how come there's only 17Gb disk space left out of 826Gb to download another 19Gb task?

Then, while I was typing this, 8 Rosetta 4.20 tasks got downloaded without complaint and are running
Am I supposed to understand this, or is this just how it goes. I doubt I'd have the RAM to run any more tasks anyway, just have a buffer

Edit:
Disk tab shows:
Used by Boinc 84.96Gb
Free, available to Boinc: 748.68Gb
Free, not available to Boinc: 1.00Gb
Used by other programs: 96.27Gb
ID: 104854 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1634
Credit: 16,775,951
RAC: 13,112
Message 104863 - Posted: 17 Feb 2022, 8:27:11 UTC - in response to Message 104850.  

It’s a lack of RAM.


There are no complaints in the event logs regarding ram.
So what messages are in the Event log when it requests more work?
If BOINC doesn't have enough RAM or disk space to start more tasks, it generally mentions it in the Event log, as many people have posted here many times ever since Python work was released.

Keep in mind it doesn't matter how much RAM your system has, if you don't let BOINC make use of it.
Computing preferences, Memory, "When computer is in use, use at most xx %" and "When computer is not in use, use at most xx %", generally set both to at least 95% so BOINC can make use of it.
Grant
Darwin NT
ID: 104863 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1634
Credit: 16,775,951
RAC: 13,112
Message 104864 - Posted: 17 Feb 2022, 8:35:24 UTC - in response to Message 104854.  

I have a weird thing going on with Disk space, over and above the understandable RAM limitations I have

Based on
Disk
Use no more than xx disk space - unselected
Leave at least 1GB free - selected
Use no more than xx% of total - unselected
I think it was .clair. who was having similar issues. I think it was a case of them using local preferences, and with those unselected the web based preferences were used. To get around it, instead of leaving those other values unselected, they put values in there that would give BOINC more than they would ever need. Both locally & in their web account settings.
Then it stopped complaining about disk space & downloaded the extra Tasks.
Grant
Darwin NT
ID: 104864 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MJH333

Send message
Joined: 29 Jan 21
Posts: 18
Credit: 5,748,861
RAC: 0
Message 104867 - Posted: 17 Feb 2022, 13:29:38 UTC - in response to Message 104850.  

It’s a lack of RAM.


There are no complaints in the event logs regarding ram.

Dougga,

I have two 4C/4T laptops running Pythons. With 8GB of RAM, they would run only 2 tasks at a time. I increased the memory on both to 16GB, and they both now run 4 Pythons.

The fact that your machine has 32GB of RAM and can run only 8 Pythons is what led me to think that RAM is the issue. But as Grant (SSSF) has pointed out, disk space can also be a problem with the Pythons.

I have to confess that I couldn’t remember what the Event Logs said about this issue when I only had 8GB of RAM. So I conducted a little experiment this morning, taking 8GB out of one of the laptops.

Doing this caused 2 of the 4 Pythons to stop running, with the message “Waiting for memory” showing in the Status section of Boinc Manager for the tasks. Aborting one and downloading another caused the Status message for the new one to read “Ready to start” instead of “Waiting for memory”. So the message for tasks which haven’t yet started because of lack of RAM seems to be simply “Ready to start”.

I also looked at the Event Log. It did not mention the fact that tasks weren’t running because of a lack of memory. I think I have the default options for the Event Log.

Changing those options by using Options>Event Log options in Boinc Manager to add in mem_usage_debug resulted in the Event Log recording that those 2 Pythons “can’t run, too big”.

I thought I would record this, in case you or others find it helpful.

Cheers,
Mark
ID: 104867 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 104880 - Posted: 17 Feb 2022, 21:54:38 UTC - in response to Message 104854.  
Last modified: 17 Feb 2022, 22:04:41 UTC

Sid Celery
I have a weird thing going on with Disk space, over and above the understandable RAM limitations I have
Based on
Disk
Use no more than xx disk space - unselected
Leave at least 1GB free - selected
Use no more than xx% of total - unselected

I finaly found the combination that works , so try this from my thread on the problem
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=14903&postid=104879
Use no more than - 500 GB . . [the total size of my disk its on , this is so I can run 45 work units together , [I have now seen boinc using 352GB of disk and 80GB of RAM] don't worry about setting this BIG.
Leave at least ## GB free . . [untick this box not needed]
Use no more than ## % of total . . [untick this box not needed]
ID: 104880 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dougga

Send message
Joined: 27 Nov 06
Posts: 28
Credit: 5,248,050
RAC: 0
Message 104897 - Posted: 18 Feb 2022, 2:06:02 UTC
Last modified: 18 Feb 2022, 2:12:51 UTC

Well it appears at least for now the VirtualBox work units are gone and I have 24 wu running.
That's a first.

After further investigation the computer is using different cores differently.

Parked:(8) 1,3,5,7,9,11,13,15
Idle or mild use:(8) 0.2,4,6,8,10,12,14
Maxed Out:(8) 16-23

Someone mentioned Windows 11 might make better use of the cores and the new Intel CPUs have multiple types of cores. More research...
ID: 104897 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2074
Credit: 40,613,760
RAC: 5,140
Message 104904 - Posted: 18 Feb 2022, 4:28:33 UTC - in response to Message 104880.  

Sid Celery
I have a weird thing going on with Disk space, over and above the understandable RAM limitations I have
Based on
Disk
Use no more than xx disk space - unselected
Leave at least 1GB free - selected
Use no more than xx% of total - unselected

I finaly found the combination that works , so try this from my thread on the problem
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=14903&postid=104879
Use no more than - 500 GB . . [the total size of my disk its on , this is so I can run 45 work units together , [I have now seen boinc using 352GB of disk and 80GB of RAM] don't worry about setting this BIG.
Leave at least ## GB free . . [untick this box not needed]
Use no more than ## % of total . . [untick this box not needed]

I just spotted that in your other thread. Well worth a try once I get back to that PC on Sunday.
It kind of indicates a problem with Boinc working with Python tasks. I've not come across it before
Thanks for the pointer
ID: 104904 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2074
Credit: 40,613,760
RAC: 5,140
Message 105043 - Posted: 20 Feb 2022, 3:30:18 UTC - in response to Message 104904.  

Sid Celery
I have a weird thing going on with Disk space, over and above the understandable RAM limitations I have
Based on
Disk
Use no more than xx disk space - unselected
Leave at least 1GB free - selected
Use no more than xx% of total - unselected

I finaly found the combination that works , so try this from my thread on the problem
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=14903&postid=104879
Use no more than - 500 GB . . [the total size of my disk its on , this is so I can run 45 work units together , [I have now seen boinc using 352GB of disk and 80GB of RAM] don't worry about setting this BIG.
Leave at least ## GB free . . [untick this box not needed]
Use no more than ## % of total . . [untick this box not needed]

I just spotted that in your other thread. Well worth a try once I get back to that PC on Sunday.
It kind of indicates a problem with Boinc working with Python tasks. I've not come across it before
Thanks for the pointer

I've set the disk space to 500Gb rather than leave it unlimited (826Gb on my system) and it worked straight off.
30 more Python tasks came down straight away - added to a number that was fewer than the 16 threads I have.
Not sure why, but you definitely hit on something
ID: 105043 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5690
Credit: 5,859,226
RAC: 10
Message 105046 - Posted: 20 Feb 2022, 8:25:36 UTC

This bug (disk space errors when running 15 or so python) has been submitted to the BOINC guys on Github.
One guy said he will dig into.
All settings are ok, disk space setup in BOINC was ok. (leave 2GB free no restriction)
Drive has more then enough capacity (500GB dedicated)

Now there is a difference between my drive and yours (Sid). You have a way larger drive and your bringing it down to my drive level of 500.
I have a 500 (465 automatic allocation when formatting) and I said keep 2 gigs free, so 463 gigs and I had the problem of disk errors here and in SiDock.
15 SiDock or 15 Pythons both created disk space errors despite having more than enough space.
ID: 105046 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2074
Credit: 40,613,760
RAC: 5,140
Message 105055 - Posted: 20 Feb 2022, 15:16:58 UTC - in response to Message 105046.  
Last modified: 20 Feb 2022, 15:17:25 UTC

This bug (disk space errors when running 15 or so python) has been submitted to the BOINC guys on Github.
One guy said he will dig into.
All settings are ok, disk space setup in BOINC was ok. (leave 2GB free no restriction)
Drive has more then enough capacity (500GB dedicated)

Now there is a difference between my drive and yours (Sid). You have a way larger drive and your bringing it down to my drive level of 500.
I have a 500 (465 automatic allocation when formatting) and I said keep 2 gigs free, so 463 gigs and I had the problem of disk errors here and in SiDock.
15 SiDock or 15 Pythons both created disk space errors despite having more than enough space.

Not sure if this is of any significance at all, but I can run 10 Python tasks at a time on my 8C/16T machine (90% RAM of 32Gb in use, 95% RAM not in use & 500Gb disk allocated)
A whole load of WCG tasks came down unexpectedly (97) so I suspended 2 pythons, leaving 8, and the other 8 cores of my PC started running WCG tasks.
So I'm going to tweak the RAM up a touch more and add another 100Gb disk space and see what happens, if anything
ID: 105055 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2074
Credit: 40,613,760
RAC: 5,140
Message 105079 - Posted: 20 Feb 2022, 21:04:36 UTC - in response to Message 105055.  

This bug (disk space errors when running 15 or so python) has been submitted to the BOINC guys on Github.
One guy said he will dig into.
All settings are ok, disk space setup in BOINC was ok. (leave 2GB free no restriction)
Drive has more than enough capacity (500GB dedicated)

Now there is a difference between my drive and yours (Sid). You have a way larger drive and you're bringing it down to my drive level of 500.
I have a 500 (465 automatic allocation when formatting) and I said keep 2 gigs free, so 463 gigs and I had the problem of disk errors here and in SiDock.
15 SiDock or 15 Pythons both created disk space errors despite having more than enough space.

Not sure if this is of any significance at all, but I can run 10 Python tasks at a time on my 8C/16T machine (90% RAM of 32Gb in use, 95% RAM not in use & 500Gb disk allocated)
A whole load of WCG tasks came down unexpectedly (97) so I suspended 2 pythons, leaving 8, and the other 8 cores of my PC started running WCG tasks.
So I'm going to tweak the RAM up a touch more and add another 100Gb disk space and see what happens, if anything

Increased RAM to 95% in use 95% not in use & 600Gb disk space and could run 10 pythons only or 9 pythons and 6 WCG.
Increased RAM to 97% & 97% & disk at 600Gb and could run 11 pythons only or 10 pythons and 6 WCG
I don't think increasing the disk space made any difference tbh - just the RAM as I stepped it up 1% at a time
ID: 105079 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 105083 - Posted: 20 Feb 2022, 21:36:58 UTC
Last modified: 20 Feb 2022, 21:43:01 UTC

My theory . . . .
On my 16cpu system I can only run max 11 pythons together because it has only 32GB memory , 100% available to boinc.
The `properties` of a python want 2.79GB memory , so , 2.79 x 11 = 30.69GB a bit for system use . and its ram is full used
with no other tasks running it uses about 20<21GB of ram the rest boinc calculates has been eaten by pythons , {greedy objects}
But , If any R4.2 tasks appear [or WCG in your case] they will use the other cpu`s and the ram that the pythons are hogging , but not actualy using.
Ticking `cpu_sched_debug` in event log gets a lot of output I wont inflict the forum with it all
here are the last few lines that look most relevant :-
20/02/2022 17:51:32 | Rosetta@home | [cpu_sched_debug] enforce: task aagb-AIB_pp-NMPHE-ACBC13T-mACPenC12C_12_2594285_2_0 can't run, too big 2861.02MB > 1294.93MB
20/02/2022 17:51:32 | Rosetta@home | [cpu_sched_debug] enforce: task aaas-AGLY_pp-NMPHE-mVAL-mSUGA_0_2406724_2_0 can't run, too big 2861.02MB > 1294.93MB
20/02/2022 17:51:32 | Rosetta@home | [cpu_sched_debug] enforce: task aagb-NMPHE_pp-mTIQ-LARE-mB3PHG_pp_5_2674514_2_0 can't run, too big 2861.02MB > 1294.93MB
20/02/2022 17:51:32 | Rosetta@home | [cpu_sched_debug] enforce: task aaas-HPR_pp-SAR-AGLY-mSUGA_pp_12_2432415_2_0 can't run, too big 2861.02MB > 1294.93MB
20/02/2022 17:51:32 |  | [cpu_sched_debug] using 11.00 out of 15 CPUs
20/02/2022 17:51:32 |  | [cpu_sched_debug] enforce_run_list: end

Its the "can`t run to big"
I had seen this a while ago when the "disk space messages from server" were doing my head in.
ID: 105083 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : There's a max WU of 8 with Virtualbox



©2024 University of Washington
https://www.bakerlab.org