Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 136 · 137 · 138 · 139 · 140 · 141 · 142 . . . 309 · Next

AuthorMessage
Profile markfw

Send message
Joined: 26 Jan 07
Posts: 5
Credit: 317,031,416
RAC: 185,123
Message 103502 - Posted: 25 Nov 2021, 0:05:41 UTC
Last modified: 25 Nov 2021, 0:08:51 UTC

I have an EPYC 7742 with 128 gig ram. When I added virtualbox (now required I guess) now I only get 16 tasks, the machine is almost idle. What needs to be done to fix this ?

Running cinnamon mint 19 (ubuntu Linux)
ID: 103502 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 2,014
Message 103503 - Posted: 25 Nov 2021, 0:15:14 UTC - in response to Message 103502.  

I have an EPYC 7742 with 128 gig ram. When I added virtualbox (now required I guess) now I only get 16 tasks, the machine is almost idle. What needs to be gone to fix this ?

Running cinnamon mint 19 (ubuntu Linux)

Each of the Python tasks reserves 7.45GB of memory. The amount it actually uses is closer to 100 MB, but that is less important.

To get more tasks, you'll have to either add a lot more memory, or wait for tasks that don't reserve so much memory.

You may have to also tell BOINC that it can use a bigger fraction of the disk space.
ID: 103503 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 103504 - Posted: 25 Nov 2021, 0:38:21 UTC - in response to Message 103502.  

I have an EPYC 7742 with 128 gig ram. When I added virtualbox (now required I guess) now I only get 16 tasks, the machine is almost idle. What needs to be done to fix this ?
Running cinnamon mint 19 (ubuntu Linux)

Ooops,
Back to to Python ram calculator again ;) . so 128 cpu/threads X 7.45GB Hmm that's a cool 953.6 giblets of rambo
If you want to keep all of its 128 cpu/threads busy you will have to fit an entire TERABYTE of ram, utterly nuts . . . . .
{nice system :}
there is another AMD EPYC 7742 at `top hosts` position 106 that does have a terrible bite of RAM,
But its still stuffed for running full on coz its got 256 cpu/threads to play with . . . . .
That would help keep the house warm
Don't fancy the lecky bill though :}
ID: 103504 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile markfw

Send message
Joined: 26 Jan 07
Posts: 5
Credit: 317,031,416
RAC: 185,123
Message 103506 - Posted: 25 Nov 2021, 2:24:42 UTC - in response to Message 103504.  

I have an EPYC 7742 with 128 gig ram. When I added virtualbox (now required I guess) now I only get 16 tasks, the machine is almost idle. What needs to be done to fix this ?
Running cinnamon mint 19 (ubuntu Linux)

Ooops,
Back to to Python ram calculator again ;) . so 128 cpu/threads X 7.45GB Hmm that's a cool 953.6 giblets of rambo
If you want to keep all of its 128 cpu/threads busy you will have to fit an entire TERABYTE of ram, utterly nuts . . . . .
{nice system :}
there is another AMD EPYC 7742 at `top hosts` position 106 that does have a terrible bite of RAM,
But its still stuffed for running full on coz its got 256 cpu/threads to play with . . . . .
That would help keep the house warm
Don't fancy the lecky bill though :}


Well, I WAS number 29 in this project worldwide. If they are now only allowing 8 gig ram per task, I think I will quit, and run WCG. I can't have my 1100 threads of EPYC going idle. I have a dual 7601 EPYC system and for some reason, it only has 5 tasks for 256 gig ram. I have a contributor on my team with a dual 7742, so 256 threads, and 256 gig ram. He is only getting 30 tasks, and both of use are almost idle.

Whatever they did the require virtualbox was a mistake. All of Anandtech may quit this project because of this.
ID: 103506 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103507 - Posted: 25 Nov 2021, 2:46:04 UTC - in response to Message 103506.  

Whatever they did the require virtualbox was a mistake. All of Anandtech may quit this project because of this.

It may be required for the science, even though they are not using much of it now. But maybe they are planning bigger payloads later?
However, you are right in wanting to keep your machine busy. If you can find better work for it, go to it.

But I just increased the memory on my Ryzen 3950X to 128 GB, so that I could run 16 full cores on the pythons. That is full enough.
I am willing to do without the benefit of virtualization for now.
ID: 103507 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 2,014
Message 103508 - Posted: 25 Nov 2021, 2:49:31 UTC - in response to Message 103507.  

Whatever they did the require virtualbox was a mistake. All of Anandtech may quit this project because of this.

It may be required for the science, even though they are not using much of it now. But maybe they are planning bigger payloads later?
However, you are right in wanting to keep your machine busy. If you can find better work for it, go to it.

But I just increased the memory on my Ryzen 3950X to 128 GB, so that I could run 16 full cores on the pythons. That is full enough.
I am willing to do without the benefit of virtualization for now.

That should work if the tasks are not written to require virtualization. However, I believe that they do require virtualization, at least under Windows.
ID: 103508 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103509 - Posted: 25 Nov 2021, 3:02:28 UTC - in response to Message 103508.  
Last modified: 25 Nov 2021, 3:11:53 UTC

That should work if the tasks are not written to require virtualization. However, I believe that they do require virtualization, at least under Windows.

I probably should have said that I am not using virtual cores. Virtual cores just makes more efficient use of a full core by running two threads on it at once. However, each work unit on a virtual core still requires the full (8 GB) amount of memory.

I have been running 12 work units on a Ryzen 3900X for almost a week on 50% of the (virtual) cores, so each work unit effectively gets a full core.
So I am willing to use only the full cores and lose the benefits of the virtual cores when I say that I am willing to give up the benefits of virtualization, in order to save on memory.

The use of virtual cores is not the VirtualBox type of virtualization, which it requires even on Linux.
ID: 103509 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,382,444
RAC: 19,446
Message 103511 - Posted: 25 Nov 2021, 5:25:14 UTC - in response to Message 103507.  

Whatever they did the require virtualbox was a mistake. All of Anandtech may quit this project because of this.
It may be required for the science
No, it's not.
Python is just a language, although unlike the various forms of C & others, it's an interpreted language, not compiled.

I'm guessing they're using it as it's meant to be a lot easier to use than most other high level languages- so it'd be good for non-programmers to develop applications with (eg scientists), and the use of VirtualBox to run it is just the easiest way of getting it to run on multiple Operating Systems without having to code for Python on each different OS and the resulting quirks that would require debugging due to the different behaviour of the code on different OSes (even though the same conde on any given Python interpreter should give exactly the same result on the OS it's running under, different OSes would have different ways of doing things, so that's not how things would actually work- eg Rosetta 4.xx applications are each written for the OS (Android, Windows, Linux, OSX/OSX on M1 etc) as while the underlying procedures, algorithms etc would be the same, getting them all to work as intended requires the programme to be written for each OS specifically).

Ideally they would just produce Python applications for each OS like they do for the present Rosetta 4.20 applications- but it would still require people to manually install Python on their Windows systems (and some Linux distros don't include it as standard either) in order for them to be able process Python tasks.
Grant
Darwin NT
ID: 103511 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103512 - Posted: 25 Nov 2021, 5:58:20 UTC - in response to Message 103511.  

That is a nice explanation, but any use of VirtualBox might be done with a native implementation, whether using python or otherwise.
So it is required for us in order to do their science, rather than in less convenient ways.
ID: 103512 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Michael Goetz
Avatar

Send message
Joined: 17 Jan 08
Posts: 12
Credit: 179,114
RAC: 0
Message 103516 - Posted: 25 Nov 2021, 12:26:50 UTC - in response to Message 103502.  
Last modified: 25 Nov 2021, 12:31:06 UTC

I have an EPYC 7742 with 128 gig ram. When I added virtualbox (now required I guess) now I only get 16 tasks, the machine is almost idle. What needs to be done to fix this ?

Running cinnamon mint 19 (ubuntu Linux)


There *IS* a way to make this work, but it's complicated. Two ways, actually.

As others have said, the tasks are set up such that BOINC must provide 7.5 GB of memory for each task. They don't actually need that much, and don't actually use that much. If, for example, you recompiled BOINC and turned off that restriction (which is something you CAN do), the tasks would all run just fine.

If you're not a programmer and don't want to change the BOINC software, what you can do is Google the instructions for running multiple BOINC instances on the same computer. Instead of running the normal single BOINC instance, run 8 instances. Each instance will think it's running on a computer with 128 GB. Each of the 8 instances will therefore be able to run 16 tasks. Set the memory limits in the preferences to 100%.

It's a nuisance to do. You have to change cc_config.xml to enable multiple BOINC clients. You have to explicitly start each BOINC instance on a different port number. You have to control each instance individually, again using that unique port number. Finally, if you use WUProp, you need to tell it about the port number with app_config.xml. You should probably also set each instance to use 12.5% of the CPU in case regular tasks are available. (6.125% if the CPU is hyperthreaded and has 256 threads).

EDIT: It would, of course, just be much easier if the project lowered the memory limit. It's just a single setting in the work generator and won't affect how the tasks actually run. I'm not privy to the thought process of the admins, nor am I aware of all the facts, but I'd guess the odds are 50/50 that they lower the memory requirements to something more reasonable at some point. 500 MB seems reasonable to me, based on what I've seen.
Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG.

ID: 103516 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile markfw

Send message
Joined: 26 Jan 07
Posts: 5
Credit: 317,031,416
RAC: 185,123
Message 103520 - Posted: 25 Nov 2021, 18:31:39 UTC - in response to Message 103516.  
Last modified: 25 Nov 2021, 18:34:30 UTC

I have an EPYC 7742 with 128 gig ram. When I added virtualbox (now required I guess) now I only get 16 tasks, the machine is almost idle. What needs to be done to fix this ?

Running cinnamon mint 19 (ubuntu Linux)


There *IS* a way to make this work, but it's complicated. Two ways, actually.

As others have said, the tasks are set up such that BOINC must provide 7.5 GB of memory for each task. They don't actually need that much, and don't actually use that much. If, for example, you recompiled BOINC and turned off that restriction (which is something you CAN do), the tasks would all run just fine.

If you're not a programmer and don't want to change the BOINC software, what you can do is Google the instructions for running multiple BOINC instances on the same computer. Instead of running the normal single BOINC instance, run 8 instances. Each instance will think it's running on a computer with 128 GB. Each of the 8 instances will therefore be able to run 16 tasks. Set the memory limits in the preferences to 100%.

It's a nuisance to do. You have to change cc_config.xml to enable multiple BOINC clients. You have to explicitly start each BOINC instance on a different port number. You have to control each instance individually, again using that unique port number. Finally, if you use WUProp, you need to tell it about the port number with app_config.xml. You should probably also set each instance to use 12.5% of the CPU in case regular tasks are available. (6.125% if the CPU is hyperthreaded and has 256 threads).

EDIT: It would, of course, just be much easier if the project lowered the memory limit. It's just a single setting in the work generator and won't affect how the tasks actually run. I'm not privy to the thought process of the admins, nor am I aware of all the facts, but I'd guess the odds are 50/50 that they lower the memory requirements to something more reasonable at some point. 500 MB seems reasonable to me, based on what I've seen.



Either is too much for me. I have had bladder cancer and lost. I am still going to the hospital weekly or more, and having surgeries bi-annually. I don't have time to mess with all that. I have 1100 cores, and 8 EPYC boxes, 3 of them are 64 core 7742's. They used to do Rosetta a lot (I am number 29 worldwide), but now all of that will be put on WCG. If Rosetta wants my computers, they need to make it easy to use, like all other BOINC projects.

Edit: as to memory, they should let BOINC do it, and also, NO virtualbox. It worked fine for years before, not now.
ID: 103520 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 103539 - Posted: 26 Nov 2021, 14:03:53 UTC - in response to Message 103520.  
Last modified: 26 Nov 2021, 14:49:21 UTC

Sorry to hear about your cancer.
I wish you nothing but the best.
ID: 103539 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Humphrey

Send message
Joined: 26 Jul 18
Posts: 5
Credit: 4,347,407
RAC: 755
Message 103540 - Posted: 26 Nov 2021, 14:47:06 UTC

How can I limit the number of simultaneous R@H tasks? I have 64GB installed, and R@H is consuming the lot, making other projects wait for memory..10 R@H jobs are running just now, and using all the memory. I'd prefer to limit them to, say, 4, but unlike other projects I can't see a way to influence this.
ID: 103540 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 103541 - Posted: 26 Nov 2021, 14:52:54 UTC - in response to Message 103540.  

The easiest way would be to go to the devices list page, click on "details" and scroll down till you find "VirtualBox VM jobs". Click on skip.

This way you won't receive the Rosetta Python tasks which are the ones that ask for over 7 GB of RAM but actually use a fraction of that (100 mb, etc). However, this may mean that there are times when no Rosetta work will be sent to your device because the standard Rosetta 4.2 application doesn't always have work available.
ID: 103541 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103542 - Posted: 26 Nov 2021, 15:11:52 UTC - in response to Message 103541.  

The easiest way would be to go to the devices list page, click on "details" and scroll down till you find "VirtualBox VM jobs". Click on skip.

Good find. I wish they would do it the other way also, and allows us the skip the regular Rosettas.
But this is a start.
ID: 103542 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103547 - Posted: 26 Nov 2021, 19:12:24 UTC - in response to Message 103370.  

I pointed that out about a year ago when MIP on WCG (which uses Rosetta) went in-house. They didn't need the crunchers any more.


I don't think so.
In-house HPC needs some points:
- a lot of performative hardware
- a big and prepared IT team
- simulations as much as possible homogeneous

Rosetta@Home has not these points. WCG (when it was IBM) has.
When IPD/BakerLab needs great computational power that cannot split on Boinc, they always use external source: AWS, Azure, TACC, etc.

But, maybe i'm wrong.....


TACC has their own supercomputer (I was trying that project for a time) to run protein folding. They toss scraps off to BOINC.
Not worth wasting your time there.
I don't know about the rest.

RAH has a neural network AI now that takes care of the majority of their work. 2 million tasks to process yet we get a little something here and there? Now they have python, but how long that will last is a good question.
I once went through the Robetta page and saw very little assigned to BOINC.
ID: 103547 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103548 - Posted: 26 Nov 2021, 19:25:18 UTC - in response to Message 103520.  

Wishing you all the best with your treatments and your life.
Put your systems where you think they will work best.
If you want COVID related work, SiDock and QuChem are both out there.
If your focusing on Cancer, then stay with WCG and Mapping Cancer Markers and the Childhood Cancer projects.
My mother in law died of a rare non treatable form of abdominal cancer, so that is why I got started here, thought they were looking at cancer stuff.
Later on I discovered WCG and joined up with their cancer projects.

I don't know what all FAH has, but its random stuff for a wide variety of science from what I can tell.
ID: 103548 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103551 - Posted: 26 Nov 2021, 20:45:01 UTC - in response to Message 103548.  

SiDock certainly does COVID. That is their only work at the moment, though they will move to other stuff later.

As for QuChemPedia, I don't think they do that sort of thing. But it is a good project otherwise.
https://quchempedia.univ-angers.fr/athome/forum_thread.php?id=78#709

FAH has been very BIG on COVID. I am sure that if the new Omicron variant needs them, they will be there.
https://foldingathome.org/2021/09/27/covid-moonshot-wellcome-trust-funding/?lng=en

WCG/OPNG is always short of work; they have too many crunchers. Their CPU work would be done faster on a GPU anyway.

Rosetta has done its share too, but where they are at the moment is a big question. They don't tell us.
ID: 103551 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103552 - Posted: 27 Nov 2021, 0:14:41 UTC - in response to Message 103551.  

SiDock certainly does COVID. That is their only work at the moment, though they will move to other stuff later.

As for QuChemPedia, I don't think they do that sort of thing. But it is a good project otherwise.
https://quchempedia.univ-angers.fr/athome/forum_thread.php?id=78#709

FAH has been very BIG on COVID. I am sure that if the new Omicron variant needs them, they will be there.
https://foldingathome.org/2021/09/27/covid-moonshot-wellcome-trust-funding/?lng=en

WCG/OPNG is always short of work; they have too many crunchers. Their CPU work would be done faster on a GPU anyway.

Rosetta has done its share too, but where they are at the moment is a big question. They don't tell us.



WCG has plenty of CPU work. I have 6 MCM running and another 85 in queue. They are fast processing. About 90 minutes. OpenPandemics is running COVID. It runs about 2 hours on CPU. I just picked up on QuChem because it was Vbox and it looked interesting. I'm sure their research will help something related to COVID eventually or something else health related. So many things to study and model.

RAH doesn't talk about anything anymore short of the news bites on the homepage.
Dr. B used to write in his journal here about every 2 weeks or so or if they discovered something interesting. Now its nothing. It's a shame. I'll have to read the homepage sometime and see what they say. I have so many things going on I don't pay attention to the homepage.
ID: 103552 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103553 - Posted: 27 Nov 2021, 0:18:19 UTC - in response to Message 103540.  

How can I limit the number of simultaneous R@H tasks? I have 64GB installed, and R@H is consuming the lot, making other projects wait for memory..10 R@H jobs are running just now, and using all the memory. I'd prefer to limit them to, say, 4, but unlike other projects I can't see a way to influence this.



Simple version is this: you can't. You can try messing with resource share, but from what I have read, that's a long term thing and may or may not affect your CPU count. Other than this there is no way.
I messed around with app_config but that can make a mess of things. The only thing I have found is to use a bunch of other projects that have core totals and use them to occupy your system.

Jim_1348 would know more about this than me.
ID: 103553 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 136 · 137 · 138 · 139 · 140 · 141 · 142 . . . 309 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org