Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 137 · 138 · 139 · 140 · 141 · 142 · 143 . . . 309 · Next

AuthorMessage
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,382,444
RAC: 19,446
Message 103554 - Posted: 27 Nov 2021, 0:28:59 UTC - in response to Message 103552.  

I'll have to read the homepage sometime and see what they say. I have so many things going on I don't pay attention to the homepage.
No news for over 12 months.
Grant
Darwin NT
ID: 103554 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103555 - Posted: 27 Nov 2021, 2:13:00 UTC - in response to Message 103553.  

Simple version is this: you can't. You can try messing with resource share, but from what I have read, that's a long term thing and may or may not affect your CPU count. Other than this there is no way.
I messed around with app_config but that can make a mess of things. The only thing I have found is to use a bunch of other projects that have core totals and use them to occupy your system.

Jim_1348 would know more about this than me.

Well there isn't really a good way at the moment. Normally you would use an app_config.xml file with a "project_max_concurrent" tag, but that produces excessive downloads due to a BOINC bug. It should be fixed eventually.

You can set the resource share of each project to get the number of work units you want. But that will take a few days to stabilize, and with the different memory requirements of the regular Rosettas and the pythons, it is something of a hit-or-miss affair. You will probably have to straighten out the mess often.

I just devote an entire machine to Rosetta. The pythons will then limit themselves when they reach the maximum memory limit, though that usually is less than the full number of cores.
Or you can set the "...use at most XX% of the processors" in BOINC manager to limit the number of cores.

I like the idea of Michael Goetz, to use separate BOINC instances. You can use one for each project. That is fairly foolproof.
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6893&postid=103516#103516

However, I have set up multiple BOINC instances before, and am familiar with it. If you have not done so, it is a bit of a hassle the first time, but easy enough afterward.
https://www.overclock.net/threads/guide-setting-up-multiple-boinc-instances.1628924/

Or just do the regular Rosettas when they are available and save yourself some hassle, and Falconet suggests.
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6893&postid=103541#103541

Good luck.
ID: 103555 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103557 - Posted: 27 Nov 2021, 8:43:59 UTC - in response to Message 103554.  

I'll have to read the homepage sometime and see what they say. I have so many things going on I don't pay attention to the homepage.
No news for over 12 months.


Not surprised
ID: 103557 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103562 - Posted: 27 Nov 2021, 13:02:50 UTC - in response to Message 103555.  

Beginning to think I might just go back to 4.2 hit and miss if Python does not settle down.
I put resource share to 50% last night, but I am still getting a lot of Python.
It's leaving my system with unused cores, I think due to memory.
24 gigs and last night my computation was 22 gigs just for Boinc and the remaining amount covers FAH and system usage. Currently with 3 python its 22.88 gigs. 3 SiDock , Einstein and Prime make it 23.34 gigs. That leaves 657 MB left over. Nothing BOINC can run on that. And that leaves me with 7 cores not doing anything.
Not what I expected.
But this calcuation conflicts with HWINFO which says only 10.6 is being used and there is 13.8 free.
Windows Task manager says only 54% maximum is being used.
So why the conflicting information between BOINC and Windows?
BOINC memory is set for 100% and Processors is 99%
ID: 103562 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 2,014
Message 103564 - Posted: 27 Nov 2021, 14:14:15 UTC - in response to Message 103562.  
Last modified: 27 Nov 2021, 14:17:36 UTC

[snip]

But this calcuation conflicts with HWINFO which says only 10.6 is being used and there is 13.8 free.
Windows Task manager says only 54% maximum is being used.
So why the conflicting information between BOINC and Windows?
BOINC memory is set for 100% and Processors is 99%

I suspect that one is including memory reserved but not actually used, and one is not.

Also, BOINC has a setting for how much of the computer's memory it can use.

The rest of the memory is then left for things like the operating system (usually either Windows or Linux).
ID: 103564 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103569 - Posted: 27 Nov 2021, 17:16:13 UTC - in response to Message 103564.  

[snip]

But this calcuation conflicts with HWINFO which says only 10.6 is being used and there is 13.8 free.
Windows Task manager says only 54% maximum is being used.
So why the conflicting information between BOINC and Windows?
BOINC memory is set for 100% and Processors is 99%

I suspect that one is including memory reserved but not actually used, and one is not.

Also, BOINC has a setting for how much of the computer's memory it can use.

The rest of the memory is then left for things like the operating system (usually either Windows or Linux).



Ok..so if I am giving it 100% memory and it is using it all, but not using it efficiently to run other tasks to keep the other cores busy, then what DO I set it at?? Because this isn't what I want it to do. So either resource share has to go down some more for RAH to eliminate it using 2-3 cores or change the memory settings?
ID: 103569 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 103572 - Posted: 27 Nov 2021, 18:11:05 UTC - in response to Message 103569.  
Last modified: 27 Nov 2021, 18:13:40 UTC

The problem is that BOINC is reserving all that RAM because the Pythons say they need all that RAM. By reserve, BOINC is simply taking into account that the Pythons say they need X RAM to run and therefore will only start other tasks with the remainder of non-reserved RAM.
The actual amount of RAM used is far lower than the one that is reserved.
RAM in reserve does not equal RAM in use which is why you see BOINC saying it doesn't have enough memory to run other tasks while Windows Task Manager says you're only using 54% of available RAM.


On my Ryzen 1400 with 16 GB of RAM, I can run 2 Pythons plus 6 MCM tasks. If I tried running Einstein@home CPU tasks, I probably couldn't run 6 because BOINC is told it needs to reserve a lot of RAM for the Pythons.

16 GB of RAM means I can run 2 Pythons with BOINC reserving 7.629 MB of RAM (from the log on my laptop which can't run these tasks) for each Python. That means I have 16384 MB - 15258 MB (Reserved for the 2 Pythons) = 1126 MB of RAM available for the 6 remaining threads, barely over 1 GB. If an Einstein@home CPU app says it needs 350 MB of RAM to run, BOINC will only run 3 of those Einstein tasks while the other 3 threads remain unused because BOINC can't find enough RAM to reserve for each of those remaining threads. With the 2 Pythons plus the 3 Einstein tasks, BOINC would only find a measly 76 MB of RAM - not enough for what a single Einstein@home task asks for. But possibly enough for some other task of some other project.
While BOINC can't find more than 76 MB of RAM, it doesn't mean that the system only has 76 MB of available RAM. It could have 10 GB available for all I know.

If it is causing too much trouble on your computer, I think you should set Rosetta to receive no new work and see if they change the amount of RAM required, which is something Admin said he would ask about. Or simply skip the Pythons and run the 4.20's whenever they are available.
ID: 103572 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103573 - Posted: 27 Nov 2021, 18:57:44 UTC - in response to Message 103572.  

The problem is that BOINC is reserving all that RAM because the Pythons say they need all that RAM. By reserve, BOINC is simply taking into account that the Pythons say they need X RAM to run and therefore will only start other tasks with the remainder of non-reserved RAM.
The actual amount of RAM used is far lower than the one that is reserved.
RAM in reserve does not equal RAM in use which is why you see BOINC saying it doesn't have enough memory to run other tasks while Windows Task Manager says you're only using 54% of available RAM.


On my Ryzen 1400 with 16 GB of RAM, I can run 2 Pythons plus 6 MCM tasks. If I tried running Einstein@home CPU tasks, I probably couldn't run 6 because BOINC is told it needs to reserve a lot of RAM for the Pythons.

16 GB of RAM means I can run 2 Pythons with BOINC reserving 7.629 MB of RAM (from the log on my laptop which can't run these tasks) for each Python. That means I have 16384 MB - 15258 MB (Reserved for the 2 Pythons) = 1126 MB of RAM available for the 6 remaining threads, barely over 1 GB. If an Einstein@home CPU app says it needs 350 MB of RAM to run, BOINC will only run 3 of those Einstein tasks while the other 3 threads remain unused because BOINC can't find enough RAM to reserve for each of those remaining threads. With the 2 Pythons plus the 3 Einstein tasks, BOINC would only find a measly 76 MB of RAM - not enough for what a single Einstein@home task asks for. But possibly enough for some other task of some other project.
While BOINC can't find more than 76 MB of RAM, it doesn't mean that the system only has 76 MB of available RAM. It could have 10 GB available for all I know.

If it is causing too much trouble on your computer, I think you should set Rosetta to receive no new work and see if they change the amount of RAM required, which is something Admin said he would ask about. Or simply skip the Pythons and run the 4.20's whenever they are available.



Ahh! very good explanation. Yes they should lower the RAM, if it is not going to be used, then why grab it?
Well then I am going to abandon Python for now and watch the threads or you could send me a message when you see something about lowering RAM requirements. It's killing my other projects.
ID: 103573 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 103574 - Posted: 28 Nov 2021, 1:13:28 UTC
Last modified: 28 Nov 2021, 2:11:35 UTC

Front page - Total queued jobs: 0
server status - Tasks ready to send 27 .. Tasks in progress 85407
Is this the big cleanout of RPP memory monsters
Looks like it will be a quiet weekend for rosetta crunching
Lets hope the VM comes back with much reduced memory footprint
Till then I will pop over to cosmo and give it a quick scrub with my Vb machine
{wich haz turned into a crash test dummy of errors and aborts of late}
ID: 103574 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
sgaboinc

Send message
Joined: 2 Apr 14
Posts: 282
Credit: 208,966
RAC: 0
Message 103576 - Posted: 28 Nov 2021, 15:40:58 UTC - in response to Message 103229.  
Last modified: 28 Nov 2021, 15:44:10 UTC

I wish they'd use KVM/QEMU instead of Virtualbox for Linux. It's the much more efficient method of virtualization on Linux that doesn't require installing external DKMS modules since it's supported directly by the Linux kernel. That said, I don't see why we're even using virtualization when a sandboxed namespace does the job just as well. Anyway, call me when there's interest in seeking open source contributors to transition from Python to Rust.


i've been wondering if some could use things like docker. that'd make do with not needing virtualization. after all python runs natively in linux. besides docker, there are things like lxc https://linuxcontainers.org/. but i'd guess setup is an issue.
but i'd guess it isn't as 'cross platform' as virtualbox.
ID: 103576 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
sgaboinc

Send message
Joined: 2 Apr 14
Posts: 282
Credit: 208,966
RAC: 0
Message 103577 - Posted: 28 Nov 2021, 15:55:14 UTC
Last modified: 28 Nov 2021, 16:08:45 UTC

no more work ? really?
As of 28 Nov 2021, 12:00:15 UTC [ Scheduler running ]
Total queued jobs: 0
In progress: 66,178
Successes last 24h: 48,304
Users (last day ): 1,376,949 (+11)
Hosts (last day ): 4,479,172 (+48)
Credits last 24h : 10,490,148
Total credits : 140,755,449,638
TeraFLOPS estimate: 104.901
ID: 103577 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 103578 - Posted: 28 Nov 2021, 16:10:58 UTC - in response to Message 103577.  

I haven't seen COVID-19 work at Rosetta@home since last year - July or August?
Except for the odd Robetta work unit. - Most of Robetta work nowadays goes to RoseTTAFold so we don't crunch that.

I'm sure there will be more work soon, COVID or not.
ID: 103578 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 103579 - Posted: 28 Nov 2021, 16:12:46 UTC - in response to Message 103573.  

Ahh! very good explanation. Yes they should lower the RAM, if it is not going to be used, then why grab it?
Well then I am going to abandon Python for now and watch the threads or you could send me a message when you see something about lowering RAM requirements. It's killing my other projects.



I think if you subscribe to a thread you automatically receive an email notification.
ID: 103579 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103582 - Posted: 28 Nov 2021, 17:31:50 UTC - in response to Message 103577.  

no more work ? really?

I am continuing to get work, though most of it is _1.

And the new ones seem to be taking longer, or at least the estimates are.
It may just be how my machines are set up. I am doing more work units now, but they may be cache-limited.
ID: 103582 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 103583 - Posted: 28 Nov 2021, 19:33:58 UTC - in response to Message 103582.  

no more work ? really?

I am continuing to get work, though most of it is _1.
And the new ones seem to be taking longer, or at least the estimates are.
It may just be how my machines are set up. I am doing more work units now, but they may be cache-limited.

You may be getting the ones that I `aborted` . `errored` . `crashed`
wimin drivers . . . . :) Nnnnnn,,
ID: 103583 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103584 - Posted: 28 Nov 2021, 21:56:26 UTC - in response to Message 103583.  

Abort some more. They are really out of pythons now, though I did pick up a few of the regular Rosettas.
But even they seem to be out now. On some of my machines, I can make it until tomorrow. On others, I can't.
ID: 103584 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103585 - Posted: 28 Nov 2021, 23:19:00 UTC - in response to Message 103584.  

Abort some more. They are really out of pythons now, though I did pick up a few of the regular Rosettas.
But even they seem to be out now. On some of my machines, I can make it until tomorrow. On others, I can't.



Hurry up...3 left
ID: 103585 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103586 - Posted: 28 Nov 2021, 23:55:25 UTC - in response to Message 103585.  

Hurry up...3 left
Yes, you have to get them when you can. But I pick up a few more from time to time, so I should make it until tomorrow.
Hopefully they will throw some more in the hopper.
ID: 103586 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
nwayno

Send message
Joined: 28 May 20
Posts: 6
Credit: 7,006,260
RAC: 1
Message 103588 - Posted: 29 Nov 2021, 4:07:13 UTC - in response to Message 80629.  

Yes there has been no work units for several weeks. I switched to World Community Grid. My raspberry pi's have nothing to do, so I am powering those off.

It would certainly help, as you said something like: Yeah, it's broke, we're working on it. I will check in again after the first of the year as well.
ID: 103588 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,382,444
RAC: 19,446
Message 103589 - Posted: 29 Nov 2021, 6:44:04 UTC - in response to Message 103588.  

Yes there has been no work units for several weeks.
Not true.
There have been periods over the last 3 weeks where there have been no new Rosetta 4.20 Tasks available from the project- there were 4 days with no new work from Nov 11th, after that it was generally 1-2 days between spurts of new Rosetta 4.20 work. Along with the occasional batch of RB Tasks being sent out as well.
But it has been just the last 36 hours or so where there has been no new Python work available either, just the very occasional resend.
Grant
Darwin NT
ID: 103589 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 137 · 138 · 139 · 140 · 141 · 142 · 143 . . . 309 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org