Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 69 · 70 · 71 · 72 · 73 · 74 · 75 . . . 310 · Next

AuthorMessage
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 98478 - Posted: 11 Aug 2020, 0:29:12 UTC - in response to Message 98477.  
Last modified: 11 Aug 2020, 1:07:29 UTC

Folding is planning to offer a BOINC version. Probably not until after they finish their COVID-19 work, though.

I thing "planning" is a bit ahead of where they are at the moment. I have been in those discussions for some time.
They will probably look into it at some point. It could be done, but it takes some work.

PS - By the way, they are beta testing a new CPU core a8. It is said to offer a 40 to 50% increase in output, and allow more advanced science. It is based on the latest GROMACS 2020.
I don't know much beyond that, but want to check it out for a while. (I don't do the betas, but you can look on their forums at the beta section if you have a login).
If you set the "advanced" flag in their Control app, you can get the first ones right after the beta finishes.
ID: 98478 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Corgi

Send message
Joined: 19 Jun 19
Posts: 5
Credit: 2,453,241
RAC: 2,154
Message 98479 - Posted: 11 Aug 2020, 4:52:33 UTC - in response to Message 98467.  

Quite a wonderful bunch of replies! I'm having a brain hiccup, though:
How many cores do you have?
...remind me where I look to answer this, please?

Computer's always on; I remember to take BOINC off pause about half the time before I go to bed [/sheepish].

I might also try that GPU-slot idea mentioned as well.
ID: 98479 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 0
Message 98481 - Posted: 11 Aug 2020, 8:22:38 UTC - in response to Message 98479.  

How many cores do you have?
...remind me where I look to answer this, please?
Computer details: Number of processors: 4

Running multiple BOINC projects you might need to review the Resource share setting for each one, and the associated client setting Switch between tasks, to make sure Rosetta tasks have sufficient chance to run. (I only run Rosetta, so others can advise better here.)
ID: 98481 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 6,010
Message 98485 - Posted: 11 Aug 2020, 17:13:04 UTC - in response to Message 98479.  

Computer's always on; I remember to take BOINC off pause about half the time before I go to bed [/sheepish].


Do what I do, tell Boinc what the programs you run that need it to pause are, then it does it itself and always remembers to turn back on.

If you're using Boinc Manager, it's in "Options", "Exclusive Applications". Then just add the path to the program in either the top box for stopping Boinc altogether (ignore what it says, it's wrong, it stops your GPU aswell), or the bottom box to only stop the GPU and leave the CPU running.

If you're using Boinctasks, it's in "Extra", "Boinc Preference", "Exclusive Applications", making sure you have the correct computer selected. Then add the path to the program and tick "GPU only" if applicable, otherwise it pauses everything.
ID: 98485 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
10esseetony

Send message
Joined: 24 Dec 11
Posts: 5
Credit: 23,602,985
RAC: 0
Message 98791 - Posted: 7 Sep 2020, 19:26:06 UTC
Last modified: 7 Sep 2020, 19:26:38 UTC

Problems and Technical Issues, eh? How about 41GB of RAM for ONE task? Name: ygG5REMC******1009391_1307_0

ID: 98791 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 6,010
Message 98793 - Posted: 7 Sep 2020, 19:35:23 UTC - in response to Message 98791.  
Last modified: 7 Sep 2020, 19:36:17 UTC

Problems and Technical Issues, eh? How about 41GB of RAM for ONE task? Name: ygG5REMC******1009391_1307_0



Time to upgrade your computer ;-)
Two of mine have 36GB RAM, but they'll take 128GB :-)

By the way, your image requires an Anandtech login.
ID: 98793 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
10esseetony

Send message
Joined: 24 Dec 11
Posts: 5
Credit: 23,602,985
RAC: 0
Message 98811 - Posted: 7 Sep 2020, 21:39:46 UTC - in response to Message 98793.  
Last modified: 7 Sep 2020, 21:41:40 UTC

Thanks for letting me know the image can't be accessed, but clearly you are BOINC'ing around with the wrong team, if you don't have an AnandTech forum account. :P The pic is just a screenshot of the properties of the offending task. As for the RAM upgrade, I am already maxed out at 512GB. I am not quite ready to buy 8 sticks of 128GB just yet....not sure the MB can support it anyway. ( I am teasing....X570 MB with 64GB of RAM, Ryzen 3950X. So you can imagine my surprise at seeing tasks "waiting for memory," especially since I am only letting 8 Rosetta tasks run for the moment )
ID: 98811 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile gbaker

Send message
Joined: 10 Jul 09
Posts: 1
Credit: 60,020,721
RAC: 1,677
Message 98812 - Posted: 7 Sep 2020, 21:47:49 UTC - in response to Message 98791.  

I'm having a similar issue. Seems like one or two tasks keeps using as much memory as it can. It typically starts with normal behavior, then suddenly memory usage starts climbing fast at a linear rate of about 10GB per minute. If no memory limit is set for boinc, memory usage increases it hits the 32GB limit in my system (and quickly burns through swap). Then memory usage crashes back down to normal (about 12GB). If I set a memory limit, it would generally just suspend the task as "waiting for memory" (which leaves a core and a chunk of memory unavailable to other boinc tasks)

Sometimes it will behave normally a few minutes and sometimes it immediately starts another cycle of fast increase in usage, then immediately falling back to normal once it fills memory.

In my case, the problem workunits that have popped up while writing this are some variation on q1RftdTf_fold_and_dock. (This wasn't the only problem workunit, just the one that happened to be causing problems at this moment)
Here are links to one of the tasks that seemed to be causing the problem (which I aborted)
https://boinc.bakerlab.org/rosetta/result.php?resultid=1255773464

I'm currently running Rosetta on two computers, but only one of them seems to be running into this issue (at least at the current moment):
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=5287905

The other computer seems to be running a non-overlapping set of tasks, so I don't know whether or not it would experience the problem given the same tasks.

This isn't really a major issue for me, but I'm assuming it's not intended behavior.
ID: 98812 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1729
Credit: 18,491,225
RAC: 20,847
Message 98817 - Posted: 7 Sep 2020, 22:23:39 UTC - in response to Message 98791.  

Problems and Technical Issues, eh? How about 41GB of RAM for ONE task? Name: ygG5REMC******1009391_1307_0

Looks like another batch of dodgy Work Units.

BTW- I'd suggest reducing the size of your cache, to 0.
You only need to cache Tasks if there's a chance of running out of work before you next get new work. Being signed up to a dozen active projects, that will never occur, so no need for a cache.
So no more missed deadlines- no point processing work if you're not going to get Credit for it.

Your account, computing preferences,
Other	
           Store at least 0.01 days of work
Store up to an additional 0.01 days of work

Grant
Darwin NT
ID: 98817 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 6,010
Message 98820 - Posted: 7 Sep 2020, 22:40:02 UTC - in response to Message 98811.  

Thanks for letting me know the image can't be accessed, but clearly you are BOINC'ing around with the wrong team, if you don't have an AnandTech forum account. :P The pic is just a screenshot of the properties of the offending task. As for the RAM upgrade, I am already maxed out at 512GB. I am not quite ready to buy 8 sticks of 128GB just yet....not sure the MB can support it anyway. ( I am teasing....X570 MB with 64GB of RAM, Ryzen 3950X. So you can imagine my surprise at seeing tasks "waiting for memory," especially since I am only letting 8 Rosetta tasks run for the moment )


Even if I had an account, I doubt I'd be asked to log in for an inline image. Or does it stay logged in forever on your browser?
ID: 98820 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 6,010
Message 98822 - Posted: 7 Sep 2020, 22:41:49 UTC - in response to Message 98817.  

Problems and Technical Issues, eh? How about 41GB of RAM for ONE task? Name: ygG5REMC******1009391_1307_0

Looks like another batch of dodgy Work Units.

BTW- I'd suggest reducing the size of your cache, to 0.
You only need to cache Tasks if there's a chance of running out of work before you next get new work. Being signed up to a dozen active projects, that will never occur, so no need for a cache.
So no more missed deadlines- no point processing work if you're not going to get Credit for it.

Your account, computing preferences,
Other	
           Store at least 0.01 days of work
Store up to an additional 0.01 days of work


Unless you run Milkyway on a GPU. Those have tasks that can take 30 seconds. And they refuse to fix the server (I've asked two successive project leaders and nothing gets fixed) - you cannot download new tasks if you're reporting completed tasks, so you need a big buffer (well 3 hours anyway).
ID: 98822 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 1,227
Message 98827 - Posted: 7 Sep 2020, 22:51:35 UTC - in response to Message 98820.  

Even if I had an account, I doubt I'd be asked to log in for an inline image. Or does it stay logged in forever on your browser?


My Firefox browser will save logon information indefinitely as long as you use it every so often.
ID: 98827 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1729
Credit: 18,491,225
RAC: 20,847
Message 98829 - Posted: 7 Sep 2020, 22:58:30 UTC - in response to Message 98822.  

Unless you run Milkyway on a GPU. Those have tasks that can take 30 seconds. And they refuse to fix the server (I've asked two successive project leaders and nothing gets fixed) - you cannot download new tasks if you're reporting completed tasks, so you need a big buffer (well 3 hours anyway).
If it were your only project, yes. If you're running more than one project, it's still not necessary even if one of the projects has issues with work allocation. Your other project will pick up work, and then BOINC will do extra for the first project when it can get work to balance out the debt between projects.
Grant
Darwin NT
ID: 98829 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 6,010
Message 98835 - Posted: 7 Sep 2020, 23:49:30 UTC - in response to Message 98827.  

Even if I had an account, I doubt I'd be asked to log in for an inline image. Or does it stay logged in forever on your browser?


My Firefox browser will save logon information indefinitely as long as you use it every so often.


Yes, my Opera browser does that too, but all it does is fill in the password when you're asked for it. I don't think an inline image would pass the correct request through.
ID: 98835 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 6,010
Message 98836 - Posted: 7 Sep 2020, 23:52:13 UTC - in response to Message 98829.  
Last modified: 7 Sep 2020, 23:52:40 UTC

Unless you run Milkyway on a GPU. Those have tasks that can take 30 seconds. And they refuse to fix the server (I've asked two successive project leaders and nothing gets fixed) - you cannot download new tasks if you're reporting completed tasks, so you need a big buffer (well 3 hours anyway).
If it were your only project, yes. If you're running more than one project, it's still not necessary even if one of the projects has issues with work allocation. Your other project will pick up work, and then BOINC will do extra for the first project when it can get work to balance out the debt between projects.


I run more than Milkyway and I need the buffer. Otherwise Boinc only ever asks MW for a couple of 30 second tasks, as that's all it needs to fill the buffer. Then it hits the problem of not getting any more until it's backed off for 10 minutes. So even if I've said half Einstein, half MW, it ends up only managing to run MW a tenth of the time.
ID: 98836 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
10esseetony

Send message
Joined: 24 Dec 11
Posts: 5
Credit: 23,602,985
RAC: 0
Message 98838 - Posted: 8 Sep 2020, 0:11:06 UTC - in response to Message 98820.  
Last modified: 8 Sep 2020, 0:18:59 UTC

You are correct, no, one shouldn't have to log in to see the image, now that you mention it. I'll just link to the thread, but beware, have your adblocker turned on: https://forums.anandtech.com/threads/recent-changes-in-projects.2500471/post-40275238

My tasks that timed out were not due to an inability to complete them, it was forgetfulness that I had 'temporarily' suspended Rosetta on that machine. ///insert forehead slap emoji here///

I would caution against having zero cache as you suggest....I pay too much for my energy bill to have my machines idle for ANY length of time (internet outage/server outage/server upgrade/home router locked up/etc etc). Rosetta has run dry many times and I do not check my machines but once daily.
ID: 98838 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1729
Credit: 18,491,225
RAC: 20,847
Message 98839 - Posted: 8 Sep 2020, 0:23:21 UTC - in response to Message 98836.  

Otherwise Boinc only ever asks MW for a couple of 30 second tasks, as that's all it needs to fill the buffer. Then it hits the problem of not getting any more until it's backed off for 10 minutes. So even if I've said half Einstein, half MW, it ends up only managing to run MW a tenth of the time.
Looks like it's been an issue forever.
J Stateson built a BOINC client to work around Milkyway's stuffed up server configuration.

Finally getting new tasks only seconds after running out. May not be worth the hassle.
Grant
Darwin NT
ID: 98839 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,216,696
RAC: 1,087
Message 98840 - Posted: 8 Sep 2020, 0:29:32 UTC - in response to Message 98836.  

Peter Hucker wrote:
I run more than Milkyway and I need the buffer. Otherwise Boinc only ever asks MW for a couple of 30 second tasks, as that's all it needs to fill the buffer. Then it hits the problem of not getting any more until it's backed off for 10 minutes. So even if I've said half Einstein, half MW, it ends up only managing to run MW a tenth of the time.


MilkyWay needs us to run other projects tasks that run more than 10 minutes because that's the backoff the Project requires...NO communication with MW for 10 minutes before it will send new gpu tasks, personally I use PrimeGrid as they have short tasks and respect the zero resources share. I run 1 maybe 2 PG tasks and them MW refills the cache and I am off and crunching them again. If the gpu is not the fastest then Collatz will work as a zero resource share project too.

IF you want to go outside the norm then a user made an alternative Boinc Manager at MilkyWay and it handles the 10 minute backoff so that it's not a problem, I don't know how but people that use it say it works.
ID: 98840 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1729
Credit: 18,491,225
RAC: 20,847
Message 98841 - Posted: 8 Sep 2020, 0:31:11 UTC - in response to Message 98838.  

I pay too much for my energy bill to have my machines idle for ANY length of time
?
If they are idle, the power they'd be using (unless they're really, really old systems), would be bugger all.



Rosetta has run dry many times and I do not check my machines but once daily.
Rosetta might have run out, but you are also doing work for over a dozen other projects. I can't see all those projects running out of work at the same time- so you'll do a bit more work for those projects, then a bit extra for Rosetta when it has work again.
Hence no need for a cache, let alone one more than a few hours or so.

If you have crappy internet, what's the longest usual outage? Set the cache for that. Even so, with the short deadlines with Rosetta, anything larger than a couple of days when running that many projects will result in some missed deadlines as the systems workout how to meet their Resource share settings.
Grant
Darwin NT
ID: 98841 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hangint3n

Send message
Joined: 23 Mar 20
Posts: 8
Credit: 1,958,078
RAC: 0
Message 98845 - Posted: 8 Sep 2020, 0:55:15 UTC - in response to Message 98812.  

Just had a similar problem on my box. froze the whole thing up.

===
hangint3n
ID: 98845 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 69 · 70 · 71 · 72 · 73 · 74 · 75 . . . 310 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org