Message boards : News : Rosetta's role in fighting coronavirus
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 27 · Next
Author | Message |
---|---|
Laurent Send message Joined: 15 Mar 20 Posts: 14 Credit: 88,800 RAC: 0 |
As an advise from an internet stranger: pick the most commonly us I typed a lengthy post, hit "submit" and got an server error. I checked if the post made it, saw nothing. So i repeated like 10x. Still got errors, lost interest, aborted. You see the end result: 5 partial posts (if an admin could please trim the mess down to one post .... ?) The short version of the end: pick the most commonly used class combinations and build a monolithic GPU implementation (like the one used for most rosetta-boinc WUs). Do not care too much about making modules, but do care a lot about data layout on the GPU. Modules and stuff comes later, but getting to a data layout that is actually efficient (low number of copies between GPU and host) is hard. Making the data layout modular in the first go will bite you hard later. I also wrote something about not aiming for MPI and multiple GPUs in the first go. I have seen a few do-it-all port attempts that failed hard because taking into consideration a rare corner case made the data layout on the GPU really, really awful. KISS (keep it simple and stupid) is king in GPU land. Adding multiple GPU / nodes later and using them in a task parallel way is very often easier and more efficient (coding-time and runtime). BTW and very off-topic.: i saw your post about tn-grid and GPU. Thanks for the tip, that one looks doable in 2 weeks. I contacted valterc (picked the last admin posting anything GPU related), hoping to get a bit of feedback from the attempt made by Daniel. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2087 Credit: 40,639,753 RAC: 4,834 |
The self-serving arrogance of this. Lol - I can see the gallery's applause from here. Can I suggest something. Can you increase your runtimes back to the default 8hrs rather than the 4hrs <you've edited> it to on your PCs? Because the project, which is benefitting from a fantastic surge of new users atm, has <run out> of tasks right now and if you hadn't run your tasks for <half> the time you wouldn't have grabbed <double> the number of tasks for yourself to the exclusion of <other people> and they'd be able to run <more> tasks for the project and for <longer> and maybe we wouldn't have run out quite so <early> But look how many tasks <you've> got. You are not "rosetta" And again, lol. Spare me |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1982 Credit: 9,230,406 RAC: 4,461 |
BTW and very off-topic.: i saw your post about tn-grid and GPU. Thanks for the tip, that one looks doable in 2 weeks. I contacted valterc (picked the last admin posting anything GPU related), hoping to get a bit of feedback from the attempt made by Daniel. Great!! P.S. If you have a problem to contact Valterc, send me a PM. I know him personally. |
pembo Send message Joined: 11 Oct 16 Posts: 2 Credit: 516,933 RAC: 0 |
Awesome Thankyou! |
Xech Send message Joined: 19 Mar 20 Posts: 2 Credit: 2,118,278 RAC: 0 |
Big thanks for the reply talking about whether this helping. Would love to see further publications on the progress being made with the influx of users, even though it may be lackluster or unintelligible to the layman. I also get that doing the work itself is of paramount importance, and that takes priority over belly rubs. Even weekly drops of what's on the docket I think would help keep existing users excited and help fuel new users as well. Thank you again! |
bcov Volunteer moderator Project developer Project scientist Send message Joined: 8 Nov 16 Posts: 12 Credit: 11,348 RAC: 0 |
has <run out> of tasks right now As far as I can tell it hasn't run out. I've been submitting to the R@H queue and apparently my jobs need 2G of ram (sorry 1G users...). But with that in mind, I see 1.4M jobs lined up and ready to go. Also, on the topic of job length, here's the tricky scenario I'm in. Let's say for instance I have 1M proteins that need to be designed. My goal is to do this as fast as possible. But how to best do this? To avoid wasting your CPU cycles, I need to make sure that all jobs have enough proteins to go for 8 hours. I submit this way, knowing full-well that not all of the proteins will be designed. Some people design for fewer hours, and some people just have slow computers (which are both fine btw). There's also another case where someone is running for 8 hours, but the job gets interrupted a bunch of times and doesn't finish for 2 weeks. With this in mind, I wait about 2 days after all the jobs have started, at which point I see what has finished, and resubmit anything that hasn't come back yet. This process has to repeat a few times with each submission getting closer and closer to 100% completion, but I typically give up at around 80%-90% and then call it a day. (I don't necessarily need all the outputs, I just need a lot) (Also, when I resubmit, I don't give up on jobs currently running, I just give someone else the same job too.) So with that in mind, yes, 8 hours is better than 4 hours because our server is under a lot of load at the moment. (More users and these design jobs are roughly 100X worse on the server). But, more importantly than the number of hours is the turnaround time, so if your 8 hour jobs are taking a week to finish, there's nothing wrong with going 4 hours. |
RandyF Send message Joined: 2 Nov 14 Posts: 6 Credit: 7,744,262 RAC: 0 |
First of all, thank you for what you all do. I am currently running Rosetta on a laptop, a desktop, and trying to get work units for my 3 Android devices. Unfortunately, no matter how many times I reset or update, I haven't received one work unit on any Android device. Any advice? Thank you. |
yoerik Send message Joined: 24 Mar 20 Posts: 128 Credit: 169,525 RAC: 0 |
First of all, thank you for what you all do. I am currently running Rosetta on a laptop, a desktop, and trying to get work units for my 3 Android devices. Unfortunately, no matter how many times I reset or update, I haven't received one work unit on any Android device. Any advice? Thank you. They haven't released units for Android for awhile. In the meantime, please run projects that actively use Android. Universe@Home, World Community Grid and Asteroids@Home all have Android units available right now. The second Rosetta sends out Android units again (if ever), you'll know. |
Fritzhuber Send message Joined: 19 Jan 16 Posts: 3 Credit: 176,501 RAC: 0 |
Since today there are some Android WUs at the Ralp@Home Project (The official alpha test project for Rosetta@home). If those run stable there might be new Android WUs for Rosetta as well. Maybe you can help testing those WUs at Ralp@Home. |
yoerik Send message Joined: 24 Mar 20 Posts: 128 Credit: 169,525 RAC: 0 |
has <run out> of tasks right now Thanks for this. Have to ask as a noob, though - what is the max ideal turnaround time for a slow PC? The laptops I'm running WUs on can take 10-12 hours per WU - and I can abort the Rosetta queue so I'm not slowing the research down. |
RandyF Send message Joined: 2 Nov 14 Posts: 6 Credit: 7,744,262 RAC: 0 |
Thank you! Would love to help test! Pardon my ignorance, but how do I add Ralp@Home work units using the Android blank client? I don't see it listed in a list of projects. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1982 Credit: 9,230,406 RAC: 4,461 |
Pardon my ignorance, but how do I add Ralp@Home work units using the Android blank client? I don't see it listed in a list of projects. There are 3 points top right in project page of Android client, and "add a project by url". |
nealburns5 Send message Joined: 11 May 19 Posts: 37 Credit: 10,184,436 RAC: 0 |
I think they are all supposed to take around 8 hours (of cpu time) no matter how fast or slow the machine. They must be scaling the wus based on benchmarks. |
yoerik Send message Joined: 24 Mar 20 Posts: 128 Credit: 169,525 RAC: 0 |
To clarify - I asked about turnaround time (the time between downloading the WU and submitting it for validation) - as the project team member was talking about. They said they were more concerned about turnaround time, not CPU time. Thanks for trying to help, though! |
Falconet Send message Joined: 9 Mar 09 Posts: 352 Credit: 1,112,255 RAC: 642 |
Bcov, Thanks for the explanation. I assume this is why some WU's end far earlier than the designated CPU time target on the prefs, because they computed all the protein work packaged in the WU. I have my my prefs set at 24 hours for now. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2087 Credit: 40,639,753 RAC: 4,834 |
has <run out> of tasks right now It was for a very short time - so short that it didn't exhaust buffers - and tasks started coming through again shortly after. A pre-emptive comment that was resolved before anyone noticed. So with that in mind, yes, 8 hours is better than 4 hours because our server is under a lot of load at the moment. (More users and these design jobs are roughly 100X worse on the server). But, more importantly than the number of hours is the turnaround time, so if your 8 hour jobs are taking a week to finish, there's nothing wrong with going 4 hours. Personally I hold 1.5 days buffer, plus the 8hr runtime, so everything I grab goes back in less than 2 days. The subject's been discussed here before and you seem to confirm again it's the optimal balance of security of work for users and turnaround time for you. As such, 4hr task runtimes only serve to double the hit on your servers, which you confirm they can ill-afford, for zero benefit. If ever you guys need a quicker turnaround time, please pipe up. The adjustment for those who read here and pay attention is trivial. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2087 Credit: 40,639,753 RAC: 4,834 |
Also, on the topic of job length, here's the tricky scenario I'm in. Let's say for instance I have 1M proteins that need to be designed. My goal is to do this as fast as possible. But how to best do this? To avoid wasting your CPU cycles, I need to make sure that all jobs have enough proteins to go for 8 hours. And on this, I was just about to say something. My 8-core PC has 16Gb RAM, running 6 tasks needing 1Gb, 1 needing 1.6Gb and 1 needing 2Gb. I'm only just fitting them in. Same with a 4-core laptop with 8Gb running 4*1Gb tasks right now, but which reported insufficient memory earlier today. The reason is now obvious. Lots of reports of problems with tasks atm. I'd inspect for demands on memory before reporting. With the work being run atm, Rosetta is a very demanding project. Users may need to consider whether upgrades are required to support it fully, not just with RAM but also with CPU cooling. And now may be the time to blow out all those dust-bunnies that've built up in your PC case to run cooler. A little bit of maintenance never did any harm. Make use of those masks you bought too! |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2087 Credit: 40,639,753 RAC: 4,834 |
Personally I hold 1.5 days buffer, plus the 8hr runtime, so everything I grab goes back in less than 2 days. Inadvertently it seems I provided the answer to later questions from others. Any task buffer size you chose from the default values to these values are fine, but don't exceed them otherwise tasks are returned too late. |
Tom Send message Joined: 28 Mar 20 Posts: 1 Credit: 245,052 RAC: 0 |
The international team CRUNCHERS SANS FRONTIERES is on board! Thanks to the Rosetta@home project and its Admins for allowing us to participate in the fight against COVID-19. Keep crunching hard! |
Millenium Send message Joined: 20 Sep 05 Posts: 68 Credit: 184,283 RAC: 0 |
Welcome to all new crunchers! It is time to crunch! And it's interesting to learn more about how the project works. |
Message boards :
News :
Rosetta's role in fighting coronavirus
©2024 University of Washington
https://www.bakerlab.org