Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 46 · 47 · 48 · 49 · 50 · 51 · 52 . . . 278 · Next

AuthorMessage
Profile GoldenHat

Send message
Joined: 14 Apr 20
Posts: 3
Credit: 122,663
RAC: 0
Message 95463 - Posted: 28 Apr 2020, 6:48:23 UTC - in response to Message 94755.  

I'm running Windows 10 64bit. I haven't checked system monitor no but I will. Since this post it seems to have disappeared. I rebooted the PC and it's been fine. I notice sometimes it does it for a short period of time but settles again. I'm not a big techie so I'm not going to spend ages faultfinding or trying to understand how it works. It's running so I'll leave it.

Thanks very much for your desire to assist, I do appreciate you taking the time to reply. I'll keep my eye on the monmitor if it goes funky again.

Richard.
ID: 95463 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Michael E.@ team Carl Sagan

Send message
Joined: 5 Apr 08
Posts: 16
Credit: 1,847,250
RAC: 308
Message 95513 - Posted: 28 Apr 2020, 23:18:56 UTC

This post asks about fixing the estimated Remaining time on long Rosetta tasks. I tend to be pretty direct so here goes...

I was using long 36 hours Rosetta tasks and cut it down to 24 hours, but still have the same issue. This project-specific preference is set under the web interface: Your account > Rosetta@home preferences > Target CPU run time.

On the computer under Advanced View > Options > Computing Preferences, I set my Store at least to 1 days of work, but I still get jobs that do not complete and have to be aborted.

With 24+ hour tasks, the estimated Remaining time says about 6 hours until about 6-7 hours elapsed time, when the more accurate time gets calculated, such as 17 hours left.

Questions/strong suggestion:
+ Could the estimated Remaining time for such 24+ hour tasks be doubled to prevent the need to abort so many tasks?
+ Could there be a limit on the number of downloaded tasks (maybe just long tasks) at a time to 2?

Could the option for long tasks > 10 or 12 hours be disabled until the estimated Remaining time can be fixed? I do not think it is a good practice for people to abort tasks.

Does it matter to the Rosetta@home research if we use 8-10 hour tasks rather than 24+ hour tasks?

Mike
ID: 95513 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
CIA

Send message
Joined: 3 May 07
Posts: 100
Credit: 21,059,812
RAC: 0
Message 95515 - Posted: 29 Apr 2020, 0:09:43 UTC - in response to Message 95513.  


Does it matter to the Rosetta@home research if we use 8-10 hour tasks rather than 24+ hour tasks?


Honestly if you are running 24 hour tasks why even have a cache? Are you running another project besides Rosetta? As long as you have a good internet connection the downtime from finishing a task and getting a fresh one is next to nothing and you are only hitting the server once per day for a new task (vs 3 times a day for 8 hour tasks). All my machines that are set to 24 hour tasks run 0 cache without issue.
ID: 95515 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brummit

Send message
Joined: 14 Jul 14
Posts: 2
Credit: 30,582
RAC: 0
Message 95517 - Posted: 29 Apr 2020, 1:14:49 UTC - in response to Message 95457.  

Shall do. Step 1 - complete the tasks I have now. Then download more if successful.
Thanks.
ID: 95517 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Michael E.@ team Carl Sagan

Send message
Joined: 5 Apr 08
Posts: 16
Credit: 1,847,250
RAC: 308
Message 95519 - Posted: 29 Apr 2020, 1:50:02 UTC - in response to Message 95515.  

My original question: Does it matter to the Rosetta@home research if we use 8-10 hour tasks rather than 24+ hour tasks?


Honestly if you are running 24 hour tasks why even have a cache? Are you running another project besides Rosetta? As long as you have a good internet connection the downtime from finishing a task and getting a fresh one is next to nothing and you are only hitting the server once per day for a new task (vs 3 times a day for 8 hour tasks). All my machines that are set to 24 hour tasks run 0 cache without issue.


So you are telling me that the same type of tasks are sent regardless of task length? That is, they get split up so there can be smaller tasks?

I want to understand the needs of the researchers. For example, if longer tasks do different types of calculations than small tasks and few people process them, I can do the long tasks.
ID: 95519 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1225
Credit: 13,856,762
RAC: 2,065
Message 95523 - Posted: 29 Apr 2020, 2:55:31 UTC - in response to Message 95519.  
Last modified: 29 Apr 2020, 2:57:21 UTC

[snip]
So you are telling me that the same type of tasks are sent regardless of task length? That is, they get split up so there can be smaller tasks?

I want to understand the needs of the researchers. For example, if longer tasks do different types of calculations than small tasks and few people process them, I can do the long tasks.

The tasks are sent out as batches of calculations, sometimes with one starting point and sometimes with a list of starting points. This part is the same between short and long tasks. There are often 100 steps per batch.

A target time set by the user is sent along with them. This controls how many steps of the batch are calculated.

There has been no clear statement on how two tasks from the same workunit are compared if they haven't done an equal number of steps.

Sometimes, the quality of the last step can be calculated rapidly; if so, this calculation is often used in place of an additional task per workunit to allow comparison.

There has been talk, but not yet action, about a new class of workunits that can use up to 4 gigabytes of memory each, rather than the usual up to 2 gigabytes. This is intended to allow work on larger proteins, which will probably also require larger target times,
ID: 95523 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 95524 - Posted: 29 Apr 2020, 3:07:34 UTC - in response to Message 95519.  

I want to understand the needs of the researchers. For example, if longer tasks do different types of calculations than small tasks and few people process them, I can do the long tasks.


The long tasks do the same calculations as the short tasks. They just do more of them. Check the number of "decoys" in your completed results. What the researchers need is thousands of completed decoys. Long tasks might complete 100 of them, short tasks might complete 20 of them. Combine a machine running long ones with a machine running short ones and you get 120 completed decoys.

...and once you successfully complete and report about a dozen tasks of the same runtime preference, BOINC Manager will have a much better guess on the runtime to expect for future tasks. Once the estimated runtime of an unstarted task approaches your current runtime preference, you will stop getting more work than you can complete before the deadline (assuming your cache size is less than the 3 day deadlines). To help things in the meantime, a smaller cache size helps avoid getting more work than you can complete within the 3 day deadline.
Rosetta Moderator: Mod.Sense
ID: 95524 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 95525 - Posted: 29 Apr 2020, 3:14:50 UTC - in response to Message 95523.  

There has been talk, but not yet action, about a new class of workunits that can use up to 4 gigabytes of memory each, rather than the usual up to 2 gigabytes. This is intended to allow work on larger proteins, which will probably also require larger target times,


Actually, Admin said the models for those particular large proteins should only run about a hour, and typically consumed 1.8GB. I had presumed they would take longer as well, so I apologize for contributing to the mistaken information. It all depends on the type of study they perform on the protein. In this case, they are planning to do "comparative modeling" on the large proteins.
Rosetta Moderator: Mod.Sense
ID: 95525 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1225
Credit: 13,856,762
RAC: 2,065
Message 95526 - Posted: 29 Apr 2020, 3:30:09 UTC - in response to Message 95525.  
Last modified: 29 Apr 2020, 3:31:53 UTC

There has been talk, but not yet action, about a new class of workunits that can use up to 4 gigabytes of memory each, rather than the usual up to 2 gigabytes. This is intended to allow work on larger proteins, which will probably also require larger target times,


Actually, Admin said the models for those particular large proteins should only run about a hour, and typically consumed 1.8GB. I had presumed they would take longer as well, so I apologize for contributing to the mistaken information. It all depends on the type of study they perform on the protein. In this case, they are planning to do "comparative modeling" on the large proteins.

Correction - models needing up to 4 gigabytes are talked about for SOME BOINC project, but I'm not sure if it is Rosetta@home.
ID: 95526 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 95527 - Posted: 29 Apr 2020, 3:37:41 UTC - in response to Message 95526.  
Last modified: 29 Apr 2020, 3:39:00 UTC

R@h has talked about "4GB tasks"... where they are telling the BOINC Manager "kill this task if it should ever try to go larger than 4GB of memory, something is wrong with it if it is going that large (memory bound)". The actual observed footprint though is typically 1.8GB.
Rosetta Moderator: Mod.Sense
ID: 95527 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Michael E.@ team Carl Sagan

Send message
Joined: 5 Apr 08
Posts: 16
Credit: 1,847,250
RAC: 308
Message 95528 - Posted: 29 Apr 2020, 4:31:36 UTC - in response to Message 95527.  

Many thanks to Mod.Sense, robertmiles, CIA and (previously) Grant for the clear explanations.

So it seems R@h users should stick to the default task size, which is 8 hours. For older systems or those not used 24x7, choose shorter length tasks.

OK about the system learning the Remaining time after about 12 tasks (good!). I do think limiting the number of work units for new hosts that have not run a dozen tasks would help.

Also, I ran into a strange situation where a 24-hour work unit reached zero Remaining time but kept processing for 10 extra hours. Work unit was 1043928617 and task 1043928617.
I had to restart my PC and that task reset to 10 hours remaining so I aborted it as it was out of time.

If you want a beta tester for the large R@h tasks, let me know.
ID: 95528 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 95529 - Posted: 29 Apr 2020, 5:14:10 UTC - in response to Message 95528.  

If you want a beta tester for the large R@h tasks, let me know.


Folks that enjoy this sorta thing can join Ralph@home. This project is run by the same folks as R@h. Ralph sorta short for "Rosetta Alpha". Works just like any other BOINC project. It is where new application versions, server configurations, and work unit types are tested. They only send work when they need testers though, so get connected and let the BOINC scheduler keep asking for work periodically, and eventually you get some.
Rosetta Moderator: Mod.Sense
ID: 95529 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Daedalus

Send message
Joined: 1 Aug 08
Posts: 39
Credit: 9,978,641
RAC: 990
Message 95558 - Posted: 29 Apr 2020, 17:34:56 UTC

I get an awful lot of tasks that complete very slowly. I set my runtime goal to 4 hours. Most rotten tasks advance at exactly 6.480 % an hour. I kill them mercilessly. But i have to babysit rosetta and check all my tasks one by one as soon as they start crunching. This is a waste of my time and my electricity. Should i dump my whole queue ?

P.S. And previously i was getting a lot of tasks with a completion time of 9 hours. But you could detect them and dump them before they started so less effort was needed.
ID: 95558 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Millenium

Send message
Joined: 20 Sep 05
Posts: 68
Credit: 184,283
RAC: 0
Message 95559 - Posted: 29 Apr 2020, 18:28:09 UTC

Or maybe they are just fine and it takes more than 4 hours to complete. The default runtime is 8 hours after all. What happens if you let them run? Do they keep going for days? Do they eventually end?
ID: 95559 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Daedalus

Send message
Joined: 1 Aug 08
Posts: 39
Credit: 9,978,641
RAC: 990
Message 95562 - Posted: 29 Apr 2020, 19:56:26 UTC

Ok, i may have an optimistic explanation of this: The completion percentages shown in the BOINC manager might be wildly innacurate.

I had "very slow" WU's and "less slow" WU's. I killed a lot of very slow ones and let the less slow run their course. Surprise: the less slow tasks, announced to take 6 hours or more have stopped at around 4 hours and half as promised. They were at 85 -90 % completion and they suddently jumped to 100% With the correct run time.

Let's hope it will be the same with the "very slow" ones. I will only know tomorrow.
ID: 95562 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1225
Credit: 13,856,762
RAC: 2,065
Message 95566 - Posted: 29 Apr 2020, 20:41:58 UTC - in response to Message 95563.  

Daedalus,

The estimated completion times usually ARE wildly inaccurate for about the next dozen tasks after any of the following events occur:

1. You adjust your target time on the Rosetta@home server.

2. You start using a new version of a Rosetta@home application.

3. Any other actions making a big difference in how long the tasks run. For example, having Folding@home and Rosetta@home running at the same time without doing enough to eliminate their competition for CPU time.
ID: 95566 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tom M

Send message
Joined: 20 Jun 17
Posts: 85
Credit: 5,196,651
RAC: 47,128
Message 95572 - Posted: 29 Apr 2020, 23:30:00 UTC - in response to Message 95562.  

Ok, i may have an optimistic explanation of this: The completion percentages shown in the BOINC manager might be wildly innacurate.

I had "very slow" WU's and "less slow" WU's. I killed a lot of very slow ones and let the less slow run their course. Surprise: the less slow tasks, announced to take 6 hours or more have stopped at around 4 hours and half as promised. They were at 85 -90 % completion and they suddently jumped to 100% With the correct run time.

Let's hope it will be the same with the "very slow" ones. I will only know tomorrow.


Sounds reasonable. Just wait for it :)

Tom M
Help, my tagline is missing..... Help, my tagline is......... Help, m........ Hel.....
ID: 95572 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Daedalus

Send message
Joined: 1 Aug 08
Posts: 39
Credit: 9,978,641
RAC: 990
Message 95588 - Posted: 30 Apr 2020, 9:34:38 UTC

Yes, save a few tasks cancelled by the server, all my tasks computed normally. So i wil ignore the progression as reported by the BOINC manager. :)
ID: 95588 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Millenium

Send message
Joined: 20 Sep 05
Posts: 68
Credit: 184,283
RAC: 0
Message 95596 - Posted: 30 Apr 2020, 11:18:47 UTC

Yup, as long as the WUs finish it's fine. Sure, if you see one running for a day and more then maybe that WU has a problem.
ID: 95596 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1990
Credit: 38,522,839
RAC: 15,277
Message 95612 - Posted: 30 Apr 2020, 15:17:03 UTC - in response to Message 95517.  

Shall do. Step 1 - complete the tasks I have now. Then download more if successful.
Thanks.

It seems like you're not running tasks 1215hrs every day. Or if you are, you maybe have kind of setting to suspend work while the computer is in operation?
It's not clear why nothing is returned within 3 days when you only have 8hr tasks if your machine is on 1215hrs a day.
ID: 95612 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 46 · 47 · 48 · 49 · 50 · 51 · 52 . . . 278 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org