0 new tasks, Rosetta?

Message boards : Number crunching : 0 new tasks, Rosetta?

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 14 · Next

AuthorMessage
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,814,265
RAC: 12,040
Message 98805 - Posted: 7 Sep 2020, 20:59:51 UTC - in response to Message 98803.  

I like the 2nd idea better as a strict number of tasks means the person bringing a 64 core machine can't even get enough to fill it up once if the number is too small, 3 per core is a good start, 8 hours tasks times 3 tasks equals 24 hours, then let more flow thru as the machine starts returning tasks up to the amount they can return in 3 days. If you want the user to feel like they have enough tasks than 4 or 5 per core would be more than 24 hours of work. I think a short 'Notice' from Rosetta would help people figure out what's going on and why they aren't getting 300 tasks to 'fill their cache'.

On a side note I HATE the default of Boincs 10 day workunit cache settings!!! I would much rather see a 1 or 2 day setting to start with and then let the user up that as they start to figure out what's going on. Sending a brand new user 300 workunits that take 5 hours each on a quad core machine is crazy!!

I think the Team GPUUsersGroup has a Boinc add-on that limits the number of workunits you can get from a project and that would help alot at those projects that just keep endlessly sending workunits even though you have 1/2 day cache settings.


I thought the Boinc default was quite small, although since I immediately change it on installation I may have misremembered.

Are you telling me Rosetta actually sends 10 days of work with a 3 day deadline? Boinc can't be that useless.
ID: 98805 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1232
Credit: 14,277,903
RAC: 1,635
Message 98826 - Posted: 7 Sep 2020, 22:46:08 UTC - in response to Message 98805.  

[snip]

Are you telling me Rosetta actually sends 10 days of work with a 3 day deadline? Boinc can't be that useless.


It's not that useless, AFTER enough tasks of each new version have been returned for a proper calculation of how long each task will run. This is usually 10 successful tasks.

But watch for the problem to start again every time the application is updated.
ID: 98826 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,814,265
RAC: 12,040
Message 98834 - Posted: 7 Sep 2020, 23:47:45 UTC - in response to Message 98826.  

[snip]

Are you telling me Rosetta actually sends 10 days of work with a 3 day deadline? Boinc can't be that useless.


It's not that useless, AFTER enough tasks of each new version have been returned for a proper calculation of how long each task will run. This is usually 10 successful tasks.

But watch for the problem to start again every time the application is updated.


Surely before it has it's own timing, it will believe what the server tells it. And the Rosetta server knows all tasks are exactly 8 hours.
ID: 98834 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1679
Credit: 17,825,538
RAC: 22,979
Message 98837 - Posted: 7 Sep 2020, 23:55:42 UTC - in response to Message 98834.  
Last modified: 8 Sep 2020, 0:00:12 UTC

And the Rosetta server knows all tasks are exactly 8 hours.
Some people run them for 2 hours, others for a day and a half.




The changes made several months back now set the initial Estimated completion time to 8 hours. So the problems of some systems picking up hundreds of Tasks when a new application is released (or a new system is attached to Rosetta) should no longer occur.
But people with large caches, or limited time when BOINC can actually do work, may still end up with a few more than they can process and miss the deadline. But not the hundreds+ that used to occur.
Grant
Darwin NT
ID: 98837 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,165,606
RAC: 4,021
Message 98842 - Posted: 8 Sep 2020, 0:34:45 UTC - in response to Message 98837.  

And the Rosetta server knows all tasks are exactly 8 hours.
Some people run them for 2 hours, others for a day and a half.




The changes made several months back now set the initial Estimated completion time to 8 hours. So the problems of some systems picking up hundreds of Tasks when a new application is released (or a new system is attached to Rosetta) should no longer occur.
But people with large caches, or limited time when BOINC can actually do work, may still end up with a few more than they can process and miss the deadline. But not the hundreds+ that used to occur.


That's very good!!

I have mine set at 2 hours but they don't stop at 2 hours, some go longer which is actually okay with me because I don't run them on a regular basis.
ID: 98842 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Keith Myers
Avatar

Send message
Joined: 29 Mar 20
Posts: 97
Credit: 332,619
RAC: 363
Message 98847 - Posted: 8 Sep 2020, 0:59:33 UTC - in response to Message 98805.  

I thought the Boinc default was quite small, although since I immediately change it on installation I may have misremembered.

Are you telling me Rosetta actually sends 10 days of work with a 3 day deadline? Boinc can't be that useless.

Ha ha LOL. My very first connection to Rosetta upon joining sent me 246 tasks in a single download after congratulating me for joining. With a 3 day deadline. Had to abort all but ten after setting NNT before the next scheduler connection or it would have kept sending work.
ID: 98847 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,814,265
RAC: 12,040
Message 98850 - Posted: 8 Sep 2020, 1:48:26 UTC - in response to Message 98847.  

I thought the Boinc default was quite small, although since I immediately change it on installation I may have misremembered.

Are you telling me Rosetta actually sends 10 days of work with a 3 day deadline? Boinc can't be that useless.

Ha ha LOL. My very first connection to Rosetta upon joining sent me 246 tasks in a single download after congratulating me for joining. With a 3 day deadline. Had to abort all but ten after setting NNT before the next scheduler connection or it would have kept sending work.


Then there was the bug which only occurred on a couple of my machines, where they downloaded 1 task every 10 minutes, vastly exceeding my buffer setting. Restarting the Boinc client seemed to stop it being silly.
ID: 98850 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1679
Credit: 17,825,538
RAC: 22,979
Message 98856 - Posted: 8 Sep 2020, 2:05:40 UTC - in response to Message 98842.  

I have mine set at 2 hours but they don't stop at 2 hours, some go longer which is actually okay with me because I don't run them on a regular basis.
The default is 8 hours. Most Tasks can produce useful results within 2 hours, but others can't- so they will continue to run until they can produce a useful result, or run for up to another 10 hours, at which time regardless of the result they are ended by the watchdog timer; which ever comes first.
Grant
Darwin NT
ID: 98856 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,814,265
RAC: 12,040
Message 98858 - Posted: 8 Sep 2020, 2:17:01 UTC - in response to Message 98856.  

I have mine set at 2 hours but they don't stop at 2 hours, some go longer which is actually okay with me because I don't run them on a regular basis.
The default is 8 hours. Most Tasks can produce useful results within 2 hours, but others can't- so they will continue to run until they can produce a useful result, or run for up to another 10 hours, at which time regardless of the result they are ended by the watchdog timer; which ever comes first.


Mine stop between 7 hours 50 and 8 hours 5. So I assume the timer allows the current result to complete?
ID: 98858 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1679
Credit: 17,825,538
RAC: 22,979
Message 98860 - Posted: 8 Sep 2020, 2:26:28 UTC - in response to Message 98858.  

Mine stop between 7 hours 50 and 8 hours 5. So I assume the timer allows the current result to complete?
Yep.
If the Target CPU time is 8 hours, and it figures the the next Decoy can't be done before the 8 hours is up, then it will end the Tasks there- they're the Tasks that finish before 8 hours. If it thinks it can do the next Decoy within 8 hours, then it'll start work on it, and in most cases it'll finish not long after 8 hours.
But every so often you'll get some Tasks that can't finish the Decoy anywhere near the Target CPU time, so that's where the Watchdog timer comes in- it gives it an extra 10 hours to finish & if it's not done by then it gets ended anyway.
Grant
Darwin NT
ID: 98860 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1232
Credit: 14,277,903
RAC: 1,635
Message 98863 - Posted: 8 Sep 2020, 3:05:27 UTC - in response to Message 98834.  

[snip]

Are you telling me Rosetta actually sends 10 days of work with a 3 day deadline? Boinc can't be that useless.


It's not that useless, AFTER enough tasks of each new version have been returned for a proper calculation of how long each task will run. This is usually 10 successful tasks.

But watch for the problem to start again every time the application is updated.


Surely before it has it's own timing, it will believe what the server tells it. And the Rosetta server knows all tasks are exactly 8 hours.

The server doesn't know how to tell the client that in a way that the client will understand. You might try telling the BOINC developers to add a feature to let the server set the run time based on what the server knows. If they add it, though, expect Rosetta@home and Ralph@home to be the only two BOINC projects that use it.
ID: 98863 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2124
Credit: 41,206,907
RAC: 10,305
Message 98879 - Posted: 8 Sep 2020, 16:47:49 UTC - in response to Message 98778.  

I hadn't been paying much attention recently, but while restarting my PC that crashed a few days ago I just started refilling my cache - grabbed 17 ok and then that seemed to be the very last of them.
Now actually zero ready to send and nothing being downloaded

I managed one re-send since, but just now have managed to fill my cache on 2 machines.
Not sure if it's a sign of many coming through as all figures still show zero, but dribs and drabs are making an appearance.
Hopefully that gives the project guys a few days extra to get some more regular work available.

Not entirely sure what everyone else is talking about, but having over 3 million tasks queued to send means everything's been sorted very quickly.
Thanks guys
ID: 98879 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,814,265
RAC: 12,040
Message 98880 - Posted: 8 Sep 2020, 18:13:46 UTC - in response to Message 98863.  

Surely before it has it's own timing, it will believe what the server tells it. And the Rosetta server knows all tasks are exactly 8 hours.

The server doesn't know how to tell the client that in a way that the client will understand. You might try telling the BOINC developers to add a feature to let the server set the run time based on what the server knows. If they add it, though, expect Rosetta@home and Ralph@home to be the only two BOINC projects that use it.


I thought Boinc already did this. If I set up a new machine with Boinc, it downloads a sensible amount. Or is that just a standard amount of tasks set at the project end? For example the huge climate change ones only downloaded one. but projects with small tasks downloaded many.

This seems like a very basic thing that should have been included from the start. I don't understand Boinc programmers at all. And they don't like my attitude, so I won't bother asking.
ID: 98880 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,814,265
RAC: 12,040
Message 98881 - Posted: 8 Sep 2020, 18:14:40 UTC - in response to Message 98879.  

I hadn't been paying much attention recently, but while restarting my PC that crashed a few days ago I just started refilling my cache - grabbed 17 ok and then that seemed to be the very last of them.
Now actually zero ready to send and nothing being downloaded

I managed one re-send since, but just now have managed to fill my cache on 2 machines.
Not sure if it's a sign of many coming through as all figures still show zero, but dribs and drabs are making an appearance.
Hopefully that gives the project guys a few days extra to get some more regular work available.

Not entirely sure what everyone else is talking about, but having over 3 million tasks queued to send means everything's been sorted very quickly.
Thanks guys


I wish other projects would tell us their queue size.
ID: 98881 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
CIA

Send message
Joined: 3 May 07
Posts: 100
Credit: 21,059,812
RAC: 0
Message 98883 - Posted: 8 Sep 2020, 19:10:21 UTC - in response to Message 98881.  
Last modified: 8 Sep 2020, 19:11:21 UTC

Since we are off topic anyway, I have 9 machines running on my account. They run Rosetta 24/7. Back in March when I set these 9 machines up, they had the default 8hr runtime, but these days 8 of the 9 are now set to 24hr runtimes. They've been on 24hr runtimes for about 3 months, and returned hundreds of tasks while at that time limit. Boinc still shows fresh tasks as 8 hours on them though, so I have my cache at 0 to avoid getting dozens of WU's I can't possibly finish before the deadline. Once they start crunching they gradually adjust the "Time Remaining" and "Elapsed" time to = 24 hours, but when they are new they still show 8.

Do I just need to detach and re-attach to Rosetta to get it to see these machines now have a 24hr runtime set on them?
ID: 98883 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1679
Credit: 17,825,538
RAC: 22,979
Message 98884 - Posted: 8 Sep 2020, 19:18:15 UTC - in response to Message 98880.  

I thought Boinc already did this. If I set up a new machine with Boinc, it downloads a sensible amount. Or is that just a standard amount of tasks set at the project end? For example the huge climate change ones only downloaded one. but projects with small tasks downloaded many.

This seems like a very basic thing that should have been included from the start. I don't understand Boinc programmers at all. And they don't like my attitude, so I won't bother asking.
The problem isn't with BOINC, it's with the projects.
The is no way on Earth BOINC can know how long it will take to process something, until it has actually processed it- successfully - then it can make reasonable estimates for completion time base on the processing time from that previous work. And if a system hasn't done any processing for any other projects previously (a fresh install) then it also doesn't know how much time the system will be on, or how much of that time it will have to do BOINC processing.

It is up to each project to provide the initial estimate of how long it will take to process a Task, that's what the wu.fpops_est / fpops_est value is for (amongst other things). Ideally they would provide an estimate that will result in a fairly high initial Estimated completion time, which will be revised down to the actual value as Valid work is returned.


AFAIK Rosetta is the only project that has a fixed processing time. All the other projects have Work Units that are processed until there is no more data to be processed. The more powerful the processor, the sooner it is done. Since Rosetta Tasks run for a fixed length of time, and people can select how long that is, the best method would be for the initial Estimated completion time to be based on their Target CPU time. If it's set for 2 hours, then the Estimated completion time should be 2 hours. 36hrs, then the Estimated completion time should be 36hours.
In the end, the project went with using the default Target CPU time, which is 8 hours. So those that have opted for 2 hours won't get as much work as their cache settings would need, and those that chose 36hrs will get slightly more than they will be able to process before the deadline.

There hasn't been a new Application released since these changes were made, but it's pretty likely we won't see a repeat of previous new application rollouts where people were getting hundreds (if not thousands) of Tasks they were never going to be able to finish in time.
Now the only issues are when people have larger than needed caches, or are micro managing things & don't actually understand about caching & resource share. BOINC is working as intended, it is doing as they have asked it to. They just don't understand what is is they've asked, because they don't understand how things are actually meant to work.
Grant
Darwin NT
ID: 98884 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1232
Credit: 14,277,903
RAC: 1,635
Message 98886 - Posted: 8 Sep 2020, 19:26:27 UTC - in response to Message 98880.  
Last modified: 8 Sep 2020, 19:33:21 UTC

Surely before it has it's own timing, it will believe what the server tells it. And the Rosetta server knows all tasks are exactly 8 hours.

The server doesn't know how to tell the client that in a way that the client will understand. You might try telling the BOINC developers to add a feature to let the server set the run time based on what the server knows. If they add it, though, expect Rosetta@home and Ralph@home to be the only two BOINC projects that use it.


I thought Boinc already did this. If I set up a new machine with Boinc, it downloads a sensible amount. Or is that just a standard amount of tasks set at the project end? For example the huge climate change ones only downloaded one. but projects with small tasks downloaded many.

This seems like a very basic thing that should have been included from the start. I don't understand Boinc programmers at all. And they don't like my attitude, so I won't bother asking.


The server already tells the client how much calculation is required. If the client already has a good number for its speed of its calculations, then calculating how many tasks to download is simple. Some BOINC projects are better than others at supplying good default values to be used for the speed until it can be measured, usually from the speed of the first ten successfully finished tasks.

It would probably be better, though, if for new versions of an applications, the starting estimate of the speed was about the same as for the previous version.
ID: 98886 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 0
Message 98888 - Posted: 8 Sep 2020, 20:04:32 UTC - in response to Message 98883.  

CIA wrote:
we are off topic
Not everybody appreciates that, so be careful. But since you asked a question here, I’m happy to try to answer it here… :-⁠)


Do I just need to detach and re-attach to Rosetta to get it to see these machines now have a 24hr runtime set on them?
It won’t make any difference. All Rosetta tasks are delivered to all clients with a declaration that they will require 8 hours of CPU time. That is regardless of user run time preference, as I assume it would be an excessive load on the server to adjust the value for each task it sends it out. Moreover, the project has disabled the mechanism by which each BOINC client can learn over time how the actual CPU time differs from the declared value. The result is that tasks that have not yet started will always show a remaining time estimate of 8 hours. Once a task has started running (and taken the run time preference into account) the calculation is based on the rate at which it makes progress, and so becomes more meaningful. (More detailed discussion starting here.)
ID: 98888 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,814,265
RAC: 12,040
Message 98890 - Posted: 8 Sep 2020, 21:13:31 UTC - in response to Message 98884.  

I thought Boinc already did this. If I set up a new machine with Boinc, it downloads a sensible amount. Or is that just a standard amount of tasks set at the project end? For example the huge climate change ones only downloaded one. but projects with small tasks downloaded many.

This seems like a very basic thing that should have been included from the start. I don't understand Boinc programmers at all. And they don't like my attitude, so I won't bother asking.
The problem isn't with BOINC, it's with the projects.
The is no way on Earth BOINC can know how long it will take to process something, until it has actually processed it- successfully - then it can make reasonable estimates for completion time base on the processing time from that previous work. And if a system hasn't done any processing for any other projects previously (a fresh install) then it also doesn't know how much time the system will be on, or how much of that time it will have to do BOINC processing.

It is up to each project to provide the initial estimate of how long it will take to process a Task, that's what the wu.fpops_est / fpops_est value is for (amongst other things). Ideally they would provide an estimate that will result in a fairly high initial Estimated completion time, which will be revised down to the actual value as Valid work is returned.


AFAIK Rosetta is the only project that has a fixed processing time. All the other projects have Work Units that are processed until there is no more data to be processed. The more powerful the processor, the sooner it is done. Since Rosetta Tasks run for a fixed length of time, and people can select how long that is, the best method would be for the initial Estimated completion time to be based on their Target CPU time. If it's set for 2 hours, then the Estimated completion time should be 2 hours. 36hrs, then the Estimated completion time should be 36hours.
In the end, the project went with using the default Target CPU time, which is 8 hours. So those that have opted for 2 hours won't get as much work as their cache settings would need, and those that chose 36hrs will get slightly more than they will be able to process before the deadline.

There hasn't been a new Application released since these changes were made, but it's pretty likely we won't see a repeat of previous new application rollouts where people were getting hundreds (if not thousands) of Tasks they were never going to be able to finish in time.
Now the only issues are when people have larger than needed caches, or are micro managing things & don't actually understand about caching & resource share. BOINC is working as intended, it is doing as they have asked it to. They just don't understand what is is they've asked, because they don't understand how things are actually meant to work.


I don't understand why Boinc can't make a good estimate before it's run one. If you look at the properties of a work unit you've downloaded, it tells you how many flops it has to do. Boinc also knows how many flops per second your CPU runs at, based on its own benchmarks. So it knows pretty accurately how long those tasks will take.
ID: 98890 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,814,265
RAC: 12,040
Message 98891 - Posted: 8 Sep 2020, 21:16:30 UTC - in response to Message 98888.  

CIA wrote:
we are off topic
Not everybody appreciates that, so be careful. But since you asked a question here, I’m happy to try to answer it here… :-⁠)


Do I just need to detach and re-attach to Rosetta to get it to see these machines now have a 24hr runtime set on them?
It won’t make any difference. All Rosetta tasks are delivered to all clients with a declaration that they will require 8 hours of CPU time. That is regardless of user run time preference, as I assume it would be an excessive load on the server to adjust the value for each task it sends it out. Moreover, the project has disabled the mechanism by which each BOINC client can learn over time how the actual CPU time differs from the declared value. The result is that tasks that have not yet started will always show a remaining time estimate of 8 hours. Once a task has started running (and taken the run time preference into account) the calculation is based on the rate at which it makes progress, and so becomes more meaningful. (More detailed discussion starting here.)


Why does Rosetta do this unusual 8 hour thing? What's wrong with the standard way other projects use of issuing a standard sized piece of work, which takes 1-10 hours depending on CPU speed? WCG does similar work to Rosetta and seems to manage that way.
ID: 98891 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 14 · Next

Message boards : Number crunching : 0 new tasks, Rosetta?



©2024 University of Washington
https://www.bakerlab.org