Message boards : Number crunching : What are your tips for new Rosetta@home users?
Author | Message |
---|---|
Mod.Zilla Volunteer moderator Send message Joined: 5 Sep 06 Posts: 423 Credit: 6 RAC: 0 |
If you've posted a message with tips you would like to moved to this thread, please send me a private message with the IDs of your posts, and I will move them for you. Please, only ask me to move your own posts. My tip is just to try and leave the BOINC Manager alone. Let it decide what to run. The more you try to force it to do things, the more confused it can get. Post to the Number Crunching message board with any questions you have about how to adjust your settings to achieve specific objectives for your machine(s). Rosetta Informational Moderator: Mod.Zilla |
rulez-alex Send message Joined: 27 Aug 11 Posts: 11 Credit: 189,802 RAC: 0 |
Is it right to increase the processing time of a task from 8 hours in the project settings by default Target CPU run time to one day in order to reduce the load on the server and calculate more in one task. ? |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Yes, just be aware that your existing WUs will take on the new runtime preference. So you want to be sure you don't have too many tasks on-board when you make such changes. Rosetta Moderator: Mod.Sense |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2119 Credit: 41,175,312 RAC: 11,123 |
Is it right to increase the processing time of a task from 8 hours in the project settings by default Target CPU run time to one day in order to reduce the load on the server and calculate more in one task. ? Yes, I've been considering this myself. Maybe not to 24hrs, but certainly to 12hrs. The chances of having too many tasks and overfilling buffers to miss deadlines are currently zero. It may be worth reducing the buffer size at the same time for when tasks do come back on stream. If I make the change I'll do both at the same time |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Right, upping the runtime preference when you have no work, but have a 4 day buffer, is going to get you way more than 4 days of work. This is because BOINC Manager hasn't seen the new runtime preference run to completion. So it doesn't know any better than to order work at the previous size. So if you were doing 4 hour WUs, and you bump it to 12, you'll basically get three times more work than you expected. If your cache is 4 days, that would mean you get 12 days of work. i.e. too much. So what Sid is saying is basically reduce the network preferences to, for example, 2 days (down from 4), so BOINC requests 2 days of work at the old runtime it knows about. Rosetta Moderator: Mod.Sense |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2119 Credit: 41,175,312 RAC: 11,123 |
Right, upping the runtime preference when you have no work, but have a 4 day buffer, is going to get you way more than 4 days of work. This is because BOINC Manager hasn't seen the new runtime preference run to completion. So it doesn't know any better than to order work at the previous size. So if you were doing 4 hour WUs, and you bump it to 12, you'll basically get three times more work than you expected. If your cache is 4 days, that would mean you get 12 days of work. i.e. too much. Just to expand on this, because lots of things have been said in lots of places and it's very likely people haven't picked up all elements of it, particular users completely new to Rosetta and Boinc. A proportion of tasks come down on Rosetta with 3 day deadlines. That means no-one should store a buffer of more than 3 days, otherwise they'll download tasks that won't meet deadline as soon as they arrive and Boinc will have to intervene to run them at high priority at some point ahead of other tasks and projects with longer deadlines. People who've recently arrived from other projects who are used to much longer deadlines may have become used to something very different, but a large buffer (in terms of days) can conflict, forcing other projects' tasks to be suspended while Rosetta tasks permanently seem to run ahead. Old ideas on buffer sizes will definitely conflict with what's necessary here. On top of this, one researcher recently said that, after they release new tasks to us, they wait 2 days to see what kind of results have been returned, and if that's around 80% of tasks issued they'll start to review the results and take them into account when considering the next batch being put out. So we can consider whether to use 2 days to be a soft turnaround time within the 3 day deadline - there's no need to only target completion just before the shutters come down. But what does a 2 day buffer actually comprise? In Boinc's advanced view, under the Options menu, the first option is Computing preferences. On the first tab, titled Computing, in the bottom section titled Other: Store at least x.x days of work Store up to an additional x.x days of work The first refers to how long, before you completely run out of tasks, Boinc should ask for more tasks. I believe Boinc's default is 0.1 days (2.4hrs) The second refers to the number days worth of buffer you want to call down. So if you enter a figure of 1.0 days and have a 4-core machine running 24 hours a day using 8hr tasks, 4 cores x 3 tasks a day will give you a buffer of 12 tasks. I think Boinc's default for the 2nd field is actually only 0.25 days (6hrs) These two fields are additive, so the default buffer is 0.1+0.25=0.35 days - well within the required turnaround time. So when you have 0.3499 days of tasks left, another (default 8hr = 0.33 of a day) task will be called for, meaning it will be returned about 0.68 days later - again within the turnaround time. While tasks are currently in very short supply (like now, but it will always happen from time to time for shorter periods) it's worth keeping a buffer of tasks on hand. This is partly a function of run-time. If you maintain the default 8hrs tasks, and allow for the possibility of task over-runs when the watchdog needs to cut in (4hrs over default), the buffer you should set in the above two fields should total 1.5 days, eg: 0.1 & 1.4 or 0.25 & 1.25 as you prefer. Setting it in Boinc means it also affects other projects you run, but it will also avoid/prevent/resolve conflicts between projects. If you chose to increase task runtimes to 12hrs (+4 for potential watchdog interventions = 0.67 days) your buffer should reduce to 1.33 days total. Similar adjustment for any other task runtime change. These numbers should be considered maximums for buffers to meet the soft turnaround time mentioned above on the Rosetta project. One other change from Boinc defaults is in the Disk and Memory tab under Options/Computing preferences In the section titled Memory, tick the option to 'Leave non-GPU tasks in memory when suspended' I believe the failure to have this option ticked is the cause of problems when PCs or Laptops come out of Sleep mode or after Shutdowns. It's frequently solved otherwise inexplicable errors. There seem to be problems restarting tasks that this seems to solve. This should be considered a default setting within Rosetta E&OE |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,738,758 RAC: 8,494 |
I see "Rosetta for portable devices" in the status, with plenty jobs, but none made it to my phones. How long have they been there? Phones take an age to finish stuff so they may have a full cache and it's not easy to interrogate them. Welcome back! With some 30,000 new hosts over the last 2 days, the servers have been hammered. You are probably one of the only people with active WUs, so just keep crunching. The BOINC Manager will clean them up if they expire. Does it? In my experience (on any project), Boinc Manager only deletes unstarted expired units. It continues to try to finish one it's halfway through even if it will be late. I abort those manually when I spot them. Total newbie here, and don't know that much about whether I'm actually helping or hindering. I usually use desktops, but the only laptop I have (an old Acer I was given) overheats if I run 4 WUs (it's a quad core). The fan goes flat out and gets to 70-80C with only 2 WUs. Unfortunately I know of no way to upgrade a laptop fan. A proportion of tasks come down on Rosetta with 3 day deadlines. That means no-one should store a buffer of more than 3 days, otherwise they'll download tasks that won't meet deadline as soon as they arrive and Boinc will have to intervene to run them at high priority at some point ahead of other tasks and projects with longer deadlines. Surely Boinc isn't that daft - it actually downloads more than the deadline of the tasks in question?! I just use a 1+1 day buffer. Otherwise if I change the priority of a project, it takes ages for it to take effect. I'm glad there's such a committed effort, however the irony isn't lost in using computers with components made in China who are profiting not only from the world's misery they created but covered it up. Everything in China is a lie. Try buying anything from a Chinese seller on Ebay. External hard disk - 8 year old broken drive put in a new box and cleverly reformatted to look like new. Replacement for that - same problem. 1TB thumb drive - actually 0.1TB, reformatted cleverly to make it look bigger. Pass the 0.1TB mark and data is overwritten. Li-Ion batteries - quoted as 2500mAH in AA size - not even yet invented - Sony only get 1000. Actual capacity - 300mAH, I measured it. The virus starting in China does not prove that the Chinese did it. It now looks like natural processes among animals did it instead. And passed to us due to their "wet meat markets". No hygiene over there. There are no known cases in Antarctica. But are you about to move there? All you need to do is be away from people. That is not best done by everyone going for a walk in the town and passing each other! I'm being safer by breaking the rules, and driving off to a countryside path where I meet virtually nobody. |
James Lee* Send message Joined: 28 Jan 17 Posts: 6 Credit: 2,770,691 RAC: 0 |
After the last post by the Admin, I want to make sure that everyone here knows the value of two IMPORTANT values that need to be set correctly. The first is "Resource Share", and the second, that causes the most angst here, is the "primary and secondary Buffers" for days of work. SInce I have over a dozen machines, all different and all custom built, and combined with over a hundred threads and each have at least one high performance GPU, I CANNOT be bothered by constant checks on their performance, workload, etc. With my gained experience in working with and on multiple projects, the following are my recommendations. First, if this the project you really want to run the most, set your resource share to 999999. That way, when WUs are available, BOINC will look here first. Set your secondary favorite project to 5000, and your 3rd favorite to 100. This way you will NEVER have an iidle processor or thread. Next, for your buffers.... Set your primary buffer to ".21" and your second to 0(zero). This allows boinc to always go thru its sequence of resource share to get your next WU. DO NOT BUFFER FOR DAYS - ONLY BUFFER FOR HOURS! This is IMPORTANT, for when a project first starts up after a lag in WUs, it does not have to do such a flood of requests, and as a project goes thru a series of WUs for collection, there are not days of WUs that have to be waited on. If you use the optimal number of .21, BOINC will take care of everything. A smaller number will give you a wait period, and a larger number is not healthy for the project. THIS WILL MAKE THE PROJECT RUN A LOT SMOOTHER AND REDUCE THE ADMINS FRETS! DO NOT COMPLAIN about the lack of WUs during software or hardware changeovers. Let your other projects fill in the CPU time while this one is doing what they can do in order to continue. If you do not have the smarts to follow this advice, then do NOT bother to post! |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2119 Credit: 41,175,312 RAC: 11,123 |
I see "Rosetta for portable devices" in the status, with plenty jobs, but none made it to my phones. How long have they been there? Phones take an age to finish stuff so they may have a full cache and it's not easy to interrogate them. Same... <sigh> A proportion of tasks come down on Rosetta with 3 day deadlines. That means no-one should store a buffer of more than 3 days, otherwise they'll download tasks that won't meet deadline as soon as they arrive and Boinc will have to intervene to run them at high priority at some point ahead of other tasks and projects with longer deadlines. <side-eye> It's not Boinc deciding what to download. It's user settings in conflict with deadlines, then Boinc dealing with the consequences. We had a protracted thread on the subject a few years ago with a whole pile of unintended consequences you wouldn't believe. Eyeroll territory. Your settings of 1+1 will meet all the hard deadlines here, but if that researcher's comments are what he actually does, your results won't be taken into account in his next iteration. I don't think he was lying to us, so you ought to adjust if you run Rosetta, because all your results will return after he starts making decisions. That's what he's saying. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2119 Credit: 41,175,312 RAC: 11,123 |
After the last post by the Admin, I want to make sure that everyone here knows the value of two IMPORTANT values that need to be set correctly. The first is "Resource Share", and the second, that causes the most angst here, is the "primary and secondary Buffers" for days of work. SInce I have over a dozen machines, all different and all custom built, and combined with over a hundred threads and each have at least one high performance GPU, I CANNOT be bothered by constant checks on their performance, workload, etc. With my gained experience in working with and on multiple projects, the following are my recommendations. First, if this the project you really want to run the most, set your resource share to 999999. That way, when WUs are available, BOINC will look here first. Set your secondary favorite project to 5000, and your 3rd favorite to 100. This way you will NEVER have an iidle processor or thread. Next, for your buffers.... Set your primary buffer to ".21" and your second to 0(zero). This allows boinc to always go thru its sequence of resource share to get your next WU. DO NOT BUFFER FOR DAYS - ONLY BUFFER FOR HOURS! This is IMPORTANT, for when a project first starts up after a lag in WUs, it does not have to do such a flood of requests, and as a project goes thru a series of WUs for collection, there are not days of WUs that have to be waited on. If you use the optimal number of .21, BOINC will take care of everything. A smaller number will give you a wait period, and a larger number is not healthy for the project. THIS WILL MAKE THE PROJECT RUN A LOT SMOOTHER AND REDUCE THE ADMINS FRETS! DO NOT COMPLAIN about the lack of WUs during software or hardware changeovers. Let your other projects fill in the CPU time while this one is doing what they can do in order to continue. If you do not have the smarts to follow this advice, then do NOT bother to post! Blimey! And I thought I was being hopeful trying to get people down to 1.5 days! Not that I can disagree with it. Basically it's saying that you should only aim to grab 1 extra task per core you run, 5 hours before you complete the ones you're running. The rest, I guess, being redundant as the project holds the stock of tasks already. It might be considered harsh as it doesn't account for things like temporary loss of connectivity with the user or at the project, though it does almost reflect the default BOINC settings of 0.1 & 0.25 The direction of travel is to store 'less' rather than 'more', so if you run a big buffer, definitely reduce it. And it will definitely help tasks to be issued more widely to users once shortages like we currently have are restored. And it conforms to both soft and hard deadlines the researcher here indicated. Resource share is a very good point I never addressed. I have Rosetta set at 2900 and WCG at 100, which is similar to what's suggested. I don't have a third project set up. |
nealburns5 Send message Joined: 11 May 19 Posts: 37 Credit: 10,184,436 RAC: 0 |
Next, for your buffers.... Set your primary buffer to ".21" and your second to 0(zero). This allows boinc to always go thru its sequence of resource share to get your next WU. DO NOT BUFFER FOR DAYS - ONLY BUFFER FOR HOURS! This is IMPORTANT, for when a project first starts up after a lag in WUs, it does not have to do such a flood of requests, and as a project goes thru a series of WUs for collection, there are not days of WUs that have to be waited on. If you use the optimal number of .21, BOINC will take care of everything. A smaller number will give you a wait period, and a larger number is not healthy for the project. Are these the buffers that you're talking about? Which one is primary and which is secondary? |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,738,758 RAC: 8,494 |
Er.... let's say I set Boinc to get a 5 day buffer. Boinc tries to get tasks from the project. The project says "deadline 3 days". It doesn't take a rocket scientist to program Boinc to get the smaller of the two numbers - the buffer I set, and the deadline.
If he would like them back sooner, he should set a smaller deadline. So what happens to those that are later than the invisible deadline? Do they get remotely cancelled so my Boinc will get more? DO NOT BUFFER FOR DAYS - ONLY BUFFER FOR HOURS! This is IMPORTANT, for when a project first starts up after a lag in WUs, it does not have to do such a flood of requests, and as a project goes thru a series of WUs for collection, there are not days of WUs that have to be waited on. I disagree. For example I have GPUs running Einstein. With my 1+1 day buffer, it waits till it gets below 1 day, then downloads a whole day of tasks at once. This must be better for the project than grabbing 1 WU at a time many many times a day.
I adjust resources depending on what project I want to run most. For example a project comes up with a bunch of new tasks for a different sub project that I'm interested in, so I raise its priority. Or I might have two projects I think are equally important, or one that's slightly or vastly more important than the other. There's no fixed sensible choice for resource share. It's the user's opinion on what importance he assigns to projects. And I do weird stuff with the GPUs - Milkyway only hands out 2.5 hours of work at a time, and has a bug whereby it won't hand out work if I've returned a completed WU within the previous 10 minutes. My GPUs finish them every 11.5 seconds. And Boinc always returns WUs while trying to get more tasks. Therefore I set Einstein to 0 priority on those GPUs, so it grabs only 1 WU to tide it over while it's twiddling its thumbs. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,738,758 RAC: 8,494 |
Next, for your buffers.... Set your primary buffer to ".21" and your second to 0(zero). This allows boinc to always go thru its sequence of resource share to get your next WU. DO NOT BUFFER FOR DAYS - ONLY BUFFER FOR HOURS! This is IMPORTANT, for when a project first starts up after a lag in WUs, it does not have to do such a flood of requests, and as a project goes thru a series of WUs for collection, there are not days of WUs that have to be waited on. If you use the optimal number of .21, BOINC will take care of everything. A smaller number will give you a wait period, and a larger number is not healthy for the project. Yes. The at least is the primary, the additional is the secondary. |
James Lee* Send message Joined: 28 Jan 17 Posts: 6 Credit: 2,770,691 RAC: 0 |
The "Store at least" is the primary. The "Store up to an additional" is the secondary. Also the "Switch between tasks" should be set to 9999. Also, as backup of work done, change checkpoints from 60 seconds to 120 seconds, if you are running more than 4 threads/cores or you do not have an SSD. This will keep up processing optimally. The CPU has to wait for disc IOs. James |
nealburns5 Send message Joined: 11 May 19 Posts: 37 Credit: 10,184,436 RAC: 0 |
James Lee and Peter Hucker -- Thanks for the tips. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2119 Credit: 41,175,312 RAC: 11,123 |
Next, for your buffers.... Set your primary buffer to ".21" and your second to 0(zero). This allows boinc to always go thru its sequence of resource share to get your next WU. DO NOT BUFFER FOR DAYS - ONLY BUFFER FOR HOURS! This is IMPORTANT, for when a project first starts up after a lag in WUs, it does not have to do such a flood of requests, and as a project goes thru a series of WUs for collection, there are not days of WUs that have to be waited on. If you use the optimal number of .21, BOINC will take care of everything. A smaller number will give you a wait period, and a larger number is not healthy for the project. James' settings (0.21+0) are pared to the bone and for systems running multiple projects at once, while mine (0.1+1.4) are maximums. Anything between the two is fine if it works for you, as you prefer. Also, I note you only run Rosetta rather than multiple projects which makes everything much simpler. The 3rd option is a bit weird as, for me, when WCG tasks take their turn on my setup, it operates as the amount of time WCG runs before it switches back to Rosetta tasks. More of a switch between projects setting than between tasks. I want the WCG tasks to run to completion rather than stop halfway and stay in memory, so I use 360 there to ensure they do. Again not relevant if you only run a single project. I've never touched the last setting tbh, so I've changed to what's been suggested here. Sounds reasonable enough. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2119 Credit: 41,175,312 RAC: 11,123 |
It's not Boinc deciding what to download. It's user settings in conflict with deadlines, then Boinc dealing with the consequences. You're not wrong, it's just that's not what Boinc does. Iirc the guy set Boinc to keep at least 10 days work, but no task from the project had a deadline as long as his minimum 10 days, so it decided not to download any at all until a core was empty, then downloaded 1 task. We tried to convince him to reduce the 10 days to like 5, just to bring 8-day tasks down, but he wouldn't do it until he understood why, which he never quite managed to do and even though he had no tasks in his buffer at all. I think it was the longest week of my life. Your settings of 1+1 will meet all the hard deadlines here, but if that researcher's comments are what he actually does, your results won't be taken into account in his next iteration. I don't think he was lying to us, so you ought to adjust if you run Rosetta, because all your results will return after he starts making decisions. That's what he's saying. I don't think he wanted them earlier, he was just getting a preview before deadline, then making changes before waiting for the stragglers. If the deadline was shorter I get the impression he'd take a look earlier as well. People are like that. I'm not sure what he said he did with the balance. Ignored them I think, not cancelled them. It was only a few days ago, maybe you can find it. Anyway, your choice what you do with that information. Resource share is a very good point I never addressed. I have Rosetta set at 2900 and WCG at 100, which is similar to what's suggested. I don't have a third project set up. Yup, that's what he's saying. Resource share is useful to match whatever you need to do and however you want to adjust it as you see fit. People rarely ever mention it, possibly because it's so individual. |
Tom M Send message Joined: 20 Jun 17 Posts: 87 Credit: 14,965,669 RAC: 49,460 |
If you are running a laptop and need to cool it down there are things called "Cooling tables". The ones I have had will work off a USB port. But you can also buy a PSU and drive it directly from your power strip. You can speed up production if you leave at least 1 core/thread free on any system. On both the website and in Boinc Manager there is a "use at most XXX% of cpu" that should be set to idle at least 1 cpu/thread upto 4 (if you are running a larger count cpu eg. 8c/16t plus. My rule(s) of thumb are 90% to as low as 75% to get 1 or more threads idled. You cannot select "Coronavid19" only named tasks. If you do not select the "Use cpu" toggle you will not get any tasks. Apparently the algorithms that Rosseta@Home is using do not adapt efficiently to being processed by Gpus due to memory constraints. Hang out in the Cafe. And don't practice "social standoffness" there :) Tom Help, my tagline is missing..... Help, my tagline is......... Help, m........ Hel..... |
bkil Send message Joined: 11 Jan 20 Posts: 97 Credit: 4,433,288 RAC: 0 |
Due to how power saving and turbo boost is implemented in processors, it is much more efficient to run 50% of the time over every core than to use 50% of the cores. TL;DR: power is squarely proportional with voltage, voltage is also increase by turbo boost, and switching off all cores at once enabled the CPU and chipset to a lower power state than if even a single one is running. I've also verified this with a watt meter but laptops expose their power gauges in the battery if you want to give it a quick try. It is interesting, though that it further reduces power if I synchronously stop & continue all threads at the same time using a simple script compared to how BOINC implements CPU usage percentage throttling. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1675 Credit: 17,701,559 RAC: 19,647 |
Due to how power saving and turbo boost is implemented in processors, it is much more efficient to run 50% of the time over every core than to use 50% of the cores.And causes a lot more thermal stress on the CPU. Heat, cool, heat, cool, heat, cool, heat, cool. Expand, contract, expand, contract, expand, contract, expand, contract. You get better performance, with less stress, by limiting the number of threads and allowing them to run at 100% If you want to limit clock speed to limit power usage then use Power Options, Change plan settings (for the plan you have selected), Change advanced power settings, click on the + next to Processor power management, Maximum processor state, and select a percentage (it will still cause some temperature spiking, but no where near as bad as starting & stopping cores/threads) With that done, you could then allow more cores to run as the performance of the CPU has been limited & heat won't be such a problem. Grant Darwin NT |
Message boards :
Number crunching :
What are your tips for new Rosetta@home users?
©2024 University of Washington
https://www.bakerlab.org