Message boards : Number crunching : Deadlines are all screwed up.
| Author | Message |
|---|---|
|
Sandman192 Send message Joined: 22 Sep 07 Posts: 16 Credit: 2,047,596 RAC: 0 |
I get work, work running 24 hours a day, work running my CPU at 100%, work doesn't even get canceled after 2 days of being behind, and just keep crunching all day long. 4 GHz at 100% work time and no playing game at all that time and apparently is not fast enough and this is only happing to Rosetta's work. Prime work that some take 15 days to finish on CPUs finish even after suspending it to play games for over 2 days it still finishes it on time. Not only BOINC checks for how long it takes, for how fast my CPU is, and for when the deadline is it still grabs the work and never finishes it on time. Resetting does nothing. 4/23/2020 12:14:15 AM | Rosetta@home | Result r4d_2598_fold_SAVE_ALL_OUT_918965_80_0 is no longer usable And from Rosetta site. 51 errors. Didn't show all but you get the message.
|
Grant (SSSF)Send message Joined: 28 Mar 20 Posts: 1895 Credit: 18,534,891 RAC: 0 |
Resetting does nothing.Oh, it does something- it makes it worse, because it throws out all the information it has got up to that point and has to start all over again from scratch. 1 Rosetta Tasks have a 3 day deadline (at present). So if you have a cache set longer than that, you will run into problems until the estimated completion times are correct. And the more projects you do, and the bigger the cache, the longer that will take to occur. 2 If you run more than a couple of projects, then you don't even need a cache. And having a cache just means it takes longer for thing to settle down. 3 The default runtime for a Rosetta Task is 8 hours. If you set a longer Target CPU time, then it takes longer for things to sort themselves out. 4 The more you do with your system, the less time there is for Rosetta (and any other project) to actually process work. One of your Tasks Run time 1 days 21 hours 16 min 27 sec CPU time 1 days 11 hours 51 min 43 secA lightly used system may have a minute's difference between CPU time & Run time for each hour of CPU time. So for that Task, the difference should be 36min- you've got almost 10 hours. So your system is spending a lot of time doing things other than processing BOINC work. This is what you need- In your Account, Preferences, Preferences for this project- Rosetta@home preferences Target CPU run time (not selected)Make sure it is "Not selected", that way it will use the project default (8 hours). Update Preferences to save the changes. In your Account, Preferences, When and how BOINC uses your computer- Computing preferences Computing
Usage limits
Use at most 100% of the CPUs
Use at most 100% of CPU time
When to suspend
Suspend when computer is on battery (not selected)
Suspend when computer is in use (not selected)
Suspend GPU computing when computer is in use (not selected)
'In use' means mouse/keyboard input in last 3 minutes
Suspend when no mouse/keyboard input in last --- minutes
Suspend when non-BOINC CPU usage is above --- %
Compute only between ---
Other
Store at least 0.05 days of work
Store up to an additional 0.02 days of work
Switch between tasks every 60 minutes
Request tasks to checkpoint at most every 60 seconds
Disk
Use no more than 20 GB
Leave at least 2 GB free
Use no more than 60 % of total
Memory
When computer is in use, use at most 95 %
When computer is not in use, use at most 95 %
Leave non-GPU tasks in memory while suspended (not selected)
Page/swap file: use at most 75 %Update Preferences to save the changes.Then on the BOINC Manager, on the Project tab select Rosetta, Update. Given the number of projects you are attached to, and the size of your cache, it will take at least a week for these settings to have a significant impact on the BOINC Manager being able to make a start on meeting your Resource share settings. But it will stop you from getting more work than you can possibly return, but still keep the system busy processing BOINC work (at least when it's not busy doing other things). Grant Darwin NT |
|
Raistmer Send message Joined: 7 Apr 20 Posts: 49 Credit: 798,155 RAC: 0 |
Another way could be to change "not selected" to smth smaller than 8 hours. Minimum is 2 hours. The less value one set the less effective will be computation of those tasks. But if it will allow to complete in time overall efficiency for that particular host could be improved. So one can gradually decrease that value to the point no deadlines occured. |
|
Sid Celery Send message Joined: 11 Feb 08 Posts: 2474 Credit: 46,506,558 RAC: 3,757 |
I get work, work running 24 hours a day, work running my CPU at 100%, work doesn't even get canceled after 2 days of being behind, and just keep crunching all day long. 4 GHz at 100% work time and no playing game at all that time and apparently is not fast enough and this is only happing to Rosetta's work. Prime work that some take 15 days to finish on CPUs finish even after suspending it to play games for over 2 days it still finishes it on time. No-one like to hear it's their fault rather than someone else's, but the 3-day deadlines are the deadlines. Your settings are. frankly, catastrophic. You've increased your runtime to 36 hours from the default 8hrs and I don't know how many days cache you've set, but it's not consistent with how many projects you run on that PC - even if it's only Rosetta. Boinc does the scheduling, not Rosetta, and the cache needs to be a maximum of 1 day total until Boinc works out how long tasks run for you, and return your runtime to 8hrs (either explicitly or by not setting it explicitly to anything to cover future changes) Change those two things, leave it untouched for at least a week, and everything will work out fine.
|
|
Sandman192 Send message Joined: 22 Sep 07 Posts: 16 Credit: 2,047,596 RAC: 0 |
For people who said. Another way could be to change "not selected" to smth smaller than 8 hours. Then why would Rosetta give you an option to set it for 1 day 12 hours? If a 4 GHz CPU can't crunch it on time then why that an option? Meaning almost no one can because 5GHz is only a hand full of computers (if it was a speed issue, to begin with). Plus, BOINC knows what speed a computer is and compensate for the long work and even for slow computers or just won't get the work. Oh, it does something- it makes it worse, because it throws out all the information it has got up to that point and has to start all over again from scratch. I mean resetting does nothing to fix the strang won't finish on time. I don't know how many days cache you've set or Given the number of projects you are attached to, and the size of your cache Cache is irrelevant since it does not affect not finish on time. I have no problems for years with work not finish on time. That's for some WU that took 15 days to finish on work other than Rosetta. Until Rosetta can with this option for changing the "Target CPU run time". That's with 10 days of work with 10 more of additional work. Again NO problems until now for years. Number of projects you attached? I gave you the number of errors not the number of projects. All you see is how may that got errors because it won't finish. How do you know how many I attached. I only have Rosetta to grab for work. Anyway, that is still irrelevant. Boinc does the scheduling, not Rosetta Your right. But Rosetta DOES add the timelines to there work that BOINC has to relie on. You've increased your runtime to 36 hours from the default 8hrs Yes, because I want it that way. It's an option. If they're giving us an option that we can't use then why give us it in the first place. Ok, right now I'm running 1 day 12 hours and it seems to finish on time for tomorrow. So I don't know why only 51 of them failed on not finishing on time. After that, I only have in my cache is 6 1/2 hour deadline work for Rosseta. |
Grant (SSSF)Send message Joined: 28 Mar 20 Posts: 1895 Credit: 18,534,891 RAC: 0 |
The reason there is an option, is because some people will do 10 projects, some people will do one. Some people will set the Resource share to be even, others will have it in all favour of of 1 project. An option that is suitable for just 1 project is almost never remotely suitable if you have more than just a couple of projects.Another way could be to change "not selected" to smth smaller than 8 hours. Just because it's an option doesn't mean you need to change it; the default is 8 hours, because that is what the project would prefer people used. If you choose to use longer time, then you may need to change other settings in order to make it possible. For the very reasons i pointed out. As it does work (or can't finish it) it records that information and makes changes based on that to -eventually- stop the problem from occurring. But if you keep throwing out that information, it has to start all over again from scratch.Oh, it does something- it makes it worse, because it throws out all the information it has got up to that point and has to start all over again from scratch.I mean resetting does nothing to fix the strang won't finish on time. So you just keeping prolonging the problems you are having. Cache is irrelevant since it does not affect not finish on time. I have no problems for years with work not finish on time. That's for some WU that took 15 days to finish on work other than Rosetta. Until Rosetta can with this option for changing the "Target CPU run time". That's with 10 days of work with 10 more of additional work. Again NO problems until now for years.Your excessive cache is the reason you are having the problem. Rosetta has very short deadlines- they are 3 days for the present Tasks. It has no history for how long it takes you to complete work, so the Estimated completion times aren't even close to accurate. So you get more work than you could even possibly do. So you miss deadlines. It is that simple. The point of a cache is so you don't run out of work. As you have dozens of projects you are attached to, there is no chance of all of those projects being down or not having work at the same time. So there is no need for any sort of cache at all. Even 0.5 days & 0.01 extra days would be more than is necessary, but at least it would stop you from missing deadlines. How do you know how many I attached. I only have Rosetta to grab for work. Anyway, that is still irrelevant.It is not irrelevant- it is the reason you are having problems- lots of projects + excessive cache = lots of problems while the BOINC manager sorts things out. I saw how many projects you are attached to when i went to see how many errors your system was having. They also give the option of 2 hours. It's an option, why not select it?You've increased your runtime to 36 hours from the default 8hrsYes, because I want it that way. It's an option. If they're giving us an option that we can't use then why give us it in the first place. As i pointed out earlier- the option is there because there are many different projects. What is good for a project with short deadlines is not necessarily good for a project with long deadlines. What might be good for a single project is not good if there are lots of projects. What might be good for a system with certain options selected, won't be a suitable choice to use if other options are selected. Ok, right now I'm running 1 day 12 hours and it seems to finish on time for tomorrow. So I don't know why only 51 of them failed on not finishing on time. After that, I only have in my cache is 6 1/2 hour deadline work for Rosseta.Because some Tasks will finish earlier than their Target CPU time, and some will take longer (up to 10 hours). Because of the number of projects you run, the excessive cache you use, the changed from the default Target CPU Run time the simple fact of the matter is that the Estimated times are wrong, and will be that way for ages. And until they finally match your selected Target CPU Runtime you will continue to miss deadlines. EDIT- and i notice you didn't address the fact that your system does not spend 100% of it's time processing work. Grant Darwin NT |
|
Sid Celery Send message Joined: 11 Feb 08 Posts: 2474 Credit: 46,506,558 RAC: 3,757 |
Then why would Rosetta give you an option to set it for 1 day 12 hours? If a 4 GHz CPU can't crunch it on time then why that an option? Meaning almost no one can because 5GHz is only a hand full of computers (if it was a speed issue, to begin with). Plus, BOINC knows what speed a computer is and compensate for the long work and even for slow computers or just won't get the work. 36hrs runtime is fine, but only at the right time. It seems that when a new version of Rosetta comes out, whatever the runtime set, the tasks come down on the assumed basis of (someone said) 4.5hr runtimes. This is bad enough when the default 8hr runtime is set. You say you have a 10+10hr cache set, which is usually no problem - Boinc will initially bring down what it thinks are 50 x 4.5hr tasks for your 12 cores - but instead of completing them in 20hrs, if your runtime is actually set at 36hrs, they'll take 150hrs to complete, long after the 3-day deadlines. This is why so many of your tasks got cancelled and will continue to until Boinc recognises and factors in your real runtime. At default 8hr runtime they'll take 33 hours, which is within deadline. A quick calculation reveals, with the size of cache you have, a 16hr runtime is the maximum you can set to complete all tasks within deadline. And when Boinc eventually reflects 16hr runtimes, only then can you increase your runtime again to get nearer your preference. I don't know how many days cache you've set or Given the number of projects you are attached to, and the size of your cache On cache, see above. On what works with other projects, Rosetta's requirements are independent of any other project and every other project is independent of Rosetta. And your own settings have to work with all of them. I may have misread what you've written. Are you saying you have a 10 DAY minimum cache plus 10 DAYS more? Then it's very simple. Rosetta deadlines are 3 days. If you set 10+10 days cache, every task will fail to meet deadline the minute you download them and Boinc will have to run every Rosetta task high priority to meet deadline, while every other task will cancel for failing to meet deadline. Those settings are completely incompatible with Rosetta, independent of anything that works with any other project. It may also mean no other project's tasks on each host will ever get a chance to run at all. Number of projects you attached? I gave you the number of errors not the number of projects. All you see is how may that got errors because it won't finish. How do you know how many I attached. I only have Rosetta to grab for work. Anyway, that is still irrelevant. You're right, I don't. But you do, so make the adjustments that will work. You've increased your runtime to 36 hours from the default 8hrs You can make any choices you want, but 1) not immediately when a new program version arrives and/or 2) if you make incompatible choices, they're you choices to be incompatible No-one like to hear it's their fault rather than someone else's, but the 3-day deadlines are the deadlines. Your settings are. frankly, catastrophic.
|
Message boards :
Number crunching :
Deadlines are all screwed up.
©2025 University of Washington
https://www.bakerlab.org