Message boards : Number crunching : Question about estimated time to complete and ...
Author | Message |
---|---|
Bill Swisher Send message Joined: 10 Jun 13 Posts: 35 Credit: 33,050,243 RAC: 43,930 |
how that affect the boinc software retrieving tasks. For example: I have a computer that processes Rosetta software, it sits in the queue saying it's going to take 8 hours to run. I also have 2 other projects running under boinc, they take either 60 minutes or 90 minutes (depending). The boinc software is supposed to spread it's workload evenly across all 3 projects. Fine and dandy. Except Rosetta projects actually only take 3 hours to process. So the queue for tasks is skewed against Rosetta getting it's share of processor time because of the estimated time to complete. Is there something to make it recompute that estimate, or is that a function of the software that resides at Rosetta@home? |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1679 Credit: 17,780,330 RAC: 22,826 |
Some background info. 1 The more Projects you run, then the longer it takes for your Resource Share settings to be honoured. 2 The larger your cache, the longer it takes for your Resource Share settings to be honoured. 3 The Resource Share setting values are a ratio, not percentages. 4 Your Resource Share is worked out using REC (Recent Estimated Credit). It does not use awarded Credit. It does not use time spent processing. So to your question So the queue for tasks is skewed against Rosetta getting it's share of processor time because of the estimated time to complete.No, not really anyway. Because the actual Runtime doesn't match up closely with the initial Remaing (estimated) time, you will see odd behaviour if you run with a cache, but your Resource Share setting will be honoured in the long term. See point 4 above- awarded Credit & time spent processing Tasks is not used for determining the Scheduling of work. What is or isn't in the queue will vary as you process work for the different projects. Due to the very different Runtimes of Tasks (not only between projects but different applications within each project) the contents of any queue you might have will change day by day (and as you are running multiple projects you'd be better off with no queue. 0.1 days and .01 additional days would be plenty. It would keep enough work so there is a Task ready to start as soon as one finishes). Is there something to make it recompute that estimate, or is that a function of the software that resides at Rosetta@home?First part- it does that as the Task is processed. If you check the Remaining (estimated) time for running Rosetta Tasks, you will notice that it reduces down towards 3 hours as the Task progresses (for those Tasks that will take more than 8 hours to complete it will move upward closer towards what it estimates the completion time will be, although in many cases the Task will often end before 8 hours (if it was likely to go much over 8 hours in order to finish)). Second part- On other projects the Remaining (estimated) time is set by the project, and as work is returned the actual processing time is used to refine that initial estimate till after 10 or so completed Tasks it should be pretty close to the actual time taken. That is for each application, for each project. Here it is set by the project, and the actual time to process it doesn't have any effect on that initial Remaining (estimated) time (very long and painful story). It is what it is. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2122 Credit: 41,184,189 RAC: 10,001 |
Except Rosetta projects actually only take 3 hours to process. So the queue for tasks is skewed against Rosetta getting it's share of processor time because of the estimated time to complete. While what Grant says may be true, I don't think it answers your direct question. I believe the 3hr runtime is actually a dumb error at Rosetta in setting the default runtime for those tasks. Whether it is or it isn't, it's inconsistent with the initial (and forced) 8hr scheduled time. So, in Boinc, select Rosetta and go to Your Account Chose Rosetta@home preferences Click Edit Preferences Against Target CPU run time (where it claims the default is 8hrs but turns out it isn't) specify 8hrs explicitly rather than 'not selected', then Update Preferences Then Estimated Runtime in Boinc should match actual runtime, rather than adjusting and only realising the difference mid-task And unstarted tasks won't mislead Boinc scheduling further, irrespective of the cache size you chose. And other projects won't be sold short so much. |
Bill Swisher Send message Joined: 10 Jun 13 Posts: 35 Credit: 33,050,243 RAC: 43,930 |
Yes, I do run a cache. It's set a 4 days +.5 days. I tend to be out of town for 3 to 5 days per week during the summer. Should the network drop dead I don't want the computers sitting around twiddling their collective thumbs, so to speak. Thanks for the reply. |
Bill Swisher Send message Joined: 10 Jun 13 Posts: 35 Credit: 33,050,243 RAC: 43,930 |
I've made the change to 4 hours, the other 4 computers have different processor specs and only 1 seems to be getting work. I'll see what it looks like when new work arrives. Thanks for the assist. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1679 Credit: 17,780,330 RAC: 22,826 |
Yes, I do run a cache. It's set a 4 days +.5 days. I tend to be out of town for 3 to 5 days per week during the summer. Should the network drop dead I don't want the computers sitting around twiddling their collective thumbs, so to speak.The deadlines for Rosetta are 3 days. And the additional days value is best set to 0.01 (the larger the additional days value, the lower your cache will fall before it refills. The 0.01 value means it won't drop more than that below your "days to keep" value). made the change to 4 hours, the other 4 computers have different processor specs and only 1 seems to be getting work. I'll see what it looks like when new work arrives.The lack of Rosetta work on those systems is probably due to the cache setting. If the manger thinks it won't be able to finish in time, it won't ask for more work. I'm pretty sure that the logic for the work fetch Scheduler is along the lines of It does it's best to honour Resource Sharw settings, but if it can't return work before the deadline, then don't get any. Either- Eventually, to honour your Resource Share settings, the debt owed to Rosetta will increase so much, it will stop asking for work from your other projects, till the cache runs down enough that it can complete Rosetta Tasks before the deadline. It'll do a whole bunch of them, then start getting work for the other projects again, till eventually it won't be able to complete any new Rosetta Tasks in time, so it'll stop asking for Rosetta work again. And then the cycle will repeat. Or- the debt to Rosetta will continue to grow, till it can do a Task, but then goes back to the other projects, regardless of what the debt owed to Rosetta is because it can't keep the debt between the other Projects managed, and return any Rosetta work in time. You can enable Work Fetch flags in the event log to see just what is going on, but expect an avalanche of messages with the number of projects you have. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2122 Credit: 41,184,189 RAC: 10,001 |
I've made the change to 4 hours, the other 4 computers have different processor specs and only 1 seems to be getting work. I'll see what it looks like when new work arrives. Good idea. I should've mentioned just edging it up an hour at a time in case it has some unexpected effects. And pay attention to what Grant says about cache size. When cache exceeds deadline I've seen Boinc do weird things - from grabbing way too many tasks to grabbing none at all. The maximum you can have that Boinc can cope with is something like 2 + 0.5 + runtime - as long as the total doesn't exceed the 3-day deadline. I'm away from home 3 days every week too but I trust my network to stay up sufficient to use 0.5 + 0.1 with a 12hr runtime And on the rare occasions my network does go down - just tough luck for me |
Bill Swisher Send message Joined: 10 Jun 13 Posts: 35 Credit: 33,050,243 RAC: 43,930 |
I'm away from home 3 days every week too but I trust my network to stay up sufficient to use 0.5 + 0.1 with a 12hr runtime The 3 days is only in effect during the summer. I'm a snowbird. So I have a couple of other computers where it's really HOT right now. They're shutdown at the moment. In the fall I'll go back there and fire them off also. Locally the internet is normally working fine, but I still have a script that runs every 3 hours, as a cronjob, to verify the network is up. Plus once a week at o'dark-thirty I have a timer that turns the power off to my wifi router and the fiber, was cable, modem for a minute in case one of them decides to muckup. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2122 Credit: 41,184,189 RAC: 10,001 |
I'm away from home 3 days every week too but I trust my network to stay up sufficient to use 0.5 + 0.1 with a 12hr runtime Understood. I'm in England. I don't have anything like that problem... (I wish) Though I'm running one (this PC) at a reduced rate atm due to being in a dusty location and I've had to do a partial clean of the cooling system this week |
Frik Send message Joined: 1 Dec 05 Posts: 2 Credit: 566,696 RAC: 1,028 |
Yes, I do run a cache. It's set a 4 days +.5 days. I tend to be out of town for 3 to 5 days per week during the summer. Should the network drop dead I don't want the computers sitting around twiddling their collective thumbs, so to speak.The deadlines for Rosetta are 3 days. And the additional days value is best set to 0.01 (the larger the additional days value, the lower your cache will fall before it refills. The 0.01 value means it won't drop more than that below your "days to keep" value). --------------------------------------------------------------------------------------------------------------- Hello, If Rosetta 's deadlines is 3 days then it is a little difficult to always get a good score for workunits completed. My PC does not run 24 hrs but only when I am here at home. So is it not possible to for Rosetta to make this deadline a little longer like some other projects in BOINC. 3 days is just a little to steep to get to on some days. I also run 3 other PC, so having a little longer deadline would really be nice. Regards Frik Brits |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2122 Credit: 41,184,189 RAC: 10,001 |
---------------------------------------------------------------------------------------------------------------Yes, I do run a cache. It's set at 4 days +.5 days. I tend to be out of town for 3 to 5 days per week during the summer. Should the network drop dead I don't want the computers sitting around twiddling their collective thumbs, so to speak.The deadlines for Rosetta are 3 days. And the additional days value is best set to 0.01 (the larger the additional days value, the lower your cache will fall before it refills. The 0.01 value means it won't drop more than that below your "days to keep" value). A long time ago the deadline was routinely 7 days, but when the pandemic started they needed to look at the results very quickly, make adjustments and put out fresh tasks that learned from the results of the previous ones, so they reduced deadlines to just 3 days. While the urgency no longer applies at such a level, I doubt that will change in future. The point being, this isn't a project where results don't really matter (like a lot I could name!) I notice your other issue is that you've set your tasks to a 2-day (48hr) CPU runtime, which I respect tbh, but it isn't helping you. Also, at one time, the project had a period when the results sent back to them were so large from a 2-day runtime they ran out of disk space, so the maximum runtime allowed is now reduced to 1 day 12 hours (36hrs). While again, this issue no longer applies, the fact that you use such a long runtime, are not always online and are struggling to meet deadlines means your best option is to consider reducing your CPU runtime. Reducing to the current maximum of 36hrs may be sufficient, but if not, cut it down further until you can be consistently successful. |
Message boards :
Number crunching :
Question about estimated time to complete and ...
©2024 University of Washington
https://www.bakerlab.org