Message boards : Number crunching : reserve wu's not downloading
Author | Message |
---|---|
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
someone want to explain why the scheduler is ignoring my settings? I have connect every 1 day and keep 7 days extra work. since my initial download of 5.80 after the server crash the reserve keeps going down and no new work is downloaded. there are no messages saying no work, yet I have 4.16 days of work left at maximum 6 hrs run time per work unit. in my other thread, i saw something about the scheduler will download new work at the last minute. that should not be happening when you prefs are set for 7 days reserve. there should always be a 7 day reserve of work, at least logically. or is the scheduler wanting to work through all the work and then download new work when 7 days is up? I don't understand the settings. Prior to the crash it always kept the reserve up to date so that there was always a queue of work to due even during the outage. so why is 5.80 not doing that as well? or is all this a BOINC problem? is 5.10.20 having a brain fart again? |
The_Bad_Penguin Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,271,025 RAC: 0 |
i'd like to see this answer. i will be out of state for 1 weeks, and unfortunately, the network my crunchers are on goes off-line with some frequency. i'd hate to only have 0.5 days of Rosie reserve (as it the case now) on my quadcore, the network go down on my first day away, and Rosie is out of a quadcore's worth of processing for nearly 1 week. don't have much hope that the pc's will reconnect to the network without manual intervention. so, i'd really like to have/force boinc/Rosie provide me with 28 days (7 days * 4 cores) worth of reserves... |
Jmarks Send message Joined: 16 Jul 07 Posts: 132 Credit: 98,025 RAC: 0 |
If you use rah's general pref. - 'Connect to network about every' should be set to the number of days you want in reserve irrelevent of the number of cpu's. Rah uses the benchmarks run on each PC to figure out how many wu's to send for your cache. So 7 days should work for your quad core cpu also because rah already know how much your pc will do. Jmarks |
The_Bad_Penguin Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,271,025 RAC: 0 |
I have this set to "4 days", and I have less than half a day of reserve. If you use rah's general pref. - 'Connect to network about every' should be set to the number of days you want in reserve irrelevent of the number of cpu's. Rah uses the benchmarks run on each PC to figure out how many wu's to send for your cache. So 7 days should work for your quad core cpu also because rah already know how much your pc will do. |
Jmarks Send message Joined: 16 Jul 07 Posts: 132 Credit: 98,025 RAC: 0 |
I have this set to "4 days", and I have less than half a day of reserve. If you have not changed any other of your settings maybe it is something wrong with rah. I would try rebooting your PC. ps I changed my Target CPU run time from .5 days to 1 day. Becuase of this my pc has not asked for work so I do not know if rah is down. Jmarks |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
I've seen several simlar reports. The question always boils down to this: "how much work did your client request from Rosetta?". And the reason I always ask that question is that people tend to assume that Rosetta is in control of all of this, and therefore is the cause of such low/no work conditions. I suspect that more recent and perhaps beta BOINC versions have some issues in the work scheduler. They've also been adding more overrides and controls over work fetch policies for each specific machine. If your machine is not requesting new work from Rosetta, then the BOINC client's code needs to be issueing requests more frequently. If your machine is requesting new work, but not enough seconds of work to fill out your cache, then again, the problem is with the BOINC client. If your machine is requesting new work and gets back a "no work from project" message, then you have a Rosetta issue. Rosetta Moderator: Mod.Sense |
Sailor Send message Joined: 19 Mar 07 Posts: 75 Credit: 89,192 RAC: 0 |
I had the same problem some days ago with my AMD 4000+ and Spinhenge. Here is how I made a work around: Select "Extras" then "preferences" in the BOINC Manager. There go into "network usage" and look for : "Additional work buffer". Once i changed my settings there, BOINC started requesting work right away like i wanted. Sidenote: this will overwrite settings for all projects on your local PC, to me it didnt matter as the only project I run on that machine is Spinhenge. Also, in the future the machine wont take changes made through the projects interface unless you select "clear" in the preferences. http://www.MIAteam.eu |
The_Bad_Penguin Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,271,025 RAC: 0 |
Thanx! I'll test a day or two before i go away to see how this works on the quadcore. I had the same problem some days ago with my AMD 4000+ and Spinhenge. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
that's what i am saying...for me...i have in that very menu of boinc manager..advanced-preferences-network usage (connect every 1 day, additional work 7 days) and the scheduler ignores that. note: i see that 5.69 gave me some work. but still not at 7 days buffer. Thanx! I'll test a day or two before i go away to see how this works on the quadcore. |
The_Bad_Penguin Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,271,025 RAC: 0 |
i have 5.10.13 and just tried it now. i set to 1 day buffer, for a test. i have set to 3 hr wu's, and a quadcore. in 24 hrs, each core can complete 8 wu's, and with 4 cores, thats 32 wu's per day. just checked BM, 4 wu's being crunched, 28 wu's hanging around. for a total of 32. seems to be working for me. maybe there's a difference in this functionality between 5.10.13 and 5.10.20 ? would seem to be a BM issue, at least relative to your installation/settings. that's what i am saying...for me...i have in that very menu of boinc manager..advanced-preferences-network usage (connect every 1 day, additional work 7 days) and the scheduler ignores that. |
Sailor Send message Joined: 19 Mar 07 Posts: 75 Credit: 89,192 RAC: 0 |
hm I got 5.10.20 and it works... maybe just reset rosetta, greg? http://www.MIAteam.eu |
Ingleside Send message Joined: 25 Sep 05 Posts: 107 Credit: 1,514,472 RAC: 0 |
someone want to explain why the scheduler is ignoring my settings? Well, if I didn't mis-count, your computer has currently 34 unfinished wu, and for a single-core with 6 hour run-time-preference this means atleast 8.5 days expected run-time. But, it's possible you don't have 34 wu's, since Rosetta@home apparently haven't enabled resending of "lost" wu... But anyway, going after the assumption it's not correct, so some basic "debug"-tips... If you looks on Tasks-tab, is the running task marked with "High Priority"? If so, work-request to the tasks project is blocked until not in deadline-trouble any longer. Setting "connect every N days" too large compared to deadline is one way to be permanently in deadline-trouble, but shouldn't be a problem if it's set to 1 day with a 10? day deadline, but if you've got any 2-3 days or shorter deadline it's likely culprit. Not sure if there's any limitations on "Cache additional N days"... Note, for any others thinks they've got a problem, "High Priority" only shows-up in BOINC v5.10.14 and later. Are you running multiple projects? If not mis-remembers, if atleast n_cpu tasks is in deadline-trouble, even if they're for other projects, all work-request is blocked. Also, if by chance Rosetta@home has negative long_term_debt, it won't normally ask for work. A couple obvious, but just to mention them.. Setting "no new work" or suspending a Task or suspending network will also block work-request. Also, wrong venue or wrong override-values can be an option, but shouldn't be for you since you're looking on the "advanced preferences"... Still nothing? At bottom of the computers summary, that are the 4 fields: % of time BOINC client is running While BOINC running, % of time work is allowed Average CPU efficiency Result duration correction factor Both the %-values should be close to 100%, while "Average CPU efficiency" close to 1. If significantly less, it can block work-requests. As for "Result duration correction factor", not sure that would be "correct" with a 6-hour run-time-preference, but maybe somewhere around 1-2... If significantly higher than 2, can again block work-request... Another method, to take a closer look if BOINC has "lost it's marbles", make a text-file, and place it in BOINC-directory. Name the file "cc_config.xml", and it should include these lines: <cc_config> <log_flags> <rr_simulation>1</rr_simulation> <work_fetch_debug>1</work_fetch_debug> </log_flags> </cc_config> After saved the file, in BOINC Manager, just click "Advanced/Read config file". Now, debug-logging often very quickly makes a large log, so then finished debugging, the "easy" way is to re-edit file, and just chance the 1 to 0, save file, and "Read config file" again. This makes it easy if you later wants to re-enable debug-logging. ;) If the log shows one or more result in "deadline-trouble", the project is blocked from asking... "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
ahh...it is possible the high priority was messing things up. that would explain alot of the hold up on getting work. now that i am home and able to check my computer i see i NOW have enough work to keep my system busy for the 8 days i told it to request. last download was around 12 UTC |
Luuklag Send message Joined: 13 Sep 07 Posts: 262 Credit: 4,171 RAC: 0 |
what do you guys think about the preferences about using hard disk space in the boinc client. if you i.e. request 7 days of work but you got not enaugh gig's to put those 7 days on, you may in example only download for 4 days, cause your settings dont match with eatch other, and the 1 overrules the other. |
Ingleside Send message Joined: 25 Sep 05 Posts: 107 Credit: 1,514,472 RAC: 0 |
what do you guys think about the preferences about using hard disk space in the boinc client. if you i.e. request 7 days of work but you got not enaugh gig's to put those 7 days on, you may in example only download for 4 days, cause your settings dont match with eatch other, and the 1 overrules the other. With too little disk/memory you'll still continue asking for work-request, but you'll be told in scheduler-reply whatever is wrong and possible solutions. It's possible you'll get a 24hour-deferral. Now, if not mis-remembers, the BOINC-defaults for disk is use at most 100 GB or 50% and leave 0.1 GB free, so as long as keeps to the defaults it shouldn't be a problem. Still, some projects has AFAIK changed the defaults... For memory, if you've got very little memory it's possible to be hit the 90%-rule introduced in v5.8.xx, but more commonly it's the 50% "if active" users will see. Well, looking on other projects, some Predictor-wu's has demanded over 1 GB, so many can have been hit by this... And a fairly new message, something going along the lines of "too slow download-speed" from WCG on their "Africa/climate"-wu's... "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
TA_JC Send message Joined: 7 Nov 05 Posts: 13 Credit: 7,004,672 RAC: 5,601 |
Well, we'll see...my box just started its last WU, the other projects are suspended. Last time it downloaded a bunch of WUs at the end, but it hasn't kept the queue full at all, just like others are experiencing. |
TA_JC Send message Joined: 7 Nov 05 Posts: 13 Credit: 7,004,672 RAC: 5,601 |
Hmmm....once again, BOINC let the queue run down until I had an idle core, then it downloaded about 8 days' work. At least I got work, I guess! :P |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
the 5.8's are diminishing in my queue and the 5.69 is taking over. but rosie is back to normal for me on filling up my queue. |
Ingleside Send message Joined: 25 Sep 05 Posts: 107 Credit: 1,514,472 RAC: 0 |
Hmmm....once again, BOINC let the queue run down until I had an idle core, then it downloaded about 8 days' work. At least I got work, I guess! :P This behaviour often means "too large cache-size", atleast if you're running with a large "Connect every N days"... "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
Keck_Komputers Send message Joined: 17 Sep 05 Posts: 211 Credit: 4,246,150 RAC: 0 |
Hmmm....once again, BOINC let the queue run down until I had an idle core, then it downloaded about 8 days' work. At least I got work, I guess! :P DOH! Why didn't I think of this before. Deadlines here are ~11 days. So any conbination of connect and extra settings totalling more than ~5 days can cause this kind of problem due to the excessive queue length. Total queue length should not be more than 40% of the deadlines for BOINC to work smoothly. BOINC WIKI BOINCing since 2002/12/8 |
Message boards :
Number crunching :
reserve wu's not downloading
©2024 University of Washington
https://www.bakerlab.org