unable to increase work cache size

Message boards : Number crunching : unable to increase work cache size

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
PresterJohn
Avatar

Send message
Joined: 4 Nov 05
Posts: 24
Credit: 2,121,609
RAC: 0
Message 3945 - Posted: 22 Nov 2005, 19:36:37 UTC
Last modified: 22 Nov 2005, 19:40:01 UTC

in anticipation on the long thanskgiving weekend, i want to increase the work cache on my crunchers so that they will have enough work to crunch for the 5 days that i will be away on holiday (there will not be any internet access during that time for these machines).

however, it seems that even setting 'Connect to network every XX days' to the max of 10, i only seem to be able to d/l enough work to last for approx 3 days. what gives?!?

in addition, is there a correlation between the size of the work cache and the cpu benchmark for each machine?

logically, if a powerful machine is being used, shouldn't it download more WU's than a less powerful machine (assuming they are using the same preference setting) because the faster machine will complete more WU's per 24hrs than the slower PC? my casual observation over the last two weeks doesn't seem to bear this theory out.

it's a shame that w/this project, users can't even properly download more work when needed and are bound by some arbitrary limit. i guess my machines will sit idle after all their WU's are completed because they won't be able to get enough work. a pity...
- team XPC - 'Where merry times and good crunching meet head-on!'
ID: 3945 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile stephan_t
Avatar

Send message
Joined: 20 Oct 05
Posts: 129
Credit: 35,464
RAC: 0
Message 3949 - Posted: 22 Nov 2005, 20:24:54 UTC

Hi PresterJohn -

If you can only download 3 days worth of WU at a time, it probably means you are hitting your 100/wu day cap. I had the same situation with my faster machine.

Now, it's not a bad thing either though, here's why. I have p4 2.66 non-HT that's currently benchmarking at 1400/2700 - about right. Now that's machine takes 14 hours to complete a single WU. Yes, 14 hours. Amusingly, a marginally faster machine of mine completes the same WUs in roughly a third of that time.

Now what gives, I'm still trying to figure out - probably overheating is involved. But the fact stays: if I was able to download a certain number of WU based on my bench number, then that machine would be allowed to download a number of WUs far superior to the number I can actually crunch.

I think that's why there's a hard limit on the number of WU you can download in one go, and in one day.

(anyone else feel free to correct me if I'm wrong).
Team CFVault.com
http://www.cfvault.com

ID: 3949 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Andrew

Send message
Joined: 19 Sep 05
Posts: 162
Credit: 105,512
RAC: 0
Message 3952 - Posted: 22 Nov 2005, 20:39:30 UTC
Last modified: 22 Nov 2005, 20:39:49 UTC

There's also a limit because of clients that just error out all the time. One machine that errors out a WU every X seconds, can go through a lot of WUs in a day.

PresterJohn - can you determine if you're hitting your daily quota? Also note that your quota can decrease depending on how many client errors you're generating.
ID: 3952 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,586,521
RAC: 2,928
Message 3959 - Posted: 22 Nov 2005, 20:52:51 UTC - in response to Message 3945.  

however, it seems that even setting 'Connect to network every XX days' to the max of 10, i only seem to be able to d/l enough work to last for approx 3 days. what gives?!?


Are you running BOINC V5.x? Prior versions did not use the "Duration Correction Factor", and for many people, a "10-day cache" was actually more like 3 days. The newer BOINC is much more accurate. Or, as the other posters have said, you may be hitting the maximum-per-day. In that case, you may have to set it to 10 a few days before you actually need 10 days worth, and connect daily, to get that many stored up on that fast a computer.

ID: 3959 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile stephan_t
Avatar

Send message
Joined: 20 Oct 05
Posts: 129
Credit: 35,464
RAC: 0
Message 3966 - Posted: 22 Nov 2005, 21:20:52 UTC
Last modified: 22 Nov 2005, 21:23:15 UTC

Yes and I should have added, also make sure you wait while the scheduler gets the WUs in - if you haven't hit the daily quota yet. Your mileage may vary, but mine does around 12 downloads, pauses for 10 minutes, does another 10 downloads, pause, etc. until either the 10 day cap or the daily quota is reached. Don't expect one big long download of 400 work units or anything like that.
Team CFVault.com
http://www.cfvault.com

ID: 3966 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
PresterJohn
Avatar

Send message
Joined: 4 Nov 05
Posts: 24
Credit: 2,121,609
RAC: 0
Message 3967 - Posted: 22 Nov 2005, 21:21:19 UTC - in response to Message 3952.  
Last modified: 22 Nov 2005, 21:33:13 UTC

There's also a limit because of clients that just error out all the time. One machine that errors out a WU every X seconds, can go through a lot of WUs in a day.

PresterJohn - can you determine if you're hitting your daily quota? Also note that your quota can decrease depending on how many client errors you're generating.


Andrew & Stephan,

is the 100 WU per day limit a per machine basis or a per user basis?

if it's a per machine basis, then i'm not anywhere close to hitting the 100 WU/day limit. and if it's a per user basis, IMO perhaps this is something that needs to be reconsidered for the good of the project since it seems to set unrealistically low.

I was able to d/l 52 WU's on monday morning after uploading two days worth of completed jobs and today, i was only able to d/l 18 jobs. the queries that i'm getting are taking approx 80 minutes each to finish...thus 55 cached WU's is only about 3 days worth.

the frustrating part of this is that i have slower machines which appear to able to download more cached WU's than my faster machines. not anywhere even close to 5 days worth (much less 10 days), but they appear to be able to receive more than some of the other machines.

none of my machines are giving are giving me regular client errors. during the three weeks of crunching r@h, i've had two client errors (on two different machines) and aside from a brief hiccup with stalled WU's which the 4.79 upgrade seemed to have fixed, everything appears to be working correctly.

- team XPC - 'Where merry times and good crunching meet head-on!'
ID: 3967 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
PresterJohn
Avatar

Send message
Joined: 4 Nov 05
Posts: 24
Credit: 2,121,609
RAC: 0
Message 3968 - Posted: 22 Nov 2005, 21:25:51 UTC - in response to Message 3966.  

Yes and I should have added, also make sure you wait while the scheduler gets the WUs in - if you haven't hit the daily quota yet. Your mileage may vary, but mine does around 12 downloads, pauses for 10 minutes, does another 10 downloads, pause, etc. until either the 10 day cap or the daily quota is reached. Don't expect one big long download of 400 work units or anything like that.


that interval has since been shorted (i'm seeing 4 minutes) but i am giving the scheduler plenty of time. the machine is question hasn't downloaded any more WU's for over 2 hrs.

- team XPC - 'Where merry times and good crunching meet head-on!'
ID: 3968 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,586,521
RAC: 2,928
Message 3970 - Posted: 22 Nov 2005, 21:30:31 UTC - in response to Message 3967.  

is the 100 WU per day limit a per machine basis or a per user basis?


It's per machine. With your computers hidden, I can't see any information that I could use to help out... definitely MUST know BOINC version, and would help to know Result Duration Correction Factor for the machine in question. Here is mine - if yours is drastically different, that could be part of the problem.

% of time BOINC client is running 99.133 %
While BOINC running, % of time host has an Internet connection 100 %
While BOINC running, % of time work is allowed 99.9788 %
Average CPU efficiency 0.995069
Result duration correction factor 1.051188

ID: 3970 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
PresterJohn
Avatar

Send message
Joined: 4 Nov 05
Posts: 24
Credit: 2,121,609
RAC: 0
Message 3971 - Posted: 22 Nov 2005, 21:31:00 UTC - in response to Message 3959.  

however, it seems that even setting 'Connect to network every XX days' to the max of 10, i only seem to be able to d/l enough work to last for approx 3 days. what gives?!?


Are you running BOINC V5.x? Prior versions did not use the "Duration Correction Factor", and for many people, a "10-day cache" was actually more like 3 days. The newer BOINC is much more accurate. Or, as the other posters have said, you may be hitting the maximum-per-day. In that case, you may have to set it to 10 a few days before you actually need 10 days worth, and connect daily, to get that many stored up on that fast a computer.


all machines are running 5.3.x and using the same preferences for the cache size, etc.


>>In that case, you may have to set it to 10 a few days before you actually need 10 days worth

unfortunately, even prepping the machine several days in advance isn't doing the trick.

just my opinion, but the entire way the WU's are managed for this project really leaves a lot to be desired. and if Rosetta wants to take its place in the forefront of preferred DC projects, this really needs to be reviewed and amended.
- team XPC - 'Where merry times and good crunching meet head-on!'
ID: 3971 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile stephan_t
Avatar

Send message
Joined: 20 Oct 05
Posts: 129
Credit: 35,464
RAC: 0
Message 3972 - Posted: 22 Nov 2005, 21:33:18 UTC

It's per CPU - set for Rosetta at 100 per CPU. So that's 200 per 24h. Which assuming that you are using a Pentium 4 3.0 Ghz HT, would take a lot more than 10 days to process - around 36 days.

Please see this link for a more detailled explanation.

I think your problem lies somewhere else. On my P4, I have managed to download until I reached 200 WUs. I'm still crunching the units downloaded 10 days ago, and I have enough in reserve for another 10. It's my opinion that the fail-safe are actually set too high - because now I could potentially be the reason for the delay on around 110 perfectly good WUs. See 'WU hogs' for more information :-)

Please try to change your settings to say, 7 days, and then update the project via boinc manager. Copy paste your logs and we can help you further. It could also help if you had a link to your machine stats.

Team CFVault.com
http://www.cfvault.com

ID: 3972 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,586,521
RAC: 2,928
Message 3973 - Posted: 22 Nov 2005, 21:33:39 UTC - in response to Message 3971.  

all machines are running 5.3.x and using the same preferences for the cache size, etc.


5.3.x is beta code. Unsupported.

just my opinion, but the entire way the WU's are managed for this project really leaves a lot to be desired. and if Rosetta wants to take its place in the forefront of preferred DC projects, this really needs to be reviewed and amended.


This project does it exactly the same way every other BOINC project does it...

ID: 3973 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile stephan_t
Avatar

Send message
Joined: 20 Oct 05
Posts: 129
Credit: 35,464
RAC: 0
Message 3975 - Posted: 22 Nov 2005, 21:36:43 UTC - in response to Message 3973.  
Last modified: 22 Nov 2005, 21:37:24 UTC

This project does it exactly the same way every other BOINC project does it...


I would like to emphasize this. Bill Michael is absolutely true, this is not Rosetta-specific and every single BOINC project behaves the same. In fact, Rosetta has the highest daily limit per CPU of all. Einstein has 8. But again, this is probably not why you can't download more WU. The problem lies elsewhere - please post your logs.

Team CFVault.com
http://www.cfvault.com

ID: 3975 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
PresterJohn
Avatar

Send message
Joined: 4 Nov 05
Posts: 24
Credit: 2,121,609
RAC: 0
Message 3977 - Posted: 22 Nov 2005, 21:42:43 UTC - in response to Message 3970.  
Last modified: 22 Nov 2005, 21:49:21 UTC

is the 100 WU per day limit a per machine basis or a per user basis?


It's per machine. With your computers hidden, I can't see any information that I could use to help out... definitely MUST know BOINC version, and would help to know Result Duration Correction Factor for the machine in question. Here is mine - if yours is drastically different, that could be part of the problem.

% of time BOINC client is running 99.133 %
While BOINC running, % of time host has an Internet connection 100 %
While BOINC running, % of time work is allowed 99.9788 %
Average CPU efficiency 0.995069
Result duration correction factor 1.051188


here are the numbers for the machine in question:

% of time BOINC client is running 98.3857 %
While BOINC running, % of time host has an Internet connection 100 %
While BOINC running, % of time work is allowed 35.231 %
Average CPU efficiency 0.984731
Result duration correction factor 0.395016

i don't know why the bold-faced stuff is showing at 35%???
- team XPC - 'Where merry times and good crunching meet head-on!'
ID: 3977 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
PresterJohn
Avatar

Send message
Joined: 4 Nov 05
Posts: 24
Credit: 2,121,609
RAC: 0
Message 3978 - Posted: 22 Nov 2005, 21:44:37 UTC - in response to Message 3973.  

>>5.3.x is beta code. Unsupported.

i could be misquoting the version since i am doing it off the top of my head. is there a way to dbl-check the version # within the client?

- team XPC - 'Where merry times and good crunching meet head-on!'
ID: 3978 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
PresterJohn
Avatar

Send message
Joined: 4 Nov 05
Posts: 24
Credit: 2,121,609
RAC: 0
Message 3979 - Posted: 22 Nov 2005, 21:47:47 UTC - in response to Message 3975.  
Last modified: 22 Nov 2005, 21:48:19 UTC

This project does it exactly the same way every other BOINC project does it...


I would like to emphasize this. Bill Michael is absolutely true, this is not Rosetta-specific and every single BOINC project behaves the same.


if this is standard for all BOINC projects then chalk up my remarks to ignorance since rosetta is the first boinc project i've ever crunched.

nothwithanding...here does seem to be other parameters which affect WU's and their download, such as the connect interval which David Kim has since reduced but which could be further tweaked, i think. i know of some fad users who are on dial-up and this project is not that dial-up friendly...

- team XPC - 'Where merry times and good crunching meet head-on!'
ID: 3979 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,586,521
RAC: 2,928
Message 3980 - Posted: 22 Nov 2005, 21:51:29 UTC - in response to Message 3977.  
Last modified: 22 Nov 2005, 22:07:15 UTC


here are the numbers for the machine i nquestion:

% of time BOINC client is running 98.3857 %
While BOINC running, % of time host has an Internet connection 100 %
While BOINC running, % of time work is allowed 35.231 %
Average CPU efficiency 0.984731
Result duration correction factor 0.395016

i don't know why the bold-faced stuff is showing that way???


BINGO! Your DCF shows this is a VERY efficient machine; the benchmarks say "100 minutes" and then you finish that WU in 40 minutes. But that "work allowed 35%" means that BOINC thinks this computer only works 8-hour days. Ten 8-hour days... about 3 24-hour days. Possible things to look at; preferences that say "do work only" between certain hours, or after being idle so long. This is a "calculated based on history" value, and I'm not _positive_ we can "fix" it in one simple change... but there is hope...

I'll go ahead and post what I would try, to save time, in case it's not a web pref. This may not work, haven't done it, but it's a possibility. First, stop BOINC and all science apps completely; File/Exit, then verify with task manager. Make a backup copy of the entire BOINC folder just in case. Locate "client_state.xml" in the BOINC folder. Near the top, just after host_info, are time_stats. The one in question is "active_frac". Change that value to 1.000000 (that's six zeroes). Change nothing else. Save the file, relaunch BOINC, hit "Update" on Rosetta and see if it gives you a ton of work...

EDIT:: I'm assuming Windows, so "open with Notepad" is the way to do this. If not Windows, any plain-text editor should be okay.
ID: 3980 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,586,521
RAC: 2,928
Message 3981 - Posted: 22 Nov 2005, 21:53:24 UTC - in response to Message 3978.  

is there a way to dbl-check the version # within the client?


Help/About BOINC Manager. 5.2.7 is, I think, the latest recommended release.

Again, I know some people are paranoid about showing their computer info, but click on my name and look at what it shows for mine. It's not everything it will show YOU for your computers, but it would let us see what you're running, look at results, etc...

ID: 3981 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,586,521
RAC: 2,928
Message 3984 - Posted: 22 Nov 2005, 22:01:18 UTC - in response to Message 3979.  

nothwithanding...here does seem to be other parameters which affect WU's and their download, such as the connect interval which David Kim has since reduced but which could be further tweaked, i think. i know of some fad users who are on dial-up and this project is not that dial-up friendly...


Yes, BOINC isn't great for dial-up users, and the deferred interval the projects can set can make it worse. However, the V5 versions are a LOT better than before. A major part of SETI Classic being kept running this long was waiting on BOINC V5, for this and other reasons. There are more dial-up-friendly changes in the works, but nothing that I know of that'll be out in the next few days/weeks.

Knowing what is BOINC and what is a project is sometimes difficult. I'm attached to five projects, and have been trying to not only read and absorb the WIKI, but also edit parts of it, plus monitor all these boards, so I've finally begun to get a grasp on what's going on. And I do say _begun_, because this system is extremely powerful, and therefore necessarily awfully complex. The nice thing is that there are users, and project staff (sometimes) around to help. Rosetta is, in my opinion, the "best run" of the BOINC projects, at least of the ones I've investigated so far.

Someone said today on the "new users welcome" thread on SETI, that the most important advice they could give was "Don't Panic". How true...

ID: 3984 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile stephan_t
Avatar

Send message
Joined: 20 Oct 05
Posts: 129
Credit: 35,464
RAC: 0
Message 3985 - Posted: 22 Nov 2005, 22:02:09 UTC - in response to Message 3981.  

is there a way to dbl-check the version # within the client?


Help/About BOINC Manager. 5.2.7 is, I think, the latest recommended release.

Again, I know some people are paranoid about showing their computer info, but click on my name and look at what it shows for mine. It's not everything it will show YOU for your computers, but it would let us see what you're running, look at results, etc...


Haha yes, first time I saw my IP and hostname in there I was quite taken aback :-)
Obviously if you logout you'll see that those are NOT displayed to anyone else.

Maybe that's something that should be made more clear to the new users?
Team CFVault.com
http://www.cfvault.com

ID: 3985 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,586,521
RAC: 2,928
Message 3987 - Posted: 22 Nov 2005, 22:09:56 UTC - in response to Message 3985.  

Maybe that's something that should be made more clear to the new users?


Definitely. My main complaint with _all_ of BOINC is that the "officially provided" documentation is way too little. The WIKI helps, but... it's more of an in-depth resource than "getting started" guide. Paul has put a "getting started" section up, but if you add enough to that to cover everything, you're back to the whole Wiki!

Just to make everyone a little more paranoid... :-)

ID: 3987 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : unable to increase work cache size



©2024 University of Washington
https://www.bakerlab.org