Message boards : Number crunching : Credit posting screwiness
Author | Message |
---|---|
arcturus Send message Joined: 22 Sep 05 Posts: 16 Credit: 525,440 RAC: 0 |
This concerns the linux client. Anyone care to tell me what's the difference between the following messages? - 'finished upload' - 'returning X # of results' for it isn't until I see that second message, forced by an 'update_pref' command, that I'm finally awarded points. This is true across 3 different projects incl Rosetta. Obviously I'm trying to avoid this extra command step. Any ideas? |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
You cannot, the reporting process is a two step process ... look in the Wiki and search on reporting process for details. |
arcturus Send message Joined: 22 Sep 05 Posts: 16 Credit: 525,440 RAC: 0 |
But WHY am I forced to do an 'update_pref' command? |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,359 RAC: 10 |
But WHY am I forced to do an 'update_pref' command? You shouldn't be... I think what you're seeing is that the "delay" between uploading and reporting is long enough on your machine that you notice it and manually update before it gets to the automatic report. It will report when it goes to get more work, plus at several other times I'm too lazy to go look up, that are more "safety" related, such as "before it would pass deadline". If your cache size is large, it basically means it reports less often. Take a look next time you have some "ready to report", and count the number, note the oldest one. Leave it alone for a day, and look again. Unless your cache size is huge, or something really is 'broke", you'll see that the old ones have been reported. |
Webmaster Yoda Send message Joined: 17 Sep 05 Posts: 161 Credit: 162,253 RAC: 0 |
In a nutshell, uploading goes to one server, reporting goes to another. Updates are generally done on the basis of your "connect to network every..." setting. If that's set to 2 days, it may take two days before your results are reported (unless you do a manual update or BOINC needs to contact the server for other reasons). Judging by the number of WU "in progress" on a couple of your machines, you may have a setting in the order of 2 or 3 days, but I may be wrong. If you have a permanent internet connection and want BOINC to report more often (without having to do it manually), lower that setting to something like 0.5 or 0.25 days (or even 0.1 = 2.4 hours). From my perspective it would be better if BOINC had two settings - one for "connect every" and one for cache size (e.g. connect every 2 hours, keep a day's cache) but that's not the way it works. *** Join BOINC@Australia today *** |
dgnuff Send message Joined: 1 Nov 05 Posts: 350 Credit: 24,773,605 RAC: 0 |
From my perspective it would be better if BOINC had two settings - one for "connect every" and one for cache size (e.g. connect every 2 hours, keep a day's cache) but that's not the way it works. Quoted for truth. This could fix quite a few problems. |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,359 RAC: 10 |
From my perspective it would be better if BOINC had two settings - one for "connect every" and one for cache size (e.g. connect every 2 hours, keep a day's cache) but that's not the way it works. Repeated for emphasis. :-) This has been asked for repeatedly, for a long time. As Dr. Anderson is opposed to having two values, the other alternative is to have a single "cache size" value, and a flag for "dial-up". If on dial-up, then the cache size would also indicate how often they could connect; if not, then BOINC would connect "at will". Unfortunately for the specific thing being discussed here, report_results_immediately, the setting would be "yes" for dial-up users (since you don't want to wait until the _next_ connection to report) and the setting for always-on users would stay pretty much as it is. The idea behind it is to collect up a batch of reports and do them all at once, to reduce the load on the reporting server. (Which for most projects is separate from the upload server.) Now, some enterprising person could recompile the code with the report_results_immediately switch permanently set to "yes"... (This has been done before.) |
FluffyChicken Send message Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0 |
From my perspective it would be better if BOINC had two settings - one for "connect every" and one for cache size (e.g. connect every 2 hours, keep a day's cache) but that's not the way it works. This boinc core client has it enabled here. http://boinc.truxoft.com/ I use it for that very reason, I'm on dial-up and I would love this to be an option (and an option that can be set by the people that do the project e.g. Rosetta, not just the lead developer ;)) It also has a very hand PORT setting for RPC, which should be in the client by default as well. All results have to be reported sometime, so the load never goes away :-s Team mauisun.org |
Ingleside Send message Joined: 25 Sep 05 Posts: 107 Credit: 1,514,472 RAC: 0 |
In a nutshell, uploading goes to one server, reporting goes to another. Reporting of results happen: 1; When asking for more work. 2; User manually updating. 3; When finished result has less than 1 day till deadline. 4; If N days since result finished, there N is the same as cache-setting. 5; Can also happen due to trickle-message in CPDN. For someone with a permanent connection and mainly running one project, normally it will be crunch result-1 to end, immediately upload it, and somewhere during crunching result-2 work on computer has dropped under cache-setting so asks for more work. So, you'll normally report 1-2 results each time asks for more work. This is the same regardless of cache-setting being 0.1 or 10 days. The reason it doesn't completely follow this, is how much cached work is based on expected run-time, not actual. Example: Someone has just crunched-through a bunch of "fast" results there cpu-time is 1.5h, and "to completion" has also dropped to 1.5h. Cache-setting 5 days, meaning 80 wu cached. If user gets 1 "normal" result taking 2h, after finishing this result "to completion" also jumps to 2h, meaning client suddenly thinks you've got 6.58 days cached. Since 6.58 days > 5 days cache-setting, can crunch many results before again time to ask for more work. As for some of the likely reasons reporting is a 2-step-process, take a look here |
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
... One of those bloomin' obvious points that is, surprisingly, Not Quite True. It turns out thst the load on the database depends on the number of contacts the client makes, not the number of results being reported. The reason for BOINC designing a delay is in the hope that by the time a result is reported you will have further results to report in the same connect. That does save load on the database server. This is not an issue for upload, where the files can be quite big on some projects, and it spreads the load on the network better if the result uploads soon after it finishes. My ideal solution would be to have a setting that said 'report every N uploads', so that the report never forced a separate dial up but even for dial up users there was scope for saving database connects. This issue especially affects Einstein@home, where the database connect limit is the critical bottleneck, or so I am told. I have no idea if that is an issue here. |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,359 RAC: 10 |
This boinc core client has it enabled here. The problem with the Trux's BOINC Client, unless there are two there to choose from, is that the benchmarks on it are extremely optimized. This is the client that I was using when I was running 90% SETI and 10% Rosetta, that I removed when I reversed that share because of the extremely high claimed and granted credit - the "cheating" factor on Rosetta. |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,359 RAC: 10 |
For someone with a permanent connection and mainly running one project, normally it will be crunch result-1 to end, immediately upload it, and somewhere during crunching result-2 work on computer has dropped under cache-setting so asks for more work. So, you'll normally report 1-2 results each time asks for more work. This is the same regardless of cache-setting being 0.1 or 10 days. This also undermines the whole reason for the delay. You're saying that in the "normal" case, you will upload a result, and then before the next result is uploaded, you'll report it. One upload and one report, and all the report_results_immediately being "no" has done is delay the report, not cause them to be "bunched". I understand the whole "load on the servers" issue, and why it would be nice to report multiple results at the same time; but given the reality of the way things work, unless the results for the project are very "short" (my fast PC sometimes reports 3 or 4 SETI at one time, never more than 1 Einstein, but it can do a SETI result in just over 30 minutes...) the current implementation is not reducing this load by very much. I would love to hear from the projects what the "average" number of reported results in one connection is. Based on what _I've_ seen on my own machines, I suspect it will be very close to "1", in spite of all the effort that's been expended. I also think that _some_ projects don't care at all, because the server in question is not stressed to begin with. Meanwhile, many participants are confused, a few lose credits, a few are angry... and I _really_ think this is a SETI-driven solution to a problem only they have, and once SETI_enhanced goes in, _it_ will solve the problem, and we'll be left with a _greater_ confusion because the reports will be happening that much less often. |
ecaf Send message Joined: 26 Oct 05 Posts: 1 Credit: 45,802 RAC: 0 |
If you have a permanent connection and Boinc is set to network connection alsways active then it will download and upload anytime a result is ready and request more work. If network set to run based on preferences it will wait based on your preference settings. At least this is what I have seen on my machines. Ecaf |
Ingleside Send message Joined: 25 Sep 05 Posts: 107 Credit: 1,514,472 RAC: 0 |
I understand the whole "load on the servers" issue, and why it would be nice to report multiple results at the same time; but given the reality of the way things work, unless the results for the project are very "short" (my fast PC sometimes reports 3 or 4 SETI at one time, never more than 1 Einstein, but it can do a SETI result in just over 30 minutes...) the current implementation is not reducing this load by very much. Well, with immediate reporting there's 2 RPC/result, with delayed reporting you'll get around 1 RPC/result. Now, if updating result-info in database is 99% of the load there's no point with the delay. But, if triggering a new scheduler-thread, parsing request, looking-up and updating host-info and so on, and at the end sending-back a reply is a large part of the load, getting half the number of connections to scheduling-server will mean a significant drop in load... For someone running multiple projects with roughly equal shares and short cache-setting, there will likely still be 2 RPC/result. For someone mainly running one project and cache-setting shorter than 1/2 the crunch-time, there'll also be 2 RPC/result. But, if cache-setting is slightly larger, with the current client you'll suddenly drop to around 1 RPC/result, and therefore potentially higher server-capasity. Anyway, since Einstein@home-results has very little variation in crunch-times, and actual crunch-times mostly is a little longer than initially expected, even with only 1 result you've reached a "stable" condition, with reporting last result while crunching on the next. |
dgnuff Send message Joined: 1 Nov 05 Posts: 350 Credit: 24,773,605 RAC: 0 |
From my perspective it would be better if BOINC had two settings - one for "connect every" and one for cache size (e.g. connect every 2 hours, keep a day's cache) but that's not the way it works. Has Dr. Anderson explained why he is opposed to two separate values? |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
Ingleside, I added the discussion to Reporting Process for some reason I had missed the suggestion to add it ... :( As usual, I made some changes so you need to check to see if I messed it up ... I did add a note in the middle that reflects *MY* observed behavior with buffer size and reporting. |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
Has Dr. Anderson explained why he is opposed to two separate values? Yes. ===== Result deadlines are increasingly important in BOINC. Some projects (like Predictor@home) need fast turnaround; a human being is waiting to see the results. For throughput-oriented projects (like SETI@home) latency is important because it delays granting credit. A few months back it became apparent that the BOINC client's work-fetch and CPU scheduling policies often resulted in missed deadlines, which is a bad situation: it wastes computation, delays credit, and can cause correct results to be granted no credit. John McLeod (with some help from me and others) developed a set of interrelated mechanisms (CPU scheduling, work fetch policy, improved completion-time estimates) that try to get as much work done as possible, while meeting deadlines and honoring resource shares. These mechanisms are tricky, and in some cases they do counter-intuitive things, like not fetching any work at all for a particular project for a long time. During this process, the idea of user-specified cache size made less and less sense. We kept it around, but it became muddled. Going forward, I'd like to completely get rid of user prefs relating to cache size and network connection period. BOINC should "do the right thing" with no user tweaking. If user input is needed to do good scheduling (which I doubt) it should be in high-level terms like: "This computer will be off-line for 3 days" or "Network connects cost me money, please make as few as possible" rather than low-level things like cache sizes. -- David (AKA "the lead developer") ==== Edit The title of the message was: "Re: [boinc_dev] Connect to network every n days" On the dev mailing list. You can get the archive if you want the whole discussion. |
FluffyChicken Send message Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0 |
I actually knew the 'not quite true on the loading' when I posted it I would have thought BOINC may be intelligent enough to see I have 20 jobs being uploaded, so when all 20 jobs have eventually gone, 'update stats' Can you tell me the answer to this, When I'm connected, the jobs auto send themselves. I then disconnect but they are sat at the 'report' stage, my comuter crashes, virus etc... Have to do a clean install, Is boinc still able to report thoose 20 odd results ? Anyways most of my comments are for 'Rosetta under BOINC', not really BOINC as whole. HENCE why I said leave the reporting option up to the project to 'enable' or 'disable' or 'give the option to members' and not just the lead Developer(s). This gives the members who are crunching the project voice (e.g. me and you for rosetta), who can give comments to Rosetta for them to decide on how it effect their bandwidth, request load, server setup and users happiness. After all very few of us bother going to 'BOINC', most come to the project forums, as they are who we are doing the work for, when asking questions. It would then be up to the project people (e.g. Rosetta) to ask for the modifications. Going forward, I'd like to completely get rid of user prefs relating to cache size and network connection period. BOINC should "do the right thing" with no user tweaking. If user input is needed to do good scheduling (which I doubt) it should be in high-level terms like: "This computer will be off-line for 3 days" or "Network connects cost me money, please make as few as possible" rather than low-level things like cache sizes Totally agree but that will take something like a centralised setup (Account Manager ?) to co-ordinate between multiple projects better. Yes Truxoft does inflate the benchmarks relative to the standard core-client, but it's nothing compared to the widely used Cruncherz. (I use Truxoft for the 2 features mentioned) Inflated benchmark score, well that's for Rosetta to sort out, not me. Welcome to open source ;-) Team mauisun.org |
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
...I would have thought BOINC may be intelligent enough to see I have 20 jobs being uploaded, so when all 20 jobs have eventually gone, 'update stats'... Sorry to be pedantic, it is not about updating the stats (which for BOINC perticipants usually means the credit scores etc), it is about updating the database that keeps track of which results are in progress, overdue, returned, etc ...Can you tell me the answer to this, In my experience no. If you have data that has been uploaded OK, but not yet reported, a few days before the deadline, and your net connection goes down till after the deadline, on projects that enforce the deadlines you lose the credit and on projects that re-issue non-returned results a new copy of that WU will have been sent to someone else. This as far as I am concerned is a major flaw in the BOINC design and in my opinion all results should be reported as soon as the last relevant data file is uploaded, and projects that are getting free computing from donors shopuld not be so penny-pinching about buying extra bandwidth on their database machines. However, my previous answer was describing BOINC as it is rather than as I'd like it to be -- and in fairness there is so much going in its favour that I am content to tolerate its petty annoyances. R~~ |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
Well, there is some good news in a way, I just updated the Wiki with more on the Reporting Process with information Ingleside wrote up, and he looked at it and made some more corrections. But, this at least explains the rational behind the design. With regard to crashes, as long as the basic data folder for BOINC is not damaged the work is not lost even if there is a problem with the executables. However, damage to the client state file will cause the information to be lost. IF, there is not a quorum of results, and you are "late" reporting, the work will be re-issued automatically. However, if you DO report before the new quorum of results is formed you will, if your work validates, be issued credit and your work will be part of the quorum of results (possibly the canonical result). On Rosetta@Home and CPDN the quorum size is 1 and so, as long as you don't blow the deadline you will be ok. Oh, and CPDN is not that strict on blowing past the deadline ... There is work under way for a new project management tool called the account manager. I am not sure what all it is going to do for us, or to us, but, this may have features also that will address our multi-project management issues. One other note is that BOINC View also has been steadily adding management and monitoring features for us BOINC Addicts (one I am REALLY looking forward too is the MySQL database for completed work ... I am not sure if I answered more of your questions or not ... |
Message boards :
Number crunching :
Credit posting screwiness
©2024 University of Washington
https://www.bakerlab.org