Dial up or capped internet experiment

Message boards : Number crunching : Dial up or capped internet experiment

To post messages, you must log in.

AuthorMessage
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 45377 - Posted: 25 Aug 2007, 9:11:39 UTC

For those of us on dial up or capped internet, I am trying an experiment, I'm attempting to make R@H do a 4 day run per Work Unit.

~5 meg a day is too much for me, though I should still be able to report within the time limit, if it works I'll try to push it out to nine days even though if a WU fails it will mean the loss of four ~ nine days of computing, and our RAC will sink like a lead balloon.

(Win XP)

I changed C:Program FilesBOINCaccount_boinc.bakerlab.org_rosetta.xml

From
...
<project_specific>
<max_fps>0</max_fps>
<max_cpu>0</max_cpu>
<cpu_run_time>86400</cpu_run_time>
</project_specific>
...

To
...
<project_specific>
<max_fps>0</max_fps>
<max_cpu>0</max_cpu>
<cpu_run_time>345600</cpu_run_time>
</project_specific>
</venue>
</project_preferences>
...

Then made C:Program FilesBOINCaccount_boinc.bakerlab.org_rosetta.xml
read only, so it wouldn't reset, as for the credits stuff well that doesn't make any difference for R@H.

Then exited BOINC (also Boincview) then restarted them.

It seems to be working, but I'm not sure about watchdog, but I'm guessing that it is using the values that are on your computer to figure out if all is well.

We shall see.


Um... I guess if the project staff have a problem with this they will nuke it, but dial up/ capped internet users like me want to contribute but it's just it's very hard on us, it took me about 1.5 hours to get the latest WU's.
ID: 45377 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 45384 - Posted: 25 Aug 2007, 10:27:05 UTC

I know the problem...
(luckily I don't have to worry about it any more)
But you have to remeber the turn around time they may want results (yes I guess you could say 10days)
Also what happens when BOINC tries to wirte to the read only file ?
Team mauisun.org
ID: 45384 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 45389 - Posted: 25 Aug 2007, 11:08:27 UTC

But you have to remeber the turn around time they may want results (yes I guess you could say 10days)


If I return the WU's before the dead line, I'm sure it will be fine.

Also what happens when BOINC tries to wirte to the read only file ?

It should be okay (I think) it should just give an error message (in the messages tab, I'm guessing), but then I did say it was an experiment :)

So far it looks good
BoincView:

(CPU time)1 day 01:24:28 (% done)26.47%

The truth is I really need to lower my downloads, no matter how much I love R@H I can't keep this up indefinitely
ID: 45389 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 45402 - Posted: 25 Aug 2007, 14:46:06 UTC

The watchdog shouldn't cause you any problem. It will see that you are making progress on the task, and reaching checkpoints, and you would still be under the 4 or 5x limit over your preferred runtime, so it would have no cause to step in and end a task.

The runtime preference had originally allowed values up to 4 days, but it was later revised downward to help the watchdog detect problems and keep productive work progressing on each machine. So, it might be possible for that limit to be extended again now that the issue of work units stalling and NEEDING the watchdog to step in, seems to be behind us.

One problem you may run in to some day would be that your output file will be larger, and there is a limit to the size it is allowed to reach. But the only times I have heard about that causing a problem was when the application was cranking out some messages much more frequently then expected and it was then quickly revised.
Rosetta Moderator: Mod.Sense
ID: 45402 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ingleside

Send message
Joined: 25 Sep 05
Posts: 107
Credit: 1,514,472
RAC: 0
Message 45502 - Posted: 27 Aug 2007, 0:26:08 UTC - in response to Message 45402.  

The watchdog shouldn't cause you any problem. It will see that you are making progress on the task, and reaching checkpoints, and you would still be under the 4 or 5x limit over your preferred runtime, so it would have no cause to step in and end a task.

The runtime preference had originally allowed values up to 4 days, but it was later revised downward to help the watchdog detect problems and keep productive work progressing on each machine. So, it might be possible for that limit to be extended again now that the issue of work units stalling and NEEDING the watchdog to step in, seems to be behind us.

The BOINC-client is also keeping an eye on things, and if you passes the #flops-limit the wu will be aborted.

Now, for the comparatively "slow" computer of hugothehermit this shouldn't be a problem, since if haven't mis-calculated his limit is 4.9 days. For a fast Core2 duo/quad on the other hand, the current wu-limit is less than 2 days...

"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
ID: 45502 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 45504 - Posted: 27 Aug 2007, 3:14:57 UTC

the #flops-limit


Forgot that one. Good call.

Yes, if they consider bumping the runtime pref. back to a 4 day max, they will have to bump the flops limit as well.
Rosetta Moderator: Mod.Sense
ID: 45504 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 45505 - Posted: 27 Aug 2007, 3:34:17 UTC
Last modified: 27 Aug 2007, 4:12:46 UTC

The BOINC-client is also keeping an eye on things, and if you passes the #flops-limit the wu will be aborted.

Now, for the comparatively "slow" computer of hugothehermit this shouldn't be a problem, since if haven't mis-calculated his limit is 4.9 days. For a fast Core2 duo/quad on the other hand, the current wu-limit is less than 2 days...


Where is that defined do you know? I don't mind a little .xml editing :)

Though I did screw myself up a bit, I have a four day cache, I miss read the amount of days I had to return all of the results, I had 8 days but I thought I had 14, the dreaded d/m/y, m/d/y got me.
So I will probably end up changing it back to one day and lowering my cache, before the next round of tests, assuming that the one that are working now are valid.

Edit:
The work units, when generated, come with a limit on the maximum number of operations it will process before it aborts itself. It's a check to make sure a work unit won't run and run and run (well you get my point). That server side value in conjunction with your benchmarks determine how much CPU time a work unit will get before self destructing (now how that is calculated is another question). There really isn't anything you can do about it.
here

If thats right then the server side is sending something to the client side, It must put it somewhere, but where is it, or maybe it's not editable? Hmm... I guess I need to have a look around.

Edit

rsc_fpops_bound is the culprit, I will have to look into what to do about it, if anything as I don't want to send back rubbish results.
ID: 45505 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 45511 - Posted: 27 Aug 2007, 13:21:13 UTC

Sounds like you've got it hugo. It's just a question of whether the validator would catch that your actuals (CPU seconds, and FP ops) exceed the project limits. And I'd say that is doubtful, because the 24hr limit does allow you to go over, and as was pointed out earlier, the FP ops may not be exceeded anyway.
Rosetta Moderator: Mod.Sense
ID: 45511 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 45578 - Posted: 29 Aug 2007, 22:52:36 UTC
Last modified: 29 Aug 2007, 22:55:11 UTC

Results here and here

The returned WU results were both about 729 KiB.

So all worked well, I now need to pull the cache down, set it back for one day WUs so I can finish all of the WUs on my machine then I will set it up for 4 day WUs from then on :)

Edit: coulnd't spell url :)

ID: 45578 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Dial up or capped internet experiment



©2024 University of Washington
https://www.bakerlab.org