users who run off-line are impacted by shorter deadlines

Message boards : Number crunching : users who run off-line are impacted by shorter deadlines

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 17348 - Posted: 30 May 2006, 10:32:01 UTC - in response to Message 17244.  

(But I really wish BOINC had the option to enable, 'finish job through to the end' rather than this swithing thing, there is loads of time to finish these shorter deadline jobs)

When the BOINC client will sometime be rewritten to match the specs (see: Client scheduling policies, CPU scheduling policies 6.), it will only start a new WU for the same project if the deadline is too near.

Norbert


/Offtopic-ish-rant/
I know why it's doing it but BOINC isn't very clever,
Current running Rosetta = 10 days time left till deadline, 70% through
New jobs comes along at 7 days, it switches to this new job.

Not very clever because, A runtime is 24hrs (It know this, approximatly). It know the on-time & run-while-on time (never off, 99%). So it should be able to conclude that if in 7hrs time when the current running job has finished, there will still be 24hrs left to finich the job before the deadline then don't switch.
Of course BOINC ignores all this, see's 7day is before 10 days and switches mid-task :(

Team mauisun.org
ID: 17348 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikus

Send message
Joined: 7 Nov 05
Posts: 58
Credit: 700,115
RAC: 0
Message 17443 - Posted: 31 May 2006, 17:28:31 UTC

I still had some 14-day-deadline WUs left in my queue, so I let them all crunch to completion. That made my queue empty. When I connected, a full six days (my queue size) of work was downloaded (#_of_WUs x specified_CPU_time). This of course plays havoc with the BOINC client, which subtracts (queue size + 1 day) from the deadlines -- making my computer overcommitted (on paper) before it even starts on the work. But, given the new deadlines of seven days, the *actual* completion time of the last WUs assigned to me is also tight. The __server__ ought to have recognized this, and not have downloaded so many WUs to my system.

As I said at the beginning of this thread, if I am forced by the shorter deadlines to shorten my queue size, it is MUCH EASIER for me to simply stop participating in Rosetta.
.
ID: 17443 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
NJMHoffmann

Send message
Joined: 17 Dec 05
Posts: 45
Credit: 45,891
RAC: 0
Message 17445 - Posted: 31 May 2006, 17:56:14 UTC - in response to Message 17443.  

The __server__ ought to have recognized this, and not have downloaded so many WUs to my system.

This bug of the (server-side-)scheduler makes it (near) impossible for the user to set queue length and ressource share to values, that make the user and the project happy.

Norbert

(And then there is the bug at the users, that set their queue length near the deadline. This is only possible, if the host is connected to several projects and EDF can postpone some WUs.)
ID: 17445 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 17446 - Posted: 31 May 2006, 18:37:54 UTC - in response to Message 17443.  

...making my computer overcommitted (on paper) before it even starts on the work... I am forced by the shorter deadlines to shorten my queue size, it is MUCH EASIER for me to simply stop participating in Rosetta.

I see BOINC waffle between "overcommitted" and "round-robin" all the time and my queue size is half the size of yours so it's not the looming deadline.

It is the client that decides how much work to request. If your WU runtime preference is accurately reflected in the initial estimated runtime to completion, then it will work out just fine. BOINC goes in to a bit of a panic there for a half day or so, but what harm is it to run in earliest deadline first mode?

It seems the client is just trying to assure you've got work to crunch on during the time it will be without a network connection, which sounds directly in-line with your objectives. So I clearly must be missing your point. Are you saying it should have downloaded LESS WUs? Are you actually having problems completing them before the deadline? Or is it just BOINC's concern about it that is disturbing?
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 17446 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikus

Send message
Joined: 7 Nov 05
Posts: 58
Credit: 700,115
RAC: 0
Message 17490 - Posted: 1 Jun 2006, 3:58:28 UTC - in response to Message 17446.  
Last modified: 1 Jun 2006, 4:14:53 UTC

...making my computer overcommitted (on paper) before it even starts on the work... I am forced by the shorter deadlines to shorten my queue size, it is MUCH EASIER for me to simply stop participating in Rosetta.

It is the client that decides how much work to request. If your WU runtime preference is accurately reflected in the initial estimated runtime to completion, then it will work out just fine. BOINC goes in to a bit of a panic there for a half day or so, but what harm is it to run in earliest deadline first mode?

It seems the client is just trying to assure you've got work to crunch on during the time it will be without a network connection, which sounds directly in-line with your objectives. So I clearly must be missing your point. Are you saying it should have downloaded LESS WUs? Are you actually having problems completing them before the deadline? Or is it just BOINC's concern about it that is disturbing?

I have no problem with the *client* requesting 518400 seconds of work if my queue size is 6 days. After all, the client does NOT know what kind of work will be downloaded.

My point was that the *server* downloaded 12 WUs, all of them having a deadline of seven days, KNOWING that for Rosetta my 'CPU time' specification is 12 hours. To finish crunching the last of those ought to take ((12 WUs * 12 hrs/Wu) / 24 hrs/day) = 6 days, when the deadline of whichever of those WUs is finished last is 7 days. That's cutting it pretty close: (7 days deadline - 6 days to complete) = 1 day "buffer". Should *anything* delay processing during those 6 days, it might take me MORE time than the deadline to finish crunching ALL the WUs that were downloaded by the server (not even counting any additional time taken for me to report back to the server the results of my having crunched the last of these downloaded WUs). I personally think the *server* ought to have downloaded 10 or 8 WUs, thereby allowing me a "buffer" (for me meeting the deadline of whichever of these WUs will be completed last) of 2 or 3 days.

[Added by edit: OR the server could set a deadline of 7 days from the download for the *first* of those WUs, but a deadline of (7 days + 3 days) for those WUs whose crunching will be PRECEDED by an estimated three days of crunching (of other WUs that were downloaded first within this same "download assemblage").]


Regarding BOINC's concern, I have advocated (here and in the BOINC forum) that the user ought to be allowed to specify __two__ parameter values - one for setting 'queue size' (how much work should be kept available to be crunched) and the other for specifying 'time between connects' (how long it might be between completing work and reporting it). But in my opinion a *severe* constraint on the "politically correct" sum of those two parameters is being imposed on off-line users by having deadlines as short as 7 days. With my queue size, I'd much prefer to see 8-day or even 9-day deadlines.
.
ID: 17490 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 17492 - Posted: 1 Jun 2006, 4:39:25 UTC - in response to Message 17490.  

[quote]It seems the client is just trying to assure you've got work to crunch on during the time it will be without a network connection, which sounds directly in-line with your objectives. So I clearly must be missing your point. Are you saying it should have downloaded LESS WUs? Are you actually having problems completing them before the deadline? Or is it just BOINC's concern about it that is disturbing?

I have no problem with the *client* requesting 518400 seconds of work if my queue size is 6 days. After all, the client does NOT know what kind of work will be downloaded.

I personally think the *server* ought to have downloaded 10 or 8 WUs, thereby allowing me a "buffer" (for me meeting the deadline of whichever of these WUs will be completed last) of 2 or 3 days.

[Added by edit: OR the server could set a deadline of 7 days from the download for the *first* of those WUs, but a deadline of (7 days + 3 days) for those WUs whose crunching will be PRECEDED by an estimated three days of crunching (of other WUs that were downloaded first within this same "download assemblage").]


If I hear you right, if it had only given you the 8 or 10 WUs, you'd have been done crunching them in 4 or 5 days, gone a day or two without establishing an internet connection and then reported them all in when you sit down to that PC and dial up to the net. That was my point about how it's trying to keep you with enough WUs to KEEP you crunching the whole time. If you do want to shut the machine off, or be positive that you NEVER pass a deadline, then your cache size (and therefore your use of that machine really) is larger than fits a project with a 7 day deadline.

One thing I've done in the past is to download a climate WU so I ALWAYS have something to crunch. But climate is no good for a dial up user. Another approach would be to add another project that has a consistent 14 day deadline. R@H gets the first 5 days of crunch time, the other project gets the last 2.

This was my point about how consistency of KNOWING what the deadline will be is important to some users. And how the mixture of 7 and 14 day deadlines removes some of the consistency for folks. But... actually... might work WELL for your situation, if you could consistently happen to get some of each type of work... which certainly isn't assured.

The 8 day deadline seems like it would give the project the timeliness they need (I doubt the 1 day difference to make it an 8 day deadline will make a material difference to them), and yet allow you a little more lattitude for a machine that you perhaps only visit once a week (unless you skip and go a day late once and a while).

While your idea for BOINC enhancement may resolve the problem, it isn't something that Rosetta can change, and will take considerable time to roll out. But the 8 day deadline would be as easy as it was to change from 14 to 7, so should be quite easy for them to change. That was why I had asked if 8 rather than 7 would help the situation for you.

Sounds like an 8 day deadline would be helpful for you. And still give the project what they need. So, I'm hoping they might consider that idea.

They can't really tag the deadline to be 7 days from the 3 days of expected crunching on the first WUs... because that was the whole point, the shortened the deadlines because they want specific results RETURNED sooner. And that approach doesn't get them returned sooner... and your next connect time would be past that deadline anyway unless they WERE crunched in the first week because nothing went wrong.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 17492 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikus

Send message
Joined: 7 Nov 05
Posts: 58
Credit: 700,115
RAC: 0
Message 17511 - Posted: 1 Jun 2006, 14:42:40 UTC - in response to Message 17492.  
Last modified: 1 Jun 2006, 14:45:58 UTC

It seems the client is just trying to assure you've got work to crunch on during the time it will be without a network connection, which sounds directly in-line with your objectives. So I clearly must be missing your point. Are you saying it should have downloaded LESS WUs? Are you actually having problems completing them before the deadline? Or is it just BOINC's concern about it that is disturbing?

I personally think the *server* ought to have downloaded 10 or 8 WUs, thereby allowing me a "buffer" (for me meeting the deadline of whichever of these WUs will be completed last) of 2 or 3 days.

[Added by edit: OR the server could set a deadline of 7 days from the download for the *first* of those WUs, but a deadline of (7 days + 3 days) for those WUs whose crunching will be PRECEDED by an estimated three days of crunching (of other WUs that were downloaded first within this same "download assemblage").]


If I hear you right, if it had only given you the 8 or 10 WUs, you'd have been done crunching them in 4 or 5 days, gone a day or two without establishing an internet connection and then reported them all in when you sit down to that PC and dial up to the net. That was my point about how it's trying to keep you with enough WUs to KEEP you crunching the whole time. If you do want to shut the machine off, or be positive that you NEVER pass a deadline, then your cache size (and therefore your use of that machine really) is larger than fits a project with a 7 day deadline.

... They can't really tag the deadline to be 7 days from the 3 days of expected crunching on the first WUs... because that was the whole point, the shortened the deadlines because they want specific results RETURNED sooner. And that approach doesn't get them returned sooner... and your next connect time would be past that deadline anyway unless they WERE crunched in the first week because nothing went wrong.

The people who specify things appear to believe that the world runs according to __rigid__ timetables. In my case, if I see that my system is close to running out of work (or has a completed WU whose deadline is close), I will then and there connect to the server (no matter *what* the "interval between connects" value happens to say). [My reason for specifying a large (queue size) value is (given non-exceptional circumstances) to not run out of work if I hapen to be absent for three days, or if the server is inaccessible when I try to connect.]

My point is that if the server gives me six days of work, ALL with a deadline of seven days, that seems "out of whack" to me. [There has been discussion that the server might "filter" the WUs it hands out according to the memory size of the client system -- why not consider "filtering" the WUs according to the estimated "time to crunch" of ALL the WUs that are now being handed out to that one client ?]

{And if the project needs the results returned "sooner", why hand out _so much_ work at one time that the last results from that download can't help being returned "later" ?]

Please note that if on Jun 1 the server has available a 7-day-deadline WU, but that WU happens to be "handed out" only on Jun 2, the deadline seen by the user will be Jun 9. In effect, the "delay in handing out" of the WU __extended__ the date on which the results will be expected back. Why can't "delay due to crunching preceding WUs" (as "handed out" in this same "download assemblage") *also* be used to extend the date on which the results (of the last WUs in this "download assemblage") would be expected back ?


You say that my cache size is too large to fit a project with 7-day deadlines. I agree. But when I joined Rosetta, it had (I think) 28-day deadlines. [And the description of the project did NOT indicate that deadlines would come to be drastically shortened.] I believe that I am making a contribution to Rosetta. To repeat - if what I do doesn't "fit", I'll just leave.
.
ID: 17511 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 17551 - Posted: 2 Jun 2006, 18:56:12 UTC - in response to Message 17511.  

To repeat - if what I do doesn't "fit", I'll just leave.

Certainly noone wants you to leave. The suggestions you are making pertain more to BOINC than Rosetta. A change of deadline from 7 days to 8 though is something I think they could almost literally do tomorrow if they decided it was something they should do. You are describing things that are possible, but not presently available options. And the project didn't make the decision to go to a 7 day deadline without cause either. So, I'm just trying to help fill that gap between the short and long term. I hope that I'm helping in that respect.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 17551 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2

Message boards : Number crunching : users who run off-line are impacted by shorter deadlines



©2024 University of Washington
https://www.bakerlab.org