WUs Advancing Together

Message boards : Number crunching : WUs Advancing Together

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,824,095
RAC: 1,014
Message 65958 - Posted: 4 May 2010, 17:58:05 UTC

Hi Ross,
I was suggesting that you check the memory use of all the applications running on your machine, not just BOINC and rosetta. And to recheck at various times during the course of the day. The memory requirements of any particular application may fluctuate or increase steadily (especially if they have a memory leak) even if you are not actively using them (I once left Firefox open on an unattended machine for over a day and came back to find it claiming nearly a GB of memory!).

I am going to try to describe what I think could be happening.

I have been making some assumptions about the settings on your machine; perhaps I should list those first:

1. SETI and rosetta have equal resource share

2. task switch interval is 60 minutes

3. tasks are not kept in memory while suspended

4. BOINC is allowed to run while user is active

If you don't know if these are correct, post if you are using web-based or on-board preferences, and I'll tell you where to look.

So imagine the computer fresh off a reboot. Only the start up items are running and no memory has yet been lost to leaks. BOINC starts crunching a rosetta WU. You start your workday opening apps A and B.
After an hour, BOINC switches to SETI, removing the rosetta WU from memory. You've stopped actively using app A but left it running in the background, it's still claiming memory of course and there are minor fluctuations in it's requirements. You've opened app C which happens to be a real memory hog.
After an hour BOINC switches from SETI to rosetta but there is no longer enough memory available to run rosetta WU #1 so BOINC checks to see if it has another rosetta WU (it's trying to honor your resource share preferences or else it could just keep crunching SETI), it finds one and as that workunit can be started with the amount of memory that's available, crunching is begun on rosetta WU #2.
After an hour BOINC switches to SETI. Near the end of the hour you close app C and go to lunch.
At the switch interval BOINC tries to run rosetta WU #1. This time there's enough memory so rosetta WU #1 gets an hour of crunch time and the progress bar advances.
Another switch interval, more SETI crunching, you come back from lunch and open apps D, E, and F. Perhaps app B, left running all day, has a memory leak and after several hours has turned into real memory hog. At some point there's not enough memory to run either rosetta WU #1 or #2 and BOINC starts #3. Eventually you stop work for the day, quitting all those programs, maybe even doing another reboot before leaving the computer running on its own all night. At which point more memory is available to run rosetta and WU #1 gets another go.

To reiterate: the memory preferences set a maximum amount BOINC is allowed to use. There's no guarantee that this amount is ever available to BOINC, certainly no guarantee that it is available at any particular point in time (such as at the moment BOINC is switching tasks). BOINC will try to honor the resource share settings even if it is forced to give up on a particular task due to resource limitations.

A few things to keep in mind:
Different rosetta WUs will require different amounts of memory to run, some requiring dramatically more than others.
If BOINC has to stop a task because it has run out of available memory I would have expected to see a "waiting for memory" message. I don't know if BOINC knows how much rosetta WU #1 now needs to run or if it has to actually try to start it up before determining there's not enough memory to run it. If, at the task switch point, BOINC checks how much memory is available and knows how much rosetta WU #1 will require, then it could be going straight to rosetta WU #2 with out leaving a clue in the message log. This might be a question for the gurus on the BOINC forum. I would still expect you to see the occasional "waiting for memory" message, though, as the rosetta WU must occasionally run out of memory mid-hour. Presuming you are in fact running tight on memory.

Which leads to the other possibility, that there is no issue regarding memory, that BOINC became afraid it had overfetched rosetta workunits when you increased your cache, and went into panic mode. As has been suggested by several others BOINC will sort this out eventually even though the process might appear nonsensical to the user. If after leaving BOINC alone for a week or so during which the other workload on the computer is fairly consistent, the user is missing deadlines, then some tweaks would be in order. One of those might be reducing the target CPU runtime. This is a rosetta specific setting, further explanation of which will have to wait for a latter post.

Snags


ID: 65958 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jochen

Send message
Joined: 6 Jun 06
Posts: 133
Credit: 3,847,433
RAC: 0
Message 65960 - Posted: 4 May 2010, 19:26:22 UTC

Nice post, Snagletooth!

I am just unsure about one thing:
As far as I understand, 'leaked' memory is moved to the swap file sooner or later (since the corresponding memory pages are no longer requested by any thread, Windows decides to swap it to disk). One could run out of swap file space, though, but that is a different story. :)
A memory leak should not really eat up your RAM. It should not slow down the system either (besides the load on the CPU and ressources when swapping the memory to the disk).
I have read this somewhere, but I can't remember where... So I am not sure how trustworth this information is.

Jochen


ID: 65960 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ross Parlette

Send message
Joined: 10 Nov 05
Posts: 32
Credit: 2,165,044
RAC: 0
Message 65962 - Posted: 4 May 2010, 22:56:35 UTC

I'm still looking for Target Runtime. Currently, my "active" profile (latest timestamp) in on-line at SETI, but I haven't made any changes yet anyway. When I do, I'll make them slowly and here on Rosetta. Am I correct in understanding that the latest profile trumps the others? Or does it only trump similar features?

I've looked at Your account | Computing preferences here at Rosetta, but I don't see Target runtime. What is the category (Processor usage, Disk and memory usage, or Network usage) and Leading text for it? (I can't seem to find the salt on the table either.) Do I have to turn on the "super expert" bit to see it? (grin)

My SETI profile leaves applications in memory while suspended. I did this some time ago because they kept having errors when I asked that they be written out to disk by BOINC. Should I try changing this now? It might well solve the problem of insufficient swap space, an error I get fairly often. I only have 2.5 GB RAM, a honkin' amount back in the day when this PC was new.

Rosetta and SETI are each 50 share and they are my only BOINC tasks; 60 minute swap time; BOINC runs while I am posting this, for example.

Rosetta & BOINC are now giving me a fresh WU well before the previous Rosetta WU is completed. I guess my reliability score is improving.
ID: 65962 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 65972 - Posted: 5 May 2010, 12:12:15 UTC

Ross, did you see the link for "Rosetta@home preferences"? ...and then the setting for "Target CPU run time"?
Rosetta Moderator: Mod.Sense
ID: 65972 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,824,095
RAC: 1,014
Message 65981 - Posted: 5 May 2010, 16:44:35 UTC - in response to Message 65960.  

Nice post, Snagletooth!

I am just unsure about one thing:
As far as I understand, 'leaked' memory is moved to the swap file sooner or later (since the corresponding memory pages are no longer requested by any thread, Windows decides to swap it to disk). One could run out of swap file space, though, but that is a different story. :)
A memory leak should not really eat up your RAM. It should not slow down the system either (besides the load on the CPU and ressources when swapping the memory to the disk).
I have read this somewhere, but I can't remember where... So I am not sure how trustworth this information is.

Jochen



Well now I'm unsure too! I shouldn't be mistaken for a computer expert of any kind (and I don't run Windows so I can't add personal observations or testing to the mix). I was trying to imagine a sequence of events that could explain the BOINC actions Ross observed. If you are right I don't think it invalidates the whole scenario but it definitely adds to the list of things I should learn more about.

I always experience a slight nervous flutter as I hit the "post reply" button, a bit worried that a gap in my knowledge or a lapse in my thinking skills has led to some fundamental error. I sincerely hope others will post with corrections, nor allow any suspected misunderstandings or bad logic to go unchallenged. I've learned a lot reading this and other BOINC boards and there's always so much more to explore.

Snags
ID: 65981 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,824,095
RAC: 1,014
Message 65982 - Posted: 5 May 2010, 17:06:55 UTC - in response to Message 65962.  

Ross,
I'm going to start with the last thing you wrote since it's the most important:

Rosetta & BOINC are now giving me a fresh WU well before the previous Rosetta WU is completed. I guess my reliability score is improving.


Everything appears to be working so while I'll try to answer your questions I'll also try to refrain from suggesting any changes. And I advise you to restrain yourself from making changes, at least until the other volunteers have had a chance to point out where I've gone wrong.

First up: You can set your computing preferences on either website or in the BOINC manager. And yes, last changed prevails. If you go to your account page you should see three links in the preferences section. The first link is to the computing preferences which are referred to as the General prefs in the BOINC manager message log. The snippet of log you posted earlier includes this line:

3/12/2010 4:08:26 PM rosetta@home General prefs: from rosetta@home (last modified 12-Mar-2010 16:08:06)

If, since you made that post, you have gone to the SETI website computing preferences page and hit update preference (even if you didn't actually change anything) then those are the prefs BOINC will start using. They'll become active the next time the BOINC client contacts the SETI website and the next time you restart BOINC you will see the change reflected in the message log. If you use the BOINC manager to set these preferences instead of the web page you will see an additional line in the log: Reading override preferences file. I'll repeat mod.sense's advice: It will be a lot less confusing if you pick one place and stick to it.

Going back to your account page, preference section, the third link, resource share and graphics, will take you to project specific preferences. This is where you will find the target CPU time. It's not turn around time or elapsed time but the amount of CPU time the WU is expected to require. The default is three hours, the maximum is 24 hours. The rosetta app will try to honor this preference but for some WUs exceptions must be made. Deviations will be greatest and more frequent for those target values at either extreme. BOINC uses this number along with other information to estimate how long a particular workunit will take to complete and, by extension, to plan on-board and work fetch scheduling. Take heed of mod.sense's warning to make changes incrementally. The changes will apply to all the rosetta WUs on board so the larger the cache the more dramatic the effect. Worst case BOINC won't be able to avoid missing deadlines but eventually it will clean up the mess; you just might want to cover your eyes while it does so.

If someone is consistently missing deadlines, reducing the target CPU time is one of several options that could help. Most crunchers though, would get more benefit from increasing the amount of CPU time BOINC gets (and there is more than one way to do this) especially if their target runtime is already on the short side (say, 6 hours or less).

Snags

ID: 65982 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jochen

Send message
Joined: 6 Jun 06
Posts: 133
Credit: 3,847,433
RAC: 0
Message 65983 - Posted: 5 May 2010, 17:25:32 UTC - in response to Message 65981.  

Well now I'm unsure too!

You are welcome. ;)

But thinking a bit more about it, both could be true. I was thinking of a slow memory leak 1 MB here and there... But if memory is eaten up fast, it could be a totaly different story and I can't imagine what the side effects could be on Win 7 or Vista based system...

I could write a little program, that does nothing but leak memory with a configurable timer and amount of memory to leak... But I would NOT run it on my computer. ;)

Jochen

ID: 65983 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ross Parlette

Send message
Joined: 10 Nov 05
Posts: 32
Credit: 2,165,044
RAC: 0
Message 65989 - Posted: 5 May 2010, 22:00:36 UTC

Ah, found it! The key was selecting

Resource share and graphics Rosetta@home preferences.

My target CPU run time is 1 day. That seems a tiny bit modest; I would estimate 1.25 to 1.5 days (30 hours is the example I recall), but I'll leave it alone for the nonce. Yes this computer is a bit long in the tooth. (sigh)

If I do decide to change it, will Rosetta become the default site for ALL of my preferences / configurations or are the ones under Resource share and graphics separable from the rest?

BTW I now have **2** fresh Rosetta WU in addition to the one in progress. The one in progress is due the 13th and the two fresh ones are both due on the 14th (AM, PM). I guess I'll keep an eye out to see if they steal processor time, although there's enough wall clock time that it won't matter if they do.

In the meantime, I have another question. I've told BOINC that the network may be used at any time (I have DSL here at home). It would seem to this naive user that in addition to uploading right away, it should report fairly soon as well. I'm not worried about posting credit but clearing results by providing a quorum for a particular WU. But perhaps this is only an issue for SETI. If so, I'll ask over there.

On the BOINC site, one person was quite upset that he was not being given as much credit (by some other project) as his machine was claiming for him. A mod pointed out that the particular project wasn't even doing actual work yet but was only releasing test data, but he was not to be dissuaded and became quite huffy, even abusive and the mods locked the thread. I assume that the claimed to allowed credit difference is due to the differences in computational power of the various computers, as well as other features of the computer environment (swapping, memory usage / availability, disk speed, gonzo OS problems, etc.). It reminded me of the wag who said, "Golf isn't a matter of life and death; it's MUCH MORE important."


ID: 65989 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ross Parlette

Send message
Joined: 10 Nov 05
Posts: 32
Credit: 2,165,044
RAC: 0
Message 65990 - Posted: 5 May 2010, 22:02:00 UTC

I forgot to ask about writing out to disk. Should I try that? I had bad experiences with it some time ago but maybe it's improved.
ID: 65990 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,773,304
RAC: 3,957
Message 65996 - Posted: 6 May 2010, 12:13:23 UTC - in response to Message 65989.  
Last modified: 6 May 2010, 12:17:08 UTC

In the meantime, I have another question. I've told BOINC that the network may be used at any time (I have DSL here at home). It would seem to this naive user that in addition to uploading right away, it should report fairly soon as well. I'm not worried about posting credit but clearing results by providing a quorum for a particular WU. But perhaps this is only an issue for SETI. If so, I'll ask over there.


Boinc is setup to report to each project once every 24 hours if you just leave it alone, unless you change the default settings, or you manually do an update.

On the BOINC site, one person was quite upset that he was not being given as much credit (by some other project) as his machine was claiming for him. A mod pointed out that the particular project wasn't even doing actual work yet but was only releasing test data, but he was not to be dissuaded and became quite huffy, even abusive and the mods locked the thread. I assume that the claimed to allowed credit difference is due to the differences in computational power of the various computers, as well as other features of the computer environment (swapping, memory usage / availability, disk speed, gonzo OS problems, etc.). It reminded me of the wag who said, "Golf isn't a matter of life and death; it's MUCH MORE important."


Credits are granted a little bit differently at each project, so they are really not the same from here to there. But basically they are granted absed on how fast your pc goes thru the workunit. The faster you crunch the fewer resources of your pc you use, the fewer credits you get for that unit. BUT since you crunched it sooo fast you will finish more units than the pc that takes longer and will get more credits per day than the slower pc. There are variables in each project, some workunits have this or that in them and the more of them you crunch the more credits you get, but again that is project specific. Some projects even give fixed credits per workunit but those units tend to be fairly short, in the 1/2 hour or so range. And yes there are people that think a project gives too much or too little credits and will raise a stink and then usually move on. There are even different groups of people that think either that all projects should give the exact same credits or that each project should be able to do its own thing. My advice is try not to get labeled yet, you are early in the Boincing process and will have a long time to make that decision yet. Me, I am in the later group but that is just me.
ID: 65996 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,824,095
RAC: 1,014
Message 65999 - Posted: 6 May 2010, 13:21:19 UTC - in response to Message 65989.  

Ah, found it! The key was selecting

Resource share and graphics Rosetta@home preferences.

My target CPU run time is 1 day. That seems a tiny bit modest; I would estimate 1.25 to 1.5 days (30 hours is the example I recall), but I'll leave it alone for the nonce. Yes this computer is a bit long in the tooth. (sigh)

I assume you spotted the 30 hours in the elapsed column on the BOINC manager? Elapsed time is the wall clock time while the application was running. Actual CPU time is going to be somewhat less. Try this: With the tasks tab open in BOINC manager, highlight any task that has accumulated any elapsed time, open the properties window (last button, top panel on the left). You should see separate lines for CPU and Elapsed times. To see the actual CPU time of tasks already reported navigate to your tasks page (click on your name, over there on the left next to one of your posts, select |Computers View| select the number in the results column). It was looking at the CPU time(sec) column that lead me to correctly guess that your target CPU time was set for 24 hours.

If I do decide to change it, will Rosetta become the default site for ALL of my preferences / configurations or are the ones under Resource share and graphics separable from the rest?


They are separate.

BTW I now have **2** fresh Rosetta WU in addition to the one in progress. The one in progress is due the 13th and the two fresh ones are both due on the 14th (AM, PM). I guess I'll keep an eye out to see if they steal processor time, although there's enough wall clock time that it won't matter if they do.

In the meantime, I have another question. I've told BOINC that the network may be used at any time (I have DSL here at home). It would seem to this naive user that in addition to uploading right away, it should report fairly soon as well. I'm not worried about posting credit but clearing results by providing a quorum for a particular WU. But perhaps this is only an issue for SETI. If so, I'll ask over there.


It does report fairly soon, as long as it has an internet connection it will report no more than 24 hours after completing the task. From the BOINC FAQ Service:

  For BOINC 5.8 and above: Completed work is reported at the first of:
       1) 24 hours before deadline
       2) Connect Every X before deadline.
       3) 24 hours after task completion.
       4) Immediately if the upload completes later than either 1, 2, or 3 upon   completion of the task.
       5) On a trickle up message (CPDN only, I believe).
       6) On a trickle down request.
       7) On a server scheduled connection. Used, but I am not certain by which project.
       8) On a request for new work.
       9) When the user pushes the update button.
      10) On a request from an account manager. 

Rosetta doesn't use quorums. Projects set the deadlines to ensure they get results back in a timely manner. Every project defines timely manner for themselves. Why second guess them?

On the BOINC site, one person was quite upset that he was not being given as much credit (by some other project) as his machine was claiming for him. A mod pointed out that the particular project wasn't even doing actual work yet but was only releasing test data, but he was not to be dissuaded and became quite huffy, even abusive and the mods locked the thread. I assume that the claimed to allowed credit difference is due to the differences in computational power of the various computers, as well as other features of the computer environment (swapping, memory usage / availability, disk speed, gonzo OS problems, etc.). It reminded me of the wag who said, "Golf isn't a matter of life and death; it's MUCH MORE important."


I'll leave this explaination to Danny.


Snags
ID: 65999 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,824,095
RAC: 1,014
Message 66000 - Posted: 6 May 2010, 13:24:15 UTC - in response to Message 65990.  

I forgot to ask about writing out to disk. Should I try that? I had bad experiences with it some time ago but maybe it's improved.


Are you thinking the work is not checkpointed(saved) if "leave application in memory while suspended" is selected?

From the BOINC FAQ Service:
Leave applications in memory while suspended, leaves the science applications in memory (page file) when BOINC switches between applications. Very useful setting for tasks that do not checkpoint and otherwise run from beginning till ending without stopping. If BOINC is stopped or the computer rebooted, the applications in memory will be unloaded and upon a restart of BOINC the tasks that don't checkpoint will still start from the beginning.

You might find this link helpful.

If a task restarted repeatedly without making any progress BOINC would eventually give up on it and send it back with an error message. A few years ago, when rosetta checkpoints were few or nonexistent, this error was seen frequently enough that it was recommended to leave apps in memory when suspended. Since that time checkpointing has become much more frequent. If you look in the Properties window I mentioned in the earlier post you can see when that task last checkpointed. Keep in mind that the Properties window is just a snapshot; checkpoint intervals will vary within a workunit.

Snags



ID: 66000 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ross Parlette

Send message
Joined: 10 Nov 05
Posts: 32
Credit: 2,165,044
RAC: 0
Message 66018 - Posted: 7 May 2010, 7:20:24 UTC

Snags,

I couldn't find the Properties Window in the Windows Task Manager (Win XP). I have tabs named Applications, Processes, Performance, Networking, and Users. My menu choices are File, Options, View, Shut Down, and Help. I chose View while Processes was up and asked to see CPU Time in addition the the default Mem Usage. My minirosetta task has 10:33:20 CPU Time and BOINC says it has accumulated 10:51:27 elapsed. While SETI is running (now, for example) Rosetta doesn't accumulate any time. While it's not dead on, it's certainly close enough for hobby work.

I now have 4 (count 'em, 4) Rosetta tasks. I'll see how BOINC deals with scheduling them. Two are due on the 14th (as I said already) and the other two on the 16th, so there shouldn't be a problem even if later ones advance with earlier due ones. So far, just one has accumulated any time. OTOH, I now have only one lonely SETI task, delivered well after the previous one had finished.

I just had a thought about the time discrepancy. Most of the time, the non-BOINC usage is not very processor intensive. If I were doing a rendering while BOINC was running, the CPU and Elapsed might diverge further than they do now. With that in mind, I now see that that's what you were saying in Message 65999.

parl
ID: 66018 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,824,095
RAC: 1,014
Message 66045 - Posted: 10 May 2010, 11:27:33 UTC - in response to Message 66018.  

Snags,


I just had a thought about the time discrepancy. Most of the time, the non-BOINC usage is not very processor intensive. If I were doing a rendering while BOINC was running, the CPU and Elapsed might diverge further than they do now. With that in mind, I now see that that's what you were saying in Message 65999.

parl


You got it :)

To find the Properties window open BOINC Manager, advanced view and follow the directions in my earlier post.


Best,
Snags
ID: 66045 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ross Parlette

Send message
Joined: 10 Nov 05
Posts: 32
Credit: 2,165,044
RAC: 0
Message 66048 - Posted: 10 May 2010, 18:35:14 UTC

I let BOINC do its thing and I eventually got 5 WU for Rosetta. (I only have 1 for SETI but that's because the project keeps saying there are none available.)

The five WU are due 5/14, 5/16, 5/16, 5/19, & 5/19. All Rosetta work had accumulated on the 5/14 WU until I looked today, when the later 5/19 is now Running high priority. Checking the messages there were no mentions of memory, insufficient or otherwise.

Checking the Task Manager (Win XP) I see that the running WU (5/19 34%) has 122,144 K RAM, while the Waiting to run WU (5/14 73%) has 3,096 K RAM. While I will continue to observe these tasks, I'll keep my grubby fingers off of them, so BOINC can do its thing. My concern is that BOINC will preferentially schedule the WU due later and starve the WU due earlier, to the point where it may miss its deadline. BTW, Rosetta WU tend to to take 25 to 30 hours elapsed time. While running, the task tends to take 96 to 99% of the CPU, as my poor little fingers typing this note don't make much of a drain on it.

I notice that in my Your account | Computing preferences on Rosetta, it says Write to disk at most every 60 seconds. This sounds like more of a limitation on check-pointing rather than a guarantee of it. Wassup? My working Computing preferences (at SETI) has Tasks checkpoint to disk at most every 120 seconds, which has the same reverse logic and not much difference in timing, IMNSHO. Checking the Properties bar, it appears that at most 2 minutes lost is about accurate. At that, remaining in memory doesn't seem necessary, as the tasks switch about every hour. After the current set of WU is done, I'm going to try removing idle tasks from memory. 2 minutes lost per hour doesn't sound so bad, particularly when it appears that multiple Rosetta WU are taking up memory when they are idle.

This is an expansion of a post I left at the BOINC site.
ID: 66048 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 66064 - Posted: 11 May 2010, 2:38:52 UTC

I notice that in my Your account | Computing preferences on Rosetta, it says Write to disk at most every 60 seconds. This sounds like more of a limitation on check-pointing rather than a guarantee of it. Wassup?


If you are baking in the kitchen, and someone runs in at any random time and says "ok stop what you are doing and pack up everything in a mannar that you can continue it later"... you'd probably end up with a lot of very odd leftovers and half cooked items that just never really taste right. If you are interrupted while your bread dough is rising, how do you just stop that and continue it later? The other thing would be if someone ran in and demanded a forced stop too frequently, you'd never get any eatable baked goods produced at all. You would always just be unpacking the last items when you are told to pack up again.

Similarly, the BOINC manager cannot force an application to take a checkpoint. The setting is there to give you a control on how frequently an application is allowed to use the disk. It gives you a way to limit the amount of disk access. Over time this may help preserve the life of your drive. The length of time between checkpoints varies rather dramatically between various types of work units, but most try to take a checkpoint about every 10-15 minutes.

So yes, you are correct. The setting is a limit on frequency of disk use, rather then a guarantee of taking a checkpoint.

There is no need to change the setting. In general Rosetta does not use the disk very often anyway.
Rosetta Moderator: Mod.Sense
ID: 66064 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ross Parlette

Send message
Joined: 10 Nov 05
Posts: 32
Credit: 2,165,044
RAC: 0
Message 66073 - Posted: 11 May 2010, 20:16:03 UTC

Thanks. I'm not sure how to do what I want, then. What I want to do is have BOINC select the one due soonest first and not start the next until the first one is done. This would be done within the 50% share Rosetta has, assuming I can ever get another SETI WU. (What I'd also like to do is not retain Rosetta in memory when it isn't computing. Currently the several WU are occupying 115 MB, 151 MB, 381 MB, and 315 MB.)

I currently have 5 Rosetta WU and now 4 of them are progressing toward completion. (Only one has yet to start.) The have due dates of May 14 (81%), 16 (Ready to start), 16 (7%), 19 (7%), and 19 (72%). The oldest one (May 14) has accumulated 17:25:47 CPU in 21:19:30 elapsed. Since I'm also running SETI (when I can get work, not much lately), I would estimate that the overall elapsed time is half the start to finish time. But this means that if BOINC fritters away the CPU on later due tasks, the one due soonest may expire. It's running just now, but NOT Running high priority.

I have mixed feelings about even posting this here, since I really think this is a BOINC problem, not a Rosetta problem. But I still don't understand about the Target CPU run time (currently 24 hours). This is the maximum but it's not clear how reducing this would help, since it seems to take more than a day to complete a Rosetta WU, in the best of circumstances.

I've re-read Snag's message of 5 May (65982) and I still don't see how reducing my Target CPU time could help. I guess (a) Target CPU time and (b) checkpointing are Rosetta issues, so they belong here. Fixing the scheduling is BOINC's problem but that's not going to happen while they're chasing higher priority bugs (and I agree they're higher priority bugs).

So in conclusion (What does it mean when a preacher says, "In Conclusion?" Absolutely nothing) I think my current action will be to reduce my requested backlog to 2 days (from 4) and let the current crop clear out. Then I can revisit the Target CPU time and see if some fine-tuning is is order.
ID: 66073 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 66081 - Posted: 12 May 2010, 3:47:51 UTC

Keep in mind that "keep in memory" doesn't mean real storage, just virtual storage.

Any time you have resource shares for more then one project and one is out of work, things get a little skewed, and any time you change your runtime preference (I'm assuming you had to change it to 24hrs, because the default is 3hrs) it takes BOINC a while to assimilate the new resulting behavior into it's work estimates and scheduling.

I think shortening the period of time you are attempting to keep ahead on downloading work and leaving the rest alone is probably best. It will soon realize that a task is approaching a deadline and devote attention to it. Keep in mind that it will only take 20% of a day (about 5 hrs) to complete that May 14 task. From there, BOINC should settle down a bit.
Rosetta Moderator: Mod.Sense
ID: 66081 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ross Parlette

Send message
Joined: 10 Nov 05
Posts: 32
Credit: 2,165,044
RAC: 0
Message 66083 - Posted: 12 May 2010, 4:28:38 UTC

But even virtual memory has a limit. And I have occasional errors with virtual memory becoming exhausted. I normally re-boot at that time to clear out any problems.

I don't remember changing my Target CPU time to 24 hours, but it's certainly been that way for a long time.

I've now asked BOINC to Maintain enough work for 3 days (down from 4). We'll see how that goes. The 4 days didn't do what I wanted anyway. SETI still couldn't keep its side of the queue full, or even populated.

What I don't understand (mentioned before) is why BOINC would select the task with the longest lead time for Running with high priority.

And I now have SIX Rosetta WU, with FOUR of them progressing toward completion.
ID: 66083 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,773,304
RAC: 3,957
Message 66089 - Posted: 12 May 2010, 13:16:12 UTC - in response to Message 66073.  

Thanks. I'm not sure how to do what I want, then. What I want to do is have BOINC select the one due soonest first and not start the next until the first one is done. This would be done within the 50% share Rosetta has, assuming I can ever get another SETI WU. (What I'd also like to do is not retain Rosetta in memory when it isn't computing. Currently the several WU are occupying 115 MB, 151 MB, 381 MB, and 315 MB.)


We have been asking for this from the Boinc program writers for this for AGES! They have their own idea of how it should work though and we must play within their rules. There is a place to voice your opinions, I don't have it right now, but it a mailing list you can subscribe too so your opinions will be heard. DO NOT be hopeful though, Dr. A and crew have been doing this 'their way' and it mostly works according to them, and it is THEIR program. They say they DO read everything that comes thru though.
ID: 66089 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : WUs Advancing Together



©2024 University of Washington
https://www.bakerlab.org