Message boards : Number crunching : Leave in Memory?
Author | Message |
---|---|
Mike Gelvin Send message Joined: 7 Oct 05 Posts: 65 Credit: 10,612,039 RAC: 0 |
Ok. The Honeymoon is over. When is the problem that forces users to leave the applicaion in memory going to be fixed? It was brought to the forground over 4 months ago. |
Ethan Volunteer moderator Send message Joined: 22 Aug 05 Posts: 286 Credit: 9,304,700 RAC: 0 |
I don't think quoting a time is going to help anything. They have a list of issues that need to be worked on, and they prioritize their work on what is at the top of the list at any given time. I operate half a dozen machines with 128 of ram, and dozens with 256, all of which haven't shown the least bit of trouble leaving the application in memory. This is even when they are running all our business applications and the physical memory is full (in some cases using >60% of the scratch space). I'm not suggesting the bug should be forgotten, but there are issues that prevent people from participating (freezing at 1%, max time exceeded, bandwidth) that are currently being worked on. -Ethan |
nasher Send message Joined: 5 Nov 05 Posts: 98 Credit: 618,288 RAC: 0 |
yes i would like it so you dont have to leave in memory but guess what i have that option selected even before i knew it was important for this project Yea there are lots of issues out there and Leave in memory is one of them... personaly i dont think its rank 1 or so priority also it may take time to figur out WHY it errors when it swaps... more alpha/beta testing |
Mike Gelvin Send message Joined: 7 Oct 05 Posts: 65 Credit: 10,612,039 RAC: 0 |
|
Andrew Send message Joined: 19 Sep 05 Posts: 162 Credit: 105,512 RAC: 0 |
|
David Stites Send message Joined: 17 Sep 05 Posts: 25 Credit: 1,837,114 RAC: 0 |
|
Webmaster Yoda Send message Joined: 17 Sep 05 Posts: 161 Credit: 162,253 RAC: 0 |
I don't leave mine in memory and I run Rosetta just fine. If just fine means it completes SOME work units OK, I guess you're right. Your Athlon X2 (host id 765) has nearly as many WUs with client errors as it has successes - I wonder how many of those are due to you not leaving them in memory. *** Join BOINC@Australia today *** |
Keck_Komputers Send message Joined: 17 Sep 05 Posts: 211 Credit: 4,246,150 RAC: 0 |
I don't leave mine in memory and I run Rosetta just fine. Yep I am also seeing this problem. Unfortunately leaving applications in memory will not help me. The host usually won't get back to the workunit before a restart anyway. BOINC WIKI BOINCing since 2002/12/8 |
Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0 |
Just as feedback, for consideration, I didn't have the "Leave in memory" option enabled until just now, and sofar had 2 errors in 50 WUs on my XP test machine: host128426 On the other hand, I've had 2 errors in 4 WUs with a Linux box sofar (which has marginal under-spec'ed hardware vs R@H recommended hw, i.e. only 256MB RAM, but except for BOINC it's mostly idle), but it's probably the WUs themselves at fault, bec they failed in several other PCs as well: err wu1 err wu2 Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity |
David Stites Send message Joined: 17 Sep 05 Posts: 25 Credit: 1,837,114 RAC: 0 |
I don't leave mine in memory and I run Rosetta just fine. Probably none. David Stites Mount Vernon, WA USA |
Mike Gelvin Send message Joined: 7 Oct 05 Posts: 65 Credit: 10,612,039 RAC: 0 |
|
Andrew Send message Joined: 19 Sep 05 Posts: 162 Credit: 105,512 RAC: 0 |
David Baker has just posted that the new client is to be released "later this week" in this post. Of course, what this upcoming release addresses is to be seen. :) EDIT: found another post with more info. |
Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0 |
|
Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0 |
I wonder how exactly the process of "removing app from memory" is handled by BOINC and science app? I guess Rosetta would lose some of its temporary results, since its last "checkpoint"? (writing temporary results to disk every x minutes or y progress?) when it's pre-empted and removed from memory. It probably won't be able to save its work up to the last sec of computation. In short, I've had zero trouble with the "leave apps in mem when pre-empted" (even with marginal hosts, as outlined below, btw that Linux w/256M RAM has over 110 processes running) and wonder how much overhead I'd pay by returning to "removing apps". I know I could look at the source of some open-source science app like SETI, and asking this in the BOINC forums, but ... I thought I'd ask here too :-) Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
I wonder how exactly the process of "removing app from memory" is handled by BOINC and science app? I guess Rosetta would lose some of its temporary results, since its last "checkpoint"? (writing temporary results to disk every x minutes or y progress?) when it's pre-empted and removed from memory. It probably won't be able to save its work up to the last sec of computation.... You are correct. If the application is removed from memory during an application swap, it will loose the work performed since the last checkpoint. In the case of rosetta the checkpoints occur each time the percentage advances. Since is takes nominally 90-120 min between 10% checkpoints, if you do not keep the application in memory, and you set the swap interval to less that the time it takes your machine to reach a 10% mark, the work units can appear to be 'hung". This is why the recommended configuration is to set "Keep applications in memory" to "YES". As an added protection, it is also recommended to set a value of nominally 120 min between application swaps. This is of course a "belt and suspenders" approach to the issue. Either of these setting alone has been shown to reduce the problem. It should be noted that ALL BOINC projects suffer from loss of CPU cycles if the applications are not kept in memory. Any work that is not save at a checkpoint before a swap, is lost when a swap occurs. On applications like Rosetta and Climate prediction this loss is significant (15 min to over an hour). On projects like Predictor, SETI and Einstein, the loss is less but it is still there (usually 60 seconds using the default setting for writing to disk). People who wish to squeeze every cycle out of their machine, usually keep applications in memory for this reason. It is important to note that the memory we are talking about is not actual RAM but in fact is virtual memory (on disk), so the actuaal impact of this is not that significant. In addition you can adjust how much memory is used by the application in this regard. With your permission I would like to add your question and the answer to the FAQs, and credit it to you and your team. Moderator9 ROSETTA@home FAQ Moderator Contact |
Carlos_Pfitzner Send message Joined: 22 Dec 05 Posts: 71 Credit: 138,867 RAC: 0 |
It is important to note that the memory we are talking about is not actual RAM but in fact is virtual memory (on disk), so the actuaal impact of this is not that significant. In addition you can adjust how much memory is used by the application in this regard. Note that it is indeed the actual RAM, only the system can decide to move RAM to the swap space ... and then ... some apps (eg: simap) ends with no finished file when moved to swap ... Date Host Project ID Message 2/17/2006 10:48:50 AM crobertp.cp3 boincsimap 2825 Result 200601277.018731_1 exited with zero status but no 'finished' file 2/17/2006 10:48:50 AM crobertp.cp3 boincsimap 2826 If this happens repeatedly you may need to reset the project. 2/17/2006 10:48:50 AM crobertp.cp3 --- 2827 request_reschedule_cpus: process exited 2/17/2006 10:48:50 AM crobertp.cp3 boincsimap 2828 Restarting result 200601277.018731_1 using simap version 507 Leaving RAM filled with stopped tasks, is not a very good idea ... Ideally all apps should checkpoint , and then exit or suspend into ram How about reboot / power losses ? Is the ram keept accross reboots ? I believe is not !!! Click signature for global team stats |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
It is important to note that the memory we are talking about is not actual RAM but in fact is virtual memory (on disk), so the actuaal impact of this is not that significant. In addition you can adjust how much memory is used by the application in this regard. {color=red]A reboot or system crash is hardly comprable to an application swap. As for the RAM usage. If the actual RAM is required for use by another application the BOINC storage is placed in virtual memory on disk. If the system does not require it to be moved then it does stay in RAM. So the RAM is available if the system needs it.[/color] Moderator9 ROSETTA@home FAQ Moderator Contact |
Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0 |
With your permission I would like to add your question and the answer to the FAQs, and credit it to you and your team. Sure, feel free to add/edit this Q (or any other post of mine) to the FAQ. Thx. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity |
Message boards :
Number crunching :
Leave in Memory?
©2024 University of Washington
https://www.bakerlab.org