Message boards : Number crunching : Problems with Rosetta version 5.67
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Matt3223 Send message Joined: 15 Dec 05 Posts: 10 Credit: 58,569 RAC: 0 |
yes, now that I think about it, I also got virtal memory errors with these workunits as well...the gp04s |
Udo Send message Joined: 11 Oct 05 Posts: 2 Credit: 9,607 RAC: 0 |
...the mentioned memory consumption (> 1GB) on my PC was with WU: gp04__BOINC_SYMM_FOLD_AND_DOCK_SUBSYSTEM-gp04_-delC126__1761_88330_0 and the 2 WUs wich aborted on my notebook were: gp04__BOINC_SYMM_FOLD_AND_DOCK_SUBSYSTEM-gp04_-delC126__1761_6708_0 gp04__BOINC_SYMM_FOLD_AND_DOCK_SUBSYSTEM-gp04_-delC126__1761_4194_0 now I got a new WU for my PC which only needs 245 MB! 1gidA_BOINC_MG_CHAINBREAK5_RNA_ABINITIO_RNA_CONTACT_RNA_LONG_RANGE_CONTACT_RNA_SASA-1gidA-_1763_1182_0 -> seems to be not a problem with app 4.67 itself, but anyway BOINC is not honoring my memory setting (use only 75% of VM). Udo |
TJ Send message Joined: 22 Oct 06 Posts: 1 Credit: 63,391 RAC: 0 |
I'm seeing similar issues on Windows. I don't use a page file. Rosetta is using 749MB (mem usage) and 1.16 GB!! (VM Size) |
Don Joslyn Send message Joined: 22 Oct 05 Posts: 2 Credit: 187,235 RAC: 0 |
Way too much memory being used: And I have 2 running at the same time! Don |
Bill Hepburn Send message Joined: 18 Sep 05 Posts: 14 Credit: 14,873,472 RAC: 1,610 |
Initially, I thought this was a BOINC foible, but a couple of hours have passed and it looks like a Rosetta 5.67 issue. I posted on the BOINC forum earlier, but a couple of hours have gone by and the Rosetta WU looks stuck "Waiting to run" at 2:15 and 75% completion. I have since stopped and restarted BOINC to no avail. BOINC 5.8.16 running as a service on WinXP Pro. I am attached to Rosetta, Malaria, and Seti on a Pentium D. I have set my "switch every interval" to 300 minutes to allow work units to complete before switching. The machine has been on overnight. I just noticed that there is a completed Seti and Malaria WU to upload. There is a Rosetta WU running (1:30 and 50%) and a Malaria WU at (1:00 and 75%). There are no report deadlines for the next couple of days. All short term debt values are between +1000 and -1000. Nothing odd there. But there is another Rosetta WU sitting at 2:15 and 75% "Waiting to run". There are WUs from all projects waiting to start. I am a bit baffled why the one Rosetta WU might have gotten stopped, and even more baffled why it would have started a new Rosetta with one partially completed. The one waiting has a deadline closer than the one running. Of course, in a few hours I'm sure they will all be completed, uploaded, and gone from sight. but it does seem odd. |
raccoonone Send message Joined: 10 May 06 Posts: 9 Credit: 335,371 RAC: 0 |
I think it was a problem with those FOLD_AND_DOCK_SUBSYSTEM WUs. I just aborted all of mine, and got new units with a different name. Now everything is working fine. |
Bill Hepburn Send message Joined: 18 Sep 05 Posts: 14 Credit: 14,873,472 RAC: 1,610 |
I think it was a problem with those FOLD_AND_DOCK_SUBSYSTEM WUs. I just aborted all of mine, and got new units with a different name. Now everything is working fine. It was, indeed a FOLD_AND_DOCK_SUBSYSTEM WU, although at least one FOLD_AND_DOCK finished satisfactorily. Since then, several more WUs started and uploaded. I looked at the log and there were the two lines about this WU when it started. Nothing about ever pausing it. Oh well, the CPU ate it. I hope it enjoyed it. |
raccoonone Send message Joined: 10 May 06 Posts: 9 Credit: 335,371 RAC: 0 |
I think it was a problem with those FOLD_AND_DOCK_SUBSYSTEM WUs. I just aborted all of mine, and got new units with a different name. Now everything is working fine. Ya, I've had some finish just fine too. But I think that most of them require > 750MB of RAM, and therefore they run into computation errors when boinc tries to throttle their memory usage. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
according to my list i wont see the same WU's you are discussing for at least another day. I am running with only 512mb memory, so how is this going to work out? Is it going use up all the physical memory and then try to blow out my virtual memory as well? You guys are all running big machines and having troubles, so I wonder how my small machine is going to act. Going to be interesting for sure. |
Rhiju Volunteer moderator Send message Joined: 8 Jan 06 Posts: 223 Credit: 3,546 RAC: 0 |
Hi Everybody: Sorry for checking in a little late on this thread. I'm a bit puzzled that the FOLD_AND_DOCK_SUBSYSTEM workunits are taking up so much memory, but I've canceled all those jobs, and won't send any more out until we reduce the memory requirement! Apologies! Thanks for posting so quickly about the problem. It wasn't apparent on ralph. Also: if you have one of these workunit in your queue, please feel free to cancel it rather than risk a system slowdown due to virtual memory problems. I think it was a problem with those FOLD_AND_DOCK_SUBSYSTEM WUs. I just aborted all of mine, and got new units with a different name. Now everything is working fine. |
Astro Send message Joined: 2 Oct 05 Posts: 987 Credit: 500,253 RAC: 0 |
Will they still be of use if we finish the ones we have? Does the same apply if we're still using 5.64 on them? Will they be credited? |
MattDavis Send message Joined: 22 Sep 05 Posts: 206 Credit: 1,377,748 RAC: 0 |
Bleh 3 computers had this problem -_- |
B-Roy Send message Joined: 26 Sep 05 Posts: 26 Credit: 46,951 RAC: 14 |
first time I've seen such a huge impact of boinc on my system performance, it just all collapsed. Good to see that counteraction was taken. <core_client_version>5.8.16</core_client_version> <![CDATA[ <message> - exit code -529697949 (0xe06d7363) </message> <stderr_txt> # cpu_run_time_pref: 10800 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Out Of Memory (C++ Exception) (0xe06d7363) at address 0x7C812A5B Engaging BOINC Windows Runtime Debugger... |
Mike Gelvin Send message Joined: 7 Oct 05 Posts: 65 Credit: 10,612,039 RAC: 0 |
I have a work unit that seems to be "almost" stuck at 97% complete. The % complete has been slowly increasing (by about .4%) over the last two hours. I have work units set to complete in 4 hours, and we are going over 6 with this one. It is wuid=74278854 |
MattDavis Send message Joined: 22 Sep 05 Posts: 206 Credit: 1,377,748 RAC: 0 |
I have a work unit that seems to be "almost" stuck at 97% complete. The % complete has been slowly increasing (by about .4%) over the last two hours. I have work units set to complete in 4 hours, and we are going over 6 with this one. It is wuid=74278854 That's a special kind of unit with HUGE decoys. That behavior is normal. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
I've had the 1gidA WU's a few times, you will see that the graph moves slowly if not at all and then suddenly the next time you check 10 mins later or so there is a big burst of data shown in the graphics. These WU's are ones that do not behave like the others. Just let them run their course and they will finish. Depending on you computer you RAC will drop as well. My system is slow to process these as well. But that's just the luck of the draw when it comes to WU's. I have a work unit that seems to be "almost" stuck at 97% complete. The % complete has been slowly increasing (by about .4%) over the last two hours. I have work units set to complete in 4 hours, and we are going over 6 with this one. It is wuid=74278854 |
Doug Worrall Send message Joined: 19 Sep 05 Posts: 60 Credit: 58,445 RAC: 0 |
I have a work unit that seems to be "almost" stuck at 97% complete. The % complete has been slowly increasing (by about .4%) over the last two hours. I have work units set to complete in 4 hours, and we are going over 6 with this one. It is wuid=74278854 Yup, I also have my w/u size set at 2 hours crunch time.The LARGE 1 decoy w/u stop around 10 minute 1 sec.These w/u are great for Rosie, and receive more credit.Have a w/u moving very slowly and at 97.2% done, yes, it will end suddenly.I have graphics disabled for better performance.GL and Happy Crunching. Here is one here: https://boinc.bakerlab.org/rosetta/result.php?resultid=82460229 Doug |
chango369 Send message Joined: 5 May 07 Posts: 10 Credit: 329,311 RAC: 0 |
Hi all. This version has been tested a lot on ralph, but please let us know if you see anything unusual. Thanks for all your posts on 5.64 before, too! Everything appears to running smoothly now. |
Ai-Leng Send message Joined: 14 Oct 06 Posts: 8 Credit: 4,715 RAC: 0 |
I have a work unit that seems to be "almost" stuck at 97% complete. The % complete has been slowly increasing (by about .4%) over the last two hours. I have work units set to complete in 4 hours, and we are going over 6 with this one. It is wuid=74278854 I have the same situation but I'm not overly concerned. The wu does get finished even if it does take a while at the end. |
[HWU] GHz Send message Joined: 4 Oct 05 Posts: 3 Credit: 366,762 RAC: 0 |
I also have memory problems with gp04* WU: https://boinc.bakerlab.org/rosetta/result.php?resultid=82333923 https://boinc.bakerlab.org/rosetta/result.php?resultid=82389538 https://boinc.bakerlab.org/rosetta/result.php?resultid=82411587 https://boinc.bakerlab.org/rosetta/result.php?resultid=82433422 This wu crashed at the start of computation. |
Message boards :
Number crunching :
Problems with Rosetta version 5.67
©2024 University of Washington
https://www.bakerlab.org