Problems with Rosetta version 5.67

Message boards : Number crunching : Problems with Rosetta version 5.67

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Matt3223

Send message
Joined: 15 Dec 05
Posts: 10
Credit: 58,569
RAC: 0
Message 41481 - Posted: 26 May 2007, 13:45:36 UTC

yes, now that I think about it, I also got virtal memory errors with these workunits as well...the gp04s

ID: 41481 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Udo

Send message
Joined: 11 Oct 05
Posts: 2
Credit: 9,607
RAC: 0
Message 41487 - Posted: 26 May 2007, 14:06:50 UTC


...the mentioned memory consumption (> 1GB) on my PC was with WU:
gp04__BOINC_SYMM_FOLD_AND_DOCK_SUBSYSTEM-gp04_-delC126__1761_88330_0
and the 2 WUs wich aborted on my notebook were:
gp04__BOINC_SYMM_FOLD_AND_DOCK_SUBSYSTEM-gp04_-delC126__1761_6708_0
gp04__BOINC_SYMM_FOLD_AND_DOCK_SUBSYSTEM-gp04_-delC126__1761_4194_0

now I got a new WU for my PC which only needs 245 MB!
1gidA_BOINC_MG_CHAINBREAK5_RNA_ABINITIO_RNA_CONTACT_RNA_LONG_RANGE_CONTACT_RNA_SASA-1gidA-_1763_1182_0

-> seems to be not a problem with app 4.67 itself, but anyway BOINC is not honoring my memory setting (use only 75% of VM).

Udo
ID: 41487 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TJ

Send message
Joined: 22 Oct 06
Posts: 1
Credit: 63,391
RAC: 0
Message 41489 - Posted: 26 May 2007, 15:44:03 UTC

I'm seeing similar issues on Windows.

I don't use a page file.

Rosetta is using 749MB (mem usage) and 1.16 GB!! (VM Size)
ID: 41489 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Don Joslyn

Send message
Joined: 22 Oct 05
Posts: 2
Credit: 187,235
RAC: 0
Message 41491 - Posted: 26 May 2007, 17:00:03 UTC
Last modified: 26 May 2007, 17:04:13 UTC

Way too much memory being used:



And I have 2 running at the same time!

Don
ID: 41491 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bill Hepburn

Send message
Joined: 18 Sep 05
Posts: 14
Credit: 14,588,658
RAC: 3,963
Message 41492 - Posted: 26 May 2007, 18:23:03 UTC

Initially, I thought this was a BOINC foible, but a couple of hours have passed and it looks like a Rosetta 5.67 issue. I posted on the BOINC forum earlier, but a couple of hours have gone by and the Rosetta WU looks stuck "Waiting to run" at 2:15 and 75% completion. I have since stopped and restarted BOINC to no avail.

BOINC 5.8.16 running as a service on WinXP Pro.

I am attached to Rosetta, Malaria, and Seti on a Pentium D. I have set my "switch every interval" to 300 minutes to allow work units to complete before switching.

The machine has been on overnight. I just noticed that there is a completed Seti and Malaria WU to upload. There is a Rosetta WU running (1:30 and 50%) and a Malaria WU at (1:00 and 75%). There are no report deadlines for the next couple of days. All short term debt values are between +1000 and -1000. Nothing odd there. But there is another Rosetta WU sitting at 2:15 and 75% "Waiting to run".

There are WUs from all projects waiting to start. I am a bit baffled why the one Rosetta WU might have gotten stopped, and even more baffled why it would have started a new Rosetta with one partially completed. The one waiting has a deadline closer than the one running.

Of course, in a few hours I'm sure they will all be completed, uploaded, and gone from sight. but it does seem odd.



ID: 41492 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
raccoonone

Send message
Joined: 10 May 06
Posts: 9
Credit: 335,371
RAC: 0
Message 41494 - Posted: 26 May 2007, 19:04:51 UTC

I think it was a problem with those FOLD_AND_DOCK_SUBSYSTEM WUs. I just aborted all of mine, and got new units with a different name. Now everything is working fine.
ID: 41494 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bill Hepburn

Send message
Joined: 18 Sep 05
Posts: 14
Credit: 14,588,658
RAC: 3,963
Message 41495 - Posted: 26 May 2007, 19:32:08 UTC - in response to Message 41494.  

I think it was a problem with those FOLD_AND_DOCK_SUBSYSTEM WUs. I just aborted all of mine, and got new units with a different name. Now everything is working fine.


It was, indeed a FOLD_AND_DOCK_SUBSYSTEM WU, although at least one FOLD_AND_DOCK finished satisfactorily. Since then, several more WUs started and uploaded. I looked at the log and there were the two lines about this WU when it started. Nothing about ever pausing it. Oh well, the CPU ate it. I hope it enjoyed it.
ID: 41495 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
raccoonone

Send message
Joined: 10 May 06
Posts: 9
Credit: 335,371
RAC: 0
Message 41498 - Posted: 26 May 2007, 20:29:10 UTC - in response to Message 41495.  

I think it was a problem with those FOLD_AND_DOCK_SUBSYSTEM WUs. I just aborted all of mine, and got new units with a different name. Now everything is working fine.


It was, indeed a FOLD_AND_DOCK_SUBSYSTEM WU, although at least one FOLD_AND_DOCK finished satisfactorily. Since then, several more WUs started and uploaded. I looked at the log and there were the two lines about this WU when it started. Nothing about ever pausing it. Oh well, the CPU ate it. I hope it enjoyed it.


Ya, I've had some finish just fine too. But I think that most of them require > 750MB of RAM, and therefore they run into computation errors when boinc tries to throttle their memory usage.
ID: 41498 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,633,150
RAC: 945
Message 41499 - Posted: 26 May 2007, 20:42:58 UTC

according to my list i wont see the same WU's you are discussing for at least another day. I am running with only 512mb memory, so how is this going to work out? Is it going use up all the physical memory and then try to blow out my virtual memory as well? You guys are all running big machines and having troubles, so I wonder how my small machine is going to act. Going to be interesting for sure.
ID: 41499 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 41504 - Posted: 26 May 2007, 21:16:15 UTC - in response to Message 41498.  
Last modified: 26 May 2007, 21:17:43 UTC

Hi Everybody:
Sorry for checking in a little late on this thread. I'm a bit puzzled that the FOLD_AND_DOCK_SUBSYSTEM workunits are taking up so much memory, but I've canceled all those jobs, and won't send any more out until we reduce the memory requirement! Apologies! Thanks for posting so quickly about the problem. It wasn't apparent on ralph.

Also: if you have one of these workunit in your queue, please feel free to cancel it rather than risk a system slowdown due to virtual memory problems.


I think it was a problem with those FOLD_AND_DOCK_SUBSYSTEM WUs. I just aborted all of mine, and got new units with a different name. Now everything is working fine.


It was, indeed a FOLD_AND_DOCK_SUBSYSTEM WU, although at least one FOLD_AND_DOCK finished satisfactorily. Since then, several more WUs started and uploaded. I looked at the log and there were the two lines about this WU when it started. Nothing about ever pausing it. Oh well, the CPU ate it. I hope it enjoyed it.


Ya, I've had some finish just fine too. But I think that most of them require > 750MB of RAM, and therefore they run into computation errors when boinc tries to throttle their memory usage.


ID: 41504 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 41506 - Posted: 26 May 2007, 21:21:00 UTC
Last modified: 26 May 2007, 21:21:17 UTC

Will they still be of use if we finish the ones we have? Does the same apply if we're still using 5.64 on them? Will they be credited?
ID: 41506 · Rating: 9.9920072216264E-15 · rate: Rate + / Rate - Report as offensive    Reply Quote
MattDavis
Avatar

Send message
Joined: 22 Sep 05
Posts: 206
Credit: 1,377,748
RAC: 0
Message 41507 - Posted: 26 May 2007, 21:21:51 UTC

Bleh 3 computers had this problem -_-
ID: 41507 · Rating: -3 · rate: Rate + / Rate - Report as offensive    Reply Quote
B-Roy

Send message
Joined: 26 Sep 05
Posts: 26
Credit: 43,230
RAC: 4
Message 41508 - Posted: 26 May 2007, 22:31:15 UTC

first time I've seen such a huge impact of boinc on my system performance, it just all collapsed. Good to see that counteraction was taken.

<core_client_version>5.8.16</core_client_version>
<![CDATA[
<message>
- exit code -529697949 (0xe06d7363)
</message>
<stderr_txt>
# cpu_run_time_pref: 10800


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Out Of Memory (C++ Exception) (0xe06d7363) at address 0x7C812A5B

Engaging BOINC Windows Runtime Debugger...


ID: 41508 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike Gelvin
Avatar

Send message
Joined: 7 Oct 05
Posts: 65
Credit: 10,612,039
RAC: 0
Message 41511 - Posted: 26 May 2007, 22:49:14 UTC

I have a work unit that seems to be "almost" stuck at 97% complete. The % complete has been slowly increasing (by about .4%) over the last two hours. I have work units set to complete in 4 hours, and we are going over 6 with this one. It is wuid=74278854

ID: 41511 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MattDavis
Avatar

Send message
Joined: 22 Sep 05
Posts: 206
Credit: 1,377,748
RAC: 0
Message 41512 - Posted: 26 May 2007, 23:19:17 UTC - in response to Message 41511.  

I have a work unit that seems to be "almost" stuck at 97% complete. The % complete has been slowly increasing (by about .4%) over the last two hours. I have work units set to complete in 4 hours, and we are going over 6 with this one. It is wuid=74278854


That's a special kind of unit with HUGE decoys. That behavior is normal.
ID: 41512 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,633,150
RAC: 945
Message 41516 - Posted: 27 May 2007, 8:29:50 UTC - in response to Message 41512.  

I've had the 1gidA WU's a few times, you will see that the graph moves slowly if not at all and then suddenly the next time you check 10 mins later or so there is a big burst of data shown in the graphics. These WU's are ones that do not behave like the others. Just let them run their course and they will finish. Depending on you computer you RAC will drop as well. My system is slow to process these as well. But that's just the luck of the draw when it comes to WU's.

I have a work unit that seems to be "almost" stuck at 97% complete. The % complete has been slowly increasing (by about .4%) over the last two hours. I have work units set to complete in 4 hours, and we are going over 6 with this one. It is wuid=74278854


That's a special kind of unit with HUGE decoys. That behavior is normal.


ID: 41516 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Doug Worrall
Avatar

Send message
Joined: 19 Sep 05
Posts: 60
Credit: 58,445
RAC: 0
Message 41521 - Posted: 27 May 2007, 10:40:39 UTC - in response to Message 41511.  

I have a work unit that seems to be "almost" stuck at 97% complete. The % complete has been slowly increasing (by about .4%) over the last two hours. I have work units set to complete in 4 hours, and we are going over 6 with this one. It is wuid=74278854

Yup,
I also have my w/u size set at 2 hours crunch time.The LARGE 1 decoy w/u stop around 10 minute 1 sec.These w/u are great for Rosie, and receive more credit.Have a w/u moving very slowly and at 97.2% done, yes, it will end suddenly.I have graphics disabled for better performance.GL and Happy Crunching.
Here is one here:
https://boinc.bakerlab.org/rosetta/result.php?resultid=82460229
Doug
ID: 41521 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
chango369

Send message
Joined: 5 May 07
Posts: 10
Credit: 329,311
RAC: 0
Message 41527 - Posted: 27 May 2007, 15:07:59 UTC - in response to Message 41403.  

Hi all. This version has been tested a lot on ralph, but please let us know if you see anything unusual. Thanks for all your posts on 5.64 before, too!

Everything appears to running smoothly now.
ID: 41527 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Ai-Leng

Send message
Joined: 14 Oct 06
Posts: 8
Credit: 4,715
RAC: 0
Message 41541 - Posted: 28 May 2007, 2:24:00 UTC - in response to Message 41511.  

I have a work unit that seems to be "almost" stuck at 97% complete. The % complete has been slowly increasing (by about .4%) over the last two hours. I have work units set to complete in 4 hours, and we are going over 6 with this one. It is wuid=74278854


I have the same situation but I'm not overly concerned. The wu does get finished even if it does take a while at the end.
ID: 41541 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [HWU] GHz
Avatar

Send message
Joined: 4 Oct 05
Posts: 3
Credit: 366,762
RAC: 0
Message 41549 - Posted: 28 May 2007, 11:56:50 UTC
Last modified: 28 May 2007, 11:57:20 UTC

ID: 41549 · Rating: 9.9920072216264E-15 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Problems with Rosetta version 5.67



©2024 University of Washington
https://www.bakerlab.org