Report Problems with Rosetta Version 5.24

Message boards : Number crunching : Report Problems with Rosetta Version 5.24

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Moderator9
Volunteer moderator
Project administrator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 19081 - Posted: 21 Jun 2006, 20:53:42 UTC

This thread is for reporting problems with Rosetta Version 5.24
Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 19081 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Dimitris Hatzopoulos

Send message
Joined: 5 Jan 06
Posts: 336
Credit: 80,939
RAC: 0
Message 19093 - Posted: 22 Jun 2006, 0:07:27 UTC

Not a bug, but I noticed a v5.24 WU FRA_t298_hom001_5_IGNORE_THE_REST_dec22.pdb_747_47_0

which has a working set (RAM usage) of almost 300MB (298) and 837MB virtual.

Plus the data files for this WU are ~11.5MB.

It reminds me that Rosetta could really use a BigWU flag and my concern that woth such WUs we might lose folks with 512MB RAM (who run other software in addition to crunching for Rosetta) and/or dialup.
Best UFO Resources
Wikipedia R@h
How-To: Join Distributed Computing projects that benefit humanity
ID: 19093 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator
Project administrator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 19095 - Posted: 22 Jun 2006, 0:43:37 UTC - in response to Message 19093.  

Not a bug, but I noticed a v5.24 WU FRA_t298_hom001_5_IGNORE_THE_REST_dec22.pdb_747_47_0

which has a working set (RAM usage) of almost 300MB (298) and 837MB virtual.

Plus the data files for this WU are ~11.5MB.

It reminds me that Rosetta could really use a BigWU flag and my concern that woth such WUs we might lose folks with 512MB RAM (who run other software in addition to crunching for Rosetta) and/or dialup.

You are correct that this is not a bug. However, the large memory test you have mentioned has already been shown not to be the answer. In fact BOINC will use virtual memory to compensate for memory issues. Moreover, the BOINC scheduler will allow other programs to run and yield the processor to them as necessary.

But if you read the version 5.22 problem reporting thread, you will see many reports from people who were upset that they were denied work because of the size of a work unit. During CASP large work units are just part of the science that is being run.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 19095 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1016
Credit: 3,845,019
RAC: 976
Message 19096 - Posted: 22 Jun 2006, 0:50:53 UTC

Our work unit generator now alternates between high memory and standard jobs to prevent filling the queue with just high memory ones. This will allow users with low memory machines to keep getting work.
ID: 19096 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bin Qian

Send message
Joined: 13 Jul 05
Posts: 33
Credit: 36,897
RAC: 0
Message 19099 - Posted: 22 Jun 2006, 1:05:58 UTC - in response to Message 19096.  


This protein is one of the largest CASP target we ran on r@h - 336 residues. We\'ve marked these WUs as high memory jobs and they will only be sent to machines with more than 512M memory. As David Kim has said in his post below, the queueing system will make sure there are always low memory jobs available for all the users.
ID: 19099 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Dimitris Hatzopoulos

Send message
Joined: 5 Jan 06
Posts: 336
Credit: 80,939
RAC: 0
Message 19100 - Posted: 22 Jun 2006, 1:21:24 UTC - in response to Message 19099.  


We\'ve marked these WUs as high memory jobs and they will only be sent to machines with more than 512M memory.


I\'ve not kept up with BOINC-server developments, but I think that currently such a high-memory job will still be sent to e.g. a PC with only 256MB RAM. And only AFTER that WU has been downloaded (a procedure which might last 45min-1hr for a dialup guy) the local BOINC client will notice it\'s not suitable and dump it.

Personally, I don\'t mind, as I have fast Internet and a few months ago I had upgraded all my PCs with extra RAM \"for Rosetta\".

But from the perspective of a dialup user... waiting 45min-1hr to download a 12MB WU and then see it aborted. He wouldn\'t be very happy.
Best UFO Resources
Wikipedia R@h
How-To: Join Distributed Computing projects that benefit humanity
ID: 19100 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ananas

Send message
Joined: 1 Jan 06
Posts: 232
Credit: 752,471
RAC: 0
Message 19103 - Posted: 22 Jun 2006, 5:28:41 UTC

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x008C0DD1 write attempt to address 0x073CEA1C

resultid=25245144

The box has only 256MB RAM, which could be the reason.

Using no graphics, Rosetta shared the host with SIMAP, which has fairly low RAM requirements.
ID: 19103 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 19106 - Posted: 22 Jun 2006, 5:52:42 UTC

The WU t316__CASP7_JUMPABINITIO_SAVE_ALL_OUT_BARCODE_250to373_hom008__737_139_0 hung after it finished crunching. After some hours, I stopped and restarted BOINC, and the WU immediately finished and uploaded.
ID: 19106 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tom Philippart
Avatar

Send message
Joined: 29 May 06
Posts: 183
Credit: 834,667
RAC: 0
Message 19114 - Posted: 22 Jun 2006, 13:18:01 UTC

I had 2 computing errors in 24h :(

shall i specify the WU?
ID: 19114 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 19117 - Posted: 22 Jun 2006, 14:16:21 UTC
Last modified: 22 Jun 2006, 14:20:30 UTC

Yesterday this WU was running for 40 minutes when I had to shut my computer down.
Although checkpointing is enabled for this WU and I found a farlxcheck the WU started from the beginning after I restarted my computer today.

@Dimitri

The big WU are only sent to machines with 512 MB RAM and above. I\'m glad they finally decided to use this BOINC-Feature to send larger jobs to higher-spec-machines. My current WU uses although about 300 MB RAM and I\'m happy that my 1 GB RAM is of any use (but I\'m not happy that checkpointing still does not work smoothly). Perhaps an announcement of this procedure would be in order in the announcement thread.
ID: 19117 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Dimitris Hatzopoulos

Send message
Joined: 5 Jan 06
Posts: 336
Credit: 80,939
RAC: 0
Message 19120 - Posted: 22 Jun 2006, 16:01:39 UTC - in response to Message 19117.  

@Dimitri

The big WU are only sent to machines with 512 MB RAM and above. I\'m glad they finally decided to use this BOINC-Feature to send larger jobs to higher-spec-machines.


Mea culpa, I thought it worked as I described below. I haven\'t looked into the sources for quite some time.

Best UFO Resources
Wikipedia R@h
How-To: Join Distributed Computing projects that benefit humanity
ID: 19120 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Amos Jeffries

Send message
Joined: 14 Dec 05
Posts: 3
Credit: 244,222
RAC: 0
Message 19145 - Posted: 23 Jun 2006, 3:52:52 UTC - in response to Message 19120.  

I\'m having trouble D/L the 5.24 application the link just hangs. Any help would be appreciated.

The BOINC manager (win32) and boinc-client (linux) grabs about 95% of some files and then dies into a resume loop. I noticed this ~2 days ago, I reset the project 24 hours ago, with no change except that it restarted all downloads from the begining and hangs in the same place again.

It downloads the small support files and WU files okay, its just the new application one.

A manual test using wget does the same thing:

\"
>wget -c http://boinc.bakerlab.org/rosetta/download/rosetta_5.24_i686-pc-linux-gnu

--15:03:04-- http://boinc.bakerlab.org/rosetta/download/rosetta_5.24_i686-pc-linux-gnu
=> `rosetta_5.24_i686-pc-linux-gnu\'
Resolving boinc.bakerlab.org... 140.142.20.103
Connecting to boinc.bakerlab.org|140.142.20.103|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 9,705,772 (9.3M), 424,092 (414K) remaining [application/octet-stream]

95% [+++++++++++++++++++++++++++++++ ] 9,281,680 --.--K/s

\"

The windows application hangs at a different size, but with the same consistency:

\"
wget -c http://boinc.bakerlab.org/rosetta/download/rosetta_5.24_windows_intelx86.exe
--15:34:45-- http://boinc.bakerlab.org/rosetta/download/rosetta_5.24_windows_intelx86.exe
=> `rosetta_5.24_windows_intelx86.exe\'
Resolving boinc.bakerlab.org... 140.142.20.103
Connecting to boinc.bakerlab.org|140.142.20.103|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 7,245,824 (6.9M), 642,944 (628K) remaining [application/octet-stream]

91% [++++++++++++++++++++++++++++++++++ ] 6,602,880 --.--K/s
\"

ID: 19145 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike Gelvin
Avatar

Send message
Joined: 7 Oct 05
Posts: 65
Credit: 10,612,039
RAC: 0
Message 19147 - Posted: 23 Jun 2006, 5:42:47 UTC

Bombed out due to high disk usage

25277738

ID: 19147 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
cnick6

Send message
Joined: 30 May 06
Posts: 25
Credit: 6,543,224
RAC: 0
Message 19148 - Posted: 23 Jun 2006, 5:59:52 UTC

Hi Everyone --

I haven\'t had time to really investigate my problem, but has anyone seen issues with 5.24 where the Windows screensaver will just hang? My disk activity gets really intense but my system will never come back to the desktop. The system is not \"locked\" as my Numlock key still lights up. Could there be some kind of memory leak with 5.24?

I can\'t even CTRL-ALT-DEL to task manager. I have to hard reset my machine.

I didn\'t see this at all with version 5.22 or previous versions.

My work units appear to be completing with \'success\'...


Thanks
-Nick
ID: 19148 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 19157 - Posted: 23 Jun 2006, 11:27:58 UTC - in response to Message 19148.  

Hi Everyone --

I haven\'t had time to really investigate my problem, but has anyone seen issues with 5.24 where the Windows screensaver will just hang? My disk activity gets really intense but my system will never come back to the desktop. The system is not \"locked\" as my Numlock key still lights up. Could there be some kind of memory leak with 5.24?

I can\'t even CTRL-ALT-DEL to task manager. I have to hard reset my machine.

I didn\'t see this at all with version 5.22 or previous versions.

My work units appear to be completing with \'success\'...


Thanks
-Nick

Nick, does the graphic seem like it\'s just locked and won\'t be released so windows can resume using it? Does it seem like you can still interact with whatever screen is right below the graphic, but the graphic just doesn\'t go away? Like, one time I had a window open that had one of those \"shoot the duck and win $1000 in grocery coupon games, and when I clicked the mouse, I could hear the gunshots and see the HD activity light as I was being swept away to a new window, since my aim was good and shot the duck (even though I could only see the Rosetta graphic). One night I saw this had happened, but didn\'t \"hard reset\" my laptop, and the next morning the Rosetta graphic had disappeared, only to be replaced by the Seti graphic. I still couldn\'t get my windows desktop back. I ended up \"powering down\". I\'ve seen this happen twice, both times Rosetta was the first screen to appear. I\'ve reported this to Rom Walton.

I\'m curious to see if you\'re the second person to see this.

tony

ID: 19157 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mewbysea

Send message
Joined: 29 Jan 06
Posts: 17
Credit: 10,374,832
RAC: 3,807
Message 19160 - Posted: 23 Jun 2006, 12:34:42 UTC
Last modified: 23 Jun 2006, 12:35:17 UTC

Hi folks,

Just wanted to point out this wu 21185008 which failed on two computers. On the computer running Rosetta 5.22, the auto debug kicked in; on the one running version 5.24, the auto debug failed.
ID: 19160 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 19164 - Posted: 23 Jun 2006, 15:06:35 UTC - in response to Message 19147.  

Bombed out due to high disk usage

25277738


after 0 seconds. This seems to be a problem with your disk and your general BOINC settings. Clean some space on your disk or allow more disk usages in your general preferences.
ID: 19164 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
cnick6

Send message
Joined: 30 May 06
Posts: 25
Credit: 6,543,224
RAC: 0
Message 19169 - Posted: 23 Jun 2006, 15:47:43 UTC - in response to Message 19157.  

Hi Everyone --

I haven\'t had time to really investigate my problem, but has anyone seen issues with 5.24 where the Windows screensaver will just hang? My disk activity gets really intense but my system will never come back to the desktop. The system is not \"locked\" as my Numlock key still lights up. Could there be some kind of memory leak with 5.24?

I can\'t even CTRL-ALT-DEL to task manager. I have to hard reset my machine.

I didn\'t see this at all with version 5.22 or previous versions.

My work units appear to be completing with \'success\'...


Thanks
-Nick

Nick, does the graphic seem like it\'s just locked and won\'t be released so windows can resume using it? Does it seem like you can still interact with whatever screen is right below the graphic, but the graphic just doesn\'t go away? Like, one time I had a window open that had one of those \"shoot the duck and win $1000 in grocery coupon games, and when I clicked the mouse, I could hear the gunshots and see the HD activity light as I was being swept away to a new window, since my aim was good and shot the duck (even though I could only see the Rosetta graphic). One night I saw this had happened, but didn\'t \"hard reset\" my laptop, and the next morning the Rosetta graphic had disappeared, only to be replaced by the Seti graphic. I still couldn\'t get my windows desktop back. I ended up \"powering down\". I\'ve seen this happen twice, both times Rosetta was the first screen to appear. I\'ve reported this to Rom Walton.

I\'m curious to see if you\'re the second person to see this.

tony



Tony, yes it sounds very similar, but Rosetta will never recover. Rosetta seems to keep running (so the screensaver is always active) -- but I cannot return back to the desktop whatsoever. As soon as I try to wake up the machine, the hard disk light is more or less lit up the entire time.

I\'ve left it sit for several hours and it will never come back.

ID: 19169 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
DanSpitz

Send message
Joined: 5 Jun 06
Posts: 2
Credit: 15,967
RAC: 0
Message 19186 - Posted: 23 Jun 2006, 22:04:30 UTC - in response to Message 19021.  

Version 5.24 is up:

1. The symbol store that we put in 5.22 was not properly activated, so we\'re giving it another shot.

2. We can now use prior predictions for which parts of the chain are buried or exposed to guide the Rosetta search.

3. We can efficiently assemble predefined domains of the protein chain into a whole structure.



ID: 19186 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bob Guy

Send message
Joined: 7 Oct 05
Posts: 39
Credit: 24,895
RAC: 0
Message 19190 - Posted: 24 Jun 2006, 0:34:57 UTC
Last modified: 24 Jun 2006, 0:43:13 UTC

This WU 21358231 crashed when restarting. I have \'leave in memory\' turned on but this was after closing and restarting Boinc. Crashed immediately (well, it ran for 12 minutes) upon restarting. Graphics never opened for this WU - not a graphics problem. Error was the -1073741819 (0xc0000005) error.

<Edit for the error code and run time.>
ID: 19190 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Report Problems with Rosetta Version 5.24



©2019 University of Washington
http://www.bakerlab.org