Report problems with Rosetta version 5.32

Message boards : Number crunching : Report problems with Rosetta version 5.32

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4

AuthorMessage
Profile Conan
Avatar

Send message
Joined: 11 Oct 05
Posts: 141
Credit: 3,132,431
RAC: 0
Message 30050 - Posted: 26 Oct 2006, 11:58:11 UTC

>>> The freezing of the Rosetta and Ralph work units is definately a ScreenSaver problem. I have Rosetta running on 7 machines, all bar 2 do not have graphics (2 are Linux and 3 are XP installed as services, 2 are XP installed at default user settings).
>>> The 2 machines that use Graphics are the only 2 machines to have any problems, whether it be the WU becoming stuck, returning some \'access violation\' or \'exit code\' with no cause.

These are a few more that stuck then errored out:-
http://boinc.bakerlab.org/rosetta/result.php?resultid=43600036
http://boinc.bakerlab.org/rosetta/result.php?resultid=42883408
http://boinc.bakerlab.org/rosetta/result.php?resultid=42889015
http://boinc.bakerlab.org/rosetta/result.php?resultid=42888958

These 2 had Access Violations:-
at address 0x0076D4FD read attempt to address 0x00000012 on result id 43369182
at address 0x0076D507 read attempt to address 0x00000011 on result id 42889050

And these 2 came up as invalid with no real error just \'exit code 1073807364 (0x40010004)
http://boinc.bakerlab.org/rosetta/result.php?resultid=42888965
http://boinc.bakerlab.org/rosetta/result.php?resultid=42888964

Hope this can help as it is limiting my output when the screen stops doing anything and you find that the cpus are not doing anything either, on one machine when this happened on the 19/10 the computer then did nothing till I came back from a break on the 24/10, 5 days of lost production.
ID: 30050 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1740
Credit: 3,655,614
RAC: 4
Message 30054 - Posted: 26 Oct 2006, 14:22:31 UTC

Looks like this one didn\'t survive a normal File -> Exit of BOINC.
If having a DC project with BOINC is of interest to you, with volunteer or cloud computing resources, but have no time for the BOINC learning curve,
use a hosting service that understands BOINC projects: http://DeepSci.com
ID: 30054 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Keith Akins

Send message
Joined: 22 Oct 05
Posts: 176
Credit: 71,779
RAC: 0
Message 30063 - Posted: 26 Oct 2006, 16:42:48 UTC

My problems didn\'t start until 5.32. Suddenly, my system goes from almost error free crunching to a variety of errors from read access violations to watch dog shutdowns to solid freezes.

I tried Linux (dual boot) for a while. the first three crunched perfectly. The next thing I know I get a WU halted at 100%, my cpu fan slows to idle and the WU will not upload. I\'ve heard of this happening before.

It is quite unlikely that a hardware failure such as a memory address would go bad just as we upgrade to 5.32 and equally as unlikely that a bad memory addr would cause such a variety of errors. Why would this not happen under linux? I had memory 70% full just to see.

I hope 5.34 goes better.

ID: 30063 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile River~~
Avatar

Send message
Joined: 15 Dec 05
Posts: 761
Credit: 285,578
RAC: 0
Message 30136 - Posted: 27 Oct 2006, 19:37:40 UTC - in response to Message 30063.  
Last modified: 27 Oct 2006, 19:40:13 UTC

... next thing I know I get a WU halted at 100%, my cpu fan slows to idle and the WU will not upload....


I have had this on both Linux (cli) and Win2k boxes. As you say, only with 5.32

When this has happened to me the result uploaded OK once BOINC has been stopped and restarted. If that did not work, next thing I\'d try would be rebooting the OS.

River~~
ID: 30136 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 317,557
RAC: 0
Message 30321 - Posted: 30 Oct 2006, 21:25:06 UTC

I don\'t know if anyone has pointed this out before, but since some of these errors seem to invoke the debugger (most don\'t, but maybe they should start sending out the .pdb again for users that would like to help debug.

Here
http://boinc.berkeley.edu/app_debug_win.php
is a start of a list Rom Walton created while debugging up to the last stable version od Rosetta@Home (i.e. before the docking)
So maybe some other can look into what\'s causing the fails on some of them.
Team mauisun.org
ID: 30321 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Chu

Send message
Joined: 23 Feb 06
Posts: 120
Credit: 112,439
RAC: 0
Message 30383 - Posted: 31 Oct 2006, 18:00:31 UTC - in response to Message 30321.  
Last modified: 31 Oct 2006, 18:01:03 UTC

Hi FluffyChicken, thanks for pointing this out. Actually with the help of Rom Walton, we created a symbol store for Rosetta@Home here and we have added pdb symbols for all the recent R@H updates. The paths information has been defined so that the cilent should be able to download the pdb symbols automatically. This has been working for 5.32. However, I just looked at our symstore and found that the file permissions are not set properly for 5.34 and some newer apps on RALPH. I just corrected the problem and hopefully the pdb symbols will be loaded again if an error occurs. Thanks again for catching it.
I don\'t know if anyone has pointed this out before, but since some of these errors seem to invoke the debugger (most don\'t, but maybe they should start sending out the .pdb again for users that would like to help debug.

Here
http://boinc.berkeley.edu/app_debug_win.php
is a start of a list Rom Walton created while debugging up to the last stable version od Rosetta@Home (i.e. before the docking)
So maybe some other can look into what\'s causing the fails on some of them.


ID: 30383 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4

Message boards : Number crunching : Report problems with Rosetta version 5.32



©2019 University of Washington
http://www.bakerlab.org