Report Problems with Rosetta Version 5.25

Message boards : Number crunching : Report Problems with Rosetta Version 5.25

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 12 · Next

AuthorMessage
mnb

Send message
Joined: 15 Dec 05
Posts: 51
Credit: 69,458
RAC: 0
Message 19923 - Posted: 8 Jul 2006, 11:07:57 UTC - in response to Message 19909.  
Last modified: 8 Jul 2006, 11:13:01 UTC

I have the strangest problem with 5.25. According to Zone Alarm it's trying to access internet on its own. It says Rosetta_5,25_windows_intelx86.exe is trying to access internet. And it doesn't matter if I allow it to do that or not. The work unit stops running at the same time.

If I suspend that work unit the results page will show client error for that work unit. If I close Boinc and restart it again the work unit starts to compute again from a previous check point, but soon it will halt again. I've had a few blue screens as well and had to reboot.

For the record: 5.25 is the first version I've had any problems with.

list of my results
ID: 19923 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TCU Computer Science

Send message
Joined: 7 Dec 05
Posts: 28
Credit: 12,861,977
RAC: 0
Message 19944 - Posted: 9 Jul 2006, 5:49:57 UTC - in response to Message 19744.  

Here is another one:

t329__MAPBACK_CLUSTER02_CASP7_ABRELAX_SAVE_ALL_OUT_CONTACT_ncap_hom001__826_17779
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=22976377


It was running for four days.
The messages show it being paused and resumed at one hour intervals.
But the accumulated time was stuck at 7 hr 48 mins.

Stopped and restarted boinc, the accumulated time began increasing, and the work unit finished normally about 30 minutes later.

This occurred on a Linux box different from the previous one.
ID: 19944 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
R.L. Casey

Send message
Joined: 7 Jun 06
Posts: 91
Credit: 2,728,885
RAC: 0
Message 19971 - Posted: 10 Jul 2006, 0:39:42 UTC

PILOT ERROR... LOST WUs

I have apparently "lost" a number of WUs--all sent on July 4. These will not report from from host Computer ID "246746". These WUs are as follows:

ResultID WUID Time Sent Deadline
======== ======== ====================== =======================
26988429 22969795 4 Jul 2006 6:20:24 UTC 11 Jul 2006 6:20:24 UTC
26988412 22969778 4 Jul 2006 6:20:24 UTC 11 Jul 2006 6:20:24 UTC
26988411 22969777 4 Jul 2006 6:20:24 UTC 11 Jul 2006 6:20:24 UTC
26988118 22969486 4 Jul 2006 6:16:16 UTC 11 Jul 2006 6:16:16 UTC
26988077 22969445 4 Jul 2006 6:16:16 UTC 11 Jul 2006 6:16:16 UTC
26988076 22969444 4 Jul 2006 6:16:16 UTC 11 Jul 2006 6:16:16 UTC
26987457 22968903 4 Jul 2006 6:10:09 UTC 11 Jul 2006 6:10:09 UTC
26987456 22968902 4 Jul 2006 6:10:09 UTC 11 Jul 2006 6:10:09 UTC
26987443 22968889 4 Jul 2006 6:10:09 UTC 11 Jul 2006 6:10:09 UTC
26986960 22968483 4 Jul 2006 6:04:04 UTC 11 Jul 2006 6:04:04 UTC
26986945 22968468 4 Jul 2006 6:04:04 UTC 11 Jul 2006 6:04:04 UTC
26986915 22968438 4 Jul 2006 6:04:04 UTC 11 Jul 2006 6:04:04 UTC
26986338 22967866 4 Jul 2006 5:57:57 UTC 11 Jul 2006 5:57:57 UTC
26986333 22967861 4 Jul 2006 5:57:57 UTC 11 Jul 2006 5:57:57 UTC
26986230 22967763 4 Jul 2006 5:57:57 UTC 11 Jul 2006 5:57:57 UTC

My regrets to all; I believe that this was a one-time occurrence! Happy crunching! :-)
ID: 19971 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Richard Cooper

Send message
Joined: 9 Jul 06
Posts: 1
Credit: 36,525
RAC: 0
Message 19995 - Posted: 10 Jul 2006, 10:55:02 UTC

I just loaded BOINC on a Linux system (SuSE 10.1, 2.6.16.13-4-smp) and started Rosetta. My first two work units both seen to have failed, although I got credit.

stderr from one result....

<core_client_version>5.4.9</core_client_version>
<stderr_txt>
Graphics are disabled due to configuration...
# random seed: 1553230
SIGSEGV: segmentation violation
Stack trace (15 frames):
[0x8849d1b]
[0x8861dcc]
[0xffffe500]
[0x8841478]
[0x863320c]
[0x861f8fb]
[0x87efda3]
[0x87382c1]
[0x87e5783]
[0x82c63a1]
[0x805f09d]
[0x846e09d]
[0x8470594]
[0x88c12b4]
[0x8048111]

Exiting...
SIGSEGV: segmentation violation
Stack trace (24 frames):
[0x8849d1b]
[0x8861dcc]
[0xffffe500]
[0x88e2066]
[0x88e2543]
[0x88b36d1]
[0x88b50f9]
[0x829f301]
[0x88c88bf]
[0x8849d66]
[0x8861dcc]
[0xffffe500]
[0x8841478]
[0x863320c]
[0x861f8fb]
[0x87efda3]
[0x87382c1]
[0x87e5783]
[0x82c63a1]
[0x805f09d]
[0x846e09d]
[0x8470594]
[0x88c12b4]
[0x8048111]

Exiting...
Graphics are disabled due to configuration...
# random seed: 1553230
# cpu_run_time_pref: 10800
*** glibc detected *** corrupted double-linked list: 0x09540f00 ***
SIGABRT: abort called
Stack trace (14 frames):
[0x8849d1b]
[0x8861dcc]
[0xffffe500]
[0x88c8374]
[0x88dd1ff]
[0x88e2376]
[0x88e2543]
[0x88b36d1]
[0x88b50f9]
[0x8278866]
[0x88c88bf]
[0x885910f]
[0x8863035]
[0x88f47ea]

Exiting...
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
# DONE :: 1 starting structures built 11 (nstruct) times
# This process generated 11 decoys from 11 attempts


BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...

</stderr_txt>

Richard Cooper
ID: 19995 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile jaxom1
Avatar

Send message
Joined: 5 Jun 06
Posts: 180
Credit: 1,586,889
RAC: 0
Message 20022 - Posted: 11 Jul 2006, 1:50:28 UTC
Last modified: 11 Jul 2006, 1:59:19 UTC

Just thought I would post these recent errors.

7/10/2006 7:25:16 PM|rosetta@home|Unrecoverable error for result t319__CASP7_JUMPRELAX_SAVE_ALL_OUT_BARCODE_CONTACT_hom013__739_9660_0 (Incorrect function. (0x1) - exit code 1 (0x1))

7/10/2006 7:25:26 PM|rosetta@home|Unrecoverable error for result t347__CASP7_ABRELAX_SAVE_ALL_OUT_6to90hom017__842_2583_0 (Incorrect function. (0x1) - exit code 1 (0x1))


<core_client_version>5.4.9</core_client_version>
<stderr_txt>
WARNING! attempt to gzip file .aat329.out failed: file does not exist.
# DONE :: 1 starting structures built 0 (nstruct) times
# This process generated 0 decoys from 0 attempts
# 1 starting pdbs were skipped


BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...

</stderr_txt>
<message>
<file_xfer_error>
<file_name>FRA_t329_CASP7_hom001_6_t329_6_2ah5A_IGNORE_THE_REST_1150_858_4_2_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>

</message>



ID: 20022 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile jaxom1
Avatar

Send message
Joined: 5 Jun 06
Posts: 180
Credit: 1,586,889
RAC: 0
Message 20023 - Posted: 11 Jul 2006, 1:55:16 UTC

Also, noticed this in the messages section.

7/10/2006 7:26:54 PM|rosetta@home|Finished download of file truncate1_hom020_t349_forceand_kill_strands_fix.bar.gz

Anything going on that we should be worried about if we have clients that are not looked at on a daily basis?


ID: 20023 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 20024 - Posted: 11 Jul 2006, 3:16:11 UTC

John, I have no way to be certain either, but thought I'd point out that so far as work unit naming goes, I wouldn't read too much in to the word "fix" any more then the word "relax" or "jump".
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 20024 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
_heinz

Send message
Joined: 30 Jun 06
Posts: 24
Credit: 38,697
RAC: 0
Message 20043 - Posted: 11 Jul 2006, 21:37:27 UTC

After some errors at the beginning the project runs now fine, no further errors occured on my machine
happy crunching :-)
ID: 20043 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Vester
Avatar

Send message
Joined: 2 Nov 05
Posts: 257
Credit: 3,158,457
RAC: 9,572
Message 20064 - Posted: 12 Jul 2006, 10:42:37 UTC
Last modified: 12 Jul 2006, 10:43:22 UTC

The client still cannot endure a normal computer restart for Windows Updates (or other restarts) without forgetting where it was, turning an error, and starting a new job.

50,779.30 seconds lost on result ID 28003895 work ID 23917872.
ID: 20064 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
John Parsons

Send message
Joined: 3 Mar 06
Posts: 1
Credit: 10,831
RAC: 0
Message 20088 - Posted: 12 Jul 2006, 21:53:51 UTC
Last modified: 12 Jul 2006, 21:54:20 UTC

Screen Saver Freezes when key pressed or mouse moves. ONLY HAPPENS if rosetta is running (does not happen with seti) and if preferences are set to not run when computer is active. I suspect that the "not running" gets triggered before the screen saver has gone away, locking the screen saver. The mouse cursor still moves but no processing is being done and the screen saver does not go away. If "run always" is set problem does not happen.
ID: 20088 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
[AF>Linux]Arnaud
Avatar

Send message
Joined: 17 Sep 05
Posts: 38
Credit: 10,490
RAC: 0
Message 20198 - Posted: 14 Jul 2006, 18:36:56 UTC

https://boinc.bakerlab.org/rosetta/result.php?resultid=28271572
process exited with code 131 (0x83)
Arnaud
ID: 20198 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ian

Send message
Joined: 14 Apr 06
Posts: 29
Credit: 175,799
RAC: 1,122
Message 20222 - Posted: 15 Jul 2006, 1:13:12 UTC
Last modified: 15 Jul 2006, 1:15:54 UTC

Flurry of snafus

From 14 Jul, 17:30 and on
https://boinc.bakerlab.org/rosetta/result.php?resultid=28364638
https://boinc.bakerlab.org/rosetta/result.php?resultid=28351566
https://boinc.bakerlab.org/rosetta/result.php?resultid=28343113

Then a string of success and one more snafu at 21:18:

https://boinc.bakerlab.org/rosetta/result.php?resultid=28482985

All 131 (0x83), all on a Mac mini (G4)

I did notice at one point yesterday that no graphic was showing up on the screensaver. Don't know if that's a factor.
Ian Cundell, St Albans, UK
ID: 20222 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Nite Owl
Avatar

Send message
Joined: 2 Nov 05
Posts: 87
Credit: 3,019,449
RAC: 0
Message 20229 - Posted: 15 Jul 2006, 6:18:37 UTC - in response to Message 20064.  
Last modified: 15 Jul 2006, 6:20:24 UTC

Vester Wrote:
The client still cannot endure a normal computer restart for Windows Updates (or other restarts) without forgetting where it was, turning an error, and starting a new job.

50,779.30 seconds lost on result ID 28003895 work ID 23917872.

With the recent round of Windows fixes/updates I've had several machines do the same thing on reboot, though I think it's more a matter of files getting corrupted than forgetting where it was... It's probably from BOINC shutting down on command and Rosetta stays running...
ID: 20229 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sam Miorelli

Send message
Joined: 16 Feb 06
Posts: 7
Credit: 1,303,044
RAC: 0
Message 20285 - Posted: 16 Jul 2006, 6:37:37 UTC
Last modified: 16 Jul 2006, 6:39:16 UTC

I just had another WU crash on my PowerMac G4 500Mhz machine:

Sat Jul 15 19:10:51 2006|rosetta@home|Unrecoverable error for result t349__CASP7_MINHELIX_ABRELAX_SAVE_ALL_OUT_BARCODE_truncate1_hom001__965_55051_1 (process exited with code 131 (0x83)

This is the same error code that Rosetta reported last time a WU crashed on my PowerMac. This new phenomena of compute errors on my Mac is only a 5.25 issue - the last version never errored out on the Mac. Here's the new work unit in question: 24413558

What does this error code mean?
ID: 20285 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Purple Rabbit
Avatar

Send message
Joined: 24 Sep 05
Posts: 28
Credit: 3,895,800
RAC: 1,674
Message 20377 - Posted: 17 Jul 2006, 13:00:53 UTC
Last modified: 17 Jul 2006, 13:17:18 UTC

A t354 error (signal 11) on Linux.

https://boinc.bakerlab.org/rosetta/result.php?resultid=28762369

Next Rosetta WU failed to start after error. Restarting BOINC fixed that problem. Running 100% Rosetta (CPDN suspended) on this machine (HT).
ID: 20377 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ananas

Send message
Joined: 1 Jan 06
Posts: 232
Credit: 752,471
RAC: 0
Message 20380 - Posted: 17 Jul 2006, 14:15:36 UTC
Last modified: 17 Jul 2006, 14:17:42 UTC

Illegal function - often that's a missing DLL but that cannot be the reason after 8 hours :

<message>Unzul�ssige Funktion. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# random seed: 3340663
# cpu_run_time_pref: 36000
ERROR:: Exit at: .dock_structure.cc line:401


Probably a memory shortage, the box has only 512MB but 4 tasks running plus some paused. I will finish the Einsteins that still occupy some RAM and set them to "no new work", so the Rosettas only have to share the memory with one Sulphur model.
ID: 20380 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TCU Computer Science

Send message
Joined: 7 Dec 05
Posts: 28
Credit: 12,861,977
RAC: 0
Message 20421 - Posted: 17 Jul 2006, 23:40:26 UTC - in response to Message 19944.  

Here is another one:

t321__CASP7_ABRELAX_SAVE_ALL_OUT_nterm_hom004__685_15852
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=23020078


It was running for six days.
The messages show it being paused and resumed at one hour intervals.
But the accumulated time was stuck at 8 hr 16 mins.

When I stopped boinc, I noticed that the process for that work unit remained in the process list.

I rebooted the machine and the work unit finished immediately.

This occurred on a Linux box different from the previous ones.
ID: 20421 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ananas

Send message
Joined: 1 Jan 06
Posts: 232
Credit: 752,471
RAC: 0
Message 20444 - Posted: 18 Jul 2006, 7:19:47 UTC
Last modified: 18 Jul 2006, 7:21:09 UTC

OT: I see your result shows "Graphics are disabled due to configuration"

I want that too, how did you do that? Is there a "lite" version for download somewhere?
ID: 20444 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 20450 - Posted: 18 Jul 2006, 9:02:31 UTC

"Graphics are disabled due to configuration"

I'd assume that is the comment in the log when you have your system setup with Boinc running as a service install. (Mine should give the same message.)


ID: 20450 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 20451 - Posted: 18 Jul 2006, 9:38:00 UTC

I use the boinc Alpha test version 5.5.6 and when installed as a service and initially started, I see the following in my "messages" tab which tells me graphics won't be used:

7/15/2006 11:09:48 PM||Starting BOINC client version 5.5.6 for windows_intelx86
7/15/2006 11:09:48 PM||Libraries: libcurl/7.15.4 OpenSSL/0.9.8a zlib/1.2.3
7/15/2006 11:09:48 PM||Executing as a daemon
7/15/2006 11:09:48 PM||Data directory: C:Program FilesBOINC
7/15/2006 11:09:48 PM||BOINC is running as a service and as a non-system user.
7/15/2006 11:09:48 PM||No application graphics will be available.



ID: 20451 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 12 · Next

Message boards : Number crunching : Report Problems with Rosetta Version 5.25



©2024 University of Washington
https://www.bakerlab.org