Problems with version 5.90/5.91

Message boards : Number crunching : Problems with version 5.90/5.91

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next

AuthorMessage
Profile sslickerson

Send message
Joined: 14 Oct 05
Posts: 101
Credit: 578,497
RAC: 0
Message 50018 - Posted: 25 Dec 2007, 1:01:23 UTC

My most recent 5.90 errored out WU: 128154725



ID: 50018 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile sslickerson

Send message
Joined: 14 Oct 05
Posts: 101
Credit: 578,497
RAC: 0
Message 50019 - Posted: 25 Dec 2007, 1:04:06 UTC

My most recent 5.90 errored out WU: 128154725

This workunit was completed once already without error by someone else: 116521023



ID: 50019 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Thomas Leibold

Send message
Joined: 30 Jul 06
Posts: 55
Credit: 19,627,164
RAC: 0
Message 50024 - Posted: 25 Dec 2007, 4:45:22 UTC - in response to Message 50019.  

My most recent 5.90 errored out WU: 128154725

This workunit was completed once already without error by someone else: 116521023


Don't be misled by the fact that someone else was successful into thinking that the problem is yours!

You were using a faster cpu and a 24 hour preferred runtime. The workunit errored after over 20 hours of computations.

The other user had a slower cpu and a 3 hour preferred runtime. This workunit never progressed far enough to reach the point of failure!

Team Helix
ID: 50024 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,840,739
RAC: 4
Message 50028 - Posted: 25 Dec 2007, 10:16:04 UTC

Another crashed 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_148220.

Q6600 2MB no overclocking, temperature never goes above 50C, Windows XP SP2 fully patched, BOINC 5.10.28, no graphics, leave in memory.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 50028 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 50029 - Posted: 25 Dec 2007, 12:17:58 UTC - in response to Message 50003.  

--

*CAUTION* Since I originally posted this I had a few weird aborts. Maybe this should be ignored ... If I figure out exactly what happened I will post the issue *CAUTION*

Another thing... while your client is down....

If you are adept with a text editor, you can edit the client_state.xml and change the 5.90 version references to 5.91.

Just search for '590' It will find lines like:
<version_num>590</version_num>

for these change the 590 to 591.

Then search for 5.90 It will find lines like:
<file_name>rosetta_beta_5.90_i686-pc-linux-gnu</file_name>

change the .90 to .91

One *CAVEAT* ::: the 5.90 search will also find lines like:
<url>https://boinc.bakerlab.org/rosetta/download/rosetta_beta_5.90_i686-pc-linux-gnu</url>

*Just leave these alone*

Restart the client and 5.91 will be substituted for the 5.90... I don't know what would happen should you don this in the middle of a run, but, since, the alternative could be aborting the process.....

I am not sure I would do it.

What I would do if I were at Rosetta is re-run all of the Linux 5.90 processes just to be sure that the results are valid. I know about CRC's and checksums, but, I would rather not find out like Intel found out about the bug in the IEEE math processors of yore.... IIIIEEEEEEE!!!!

Of course your mileage may vary...



dont know if this is an good idea, cause the project server then finds a wu that has not been send away, en dousn't get a sended 1 returned, so the team cant use the results, so before uploading them, you have to turn it back.... i think!?!?
ID: 50029 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
M.L.

Send message
Joined: 21 Nov 06
Posts: 182
Credit: 180,462
RAC: 0
Message 50035 - Posted: 25 Dec 2007, 12:49:15 UTC

Task ID 128589160
Name 1zpy__BOINC_TWIST_RINGS_MORE_SLIDESYMM_FOLD_AND_DOCK-1zpy_-native__2476_931_0
Workunit 116921434
Created 23 Dec 2007 1:05:08 UTC
Sent 23 Dec 2007 1:06:38 UTC
Received 25 Dec 2007 12:47:11 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 510574
Report deadline 2 Jan 2008 1:06:38 UTC
CPU time 10454.91
stderr out <core_client_version>5.10.30</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 3500000
**********************************************************************
Rosetta score is stuck or going too long. Watchdog is ending the run!
Stuck at score -96.9016 for 900 seconds
**********************************************************************
GZIP SILENT FILE: .xx1zpy.out

</stderr_txt>
]]>


Validate state Valid
Claimed credit 43.2107713566122
Granted credit 38.1117339116195
application version 5.90
ID: 50035 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 50036 - Posted: 25 Dec 2007, 16:01:01 UTC
Last modified: 25 Dec 2007, 16:55:11 UTC

error after 14000+ secs.

Task ID 128929743
Name 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_73548_0
Workunit 117235939


stderr out <core_client_version>5.10.20</core_client_version>
<![CDATA[
<message>
Onjuiste functie. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 3412383
# cpu_run_time_pref: 14400
# random seed: 3412383
# cpu_run_time_pref: 14400
ABORT: bad to aa_rotno_to_packedrotno
aa,rot1/2/3/4: MET 11 3 2 0 0
chi no 3 nchi 3 aav 1 is_chi_proton_rotamer(aa,aav,i) 0
ERROR:: Exit from: .rotamer_functions.cc line: 1461

</stderr_txt>
]]>

{EDIT:
I also have a task, the same as this, same batch etc.

but there it calculates energy and rmsd , but it dousn't place dots. and the accepted model is also changing. and when to dots should be real close next to each other, i.e. -181 and -183 it dousnt show a thing, and rmsd is the same..} {edit2: well i guess the dots are to small to display, cause when i look more close, in the energy bar, the lines go from -200 to +100 in the same screen, i can see both at 1 time, so i guess the dots get beneathe the size of 1 pixel and therefor dont get showed.}
ID: 50036 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 50038 - Posted: 25 Dec 2007, 17:34:13 UTC

I have just aborted this one after it stalled at approx 62 percent

1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_65532_0

Earlier in the run I thought that I might was in with a chance for the hall of fame with a 1.05 rmsd until I noticed it was followed by an E+XXX. Do I get an award for having the worst result?

I thought one of the upgrades was to abort a run if the result was ridiculous.
ID: 50038 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,888,320
RAC: 0
Message 50047 - Posted: 25 Dec 2007, 23:43:10 UTC

resultid=128827259

1eyvA_BOINC_ABINITIO_VF-S25-9-S3-3--1eyvA-vf__2450_12149_0

On my ppc mac I completed 15 decoys successfully but ran about 2 and a quarter hours over my preferred runtime of 10 hours.
ID: 50047 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike.Gibson

Send message
Joined: 3 Nov 07
Posts: 19
Credit: 311,844
RAC: 0
Message 50049 - Posted: 26 Dec 2007, 2:10:06 UTC

Hi, folks

I have just had my second WU stick at 10 minutes to go out of 3 hours expected using Beta 5.90

Others have finished OK.

I have suspended both so that other WUs will run.

The only similarity that I can see is that they were both listed as Running, high priority and the successful ones were not.

The report deadline is 31/12, so I was going to try them again when the other WUs were finished.

Any other suggestions, please?

Mike
ID: 50049 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile sslickerson

Send message
Joined: 14 Oct 05
Posts: 101
Credit: 578,497
RAC: 0
Message 50054 - Posted: 26 Dec 2007, 6:50:59 UTC

This one validated but was caught by the watchdog for getting stuck: 129143112

Granted that this is a good thing, it did what it was supposed to do but I haven't had this many errors/"other weird things" since the very early days of Rosetta. It seems there is something not right going on.

Tim



ID: 50054 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Eli Bildirici

Send message
Joined: 6 Jun 06
Posts: 4
Credit: 38,501,259
RAC: 0
Message 50058 - Posted: 26 Dec 2007, 8:48:56 UTC

Mhm...was wondering what was going on...the graphics on these work units have been screwing with the graphics drivers on multiple Windows machines.

On my laptop, which has an Intel 945GME chipset, the driver crashed; restarting Windows made everything fine.

On my desktop (Radeon 9550), the damn thing just corrupts itself and the screen goes totally haywire...with effort I got it to restart, but I was basically flying blind.

Desktop WU # is: 1zpy_BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native_2477_243926__0

Laptop: ..._248644_0

This is pretty critical guys...the past month and half or so have seen a bunch of problems; maybe more resources should be reallocated to making sure things are stable rather than the beta/game stuff...
ID: 50058 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Hokke

Send message
Joined: 18 Dec 06
Posts: 3
Credit: 158,955
RAC: 0
Message 50063 - Posted: 26 Dec 2007, 11:11:11 UTC
Last modified: 26 Dec 2007, 11:14:22 UTC

With version 5.90 every work unit I run crashes!!


Here is one:
https://boinc.bakerlab.org/rosetta/result.php?resultid=128876130

There is more:
https://boinc.bakerlab.org/rosetta/results.php?userid=136628

Okay, not every one but already three in a row!
ID: 50063 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Psycodad

Send message
Joined: 10 Dec 05
Posts: 1
Credit: 124,429
RAC: 0
Message 50064 - Posted: 26 Dec 2007, 11:40:19 UTC

This Wu errored out after 5 hours:


Task

Workunit



ID: 50064 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 50074 - Posted: 26 Dec 2007, 17:28:05 UTC

ID: 50074 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 50076 - Posted: 26 Dec 2007, 20:09:04 UTC

got myself an error 2
[url=https://boinc.bakerlab.org/rosetta/result.php?resultid=129070618
]ERROR[/url]
after 6000+ sec. computation error.
ID: 50076 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 50080 - Posted: 26 Dec 2007, 21:03:31 UTC

This one fellover after 2hrs, bad tasks.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=117204604

1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_42213_1

pete.

ID: 50080 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Speedy
Avatar

Send message
Joined: 25 Sep 05
Posts: 163
Credit: 808,337
RAC: 1
Message 50083 - Posted: 26 Dec 2007, 22:29:55 UTC

This wu made the debugger run I'm not sure as to why though. It was using App 5.90 Any idea's?
______
Speedy
Have a crunching good day!!
ID: 50083 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
stoneysilence

Send message
Joined: 4 May 07
Posts: 13
Credit: 401,055
RAC: 0
Message 50090 - Posted: 26 Dec 2007, 23:56:43 UTC

I am using Boinc 5.10.30 and am running World Community Grid and Rosetta. Rosetta is currently working on a 5.90 beta process.

It is currently using 300MB of ram as I type this. Luckily I have 4gigs of ram but 300mb of ram is quite a bit. This is the highest I have ever seen it. Usually it is 150-200mb of ram.

World Community Grid hardly ever breaks 100mb of ram though.

System Specs:
Windows Vista Business (32-bit)
C2D E6600
EVGA n680i MB
4GB OCZ DDR2-1066 RAM
Maxtor SATA-150 160gb HD
WD Raptor 76gb HD
ID: 50090 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David Emigh
Avatar

Send message
Joined: 13 Mar 06
Posts: 158
Credit: 417,178
RAC: 0
Message 50091 - Posted: 27 Dec 2007, 0:05:09 UTC
Last modified: 27 Dec 2007, 0:09:08 UTC

This task triggered both the Windows debugger and the BOINC debugger.

It was a "1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_ etc." task.
Rosie, Rosie, she's our gal,
If she can't do it, no one shall!
ID: 50091 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next

Message boards : Number crunching : Problems with version 5.90/5.91



©2024 University of Washington
https://www.bakerlab.org