Problems with Minirosetta Version 1.67

Message boards : Number crunching : Problems with Minirosetta Version 1.67

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 61074 - Posted: 9 May 2009, 15:43:07 UTC
Last modified: 9 May 2009, 15:44:50 UTC

Version 1.67 has a nifty new threading mode built in. And corrects some of the validator problems people have reported in prior releases.

Please report any problems with this version in this thread.
Rosetta Moderator: Mod.Sense
ID: 61074 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 11,805,838
RAC: 0
Message 61075 - Posted: 9 May 2009, 17:07:46 UTC

I've lost 5 tasks out of 10 with this problem in task termination on Mac. Until it's fixed I'm not crunching any more 1.67 workunits.

Sample task 249815019

and error message fragment

Thread 0 Crashed:
0 ...etta_1.67_i686-apple-darwin 0x00efc966 __ZN7utility7signals9SignalHubIvN4core12conformation7signals16DestructionEventEE11send_signalES5_ + 1870
1 ...etta_1.67_i686-apple-darwin 0x0002a4a7 __ZN4core12conformation12ConformationD1Ev + 7373
2 ...etta_1.67_i686-apple-darwin 0x000910d0 __ZN4core4pose4PoseD1Ev + 4652



ID: 61075 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Murasaki
Avatar

Send message
Joined: 20 Apr 06
Posts: 303
Credit: 511,418
RAC: 0
Message 61078 - Posted: 9 May 2009, 21:55:50 UTC
Last modified: 9 May 2009, 22:03:14 UTC

My second 1.67 work unit appears to have validated correctly, but the log includes the following error message:

ERROR: pose.fold_tree().is_cutpoint( *it )
ERROR:: Exit from: ....srcprotocolstopology_brokerTopologyBroker.cc line: 560
called boinc_finish



Here is the link to the task details: threading_broker_hb_t303__IGNORE_THE_REST_11860_719_0
ID: 61078 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5662
Credit: 5,701,869
RAC: 2,154
Message 61080 - Posted: 10 May 2009, 0:02:51 UTC
Last modified: 10 May 2009, 0:03:14 UTC

lr8_seq_score12_rlbd_1fkb_IGNORE_THE_REST_DECOY_11810_1955_0

graphics box shows no activity with 2hrs and 52 mins to go.
also says stage unknown.
it is on model 55 but stop = 0

ID: 61080 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5662
Credit: 5,701,869
RAC: 2,154
Message 61081 - Posted: 10 May 2009, 0:06:19 UTC

threading_broker_hb_t342__IGNORE_THE_REST_11876_188_0 ran into the famous 0xc error. But my firewall triggered on it. Funny thing is the other tasks before this one did not trigger the firewall.
ID: 61081 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,811,598
RAC: 764
Message 61095 - Posted: 10 May 2009, 19:43:42 UTC
Last modified: 10 May 2009, 19:47:47 UTC

And another on my Mac as well:

threading_broker_hb_t370__IGNORE_THE_REST_11885_966_0

# cpu_run_time_pref: 36000
SIGBUS: bus error

Crashed executable name: minirosetta_1.67_i686-apple-darwin
built using BOINC library version 6.5.0
Machine type Intel 80486 (32-bit executable)
System version: Macintosh OS 10.5.6 build 9G55
Sun May 10 14:11:14 2009

sh: /usr/bin/atos: No such file or directory
0 0x006c0345 SIGPIPE: write on a pipe with no reader
1 0x004a3d8e SIGPIPE: write on a pipe with no reader
2 0x90a9e2bb SIGPIPE: write on a pipe with no reader
3 0xffffffff SIGPIPE: write on a pipe with no reader
4 0x0002a4a7 SIGPIPE: write on a pipe with no reader
5 0x000910d0 SIGPIPE: write on a pipe with no reader
6 0x00518bdc SIGPIPE: write on a pipe with no reader
7 0x00b59c20 SIGPIPE: write on a pipe with no reader
8 0x0013b70a SIGPIPE: write on a pipe with no reader
9 0x00005a5b SIGPIPE: write on a pipe with no reader
10 0x0000292e SIGPIPE: write on a pipe with no reader
11 0x00002855 SIGPIPE: write on a pipe with no reader
12 0x00000019
Thread 0 crashed with X86 Thread State (32-bit):
eax: 0xffffffe1 ebx: 0x90a66802 ecx: 0xbfffc23c edx: 0x90a321c6
edi: 0x00000000 esi: 0x00000000 ebp: 0xbfffc278 esp: 0xbfffc23c
ss: 0x0000001f efl: 0x00000206 eip: 0x90a321c6 cs: 0x00000007
ds: 0x0000001f es: 0x0000001f fs: 0x00000000 gs: 0x00000037

etc.

From the BOINC message log:
Sun May 10 14:11:16 2009|rosetta@home|Computation for task threading_broker_hb_t370__IGNORE_THE_REST_11885_966_0 finished
Sun May 10 14:11:16 2009|rosetta@home|Output file threading_broker_hb_t370__IGNORE_THE_REST_11885_966_0_0 for task threading_broker_hb_t370__IGNORE_THE_REST_11885_966_0 absent
ID: 61095 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gen_X_Accord
Avatar

Send message
Joined: 5 Jun 06
Posts: 154
Credit: 279,018
RAC: 0
Message 61096 - Posted: 10 May 2009, 20:36:32 UTC

This was weird, I had one work task that finished and have a claimed credit of 137.07 but a granted credit of 0.00.
Anybody else have this weird problem?
ID: 61096 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 61098 - Posted: 10 May 2009, 21:14:39 UTC
Last modified: 10 May 2009, 21:15:53 UTC

Gen X has a 2POIA_BOINC_MPZN_with_zinc_abrelax task showing a Validate state of Workunit error - check skipped from BOINC 6.6.20 in Win XP. But no other errors are shown, and no credit issued.
Rosetta Moderator: Mod.Sense
ID: 61098 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,811,598
RAC: 764
Message 61099 - Posted: 10 May 2009, 21:39:26 UTC - in response to Message 61098.  
Last modified: 10 May 2009, 22:09:16 UTC

Gen X has a 2POIA_BOINC_MPZN_with_zinc_abrelax task showing a Validate state of Workunit error - check skipped from BOINC 6.6.20 in Win XP. But no other errors are shown, and no credit issued.


Looking at the workunit details I see "too many total results"
Cruncher #1 missed his deadline, cruncher #2 had a validate error, Gen X becomes cruncher #3, cruncher #1 returns successfully completed wu late but before cruncher #3, cruncher #3(Gen X) misses out.

Cruncher #2 has been given credit presumably because the validate error has been deemed not his fault.

Forgive my hazy memory but hasn't this been discussed before?

Snags

edit: Here's one discussion from 2007

edit#2: Following links from that thread; David E K explains the max results #'s and mentions trying to find a fix and the BOINC trac ticket remains open
ID: 61099 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1224
Credit: 13,847,398
RAC: 1,953
Message 61110 - Posted: 11 May 2009, 16:29:20 UTC

Should the home page be updated to include a link to this thread?
ID: 61110 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5662
Credit: 5,701,869
RAC: 2,154
Message 61115 - Posted: 11 May 2009, 21:19:51 UTC

compute errors

threading_broker_hb_t342__IGNORE_THE_REST_11876_188_0

Log:

-1073741819 (0xc0000005)
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0088A3C2 write attempt to address 0x00000000

Engaging BOINC Windows Runtime Debugger...

CPU time 2805.672

and

abinitio_norelax_homfrag__plus_native_1fkb_0001A__SAVE_ALL_OUT_11817_3500_0

Log:

Client state Compute error
Exit status -177 (0xffffff4f)
Computer ID 871217
Report deadline 19 May 2009 6:13:24 UTC
CPU time 0
stderr out

<core_client_version>6.6.20</core_client_version>
<![CDATA[
<message>
Maximum elapsed time exceeded - note: CPU time 0
</message>

ID: 61115 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile JChojnacki
Avatar

Send message
Joined: 17 Sep 05
Posts: 71
Credit: 9,939,211
RAC: 3,308
Message 61117 - Posted: 11 May 2009, 22:08:51 UTC

Compute error on this WU:

https://boinc.bakerlab.org/rosetta/result.php?resultid=249863167

Outcome Client error
Client state Compute error
Exit status -1073741819 (0xc0000005)
Computer ID 1043369
Report deadline 18 May 2009 23:41:27 UTC
CPU time 21000.18


ID: 61117 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gen_X_Accord
Avatar

Send message
Joined: 5 Jun 06
Posts: 154
Credit: 279,018
RAC: 0
Message 61122 - Posted: 12 May 2009, 6:38:49 UTC
Last modified: 12 May 2009, 6:44:17 UTC

Here's a bizarre one I'd like to throw out there. I installed Windows 7 on an old 40gb 54oo rpm hard drive I have, just to check it out, and of coarse I had to run Rosetta on it. You can see the result under Larry-PC. But I had a problem with it that I cannot explain...after finishing one task, it wouldn't download anymore, saying I needed 14000 kb/mb more room or something like that. Even though I liked the windows 7, since it wouldn't run anymore tasks, and I have 14gb free on the hard drive and at least 1 gb of memory free, I just put my windows XP hard drive back on. But I have been a bit surprised by the granted credit on the one completed task on that OS, as well as the last few on the one I am using now. Same set up as you can see, just different hard drives with different OS on them.

Oh yeah, I forgot to mention that it would only run one work unit one one core instead of 2. that was my first clue that something was funky.
ID: 61122 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,811,598
RAC: 764
Message 61125 - Posted: 12 May 2009, 10:05:07 UTC

ID: 61125 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [AF>Slappyto] popolito

Send message
Joined: 8 Mar 06
Posts: 13
Credit: 991,449
RAC: 1,126
Message 61126 - Posted: 12 May 2009, 10:43:10 UTC

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=228717046
ID: 61126 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Betting Slip

Send message
Joined: 26 Sep 05
Posts: 71
Credit: 5,702,246
RAC: 0
Message 61127 - Posted: 12 May 2009, 11:09:05 UTC

Why am I getting validate errors?

https://boinc.bakerlab.org/rosetta/result.php?resultid=250683203

https://boinc.bakerlab.org/rosetta/result.php?resultid=250621029
ID: 61127 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
neggha [Lombardia]

Send message
Joined: 23 Oct 08
Posts: 1
Credit: 12,005
RAC: 0
Message 61132 - Posted: 12 May 2009, 13:28:57 UTC - in response to Message 61127.  

i've got 2 validate errors today

https://boinc.bakerlab.org/rosetta/result.php?resultid=250671695
https://boinc.bakerlab.org/rosetta/result.php?resultid=250671693
ID: 61132 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [TiDC] yattote

Send message
Joined: 13 Mar 07
Posts: 2
Credit: 12,427,729
RAC: 0
Message 61133 - Posted: 12 May 2009, 16:09:51 UTC

woow, in half a day i got a 12 Validate error units vs. 3 correct units. What a mess!
ID: 61133 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 11,805,838
RAC: 0
Message 61135 - Posted: 12 May 2009, 17:27:35 UTC

After updating from BOINC 6.2.18 to 6.6.20 I've yet to see the termination error I reported earlier (and shown again below) , although its a pretty small sample size of tasks.

Thread 0 Crashed:
0 ...etta_1.67_i686-apple-darwin 0x00efc966 __ZN7utility7signals9SignalHubIvN4core12conformation7signals16DestructionEventEE11send_signalES5_ + 1870

------

I'm seeing several validation errors though as others are doing: all have exited after quickly doing 99 decoys.


ID: 61135 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike Tyka

Send message
Joined: 20 Oct 05
Posts: 96
Credit: 2,190
RAC: 0
Message 61137 - Posted: 12 May 2009, 18:04:11 UTC

The validator error has been found: Our data format was changed and the validator was not updated. We're doing that now.


http://beautifulproteins.blogspot.com/
http://www.miketyka.com/
ID: 61137 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : Number crunching : Problems with Minirosetta Version 1.67



©2024 University of Washington
https://www.bakerlab.org