Minirosetta 1.98

Message boards : Number crunching : Minirosetta 1.98

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2106
Credit: 40,933,658
RAC: 18,058
Message 63743 - Posted: 18 Oct 2009, 11:56:08 UTC

Over the last week I've had no errors at all. That's the second week running. Well done again guys.

Boinc 6.6.38
Processor: 4 AuthenticAMD AMD Phenom(tm) 9850 Quad-Core Processor [AMD64 Family 16 Model 2 Stepping 3]
OS: Microsoft Windows Vista: Home Premium x64 Edition, Service Pack 2, (06.00.6002.00)
Memory: 8.00 GB physical, 17.52 GB virtual

ID: 63743 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 63744 - Posted: 18 Oct 2009, 12:57:55 UTC

Over the last week I've had no errors at all. That's the second week running. Well done again guys.


Yes, seconded! My last error report was an exception to the norm.

ID: 63744 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
der_Day

Send message
Joined: 17 Apr 08
Posts: 1
Credit: 3,000,504
RAC: 6,442
Message 63746 - Posted: 18 Oct 2009, 16:57:40 UTC

I've a lot of problems since friday. Most of my WUs broke after a few seconds, some produced an error message:
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C9201B3 write attempt to address 0x00BD97B2


See mega_lr8_seq_score12_rlbd_2hbo_IGNORE_THE_REST_DECOY_15198_2497
or mega_lr10_seq_score12_rlbd_1vkk_IGNORE_THE_REST_DECOY_15197_2829_0

Sometimes my wingmen can finish the WU, sometimes they fail too. What I've to do??
ID: 63746 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 63748 - Posted: 18 Oct 2009, 19:38:30 UTC

Yes, seconded! My last error report was an exception to the norm.


Oh dear, I talked too soon!

symm_lr8_seq_score12_ss_1.7_rlbd_1ttz_IGNORE_THE_REST_DECOY_14923_2951

ERROR: !core::conformation::symmetry::is_symmetric( pose )
ERROR:: Exit from: ....srccoreoptimizationAtomTreeMinimizer.cc line: 55
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

ID: 63748 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2106
Credit: 40,933,658
RAC: 18,058
Message 63751 - Posted: 18 Oct 2009, 21:37:59 UTC - in response to Message 63748.  

Oh dear, I talked too soon!

symm_lr8_seq_score12_ss_1.7_rlbd_1ttz_IGNORE_THE_REST_DECOY_14923_2951

ERROR: !core::conformation::symmetry::is_symmetric( pose )
ERROR:: Exit from: ....srccoreoptimizationAtomTreeMinimizer.cc line: 55
BOINC:: Error reading and gzipping output datafile: default.out

As did I...

symm_lr8_seq_score12_ss_1.7_rlbd_1h75_IGNORE_THE_REST_DECOY_14923_3191_1
ID: 63751 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Nosferatu*

Send message
Joined: 27 Nov 05
Posts: 1
Credit: 1,004,857
RAC: 0
Message 63752 - Posted: 18 Oct 2009, 21:38:25 UTC

Don't know quite where to post this but since the change to 1.98 all workunits result in Computation error. 5.98 is running just fine as are all other projects in boinc.
ID: 63752 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
zibou

Send message
Joined: 9 Sep 09
Posts: 2
Credit: 672,998
RAC: 0
Message 63759 - Posted: 19 Oct 2009, 23:18:48 UTC

I have two different behaviour with the 1.98:

- On Windows XP, units fail after 5 minutes, calculation error.

- On Windows 2000, units run for 9 hours (my patience was over, normally 3), the percentage completed is still at 0%, and the time remaining does not move from the initial value. It does not let other projects with any time to run. Restarting Boinc manager resets the clock to zero, and it starts all over again with the same behaviour. I had to suspend the project until further fix.
ID: 63759 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile spiceyux

Send message
Joined: 9 Nov 05
Posts: 1
Credit: 10,155,875
RAC: 1,209
Message 63760 - Posted: 20 Oct 2009, 1:28:07 UTC - in response to Message 63752.  

Don't know quite where to post this but since the change to 1.98 all workunits result in Computation error. 5.98 is running just fine as are all other projects in boinc.


Just seconding this, I'm having the same behavior. Please advise if there is anything that I can report to help diagnose.
ID: 63760 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
borg

Send message
Joined: 4 Dec 07
Posts: 3
Credit: 142,556
RAC: 0
Message 63769 - Posted: 21 Oct 2009, 7:24:39 UTC

21.10.2009 9:14:49 rosetta@home Task lr8_A_seq_score12_shake_ss1.7_rlbd_1c8c_IGNORE_THE_REST_DECOY_14949_3992_0 exited with zero status but no 'finished' file
21.10.2009 9:14:49 rosetta@home If this happens repeatedly you may need to reset the project.

This was happening repeatedly. Finally I aborted the task.
ID: 63769 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile AM

Send message
Joined: 15 Jul 06
Posts: 7
Credit: 522,822
RAC: 91
Message 63772 - Posted: 21 Oct 2009, 15:10:43 UTC

This and other mini 1.98 WU's have been resource hogs lately.

https://boinc.bakerlab.org/rosetta/result.php?resultid=287892484
ID: 63772 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
googloo
Avatar

Send message
Joined: 15 Sep 06
Posts: 133
Credit: 22,619,659
RAC: 5,614
Message 63779 - Posted: 23 Oct 2009, 1:16:27 UTC

FYI

I aborted this one:
https://boinc.bakerlab.org/rosetta/result.php?resultid=289747880 because it had run nearly 30 minutes with 0% completed.

FWIW, my wingman had a Compute Error; see
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=264327534
ID: 63779 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
bradipopitt

Send message
Joined: 4 Mar 06
Posts: 1
Credit: 1,096,654
RAC: 0
Message 63793 - Posted: 23 Oct 2009, 13:08:37 UTC

Hej there,
I cannot run the new version anymore.
I always get such an error:

<core_client_version>6.6.38</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
[2009-10-23 9:33: 1:] :: BOINC:: Initializing ... ok.
[2009-10-23 9:33: 1:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C9101B3 write attempt to address 0x00BD0DA2

Engaging BOINC Windows Runtime Debugger...



Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C9101B3 write attempt to address 0x0045E5AE

Engaging BOINC Windows Runtime Debugger...


</stderr_txt>
]]>

Do you have any idea to overcome it?
For the moment I just run Rosetta beta.

good weekend
ID: 63793 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Speedy
Avatar

Send message
Joined: 25 Sep 05
Posts: 163
Credit: 808,098
RAC: 0
Message 63801 - Posted: 23 Oct 2009, 20:55:36 UTC
Last modified: 23 Oct 2009, 21:01:06 UTC

https://boinc.bakerlab.org/rosetta/result.php?resultid=289275155 Gave a compute error after running for roughly 1hr 06min 3812.641 seconds. This is a lr8 task task ID 289275155
Have a crunching good day!!
ID: 63801 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Cesium_133*
Avatar

Send message
Joined: 1 Dec 08
Posts: 28
Credit: 225,332
RAC: 0
Message 63803 - Posted: 24 Oct 2009, 1:50:31 UTC

I'll join the chorus on error-halted WU's (I assume this is the place to give notice about such things) :

symm_lr8_seq_score12_ss_1.7_rlbd_1npu_IGNORE_THE_REST_DECOY_14923_2726_0

Computation error after 1m 13s...

If this isn't the venue to report such glitches, though it seems to be, please let me know :) Incidentally, what % of WU's are found to be corrupted/abortive/unfinishable, and can they be rehabilitated for purposes of this project? :)

The inquiring layman's mind wishes to know... :)
The lovely lady you see isn't I, but Hayley Westenra, a classical crossover singer from Christchurch, NZ. There is no known voice as hers. Check her out- she's seraphic.

ID: 63803 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile macko
Avatar

Send message
Joined: 25 Jun 09
Posts: 32
Credit: 153,495
RAC: 0
Message 63804 - Posted: 24 Oct 2009, 9:14:47 UTC - in response to Message 63736.  

Second chance for this one also proved a failure:

lr8_A_seq_score12_shake_ss1.7_rlbd_1c8c_IGNORE_THE_REST_DECOY_14949_2196

Repeated lines in the stderr txt point to this error

Exception:
failure to read decoy F_00018_0004416_0_0001 from silent-file lr8_shake_1c8c.out
[2009-10-17 20:10:35:] :: BOINC:: Initializing ... ok.


Hi

Similar error with this:
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=263361396
Exception:
failure to read decoy F_00023_0001585_0 from silent-file lr8_1lou.out
CPU time 15.23438

And a different one with these:
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=263361627
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00436BD8 read attempt to address 0x018BD000

Engaging BOINC Windows Runtime Debugger...

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=263361627
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00436BD8 read attempt to address 0x018BD000

Engaging BOINC Windows Runtime Debugger...

After these 3 the program runs normal


ID: 63804 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 63813 - Posted: 25 Oct 2009, 5:42:22 UTC
Last modified: 25 Oct 2009, 5:43:14 UTC

This one failed many times.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=262741617

Sun 25 Oct 2009 15:25:52 EST|rosetta@home|Output file lr8_score12_run03_rlbd_1ugh_IGNORE_THE_REST_DECOY_14712_835_2_0 for task absent

<message>
process exited with code 1 (0x1, -255)
</message>

ERROR: Illegal attempt to score with non-identical atom set between pose and etable
ERROR:: Exit from: src/core/scoring/etable/EtableEnergy.cc line: 72
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
ID: 63813 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2106
Credit: 40,933,658
RAC: 18,058
Message 63849 - Posted: 27 Oct 2009, 1:24:36 UTC
Last modified: 27 Oct 2009, 1:30:02 UTC

Just one failure this week, but one with some errors I haven't seen reported before:

symm_lr8_seq_score12_A_rlbd_1lou_IGNORE_THE_REST_DECOY_14880_3610_0
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
CPU time 2.948419

stderr out <core_client_version>6.6.38</core_client_version>

...
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/mtyka_symm_lr8_seq_score12_A.zip
error: cannot create ./dummy
error: cannot create ./1gvp.symm
error: cannot create ./1a68.pdb.symm
error: cannot create ./1b3a.pdb.symm
error: cannot create ./1bmg.pdb.symm
error: cannot create ./1cg5.pdb.symm
error: cannot create ./1e6i.pdb.symm
error: cannot create ./1ew4.pdb.symm
error: cannot create ./1lve.pdb.symm
error: cannot create ./1pxu.pdb.symm
error: cannot create ./1rki.pdb.symm
error: cannot create ./1t2j.pdb.symm
error: cannot create ./1tif.pdb.symm
error: cannot create ./1tza.pdb.symm
error: cannot create ./1urn.pdb.symm
error: cannot create ./1vie.pdb.symm
error: cannot create ./1who.pdb.symm
error: cannot create ./1wty.pdb.symm
error: cannot create ./1zd0.pdb.symm
error: cannot create ./2hl7.pdb.symm
error: cannot create ./2hsb.pdb.symm
error: cannot create ./2i9c.pdb.symm
error: cannot create ./2iiy.pdb.symm
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/lr8_1lou.out.zip
Setting database description ...

...

Many, many more errors of a similar type before it closes down after 3 seconds

Edit: wingman errored out after 11 seconds but with very different errors reported. Best to take a look yourself. Well beyond me.
ID: 63849 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike

Send message
Joined: 28 Jun 09
Posts: 1
Credit: 688,633
RAC: 0
Message 63850 - Posted: 27 Oct 2009, 1:54:02 UTC
Last modified: 27 Oct 2009, 1:57:21 UTC

I found the following error when I returned to my computer this afternoon:

Unhandled exception at 0x7c9101b3 in minirosetta_1.98_windows_intelx86.exe: 0xC0000005: Access violation writing location 0x00450eb

Actually I got a dialog offering to run the Visual Studio debugger, (BOINC is running on a development machine) and this is what the debugger identified as the error.

This has happened twice before, possibly in the same work unit.

Addendum - this happens whenever the work unit runs - aborting it
Work Unit: symm_lr13_seq_score12_A_rlbd_1bkr_IGNORE_THE_REST_DECOY_15334_130_1


Environment:
OS: Windows XP Professional Service Pack 3
Processor: Intel Core2 CPU 6420@ 2.13GHz (2 CPUs)
Memory: 3584MB RAM
Page File: 2359MB used, 6421MB available

1. Do you want more debugging info if it happens again?
2. Should I abort the work unit?

-- Mike --
ID: 63850 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
D.J.Lankenau

Send message
Joined: 26 May 06
Posts: 1
Credit: 614,744
RAC: 0
Message 63858 - Posted: 27 Oct 2009, 18:14:39 UTC

I would like to know if there is a way to avoid units that run under 1.98.
Not only am I loosing houres of crunching time for R@H but all other projects are being affected. At present I am manually aborting all 1.98 units but this is a hit and miss operation. The alternative is to suspend R@H until the problem is fixed.

Please advise
Doug Lankenau
ID: 63858 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Cesium_133*
Avatar

Send message
Joined: 1 Dec 08
Posts: 28
Credit: 225,332
RAC: 0
Message 63859 - Posted: 27 Oct 2009, 18:44:18 UTC - in response to Message 63858.  

Not only am I losing houres of crunching time for R@H but all other projects are being affected.


Happened to me too a while back... it's why I walked away from Rosetta and went to POEM. Their WU's don't foul up. I gave Rosetta time to fix their issues, and while my experience hasn't been as bad as the last poster, I have had 2-3 comp error aborts. Also, the graphics for 1.98 often won't show when prompted, a symptom of troubles generally in the code.

The alternative is to suspend R@H until the problem is fixed.


Managers of R@H, take note of that last line. The original poster, joined by myself, just said a mouthful... we value the use of our flops!
The lovely lady you see isn't I, but Hayley Westenra, a classical crossover singer from Christchurch, NZ. There is no known voice as hers. Check her out- she's seraphic.

ID: 63859 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Minirosetta 1.98



©2024 University of Washington
https://www.bakerlab.org