Rosetta@home

Minirosetta 1.98

  UW Seal
 
[ Home ] [ Join ] [ About ] [ Participants ] [ Community ] [ Statistics ]
  [ login/out ]


Advanced search
Message boards : Number crunching : Minirosetta 1.98

Sort
AuthorMessage
Yifan Song
Forum moderator
Project administrator
Project developer
Project scientist

Joined: May 26 09
Posts: 62
ID: 318024
Credit: 7,322
RAC: 0
Message 63690 - Posted 15 Oct 2009 0:31:55 UTC

A few more command line level options are added.
Report bugs here.

Chilean Profile
Avatar

Joined: Oct 16 05
Posts: 651
ID: 5008
Credit: 10,394,263
RAC: 2,476
Message 63694 - Posted 15 Oct 2009 3:03:29 UTC

What do these "options" mean exactly?
____________

Yifan Song
Forum moderator
Project administrator
Project developer
Project scientist

Joined: May 26 09
Posts: 62
ID: 318024
Credit: 7,322
RAC: 0
Message 63695 - Posted 15 Oct 2009 7:13:11 UTC - in response to Message ID 63694.
Last modified: 15 Oct 2009 7:15:58 UTC

As David described in his journal (http://boinc.bakerlab.org/rosetta/forum_thread.php?id=1177&nowrap=true#63383), we are working on improving energy functions. Currently, a lot of parameters are either defined in the code or in minirosetta_database. The new options allow us to test different energy functions from command line.
One example of what we plan to test in the coming weeks is a hydrogen bond potential with sharper distance and angular dependence. As many of you know, hydrogen bond plays an important role in drug design, as well as in protein/DNA interface design for gene therapy. However the exact form and magnitude of hydrogen bond is still underdetermined. Now from command line we can change the shape of hydrogen bond potential, and test whether they agree with the experimental data we've collected.

What do these "options" mean exactly?

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 63700 - Posted 15 Oct 2009 15:48:07 UTC

The "command line" is what drives each task to get it started on your machine at home. Just like Linux/Unix or DOS commands, BOINC applications are started with a number of options that direct them on how to run. So it is easily changed, without changing the program. And also easy to create several tasks with the same protein and starting point, and see if the command line adjustments make a demonstrable difference in the outcome.
____________
Rosetta Moderator: Mod.Sense

Michael G.R.

Joined: Nov 11 05
Posts: 263
ID: 11128
Credit: 8,385,240
RAC: 115
Message 63703 - Posted 15 Oct 2009 16:26:32 UTC

Thank you Yifan.
____________

AMD_is_logical

Joined: Dec 20 05
Posts: 299
ID: 41207
Credit: 31,460,681
RAC: 0
Message 63708 - Posted 16 Oct 2009 2:27:52 UTC

I'm getting some errors with lr8_score12_run03_rlbd WUs. They exit after a few seconds with the message:

ERROR: Illegal attempt to score with non-identical atom set between pose and etable
ERROR:: Exit from: src/core/scoring/etable/EtableEnergy.cc line: 72
BOINC:: Error reading and gzipping output datafile: default.out

Here's a few examples:
http://boinc.bakerlab.org/rosetta/result.php?resultid=288322249
http://boinc.bakerlab.org/rosetta/result.php?resultid=288322220
http://boinc.bakerlab.org/rosetta/result.php?resultid=288263880
http://boinc.bakerlab.org/rosetta/result.php?resultid=288239633
http://boinc.bakerlab.org/rosetta/result.php?resultid=288212752

P . P . L .
Avatar

Joined: Aug 20 06
Posts: 581
ID: 105843
Credit: 4,864,105
RAC: 0
Message 63709 - Posted 16 Oct 2009 4:43:29 UTC

Same error here by the look of it.

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=263115946

Fri 16 Oct 2009 15:22:49 EST|rosetta@home|Output file lr8_score12_run03_rlbd_1py9_IGNORE_THE_REST_DECOY_14712_976_1_0 for task absent

ERROR: Illegal attempt to score with non-identical atom set between pose and etable
ERROR:: Exit from: src/core/scoring/etable/EtableEnergy.cc line: 72
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

____________


googloo
Avatar

Joined: Sep 15 06
Posts: 105
ID: 112667
Credit: 6,304,481
RAC: 7,118
Message 63712 - Posted 16 Oct 2009 12:08:08 UTC

How many times must I ask that you post new versions to the Rosetta Application Version Release Log?

If you do that, those of us subscribed to that thread will get an email and be able to adjust our firewalls BEFORE there are problems.

MarcoA

Joined: Sep 2 08
Posts: 9
ID: 276404
Credit: 777,433
RAC: 0
Message 63713 - Posted 16 Oct 2009 13:03:40 UTC

same as with AMD_is_logical:
http://boinc.bakerlab.org/rosetta/result.php?resultid=288316563:

ERROR: Illegal attempt to score with non-identical atom set between pose and etable
ERROR:: Exit from: src/core/scoring/etable/EtableEnergy.cc line: 72

at least it took only 6s of CPU-Time.

Snagletooth

Joined: Feb 22 07
Posts: 193
ID: 149031
Credit: 1,425,415
RAC: 236
Message 63715 - Posted 16 Oct 2009 14:24:08 UTC

symm_lr8_seq_score12_ss_1.7_rlbd_1lou_IGNORE_THE_REST_DECOY_14923_2775

Exception:
failure to read decoy F_00023_0001585_0 from silent-file lr8_1lou.out

svincent

Joined: Dec 30 05
Posts: 202
ID: 44923
Credit: 4,404,794
RAC: 6,286
Message 63716 - Posted 16 Oct 2009 16:57:40 UTC

Task 288375137 (symm_lr8_seq_score12_ss_1.7_rlbd_1h75_IGNORE_THE_REST_DECOY_14923_2689_0) failed on Mac OS X 10.6

ERROR: !core::conformation::symmetry::is_symmetric( pose )
ERROR:: Exit from: src/core/optimization/AtomTreeMinimizer.cc line: 55
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>
]]>
____________

Yifan Song
Forum moderator
Project administrator
Project developer
Project scientist

Joined: May 26 09
Posts: 62
ID: 318024
Credit: 7,322
RAC: 0
Message 63718 - Posted 16 Oct 2009 20:33:24 UTC

Hmm, I can't figure out right away where these errors could be from. They are rather old work units which has been running ok. The same type of jobs also passed the alpha test on ralph for v1.98. I'll do some more tests over the weekend.

Stephenish

Joined: Feb 26 06
Posts: 3
ID: 61716
Credit: 757,327
RAC: 0
Message 63721 - Posted 16 Oct 2009 22:21:39 UTC
Last modified: 16 Oct 2009 23:00:35 UTC

I had two WUs fail earlier also.

I think these two made computation errors:

10/16/2009 4:37:11 PM rosetta@home Output file denovo_design_rossmann2x3_flxbb_SAVE_ALL_OUT_r2x3_001_rbd_h15_001_0001.10_0001_15195_294_0_0 for task denovo_design_rossmann2x3_flxbb_SAVE_ALL_OUT_r2x3_001_rbd_h15_001_0001.10_0001_15195_294_0 absent

10/16/2009 12:57:47 PM rosetta@home Output file lr8_A_seq_score12_shake_ss1.7_rlbd_2apb_IGNORE_THE_REST_DECOY_14949_3396_0_0 for task lr8_A_seq_score12_shake_ss1.7_rlbd_2apb_IGNORE_THE_REST_DECOY_14949_3396_0 absent

And these two definitely had issues (computation errors):

10/16/2009 5:38:05 PM rosetta@home Output file mega_lr8_seq_score12_rlbd_1srr_IGNORE_THE_REST_DECOY_15198_4_0_0 for task mega_lr8_seq_score12_rlbd_1srr_IGNORE_THE_REST_DECOY_15198_4_0 absent

10/16/2009 5:38:10 PM rosetta@home Output file lr5_dun08_it01_A_rlbd_1gvp_SAVE_ALL_OUT_IGNORE_THE_REST_DECOY_15193_176_0_0 for task lr5_dun08_it01_A_rlbd_1gvp_SAVE_ALL_OUT_IGNORE_THE_REST_DECOY_15193_176_0 absent

I seem to be getting bad WUs back to back. Only on my Pentium D 3.2GHz machine, though.
____________

Jerry Goggin Profile

Joined: Jun 7 06
Posts: 4
ID: 92225
Credit: 226,010
RAC: 0
Message 63724 - Posted 17 Oct 2009 2:12:47 UTC

This task appears to have hung

10/16/2009 10:02:03 PM rosetta@home task denovo_design_rossmann2x3_flxbb_SAVE_ALL_OUT_r2x3_001_rbd_h15_001_0001.82_0001_15195_60_0 suspended by user

so I am going to abort it. Task properties indicate the State "Waiting for memory" whatever that means. There is plenty of memory available. I can resume it, but it simply does not run.

10/16/2009 10:08:35 PM rosetta@home task denovo_design_rossmann2x3_flxbb_SAVE_ALL_OUT_r2x3_001_rbd_h15_001_0001.82_0001_15195_60_0 resumed by user

____________

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 63726 - Posted 17 Oct 2009 3:12:25 UTC

Jerry, BOINC allows you to control the amount of memory used by BOINC. Check your settings to see what % of memory you allow when computer is in use, and when idle. Often more memory is allowed when idle, and so BOINC is waiting either for another task to complete, or for the machine to go idle again to resume work on the task.

So, what you are seeing is normal, not "hung".
____________
Rosetta Moderator: Mod.Sense

Jerry Goggin Profile

Joined: Jun 7 06
Posts: 4
ID: 92225
Credit: 226,010
RAC: 0
Message 63727 - Posted 17 Oct 2009 12:07:57 UTC - in response to Message ID 63726.

I understand what you are saying, but think something else was going on. Based on my settings and running processes, task manager indicated over 500MB free physical memory available and considerably more swap space. Settings allow 50% when computer in use and 100% when computer idle, and Rosetta has had no problems during the years I've been running it. Thing I noticed was that Rosetta had disappeared from the process list -- it wasn't waiting for anything, it was gone.

____________

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 63729 - Posted 17 Oct 2009 15:58:37 UTC

The amount free is not relevant. The total size times 50% when in use, and then how many CPUs do you have?

It sounds like what you are seeing is that Rosetta often takes more memory to run then it used to, and they increased the recommended minimum system memory in the past year as well to reflect this.

I see you have 1 Windows machine, with 1 CPU and 768MB of memory. So 50% of that is less then the 512MB recommended. But when your machine goes idle and you allow 100%, the task is probably running. As soon as you use the machine to try and see it, computer is in use and it must be suspended.

With such settings, be sure you check the box to keep tasks in memory while suspended to avoid losing work everytime you sit down to use your computer and more memory is required.

If BOINC just got started, and knows that the task was using 200MB when it was last running, and does not presently have 200MB to devote to it, then it may not start the task on the task list until sufficient memory is available.

Also, your original post said the task was waiting for memory, but the message you copied said that you (the user) had suspended it. Perhaps that was as you were preparing to abort it?

Anyway, next time, note the task name and check back in a day and see if it is still "hung". It should work it's way through normally without any intervention.
____________
Rosetta Moderator: Mod.Sense

AMD_is_logical

Joined: Dec 20 05
Posts: 299
ID: 41207
Credit: 31,460,681
RAC: 0
Message 63731 - Posted 17 Oct 2009 19:09:38 UTC

I'm getting some errors with symm_lr8_seq_score12_ss_1.7_rlbd WUs.

ERROR: !core::conformation::symmetry::is_symmetric( pose )
ERROR:: Exit from: src/core/optimization/AtomTreeMinimizer.cc line: 55
BOINC:: Error reading and gzipping output datafile: default.out

http://boinc.bakerlab.org/rosetta/result.php?resultid=288650449
http://boinc.bakerlab.org/rosetta/result.php?resultid=288656157
http://boinc.bakerlab.org/rosetta/result.php?resultid=288660509

Jerry Goggin Profile

Joined: Jun 7 06
Posts: 4
ID: 92225
Credit: 226,010
RAC: 0
Message 63732 - Posted 17 Oct 2009 23:21:43 UTC - in response to Message ID 63729.

Really appreciate the prompt feedback. The box to "keep tasks in memory while suspended" was not checked, so that got changed. This is a very old computer which needs replacing soon. Going from Win2K to Windows 7 will be interesting, I'm sure.
____________

Evan

Joined: Dec 23 05
Posts: 268
ID: 42505
Credit: 402,585
RAC: 0
Message 63736 - Posted 18 Oct 2009 9:43:35 UTC

Second chance for this one also proved a failure:

lr8_A_seq_score12_shake_ss1.7_rlbd_1c8c_IGNORE_THE_REST_DECOY_14949_2196

Repeated lines in the stderr txt point to this error

Exception:
failure to read decoy F_00018_0004416_0_0001 from silent-file lr8_shake_1c8c.out
[2009-10-17 20:10:35:] :: BOINC:: Initializing ... ok.

____________

Sid Celery

Joined: Feb 11 08
Posts: 806
ID: 241409
Credit: 10,030,156
RAC: 9,347
Message 63743 - Posted 18 Oct 2009 11:56:08 UTC

Over the last week I've had no errors at all. That's the second week running. Well done again guys.

Boinc 6.6.38
Processor: 4 AuthenticAMD AMD Phenom(tm) 9850 Quad-Core Processor [AMD64 Family 16 Model 2 Stepping 3]
OS: Microsoft Windows Vista: Home Premium x64 Edition, Service Pack 2, (06.00.6002.00)
Memory: 8.00 GB physical, 17.52 GB virtual

____________

Evan

Joined: Dec 23 05
Posts: 268
ID: 42505
Credit: 402,585
RAC: 0
Message 63744 - Posted 18 Oct 2009 12:57:55 UTC

Over the last week I've had no errors at all. That's the second week running. Well done again guys.


Yes, seconded! My last error report was an exception to the norm.

____________

der_Day

Joined: Apr 17 08
Posts: 1
ID: 253533
Credit: 192,713
RAC: 0
Message 63746 - Posted 18 Oct 2009 16:57:40 UTC

I've a lot of problems since friday. Most of my WUs broke after a few seconds, some produced an error message:
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C9201B3 write attempt to address 0x00BD97B2


See mega_lr8_seq_score12_rlbd_2hbo_IGNORE_THE_REST_DECOY_15198_2497
or mega_lr10_seq_score12_rlbd_1vkk_IGNORE_THE_REST_DECOY_15197_2829_0

Sometimes my wingmen can finish the WU, sometimes they fail too. What I've to do??

Evan

Joined: Dec 23 05
Posts: 268
ID: 42505
Credit: 402,585
RAC: 0
Message 63748 - Posted 18 Oct 2009 19:38:30 UTC

Yes, seconded! My last error report was an exception to the norm.


Oh dear, I talked too soon!

symm_lr8_seq_score12_ss_1.7_rlbd_1ttz_IGNORE_THE_REST_DECOY_14923_2951

ERROR: !core::conformation::symmetry::is_symmetric( pose )
ERROR:: Exit from: ..\..\src\core\optimization\AtomTreeMinimizer.cc line: 55
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

____________

Sid Celery

Joined: Feb 11 08
Posts: 806
ID: 241409
Credit: 10,030,156
RAC: 9,347
Message 63751 - Posted 18 Oct 2009 21:37:59 UTC - in response to Message ID 63748.

Oh dear, I talked too soon!

symm_lr8_seq_score12_ss_1.7_rlbd_1ttz_IGNORE_THE_REST_DECOY_14923_2951

ERROR: !core::conformation::symmetry::is_symmetric( pose )
ERROR:: Exit from: ..\..\src\core\optimization\AtomTreeMinimizer.cc line: 55
BOINC:: Error reading and gzipping output datafile: default.out

As did I...

symm_lr8_seq_score12_ss_1.7_rlbd_1h75_IGNORE_THE_REST_DECOY_14923_3191_1
____________

Nosferatu* Profile

Joined: Nov 27 05
Posts: 1
ID: 22128
Credit: 1,004,857
RAC: 0
Message 63752 - Posted 18 Oct 2009 21:38:25 UTC

Don't know quite where to post this but since the change to 1.98 all workunits result in Computation error. 5.98 is running just fine as are all other projects in boinc.

zibou

Joined: Sep 9 09
Posts: 2
ID: 344397
Credit: 559,314
RAC: 309
Message 63759 - Posted 19 Oct 2009 23:18:48 UTC

I have two different behaviour with the 1.98:

- On Windows XP, units fail after 5 minutes, calculation error.

- On Windows 2000, units run for 9 hours (my patience was over, normally 3), the percentage completed is still at 0%, and the time remaining does not move from the initial value. It does not let other projects with any time to run. Restarting Boinc manager resets the clock to zero, and it starts all over again with the same behaviour. I had to suspend the project until further fix.
____________

spiceyux Profile

Joined: Nov 9 05
Posts: 1
ID: 10656
Credit: 2,387,273
RAC: 2,467
Message 63760 - Posted 20 Oct 2009 1:28:07 UTC - in response to Message ID 63752.

Don't know quite where to post this but since the change to 1.98 all workunits result in Computation error. 5.98 is running just fine as are all other projects in boinc.


Just seconding this, I'm having the same behavior. Please advise if there is anything that I can report to help diagnose.
____________

borg

Joined: Dec 4 07
Posts: 3
ID: 224173
Credit: 142,556
RAC: 0
Message 63769 - Posted 21 Oct 2009 7:24:39 UTC

21.10.2009 9:14:49 rosetta@home Task lr8_A_seq_score12_shake_ss1.7_rlbd_1c8c_IGNORE_THE_REST_DECOY_14949_3992_0 exited with zero status but no 'finished' file
21.10.2009 9:14:49 rosetta@home If this happens repeatedly you may need to reset the project.

This was happening repeatedly. Finally I aborted the task.

AM Profile

Joined: Jul 15 06
Posts: 7
ID: 100201
Credit: 318,158
RAC: 49
Message 63772 - Posted 21 Oct 2009 15:10:43 UTC

This and other mini 1.98 WU's have been resource hogs lately.

http://boinc.bakerlab.org/rosetta/result.php?resultid=287892484
____________

googloo
Avatar

Joined: Sep 15 06
Posts: 105
ID: 112667
Credit: 6,304,481
RAC: 7,118
Message 63779 - Posted 23 Oct 2009 1:16:27 UTC

FYI

I aborted this one:
http://boinc.bakerlab.org/rosetta/result.php?resultid=289747880 because it had run nearly 30 minutes with 0% completed.

FWIW, my wingman had a Compute Error; see
http://boinc.bakerlab.org/rosetta/workunit.php?wuid=264327534

bradipopitt

Joined: Mar 4 06
Posts: 1
ID: 63544
Credit: 1,096,654
RAC: 0
Message 63793 - Posted 23 Oct 2009 13:08:37 UTC

Hej there,
I cannot run the new version anymore.
I always get such an error:

<core_client_version>6.6.38</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
[2009-10-23 9:33: 1:] :: BOINC:: Initializing ... ok.
[2009-10-23 9:33: 1:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C9101B3 write attempt to address 0x00BD0DA2

Engaging BOINC Windows Runtime Debugger...



Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C9101B3 write attempt to address 0x0045E5AE

Engaging BOINC Windows Runtime Debugger...


</stderr_txt>
]]>

Do you have any idea to overcome it?
For the moment I just run Rosetta beta.

good weekend
____________

Speedy
Avatar

Joined: Sep 25 05
Posts: 159
ID: 1058
Credit: 521,019
RAC: 10
Message 63801 - Posted 23 Oct 2009 20:55:36 UTC
Last modified: 23 Oct 2009 21:01:06 UTC

http://boinc.bakerlab.org/rosetta/result.php?resultid=289275155 Gave a compute error after running for roughly 1hr 06min 3812.641 seconds. This is a lr8 task task ID 289275155
____________
Have a crunching good day!!

Cesium_133* Profile
Avatar

Joined: Dec 1 08
Posts: 28
ID: 290631
Credit: 113,039
RAC: 2
Message 63803 - Posted 24 Oct 2009 1:50:31 UTC

I'll join the chorus on error-halted WU's (I assume this is the place to give notice about such things) :

symm_lr8_seq_score12_ss_1.7_rlbd_1npu_IGNORE_THE_REST_DECOY_14923_2726_0

Computation error after 1m 13s...

If this isn't the venue to report such glitches, though it seems to be, please let me know :) Incidentally, what % of WU's are found to be corrupted/abortive/unfinishable, and can they be rehabilitated for purposes of this project? :)

The inquiring layman's mind wishes to know... :)
____________
The lovely lady you see isn't I, but Hayley Westenra, a classical crossover singer from Christchurch, NZ. There is no known voice as hers. Check her out- she's seraphic.

macko Profile
Avatar

Joined: Jun 25 09
Posts: 32
ID: 323638
Credit: 152,285
RAC: 0
Message 63804 - Posted 24 Oct 2009 9:14:47 UTC - in response to Message ID 63736.

Second chance for this one also proved a failure:

lr8_A_seq_score12_shake_ss1.7_rlbd_1c8c_IGNORE_THE_REST_DECOY_14949_2196

Repeated lines in the stderr txt point to this error

Exception:
failure to read decoy F_00018_0004416_0_0001 from silent-file lr8_shake_1c8c.out
[2009-10-17 20:10:35:] :: BOINC:: Initializing ... ok.


Hi

Similar error with this:
http://boinc.bakerlab.org/rosetta/workunit.php?wuid=263361396
Exception:
failure to read decoy F_00023_0001585_0 from silent-file lr8_1lou.out
CPU time 15.23438

And a different one with these:
http://boinc.bakerlab.org/rosetta/workunit.php?wuid=263361627
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00436BD8 read attempt to address 0x018BD000

Engaging BOINC Windows Runtime Debugger...

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=263361627
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00436BD8 read attempt to address 0x018BD000

Engaging BOINC Windows Runtime Debugger...

After these 3 the program runs normal


____________

P . P . L .
Avatar

Joined: Aug 20 06
Posts: 581
ID: 105843
Credit: 4,864,105
RAC: 0
Message 63813 - Posted 25 Oct 2009 5:42:22 UTC
Last modified: 25 Oct 2009 5:43:14 UTC

This one failed many times.

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=262741617

Sun 25 Oct 2009 15:25:52 EST|rosetta@home|Output file lr8_score12_run03_rlbd_1ugh_IGNORE_THE_REST_DECOY_14712_835_2_0 for task absent

<message>
process exited with code 1 (0x1, -255)
</message>

ERROR: Illegal attempt to score with non-identical atom set between pose and etable
ERROR:: Exit from: src/core/scoring/etable/EtableEnergy.cc line: 72
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
____________


Sid Celery

Joined: Feb 11 08
Posts: 806
ID: 241409
Credit: 10,030,156
RAC: 9,347
Message 63849 - Posted 27 Oct 2009 1:24:36 UTC
Last modified: 27 Oct 2009 1:30:02 UTC

Just one failure this week, but one with some errors I haven't seen reported before:

symm_lr8_seq_score12_A_rlbd_1lou_IGNORE_THE_REST_DECOY_14880_3610_0

Outcome Client error
Client state Compute error
Exit status 1 (0x1)
CPU time 2.948419

stderr out <core_client_version>6.6.38</core_client_version>

...
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/mtyka_symm_lr8_seq_score12_A.zip
error: cannot create ./dummy
error: cannot create ./1gvp.symm
error: cannot create ./1a68.pdb.symm
error: cannot create ./1b3a.pdb.symm
error: cannot create ./1bmg.pdb.symm
error: cannot create ./1cg5.pdb.symm
error: cannot create ./1e6i.pdb.symm
error: cannot create ./1ew4.pdb.symm
error: cannot create ./1lve.pdb.symm
error: cannot create ./1pxu.pdb.symm
error: cannot create ./1rki.pdb.symm
error: cannot create ./1t2j.pdb.symm
error: cannot create ./1tif.pdb.symm
error: cannot create ./1tza.pdb.symm
error: cannot create ./1urn.pdb.symm
error: cannot create ./1vie.pdb.symm
error: cannot create ./1who.pdb.symm
error: cannot create ./1wty.pdb.symm
error: cannot create ./1zd0.pdb.symm
error: cannot create ./2hl7.pdb.symm
error: cannot create ./2hsb.pdb.symm
error: cannot create ./2i9c.pdb.symm
error: cannot create ./2iiy.pdb.symm
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/lr8_1lou.out.zip
Setting database description ...

...

Many, many more errors of a similar type before it closes down after 3 seconds

Edit: wingman errored out after 11 seconds but with very different errors reported. Best to take a look yourself. Well beyond me.
____________

Mike

Joined: Jun 28 09
Posts: 1
ID: 324073
Credit: 6,427
RAC: 0
Message 63850 - Posted 27 Oct 2009 1:54:02 UTC
Last modified: 27 Oct 2009 1:57:21 UTC

I found the following error when I returned to my computer this afternoon:

Unhandled exception at 0x7c9101b3 in minirosetta_1.98_windows_intelx86.exe: 0xC0000005: Access violation writing location 0x00450eb

Actually I got a dialog offering to run the Visual Studio debugger, (BOINC is running on a development machine) and this is what the debugger identified as the error.

This has happened twice before, possibly in the same work unit.

Addendum - this happens whenever the work unit runs - aborting it
Work Unit: symm_lr13_seq_score12_A_rlbd_1bkr_IGNORE_THE_REST_DECOY_15334_130_1


Environment:
OS: Windows XP Professional Service Pack 3
Processor: Intel Core2 CPU 6420@ 2.13GHz (2 CPUs)
Memory: 3584MB RAM
Page File: 2359MB used, 6421MB available

1. Do you want more debugging info if it happens again?
2. Should I abort the work unit?

-- Mike --

D.J.Lankenau

Joined: May 26 06
Posts: 1
ID: 84670
Credit: 614,508
RAC: 1
Message 63858 - Posted 27 Oct 2009 18:14:39 UTC

I would like to know if there is a way to avoid units that run under 1.98.
Not only am I loosing houres of crunching time for R@H but all other projects are being affected. At present I am manually aborting all 1.98 units but this is a hit and miss operation. The alternative is to suspend R@H until the problem is fixed.

Please advise
Doug Lankenau

Cesium_133* Profile
Avatar

Joined: Dec 1 08
Posts: 28
ID: 290631
Credit: 113,039
RAC: 2
Message 63859 - Posted 27 Oct 2009 18:44:18 UTC - in response to Message ID 63858.

Not only am I losing houres of crunching time for R@H but all other projects are being affected.


Happened to me too a while back... it's why I walked away from Rosetta and went to POEM. Their WU's don't foul up. I gave Rosetta time to fix their issues, and while my experience hasn't been as bad as the last poster, I have had 2-3 comp error aborts. Also, the graphics for 1.98 often won't show when prompted, a symptom of troubles generally in the code.

The alternative is to suspend R@H until the problem is fixed.


Managers of R@H, take note of that last line. The original poster, joined by myself, just said a mouthful... we value the use of our flops!
____________
The lovely lady you see isn't I, but Hayley Westenra, a classical crossover singer from Christchurch, NZ. There is no known voice as hers. Check her out- she's seraphic.

JeffT Profile

Joined: Dec 1 06
Posts: 2
ID: 132581
Credit: 3,242,558
RAC: 0
Message 63864 - Posted 28 Oct 2009 2:12:05 UTC

All my machines were getting Computation Errors.

Here is the specifics on one:

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x007480C0 read attempt to address 0x9800503B

Engaging BOINC Windows Runtime Debugger...

I pulled everything off of Rosetta.
____________

MikeMcC3

Joined: May 13 08
Posts: 2
ID: 258469
Credit: 501,309
RAC: 0
Message 63865 - Posted 28 Oct 2009 4:25:35 UTC

My computer has stopped DLing any new work from the rosetta project since
27 Oct 2009 3:20:11 UTC. I double-checked all my settings. Nothing has changed,
I'm just not receiving any new work. So what's happened now, any ideas?
____________

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 63866 - Posted 28 Oct 2009 4:31:39 UTC - in response to Message ID 63865.

My computer has stopped DLing any new work from the rosetta project since
27 Oct 2009 3:20:11 UTC. I double-checked all my settings. Nothing has changed,
I'm just not receiving any new work. So what's happened now, any ideas?


Whatcha getting for messages?
____________
Rosetta Moderator: Mod.Sense

Sid Celery

Joined: Feb 11 08
Posts: 806
ID: 241409
Credit: 10,030,156
RAC: 9,347
Message 63869 - Posted 28 Oct 2009 14:24:37 UTC - in response to Message ID 63865.

My computer has stopped DLing any new work from the rosetta project since
27 Oct 2009 3:20:11 UTC. I double-checked all my settings. Nothing has changed,
I'm just not receiving any new work. So what's happened now, any ideas?

I see you're getting tasks through from Einstein ok. Could it be something to do with debt on one project compared to another? Or the split of work between the two? Or just Boinc messing up scheduling again.

Additional info on what Boinc is reporting under the messages tab would help to pin the reason down.
____________

Yifan Song
Forum moderator
Project administrator
Project developer
Project scientist

Joined: May 26 09
Posts: 62
ID: 318024
Credit: 7,322
RAC: 0
Message 63886 - Posted 30 Oct 2009 18:22:46 UTC

Mike has figured out where the bugs are and is currently working on them.
there seems to be a conflict between the symmetry code and disulphide, which is why a lot of symm runs are failing.
also, there is an api bug for zip, which causes some of the i/o problems.

hopefully we'll be able to update at the beginning of next week.

Sid Celery

Joined: Feb 11 08
Posts: 806
ID: 241409
Credit: 10,030,156
RAC: 9,347
Message 63892 - Posted 31 Oct 2009 2:00:41 UTC - in response to Message ID 63869.

My computer has stopped DLing any new work from the rosetta project since
27 Oct 2009 3:20:11 UTC. I double-checked all my settings. Nothing has changed,
I'm just not receiving any new work. So what's happened now, any ideas?

I see you're getting tasks through from Einstein ok. Could it be something to do with debt on one project compared to another? Or the split of work between the two? Or just Boinc messing up scheduling again.

Additional info on what Boinc is reporting under the messages tab would help to pin the reason down.

Problem solved, I notice. And in a big way.

New problem - too many WUs! ;)
____________

Rob Lilley

Joined: Jan 11 06
Posts: 11
ID: 49465
Credit: 96,486
RAC: 21
Message 63893 - Posted 31 Oct 2009 9:59:40 UTC

Error on this WU after a couple of wasted hours of crunching, as follows:

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00436BD8 read attempt to address 0x016AE000

I was using this machine, running XP SP3 and usinb BOINC version 6.6.41

Will suspend work fetch and await developments...
____________

MarcoA

Joined: Sep 2 08
Posts: 9
ID: 276404
Credit: 777,433
RAC: 0
Message 63936 - Posted 3 Nov 2009 11:41:27 UTC

http://boinc.bakerlab.org/rosetta/result.php?resultid=292416155

Segfault...

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 63969 - Posted 6 Nov 2009 5:25:52 UTC

Let's leave the sticky on this thread until the existing 1.98 WUs have had 10 days to reach their expiration.
____________
Rosetta Moderator: Mod.Sense

svincent

Joined: Dec 30 05
Posts: 202
ID: 44923
Credit: 4,404,794
RAC: 6,286
Message 64024 - Posted 11 Nov 2009 22:36:58 UTC

This may be ancient history with the release of 2.0 but I had several tasks with names like threading_bongs_pipeline_hb* hang under Windows System 7 at random percentage completion values. Bringing up the graphics window simply resulted in a blank window: I had to abort the tasks.

Example: task 293254584

____________

svincent

Joined: Dec 30 05
Posts: 202
ID: 44923
Credit: 4,404,794
RAC: 6,286
Message 64025 - Posted 11 Nov 2009 22:37:33 UTC

This may be ancient history with the release of 2.0 but I had several tasks with names like threading_bongs_pipeline_hb* hang under Windows System 7 at random percentage completion values. Bringing up the graphics window simply resulted in a blank window: I had to abort the tasks.

Example: task 293254584

____________

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 64026 - Posted 11 Nov 2009 23:54:31 UTC

svincent, did you get a look at the task manager? Was the processing task getting CPU? It is possible that the graphic had a problem but the processing was continuing.
____________
Rosetta Moderator: Mod.Sense

svincent

Joined: Dec 30 05
Posts: 202
ID: 44923
Credit: 4,404,794
RAC: 6,286
Message 64027 - Posted 12 Nov 2009 1:11:50 UTC

svincent, did you get a look at the task manager? Was the processing task getting CPU? It is possible that the graphic had a problem but the processing was continuing.


Graphics worked OK for other workunits. I didn't take a look at the task manager (will do next time) but was going on the combination of the Progress and Elapsed Time fields in the Boinc Manager: the former was stuck and the latter kept going. I had one such task that went on over 25 hours before I aborted it: unfortunately my results page doesn't go back far enough to find it. On the other hand I did have some workunits named threading_bongs_* complete successfully.

Sorry about the previous double post.

____________

Sid Celery

Joined: Feb 11 08
Posts: 806
ID: 241409
Credit: 10,030,156
RAC: 9,347
Message 64029 - Posted 12 Nov 2009 3:27:40 UTC
Last modified: 12 Nov 2009 3:34:29 UTC

Two final 1.98 errors to report over the last week. The first on my main Vista desktop and one on my spiffy new W7 laptop.

threading_oct09_hb_t302__IGNORE_THE_REST_15390_247_1
{Edit: I was the wingman on this job after it failed for similar reasons on its first try}

Outcome Client error
Client state Compute error
Exit status -1073741819 (0xc0000005)
CPU time 15.8497

stderr out <core_client_version>6.6.41</core_client_version>
...
- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x005C28F0 read attempt to address 0x00000000


lr5_dun08_it02_B_rlbd_1eyv_SAVE_ALL_OUT_IGNORE_THE_REST_DECOY_15457_979_0
{Edit: the wingman failed on this job too, as a Mini 2.00 job}
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
CPU time 10215.96

stderr out <core_client_version>6.10.17</core_client_version>
...
ERROR: Option file open failed for: relax_options_lr5_dun08_it02_B_yfsong


No errors at all on either machine on Beta 5.98 or MiniRosetta 2.00 WUs.
____________

Message boards : Number crunching : Minirosetta 1.98


Home | Join | About | Participants | Community | Statistics

Copyright © 2017 University of Washington

Last Modified: 10 Nov 2010 1:51:38 UTC
Back to top ^