Posts by AMD_is_logical

21) Message boards : Number crunching : Minirosetta 1.98 (Message 63708)
Posted 16 Oct 2009 by AMD_is_logical
Post:
I'm getting some errors with lr8_score12_run03_rlbd WUs. They exit after a few seconds with the message:

ERROR: Illegal attempt to score with non-identical atom set between pose and etable
ERROR:: Exit from: src/core/scoring/etable/EtableEnergy.cc line: 72
BOINC:: Error reading and gzipping output datafile: default.out

Here's a few examples:
http://boinc.bakerlab.org/rosetta/result.php?resultid=288322249
http://boinc.bakerlab.org/rosetta/result.php?resultid=288322220
http://boinc.bakerlab.org/rosetta/result.php?resultid=288263880
http://boinc.bakerlab.org/rosetta/result.php?resultid=288239633
http://boinc.bakerlab.org/rosetta/result.php?resultid=288212752
22) Message boards : Number crunching : Minirosetta 1.97 (Message 63522)
Posted 29 Sep 2009 by AMD_is_logical
Post:
I've had a number of "frb" WUs run out of disk space with multi-gigabyte stderr.txt files. Those files were full of "bounds error" statements.

example:
http://boinc.bakerlab.org/rosetta/result.php?resultid=284162008
23) Message boards : Number crunching : Granted Credit taking forever.... (Message 63370)
Posted 15 Sep 2009 by AMD_is_logical
Post:
The Validators are finally catching up, and my pending list has been cut in half.

The home page still lists the TFLOPS as only 17. I don't think it's being calculated correctly in the new Validator configuration.
24) Message boards : Number crunching : SERVER PROBLEMS. (Message 63330)
Posted 14 Sep 2009 by AMD_is_logical
Post:
EDIT// Just a suggestion but you might have to stop sending out new tasks to let the validator catch up, other projects have done that.//

That wouldn't do any good. The Validator can't validate current tasks no matter how much time it is given. When I send back an old WU that the Validator can handle, it is promptly validated.

Example: http://boinc.bakerlab.org/rosetta/workunit.php?wuid=253620869

25) Message boards : Number crunching : SERVER PROBLEMS. (Message 63323)
Posted 14 Sep 2009 by AMD_is_logical
Post:
I got credit for a WU today:

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=253191930

Note that this WU is an old one that was never returned the first time, and was eventually resent to me. I had some similar WUs yesterday.

All of the many current WUs that I've returned are still pending.

So somehow all the current WUs are different in a way that the Validator can't handle, but old WUs validate just fine.
26) Message boards : Number crunching : Minirosetta 1.97 (Message 63216)
Posted 9 Sep 2009 by AMD_is_logical
Post:
I currently have over 100 results in my "pending" list. Also, I notice that I have quite a few results taking around 2 hours or less. (My runtime setting is 12 hours.)
27) Message boards : Number crunching : Minirosetta 1.90 and 1.91 (Message 62870)
Posted 11 Aug 2009 by AMD_is_logical
Post:
I've had a few looprebuild_t374 WUs exit in zero time with the message:

ERROR: Option matching -in:detect_disulfides not found in command line top-level context

http://boinc.bakerlab.org/rosetta/result.php?resultid=272007811
http://boinc.bakerlab.org/rosetta/result.php?resultid=271996598
http://boinc.bakerlab.org/rosetta/result.php?resultid=271963149
28) Message boards : Number crunching : Minirosetta 1.90 and 1.91 (Message 62702)
Posted 1 Aug 2009 by AMD_is_logical
Post:
I got a few errors on lr5_combine_mods_run01_rlbn WUs.

http://boinc.bakerlab.org/rosetta/result.php?resultid=269713462
http://boinc.bakerlab.org/rosetta/result.php?resultid=269758962
http://boinc.bakerlab.org/rosetta/result.php?resultid=269787057
http://boinc.bakerlab.org/rosetta/result.php?resultid=269811876

They end after about 10 seconds with the error:

Native pose needed for OptionKeys::relax::constrain_relax_to_native_coords
ERROR:: Exit from: src/protocols/relax/ClassicRelax.cc line: 544
BOINC:: Error reading and gzipping output datafile: default.out
29) Message boards : Number crunching : Minirosetta 1.82/1.88 (Message 62607)
Posted 30 Jul 2009 by AMD_is_logical
Post:
I'm getting a lot of 1.88 WUs with compute errors after zero time. They have the error message: "ERROR: Option matching -detect_disulf not found in command line top-level context"

A few examples:
http://boinc.bakerlab.org/rosetta/result.php?resultid=268993566
http://boinc.bakerlab.org/rosetta/result.php?resultid=268991528
http://boinc.bakerlab.org/rosetta/result.php?resultid=268991527
http://boinc.bakerlab.org/rosetta/result.php?resultid=268990691
http://boinc.bakerlab.org/rosetta/result.php?resultid=268990689
http://boinc.bakerlab.org/rosetta/result.php?resultid=268987949
http://boinc.bakerlab.org/rosetta/result.php?resultid=268987910
http://boinc.bakerlab.org/rosetta/result.php?resultid=268985221
http://boinc.bakerlab.org/rosetta/result.php?resultid=268984719
30) Message boards : Number crunching : SERVER PROBLEMS. (Message 62513)
Posted 27 Jul 2009 by AMD_is_logical
Post:
I'm getting 1.87 tasks now, and they seem to be crunching.

The server is very slow at the moment. I expect it will catch up after a while and then things will be back to normal.
31) Message boards : Number crunching : Minirosetta 1.82/1.88 (Message 62424)
Posted 25 Jul 2009 by AMD_is_logical
Post:
The WU looprebuild_t374_decoy_1_12863_2519 had a validate error for both crunchers after crunching a short amount of time and producing 99 decoys.
32) Message boards : Number crunching : Minirosetta 1.82/1.88 (Message 62252)
Posted 16 Jul 2009 by AMD_is_logical
Post:
This WU, lr8_seq_score12_ss5.0_rlbd_1ptq_IGNORE_THE_REST_DECOY_14281_1507_0, had a segmentation violation after 5 seconds on one of my Linux systems.
33) Message boards : Number crunching : Problems with Minirosetta 1.80 (Message 62184)
Posted 11 Jul 2009 by AMD_is_logical
Post:
I'm having a lot of errors from sel_core WUs. They crunch over 16 hours (4 hours over my 12 hour preference), then they exit claiming 1 decoy (although I suspect they didn't produce any decoys), then they error out with code -161 (file_xfer_error, probably because no decoys were generated).

http://boinc.bakerlab.org/rosetta/result.php?resultid=264546035
http://boinc.bakerlab.org/rosetta/result.php?resultid=264511109
http://boinc.bakerlab.org/rosetta/result.php?resultid=264503575
http://boinc.bakerlab.org/rosetta/result.php?resultid=264490536
http://boinc.bakerlab.org/rosetta/result.php?resultid=264476503
http://boinc.bakerlab.org/rosetta/result.php?resultid=264456236
http://boinc.bakerlab.org/rosetta/result.php?resultid=264403527
http://boinc.bakerlab.org/rosetta/result.php?resultid=264394564
34) Message boards : Number crunching : Help with Linux Installation (Message 62144)
Posted 9 Jul 2009 by AMD_is_logical
Post:
I did some simple math and 1200 seconds is 20 minutes, 10,000 divided by 20 equals 500. 3 times per hour, times 24 hours, equals 72 times per day, divided into 500 equals about 7 days. Which is how many times you can write before it becomes susceptible to dying, more or less.


Why did you divide 10,000 by 20?

10,000 writes divided by 72 writes per day gives 139 days.
35) Message boards : Number crunching : Problems with Minirosetta 1.80 (Message 62119)
Posted 7 Jul 2009 by AMD_is_logical
Post:
This one is taking 689MB of memory, peak was 986MB!
2a05_NN_DISCONTROL_BOINC_ABRELAX_SAVE_ALL_OUT_13840
It is 20hrs in to a 24hr runtime on Windows XP, under BOINC 6.6.20.


Here's a 2a05_NN_DISCONTROL_BOINC_ABRELAX_SAVE_ALL_OUT_13840 WU that ran on a single core diskless Linux node with 1GB installed. It ended with a bad_alloc error, which means the node ran out of physical memory. I've had a number of bad_alloc errors on 512MB nodes (which I no longer crunch with), but now it seems 1GB/core may no longer be enough for Rosetta.
36) Message boards : Number crunching : Problems with Minirosetta 1.80 (Message 62022)
Posted 30 Jun 2009 by AMD_is_logical
Post:
I've now had quite a lot of WUs run for 4 hours over my run time of 12 hours, and then get ended by the watchdog. They always report one decoy being made, although, in fact, no decoys seem to have been produced. They then have a file xfer error (-161), presumably because there was no output file.

here's yet another example: http://boinc.bakerlab.org/rosetta/result.php?resultid=262096625

Note that this ran over 16 hours on a Phenom II, yet produced no output.
37) Message boards : Number crunching : Problems with Minirosetta 1.80 (Message 61995)
Posted 29 Jun 2009 by AMD_is_logical
Post:
I've had a number of real_core_1.5_low200_beta_low200_start_ WUs go 4 hours past my runtime and they were presumably ended by the watchdog. They all claim 1 decoy and were marked invalid.

http://boinc.bakerlab.org/rosetta/result.php?resultid=261815816
http://boinc.bakerlab.org/rosetta/result.php?resultid=261768023
http://boinc.bakerlab.org/rosetta/result.php?resultid=261765649
http://boinc.bakerlab.org/rosetta/result.php?resultid=261722487
38) Message boards : Number crunching : Problems with Minirosetta v1.54 (Message 59645)
Posted 18 Feb 2009 by AMD_is_logical
Post:
I've had three ss-neg-1i17__7365 WUs fail with segmentation violations on three different linux machines:

http://boinc.bakerlab.org/rosetta/result.php?resultid=229167706
http://boinc.bakerlab.org/rosetta/result.php?resultid=229161990
http://boinc.bakerlab.org/rosetta/result.php?resultid=229084435

(I notice that only the third number is different in the stack traces of the above three WUs.)
39) Message boards : Number crunching : new version of BOINC + CUDA support (Message 59403)
Posted 6 Feb 2009 by AMD_is_logical
Post:
For those that want to read the article dcdc was referring to: Intel will design PlayStation 4 GPU


Sony denies this rumor http://www.t3.com/news/sony-denies-ps4-intel-gpu-rumour?=38046&cid=OTC-RSS&attr=T3-Main-RSS
40) Message boards : Number crunching : Tired of 11 hours (Message 59234)
Posted 2 Feb 2009 by AMD_is_logical
Post:
????

I looked at the WUs your computers completed and most are around 6 hours or less. (6 hours is 21,600 seconds.)


Previous 20 · Next 20



©2025 University of Washington
https://www.bakerlab.org