Posts by rhb

1) Message boards : Number crunching : Rosetta@Home version 3.31 (Message 73394)
Posted 8 Jul 2012 by rhb
Post:
There are apparently still some bad workunits in the system. I just ran:

Task ID 517983290
Name rb_07_06_32293_62980__t000__3_C1_SAVE_ALL_OUT_IGNORE_THE_REST_52024_1284_1
Workunit 471496494

which gave up after xml parsing errors:

http://boinc.bakerlab.org/rosetta/result.php?resultid=517983290

BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Tag::read - parse error, printing backtrace.

Tag::read - parse error - file:istream line:3 column:1 - </TASKOPERATIONS>
Tag::read - parse error - file:istream line:3 column:1 - ^

... etc.


Tag::read - parse error - file:istream line:1 column:1 - <dock_design>
Tag::read - parse error - file:istream line:1 column:1 - ^


ERROR: false
ERROR:: Exit from: src/utility/tag/Tag.cc line: 387
BOINC:: Error reading and gzipping output datafile: default.out
2) Message boards : Number crunching : WU sits idle (Message 73298)
Posted 18 Jun 2012 by rhb
Post:
I just aborted a task which sat for over a day using only a few minutes of cpu time.

rosetta@home 3.31 minirosetta rb_06_16_31956_62768__t000__0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_51380_4798_0 00:04:59 (00:04:57) 6/18/2012 12:19:54 PM 99.33 Aborted by user scoot

Task ID 513743479
Name rb_06_16_31956_62768__t000__0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_51380_4798_0
Workunit 467811038
Created 17 Jun 2012 18:36:53 UTC
Sent 17 Jun 2012 18:37:36 UTC
Received 18 Jun 2012 16:20:34 UTC
Server state Over
Outcome Client error
Client state Aborted by user
Exit status -197 (0xffffffffffffff3b)
Computer ID 1544897
Report deadline 27 Jun 2012 18:37:36 UTC
CPU time 297.5626
stderr out <core_client_version>6.12.33</core_client_version>
<![CDATA[
<message>
aborted by user
</message>
]]>
Validate state Invalid
Claimed credit 3.22860824380819
Granted credit 0
application version 3.31

I have had success with other wus, so I will try another one. In fact I have one running normally now.
3) Message boards : Number crunching : Problems with Minirosetta 1.76 (Message 61863)
Posted 20 Jun 2009 by rhb
Post:
I have one task with a missing output file, the same problem as Message 61813.

20-Jun-2009 00:37:51 [rosetta@home] Computation for task wRMSF_1_5_core_jumps_mixcst2_hb_t369__IGNORE_THE_REST_12927_2156_0 finished
20-Jun-2009 00:37:51 [rosetta@home] Output file wRMSF_1_5_core_jumps_mixcst2_hb_t369__IGNORE_THE_REST_12927_2156_0_0 for task wRMSF_1_5_core_jumps_mixcst2_hb_t369__IGNORE_THE_REST_12927_2156_0 absent

http://boinc.bakerlab.org/rosetta/result.php?resultid=259829534

Task ID 259829534
Name wRMSF_1_5_core_jumps_mixcst2_hb_t369__IGNORE_THE_REST_12927_2156_0
Workunit 237149781

InternalDecoyCount: protocols::boinc::Boinc::decoy_count() (GZ)
======================================================
DONE :: 1 starting structures 25455.1 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================
called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
<file_name>wRMSF_1_5_core_jumps_mixcst2_hb_t369__IGNORE_THE_REST_12927_2156_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>

</message>
4) Message boards : Number crunching : Minirosetta v1.47 bug thread. (Message 58052)
Posted 20 Dec 2008 by rhb
Post:
I had a computation error. Running Ubuntu Linux 6.06, Boinc 5.4.9.
This is the first error I've seen in the last two weeks.

http://boinc.bakerlab.org/rosetta/result.php?resultid=215760302

Task ID 215760302
Name cs_noe_fullw_nolin_homo_bench_cs_noe_abrelax_cs_nsp1_olange_5608_24330_0
Workunit 196639962

<core_client_version>5.4.9</core_client_version>
<message>
process exited with code 193 (0xc1)
</message>
<stderr_txt>
*** glibc detected *** double free or corruption (!prev): 0x0bd2d980 ***
SIGABRT: abort called

5) Message boards : Number crunching : Download files -- Incomplete reads of < 5k (Message 45677)
Posted 1 Sep 2007 by rhb
Post:
I am getting incomplete reads of < 5k on files Boinc attempts to download, leading to file trunction, checksum or signature error, and an error 200 wu failure. It appears that all files are failing.

I am transferring large files successfully at several projects, but I've also had trouble at several projects. Usually it's been a single type of file that consistently fails. There are several other users who have reported similar problems. The problem seems to be more severe here than any other occurence I've seen (apparently less bytes transferred over longer time period).

I am running Linux 6.06, Boinc 5.4.9. I communicate via a wireless lan with no external proxy, but there is an internet filter which could be involved.

Here is a sample of the messages:

Sat 01 Sep 2007 03:25:41 PM EDT|rosetta@home|Requesting 34560 seconds of new work
Sat 01 Sep 2007 03:26:34 PM EDT|rosetta@home|Scheduler request succeeded
Sat 01 Sep 2007 03:26:36 PM EDT|rosetta@home|Started download of file 1ffk.vall_torsions.gz
Sat 01 Sep 2007 03:26:36 PM EDT|rosetta@home|Started download of file avgE_from_pdb.gz
Sat 01 Sep 2007 03:26:56 PM EDT|rosetta@home|Incomplete read of less than 5KB for 1ffk.vall_torsions.gz - truncating
Sat 01 Sep 2007 03:26:56 PM EDT|rosetta@home|Incomplete read of less than 5KB for avgE_from_pdb.gz - truncating
Sat 01 Sep 2007 03:26:56 PM EDT|rosetta@home|Finished download of file 1ffk.vall_torsions.gz
Sat 01 Sep 2007 03:26:56 PM EDT|rosetta@home|Throughput 104 bytes/sec
Sat 01 Sep 2007 03:26:56 PM EDT|rosetta@home|Finished download of file avgE_from_pdb.gz
Sat 01 Sep 2007 03:26:56 PM EDT|rosetta@home|Throughput 105 bytes/sec
Sat 01 Sep 2007 03:26:56 PM EDT|rosetta@home|Started download of file bbdep02.May.sortlib.gz
Sat 01 Sep 2007 03:26:56 PM EDT|rosetta@home|Started download of file bb_hbW.gz
Sat 01 Sep 2007 03:26:56 PM EDT|rosetta@home|Checksum or signature error for 1ffk.vall_torsions.gz
Sat 01 Sep 2007 03:26:56 PM EDT|rosetta@home|Checksum or signature error for avgE_from_pdb.gz
Sat 01 Sep 2007 03:27:12 PM EDT|rosetta@home|Unrecoverable error for result 1gidA_BOINC_MG_CHAINBREAK5_SUBSET2_RNA_ABINITIO_RNA_CONTACT_RNA_LONG_RANGE_CONTACT_RNA_SASA-1gidA-_2042_43456_0 (WU download error: couldn't get input files:<file_xfer_error> <file_name>1ffk.vall_torsions.gz</file_name> <error_code>-200</error_code></file_xfer_error><file_xfer_error> <file_name>avgE_from_pdb.gz</file_name> <error_code>-200</error_code></file_xfer_error>)






©2024 University of Washington
https://www.bakerlab.org