Message boards : Number crunching : Problems with Minirosetta Version 1.64/1.65
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
LizzieBarry Send message Joined: 25 Feb 08 Posts: 76 Credit: 201,862 RAC: 0 |
I've tracked down the crashes in the docking runs to 4 runs within the batch. I've now fixed the problem and should rerun those soon (first on ralph). I think I got one of those: Docking_benchmark_natives__2KAI.mppk.pdb.gzdock_score12_hi.xml_11809_6_0 It looks like it ran for 10 seconds but in fact it ran for over two hours 06/05/2009 08:45:01 rosetta@home Starting task Docking_benchmark_natives__2KAI.mppk.pdb.gzdock_score12_hi.xml_11809_6_0 using minirosetta version 165 06/05/2009 08:48:22 rosetta@home Restarting task Docking_benchmark_natives__2KAI.mppk.pdb.gzdock_score12_hi.xml_11809_6_0 using minirosetta version 165 06/05/2009 08:51:46 rosetta@home Restarting task Docking_benchmark_natives__2KAI.mppk.pdb.gzdock_score12_hi.xml_11809_6_0 using minirosetta version 165 [...] [...] 06/05/2009 10:53:38 rosetta@home Restarting task Docking_benchmark_natives__2KAI.mppk.pdb.gzdock_score12_hi.xml_11809_6_0 using minirosetta version 165 06/05/2009 10:57:02 rosetta@home Restarting task Docking_benchmark_natives__2KAI.mppk.pdb.gzdock_score12_hi.xml_11809_6_0 using minirosetta version 165 06/05/2009 11:02:22 rosetta@home Computation for task Docking_benchmark_natives__2KAI.mppk.pdb.gzdock_score12_hi.xml_11809_6_0 finished I the absence of stating which jobs they are I'll be aborting all dockiing...11809_*.* jobs |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
I've tracked down the crashes in the docking runs to 4 runs within the batch. I've now fixed the problem and should rerun those soon (first on ralph). If this was one of those jobs, why has it gone back out to another user? https://boinc.bakerlab.org/rosetta/workunit.php?wuid=227006233 |
TomaszPawel Send message Joined: 28 Apr 07 Posts: 54 Credit: 2,791,145 RAC: 0 |
Hi I have first error of 1.65 https://boinc.bakerlab.org/rosetta/result.php?resultid=248333213 <core_client_version>6.6.20</core_client_version> <![CDATA[ <message> - exit code -1073741680 (0xc0000090) </message> <stderr_txt> [2009- 5- 6 18: 7: 0:] :: BOINC:: Initializing ... ok. [2009- 5- 6 18: 7: 0:] :: BOINC :: boinc_init() BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Setting WU description ... Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev29025.zip Setting database description ... Setting up checkpointing ... Setting up folding (abrelax) ... Beginning folding (abrelax) ... BOINC:: Worker startup. Starting watchdog... Watchdog active. Starting work on structure: _U18X26X_00001 Starting work on structure: _U18X26X_00002 Starting work on structure: _U18X26X_00003 Starting work on structure: _U18X26X_00004 Starting work on structure: _U18X26X_00005 Starting work on structure: _U18X26X_00006 Starting work on structure: _U18X26X_00007 Starting work on structure: _U18X26X_00008 Starting work on structure: _U18X26X_00009 Starting work on structure: _U18X26X_00010 Starting work on structure: _U18X26X_00011 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Float Invalid Operation (0xc0000090) at address 0x0050A2B9 Engaging BOINC Windows Runtime Debugger... </stderr_txt> ]]> 2,2 h lost.... and for 1.64 https://boinc.bakerlab.org/rosetta/result.php?resultid=247697671 <core_client_version>6.6.20</core_client_version> <![CDATA[ <message> - exit code -1073741680 (0xc0000090) </message> <stderr_txt> [2009- 5- 4 20:20:35:] :: BOINC:: Initializing ... ok. [2009- 5- 4 20:20:35:] :: BOINC :: boinc_init() BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Setting WU description ... Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev29025.zip Setting database description ... Setting up checkpointing ... Setting up folding (abrelax) ... Beginning folding (abrelax) ... BOINC:: Worker startup. Starting watchdog... Watchdog active. Starting work on structure: _U23X25X_00001 Starting work on structure: _U23X25X_00002 Starting work on structure: _U23X25X_00003 Starting work on structure: _U23X25X_00004 Starting work on structure: _U23X25X_00005 Starting work on structure: _U23X25X_00006 Starting work on structure: _U23X25X_00007 Starting work on structure: _U23X25X_00008 Starting work on structure: _U23X25X_00009 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Float Invalid Operation (0xc0000090) at address 0x0050AEE9 Engaging BOINC Windows Runtime Debugger... </stderr_txt> ]]> almost 2h lost... WWW of Polish National Team - Join! Crunch! Win! |
ViVac Send message Joined: 10 Dec 08 Posts: 4 Credit: 117,352 RAC: 0 |
aonther (0xc0000005) error here. https://boinc.bakerlab.org/rosetta/result.php?resultid=249318829 |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
task 249354645 failed on Mac during initialization. It's actually the 3rd time it was sent out (16th April was the first: suspect this predates version 1.65) Watchdog active. # cpu_run_time_pref: 14400 Hbond tripped: [2009- 5- 6 21:40:20:] ERROR: dis==0 in pairtermderiv! ERROR:: Exit from: src/core/scoring/methods/PairEnergy.cc line: 333 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish </stderr_txt> ]]> |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
This task 249446890 (lr8_seq_score12_rlbd_1wd6_IGNORE_THE_REST_DECOY_11810_826_0) looks OK in the BOINC task pane on Mac but the graphics are strange. No protein is displayed and the step remains stuck at 0, although the task number slowly increments as usual. |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
This task 249446890 (lr8_seq_score12_rlbd_1wd6_IGNORE_THE_REST_DECOY_11810_826_0) looks OK in the BOINC task pane on Mac but the graphics are strange. I have a couple of these exhibiting the same odd graphics on my Mac as well. lr8_seq_score12_rlbd_4ubp_IGNORE_THE_REST_DECOY_11810_866_0 has finished successfully albeit very quickly; it completed 99 models in 12737 seconds. Snags |
Speedy Send message Joined: 25 Sep 05 Posts: 163 Credit: 808,337 RAC: 1 |
lr8 tasks are known to finish way before the run time pref is met. This is perfectly normal I have had a number of these on my windows host. Have a crunching good day!! |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Hi. I've noticed that too, the tasks starting with lr8_seq_score12_rlbd_ the graphics are mostly blank. The only thing working is the time & models count, stage says unknown! Otherwise they run O.K. pete. |
Betting Slip Send message Joined: 26 Sep 05 Posts: 71 Credit: 5,702,246 RAC: 0 |
https://boinc.bakerlab.org/rosetta/result.php?resultid=249163776 <core_client_version>6.6.20</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> [2009- 5- 6 8:46:50:] :: BOINC:: Initializing ... ok. [2009- 5- 6 8:46:50:] :: BOINC :: boinc_init() BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Setting WU description ... Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev29025.zip Setting database description ... Setting up checkpointing ... Setting up folding (abrelax) ... Beginning folding (abrelax) ... BOINC:: Worker startup. Starting watchdog... Watchdog active. Starting work on structure: _U6X12X_00001 # cpu_run_time_pref: 86400 Starting work on structure: _U6X12X_00002 Starting work on structure: _U6X12X_00003 Starting work on structure: _U6X12X_00004 Starting work on structure: _U6X12X_00005 Starting work on structure: _U6X12X_00006 Starting work on structure: _U6X12X_00007 Starting work on structure: _U6X12X_00008 Starting work on structure: _U6X12X_00009 Starting work on structure: _U6X12X_00010 Starting work on structure: _U6X12X_00011 Starting work on structure: _U6X12X_00012 Starting work on structure: _U6X12X_00013 Starting work on structure: _U6X12X_00014 Starting work on structure: _U6X12X_00015 Starting work on structure: _U6X12X_00016 Starting work on structure: _U6X12X_00017 Starting work on structure: _U6X12X_00018 Starting work on structure: _U6X12X_00019 Starting work on structure: _U6X12X_00020 Starting work on structure: _U6X12X_00021 Starting work on structure: _U6X12X_00022 Starting work on structure: _U6X12X_00023 Starting work on structure: _U6X12X_00024 Starting work on structure: _U6X12X_00025 Starting work on structure: _U6X12X_00026 Starting work on structure: _U6X12X_00027 Starting work on structure: _U6X12X_00028 Starting work on structure: _U6X12X_00029 Starting work on structure: _U6X12X_00030 Starting work on structure: _U6X12X_00031 Starting work on structure: _U6X12X_00032 Starting work on structure: _U6X12X_00033 Starting work on structure: _U6X12X_00034 Starting work on structure: _U6X12X_00035 Starting work on structure: _U6X12X_00036 Starting work on structure: _U6X12X_00037 Starting work on structure: _U6X12X_00038 Starting work on structure: _U6X12X_00039 [2009- 5- 7 18:23:19:] :: BOINC:: Initializing ... ok. [2009- 5- 7 18:23:19:] :: BOINC :: boinc_init() BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Setting WU description ... Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev29025.zip Setting database description ... Setting up checkpointing ... Setting up folding (abrelax) ... Beginning folding (abrelax) ... BOINC:: Worker startup. Starting watchdog... Watchdog active. # cpu_run_time_pref: 86400 Starting work on structure: _U6X12X_00039 Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage_1 ... success! Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage_2 ... success! Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage_3_iter1_1 ... success! Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage_3_iter1_2 ... success! Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage_3_iter1_3 ... success! Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage_3_iter1_4 ... success! Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage_3_iter1_5 ... success! Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage_3_iter1_6 ... success! Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage_3_iter1_7 ... success! Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage_3_iter1_8 ... success! Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage_3_iter1_9 ... success! Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage_3_iter1_10 ... success! Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage4_kk_1 ... success! Continuing computation from checkpoint: chk_S_U6X12X_00000039_ClassicAbinitio__stage4_kk_2 ... success! Starting work on structure: _U6X12X_00040 [2009- 5- 7 19:14:22:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:15: 3:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:15:44:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:16:25:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:17: 7:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:17:48:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:18:29:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:19:10:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:19:51:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:32:22:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:33: 3:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:33:45:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:34:26:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:35: 7:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:35:48:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:36:29:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:37:11:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:37:52:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:47:39:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:48:21:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:49: 2:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:49:43:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:50:24:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:51: 6:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:51:47:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:52:28:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:53: 9:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:53:50:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:54:32:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:55:13:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:55:54:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:58:39:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 19:59:21:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20: 1:41:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20: 2:22:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:20:17:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:20:58:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:21:40:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:22:21:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:23: 2:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:23:43:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:24:24:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:25: 6:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:25:47:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:26:28:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:27: 9:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:27:50:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:28:32:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:29:13:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:29:54:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:30:35:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:31:16:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:31:58:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:32:39:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:33:20:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:34: 1:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:34:43:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:35:24:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:36: 5:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:36:46:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:37:28:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:40:35:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:41:16:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:41:57:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:42:39:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:45:45:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:46:26:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:47: 7:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:52:20:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:53: 0:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:53:42:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:54:23:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:55: 4:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:55:45:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:56:27:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:57: 8:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:57:49:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:58:31:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:59:12:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 20:59:53:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21: 0:34:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21: 1:16:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21: 1:57:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21: 2:38:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21: 3:19:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21: 4: 0:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:25:23:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:26: 4:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:26:45:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:27:27:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:28: 8:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:28:49:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:29:30:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:30:12:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:30:53:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:31:34:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:32:15:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:32:57:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:33:38:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:34:19:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting [2009- 5- 7 21:37: 4:] :: BOINC:: Initializing ... ok. Can't acquire lockfile - exiting </stderr_txt> ]]> Validate state Invalid Claimed credit 188.147749656091 Granted credit 0 application version 1.65 |
dgnuff Send message Joined: 1 Nov 05 Posts: 350 Credit: 24,773,605 RAC: 0 |
A question for all the "error experts". I've got a task running on one of my machines that appears to have stalled. Here's the line from the messages where it started: Silver rosetta@home 5/4/2009 18:05:05 Starting abinitio_norelax_homfrag_natfrag_129_B_2hkvA_SAVE_ALL_OUT_6252_7736_0 This is from BoincView, since that program allows me to monitor all my farm machines from one place. What's a concern is that after almost 4 days, it's still taking a slot up on the machine. CPU efficiency reports as zero (not a good sign), CPU consumed is 1:48:20, however the to completion time is off the scale: 140:06:02. What I'd like to know is the most useful things I can do to get information about this back to Bakerlab. That includes anything up to and including attaching a debugger to it and poking around inside the process. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
dnuff, what is the current status of the task? "Running"? What platform are you running on? What BOINC version? Rosetta Moderator: Mod.Sense |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
I've just has a couple of 1.67 tasks (the first and only two I've received) fail on me in termination on Mac. This happened on Ralph 1.67 also: I reported it then and it should have been fixed. 249719353 249691873 Crashed executable name: minirosetta_1.67_i686-apple-darwin built using BOINC library version 6.5.0 Machine type Intel 80486 (32-bit executable) System version: Macintosh OS 10.4.11 build 8S2167 Fri May 8 14:26:41 2009 Thread 0 Crashed: 0 ...etta_1.67_i686-apple-darwin 0x00efc851 __ZN7utility7signals9SignalHubIvN4core12conformation7signals16DestructionEventEE11send_signalES5_ + 1593 1 ...etta_1.67_i686-apple-darwin 0x0002a4a7 __ZN4core12conformation12ConformationD1Ev + 7373 2 ...etta_1.67_i686-apple-darwin 0x000910d0 __ZN4core4pose4PoseD1Ev + 4652 3 ...etta_1.67_i686-apple-darwin 0x00518bdc __ZN9protocols3jd214JobDistributor2goEN7utility7pointer10owning_ptrINS_5moves5MoverEEE + 1730 4 ...etta_1.67_i686-apple-darwin 0x00b59c20 __ZN9protocols3jd219BOINCJobDistributor2goEN7utility7pointer10owning_ptrINS_5moves5MoverEEE + 42 5 ...etta_1.67_i686-apple-darwin 0x0013b068 __ZN9protocols8abinitio24Loopbuild_Threading_mainEv + 720 6 ...etta_1.67_i686-apple-darwin 0x00005db8 _main + 7640 7 ...etta_1.67_i686-apple-darwin 0x0000292e __start + 216 8 ...etta_1.67_i686-apple-darwin 0x00002855 start + 41 And could we have a new thread for 1.67 please? |
nick n Send message Joined: 26 Aug 07 Posts: 49 Credit: 219,102 RAC: 0 |
Yeah we should start a 1.67 thread |
dgnuff Send message Joined: 1 Nov 05 Posts: 350 Credit: 24,773,605 RAC: 0 |
dnuff, what is the current status of the task? "Running"? What platform are you running on? What BOINC version? Platform: Windows XP SP 3 Boinc client version: 6.4.7 Hardware: Intel Q6600 quad core, 2 GB memory Link to the host. While I'm at it, Task details and workunit details According to boincmgr, it's running, high priority. Keeping that in mind, I ran a VNC session to get to the desktop of the system in question. Task manager shows the process is present, but not using any CPU time. |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
I just had one of these crash upon completion on my Mac as well: threading_lb_test1_hb_t373__IGNORE_THE_REST_11850_55_0 # cpu_run_time_pref: 36000 SIGBUS: bus error Crashed executable name: minirosetta_1.67_i686-apple-darwin built using BOINC library version 6.5.0 Machine type Intel 80486 (32-bit executable) System version: Macintosh OS 10.5.6 build 9G55 Fri May 8 23:50:54 2009 sh: /usr/bin/atos: No such file or directory 0 0x006c0345 SIGPIPE: write on a pipe with no reader 1 0x004a3d8e SIGPIPE: write on a pipe with no reader 2 0x90a9e2bb SIGPIPE: write on a pipe with no reader 3 0xffffffff SIGPIPE: write on a pipe with no reader 4 0x0002a4a7 SIGPIPE: write on a pipe with no reader 5 0x000910d0 SIGPIPE: write on a pipe with no reader 6 0x00518bdc SIGPIPE: write on a pipe with no reader 7 0x00b59c20 SIGPIPE: write on a pipe with no reader 8 0x0013b068 SIGPIPE: write on a pipe with no reader 9 0x00005db8 SIGPIPE: write on a pipe with no reader 10 0x0000292e SIGPIPE: write on a pipe with no reader 11 0x00002855 Thread 0 crashed with X86 Thread State (32-bit): eax: 0xffffffe1 ebx: 0x90a66802 ecx: 0xbfffc26c edx: 0x90a321c6 edi: 0x00000000 esi: 0x00000000 ebp: 0xbfffc2a8 esp: 0xbfffc26c ss: 0x0000001f efl: 0x00000206 eip: 0x90a321c6 cs: 0x00000007 ds: 0x0000001f es: 0x0000001f fs: 0x00000000 gs: 0x00000037 |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
dgnuff, when I've seen that before, it always seems it is with that era of BOINC Manager installed. It seems to lose track of the tasks sometimes and now you probably have an idle CPU. I suggest you suspend and release the task and see if it wakes up. If not, suspend and release it for a minute or so, and after 5 times the watchdog will kick it out for not making progress after 5 restarts and it will report back. Rosetta Moderator: Mod.Sense |
dgnuff Send message Joined: 1 Nov 05 Posts: 350 Credit: 24,773,605 RAC: 0 |
dgnuff, when I've seen that before, it always seems it is with that era of BOINC Manager installed. It seems to lose track of the tasks sometimes and now you probably have an idle CPU. I suggest you suspend and release the task and see if it wakes up. If not, suspend and release it for a minute or so, and after 5 times the watchdog will kick it out for not making progress after 5 restarts and it will report back. Thanks for the information. Suspending and resuming it a couple of times unlocked it. I'll keep an eye on it for now, since the completion time is totally wrong. However, the "completion percentage" is increasing at a rate that's correct for my standard run time of 12 hours. -- Edit -- You indicated that a 6.4.? client can cause this. I notice that the current version for Windows is 6.6.20. Do you think it would be worth my time to download and install that to avoid this problem in the future? |
Speedy Send message Joined: 25 Sep 05 Posts: 163 Credit: 808,337 RAC: 1 |
I'm not Mod.Sense In my view it's always best to update to the latest stable version of Boinc Especially if a earlier version is causing a problem as was indicated in Mod. Sense post above. Have a crunching good day!! |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
There have been other problems with the current BOINC release, so I wasn't trying to push you that direction. Just to say that the situation wasn't unique to your machine or the task, or Rosetta version. Yes, I wouldn't make too much of the completion estimate. It was thrown off by the period of time with no CPU. Now that it is crunching again, it should end in a normal time period for your tasks (I mean, based on your runtime preference plus up to 4 hours). Rosetta Moderator: Mod.Sense |
Message boards :
Number crunching :
Problems with Minirosetta Version 1.64/1.65
©2024 University of Washington
https://www.bakerlab.org