Message boards : Number crunching : Report stuck & aborted WU here please
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 18 · Next
Author | Message |
---|---|
Marie Lucie Send message Joined: 9 Dec 05 Posts: 5 Credit: 40,616 RAC: 0 |
Same problem. It achieved +/- 26% AFTER41 minutes 25/02/2006 12:33:35|rosetta@home|Pausing result HBLR_1.0_1n0u_321_977_0 (removed from memory) 25/02/2006 12:33:47|rosetta@home|Unrecoverable error for result HBLR_1.0_1n0u_321_977_0 ( - exit code -1073741819 (0xc0000005)) 25/02/2006 12:33:47||request_reschedule_cpus: process exited 25/02/2006 12:33:47|rosetta@home|Computation for result HBLR_1.0_1n0u_321_977_0 finished |
Insidious Send message Joined: 10 Nov 05 Posts: 49 Credit: 604,937 RAC: 0 |
Same problem. It achieved +/- 26% AFTER41 minutes I think you need to set your preferences to "leave in memory while suspended" to YES to help with this. -Sid Proudly crunching with TeAm Anandtech |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
Same problem. It achieved +/- 26% AFTER41 minutes You computers are "Hidden" so I cannot look at your WU errors to see what the problem might be. So the advice from "Insidious" to set "keep application in memory during switches" = "YES", would be the only advice I can give as well. Moderator9 ROSETTA@home FAQ Moderator Contact |
Steve Shedroff Send message Joined: 7 Nov 05 Posts: 11 Credit: 250,657 RAC: 0 |
This work unit was at 1% after 86 hours with 96 hours to go so I aborted it. 2/26/2006 1:15:09 PM|rosetta@home|Unrecoverable error for result PRODUCTION_ABINITIO_INCREASECYCLES50_2acy__317_316_0 (aborted via GUI RPC) Regards, Steve |
Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0 |
workunit 9626703 (ABINITbg_hom023_1bgf__320_10) got stuck at Model 1, Step 20710 for 14 CPU hours, with 25+ to go, until I noticed. I exited & restarted BOINC, it started the WU all over from scratch, but it again (after running for a few seconds) stuck at M1,S20710. I'm aborting it. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity |
chris mathews Send message Joined: 21 Jan 06 Posts: 1 Credit: 1,192,890 RAC: 0 |
Had never received this error until yesterday: 2006-02-26 15:06:35 [rosetta@home] Unrecoverable error for result ABINITbq_hom014_1bq9A_320_100_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 15:38:11 [rosetta@home] Unrecoverable error for result ABINITcg_hom025_1cg5B_320_2_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 15:48:43 [rosetta@home] Unrecoverable error for result ABINITce_hom005_1cei__320_11_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 15:59:32 [rosetta@home] Unrecoverable error for result ABINITce_hom008_1cei__320_15_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 16:28:40 [rosetta@home] Unrecoverable error for result ABINITce_hom012_1cei__320_18_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 18:18:13 [rosetta@home] Unrecoverable error for result ABINITct_hom006_1ctf__320_25_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 18:29:54 [rosetta@home] Unrecoverable error for result ABINITce_hom002_1cei__320_51_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 18:39:43 [rosetta@home] Unrecoverable error for result ABINITcc_hom003_1cc8A_320_54_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 19:13:46 [rosetta@home] Unrecoverable error for result ABINITcc_hom016_1cc8A_320_64_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 19:38:46 [rosetta@home] Unrecoverable error for result ABINITcg_hom028_1cg5B_320_74_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 19:54:49 [rosetta@home] Unrecoverable error for result ABINITct_hom010_1ctf__320_81_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 20:11:24 [rosetta@home] Unrecoverable error for result ABINITcg_hom015_1cg5B_320_85_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 21:21:59 [rosetta@home] Unrecoverable error for result ABINITcg_hom021_1cg5B_320_93_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 21:35:17 [rosetta@home] Unrecoverable error for result ABINITe6_hom021_1e6iA_320_10_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 22:05:31 [rosetta@home] Unrecoverable error for result HBLR_1.0_1b72_321_3512_1 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 22:18:07 [rosetta@home] Unrecoverable error for result ABINITce_hom022_1cei__320_33_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 22:42:40 [rosetta@home] Unrecoverable error for result ABINITcg_hom005_1cg5B_320_56_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 23:24:48 [rosetta@home] Unrecoverable error for result ABINITel_hom022_1elwA_320_11_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 23:37:54 [rosetta@home] Unrecoverable error for result ABINITe6_hom025_1e6iA_320_23_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 23:56:02 [rosetta@home] Unrecoverable error for result ABINITdh_hom016_1dhn__320_22_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-27 00:03:35 [rosetta@home] Unrecoverable error for result ABINITct_hom015_1ctf__320_31_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-27 00:21:59 [rosetta@home] Unrecoverable error for result ABINITdh_hom005_1dhn__320_34_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-27 00:36:28 [rosetta@home] Unrecoverable error for result ABINITe6_hom030_1e6iA_320_38_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-27 02:09:20 [rosetta@home] Unrecoverable error for result ABINITe6_hom026_1e6iA_320_41_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-27 03:32:48 [rosetta@home] Unrecoverable error for result ABINITdh_hom030_1dhn__320_70_0 ( - exit code -1073741811 (0xc000000d)) |
Nicolas VC Send message Joined: 10 Nov 05 Posts: 1 Credit: 1,619,083 RAC: 0 |
I have BOINC 5.2.13 running since Dec 2005. Windows XP detects my pc as a multiprocessor system. My value for time switch betweeen apps is 60 min from the begginning. But since four days ago all my results except one are client errors. hostid=59738 12247891 9795546 1 Mar 2006 7:39:42 UTC 15 Mar 2006 7:39:42 UTC In Progress Unknown New --- --- --- 12228641 9791198 28 Feb 2006 22:13:42 UTC 1 Mar 2006 10:44:55 UTC Over Client error Computing 28,740.80 50.91 --- 12176920 9700235 27 Feb 2006 16:01:48 UTC 27 Feb 2006 18:02:38 UTC Over Client error Computing 3,552.36 6.29 --- 12131713 9716891 28 Feb 2006 14:31:53 UTC 28 Feb 2006 22:13:42 UTC Over Client error Computing 11,330.28 20.07 --- 12121508 9706950 27 Feb 2006 18:07:44 UTC 28 Feb 2006 14:31:53 UTC Over Client error Done 28,363.94 50.24 --- 12092713 9694375 25 Feb 2006 18:19:48 UTC 27 Feb 2006 11:51:23 UTC Over Client error Done 29,023.41 51.41 --- 12069367 9671614 25 Feb 2006 4:22:05 UTC 27 Feb 2006 11:51:23 UTC Over Client error Done 28,695.72 50.83 --- 12007623 9624382 26 Feb 2006 8:25:02 UTC 27 Feb 2006 16:01:48 UTC Over Client error Computing 6,715.63 11.90 --- 11879720 9572581 22 Feb 2006 10:02:46 UTC 25 Feb 2006 4:22:05 UTC Over Success Done 25,620.59 45.55 45.55 11879602 9572468 22 Feb 2006 10:01:08 UTC 8 Mar 2006 10:01:08 UTC In Progress Unknown New --- --- --- 11873799 9566928 22 Feb 2006 6:00:47 UTC 22 Feb 2006 9:58:35 UTC Over Client error Computing 5,748.20 10.22 --- 11840411 9535527 21 Feb 2006 1:30:19 UTC 21 Feb 2006 4:38:20 UTC Over Client error Computing 3,483.77 6.19 --- 11689498 9467768 18 Feb 2006 18:50:50 UTC 19 Feb 2006 9:52:24 UTC Over Client error Computing 9,606.52 16.99 --- 11655933 9449069 18 Feb 2006 7:35:25 UTC 18 Feb 2006 16:54:01 UTC Over Client error Computing 10,126.22 17.91 --- 11606229 9417205 17 Feb 2006 3:51:27 UTC 17 Feb 2006 18:25:28 UTC Over Client error Computing 11,394.44 20.16 --- 11606180 9417159 17 Feb 2006 3:51:27 UTC 17 Feb 2006 22:08:54 UTC Over Client error Computing 9,874.17 17.47 --- 11537668 9363268 16 Feb 2006 4:57:07 UTC 17 Feb 2006 3:51:27 UTC Over Client error Computing 14,039.75 24.83 --- 11537657 9363257 16 Feb 2006 4:57:07 UTC 16 Feb 2006 22:55:01 UTC Over Client error Computing 15,775.41 27.90 --- 11454642 9263916 15 Feb 2006 8:54:16 UTC 15 Feb 2006 14:19:01 UTC Over Client error Done 6,308.38 11.16 --- 11454641 9285436 15 Feb 2006 8:54:16 UTC 15 Feb 2006 14:19:01 UTC Over Client error Done 4,996.91 8.84 --- 11229047 5611704 14 Feb 2006 7:24:17 UTC 15 Feb 2006 8:54:16 UTC Over Client error Computing 25,129.64 44.45 --- 7867088 2157283 23 Jan 2006 7:45:08 UTC 23 Jan 2006 12:08:57 UTC Over Client error Computing 6,295.66 11.23 --- 7411296 5919940 24 Jan 2006 18:34:04 UTC 25 Jan 2006 4:06:00 UTC Over Client error Computing 18,074.89 32.23 --- As I can see, other computers can solve the same WUs. But if there isn't a workaround I am thinking about suspend rosetta project until the stabilization of the client. Maybe next version. Nicolas Velazquez noquierocomprar@hotmail.com |
Osku87 Send message Joined: 1 Nov 05 Posts: 17 Credit: 280,268 RAC: 0 |
Got stucked WU. Stucks always to step 20840 (1,0%) and when restarting the client starts calculation all over. Tried three times to restart the client with the same effect. (Helps usually when stucked in 1,0%). Now aborting. Result ID |
Team TMR Send message Joined: 2 Nov 05 Posts: 21 Credit: 1,583,679 RAC: 0 |
This one WU 9696277 was stuck on 1% for 3 days! I've just aborted it. No wonder my daily points have taken a hit. Looking forward to getting the credit it... |
casio7131 Send message Joined: 10 Oct 05 Posts: 35 Credit: 149,748 RAC: 0 |
not too sure whether you're still concerned with these... ABINITgv_hom021_1gvp__322_53_0 https://boinc.bakerlab.org/rosetta/result.php?resultid=12134791 was stuck at 1% after >5 hours using boinc. [edit: i also checked the graphics to ensure that it actually was stuck.] now using the command line method, it is at 13.1% after 1h 18min. rosetta_4.82_windows_intelx86.exe xx 1gvp _ -output_silent_gz -silent -increase_cycles 10 -new_centroid_packing -no_filters -nstruct 10 -protein_name_prefix hom021_ -frags_name_prefix hom021_ -constant_seed -jran 3884548 |
ecafkid Send message Joined: 5 Oct 05 Posts: 40 Credit: 15,177,319 RAC: 0 |
This one was 24 +Hrs and 27 left tog o I hope to get some credit out of these failing WU's. 3/2/2006 5:22:32 PM|rosetta@home|Unrecoverable error for result ABINITew_hom002_1ew4A_322_56_0 (aborted via GUI RPC) |
AKH54 Send message Joined: 8 Dec 05 Posts: 4 Credit: 1,812,208 RAC: 0 |
I did set my target run time to 2 hours but I have a WU that has been running for over 6 hrs & still on 1% & the completion time is increasing Is this a duff WU or should I try and perservere Alan This is the second time I have posted this, First time was in the wrong place?? |
ecafkid Send message Joined: 5 Oct 05 Posts: 40 Credit: 15,177,319 RAC: 0 |
2 more gone. This is getting ceazy. 3/3/2006 6:46:23 AM|rosetta@home|Unrecoverable error for result ABINITen_hom009_1enh__322_39_0 ( - exit code -1073741811 (0xc000000d)) 3/3/2006 6:46:26 AM|rosetta@home|Unrecoverable error for result HBLR_1.0_1mky_323_1748_0 ( - exit code -1073741811 (0xc000000d)) |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
I did set my target run time to 2 hours but I have a WU that has been running for over 6 hrs & still on 1% & the completion time is increasing Every WU will complete at LEAST one model before reporting back. For some types of WU this can take 6-9 hours. There is an explanation of this in the FAQs thread Moderator9 ROSETTA@home FAQ Moderator Contact |
David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0 |
2 more gone. This is getting ceazy. sorry! David is working on a general fix for this error and running lots of tests on ralph this week. there should be a solution soon ... |
vfrey Send message Joined: 17 Sep 05 Posts: 9 Credit: 690,341 RAC: 485 |
a WU stuck at 1 % https://boinc.bakerlab.org/rosetta/workunit.php?wuid=9740377 unfortunately it ran for more than 29 hours until I noticed it... |
ecafkid Send message Joined: 5 Oct 05 Posts: 40 Credit: 15,177,319 RAC: 0 |
2 more 3/3/2006 7:50:23 PM|rosetta@home|Unrecoverable error for result ABINITpt_hom015_1ptq__322_19_0 ( - exit code -1073741811 (0xc000000d)) 3/3/2006 7:50:25 PM|rosetta@home|Unrecoverable error for result ABINITpg_hom016_1pgx__322_6_0 ( - exit code -1073741811 (0xc000000d)) |
ecafkid Send message Joined: 5 Oct 05 Posts: 40 Credit: 15,177,319 RAC: 0 |
2 more 3/4/2006 7:43:47 AM|rosetta@home|Unrecoverable error for result ABINITa1_hom013_1a19A_320_84_1 ( - exit code -1073741811 (0xc000000d)) 3/4/2006 7:43:48 AM|rosetta@home|Unrecoverable error for result ABINITen_hom019_1enh__322_79_1 ( - exit code -1073741811 (0xc000000d)) |
simpe73 Send message Joined: 20 Feb 06 Posts: 4 Credit: 438,570 RAC: 0 |
Here is some. I only checked 3 of my 40 computers, so there is alots of more that will not be reported. It seems that the my problems are related to pausing the WU. All computers will stop crunching will user is active. HOST 168941 4.3.2006 12:45:42||Suspending computation and network activity - user is active 4.3.2006 12:45:42|rosetta@home|Pausing result FAST_ABINITIO_DEFAULT_2acy__306_3270_2 (removed from memory) 4.3.2006 12:45:42|rosetta@home|Pausing result ABINITsc_hom005_1scjB_322_42_1 (removed from memory) 4.3.2006 12:45:43|rosetta@home|Unrecoverable error for result FAST_ABINITIO_DEFAULT_2acy__306_3270_2 ( - exit code -1073741819 (0xc0000005)) 4.3.2006 12:45:43|rosetta@home|Unrecoverable error for result ABINITsc_hom005_1scjB_322_42_1 ( - exit code -1073741819 (0xc0000005)) HOST 168943 4.3.2006 12:45:51|rosetta@home|Pausing result NEW_SOFT_CENTROID_PACKING_1mky_225_6294_2 (removed from memory) 4.3.2006 12:45:51|rosetta@home|Pausing result ABINITig_hom007_1ig5A_322_2_1 (removed from memory) 4.3.2006 12:45:53|rosetta@home|Unrecoverable error for result NEW_SOFT_CENTROID_PACKING_1mky_225_6294_2 ( - exit code -164 (0xffffff5c)) HOST 168960 4.3.2006 12:51:12||Suspending computation and network activity - user is active 4.3.2006 12:51:12|rosetta@home|Pausing result ABINITrn_hom025_1rnbA_322_79_0 (removed from memory) 4.3.2006 12:51:12|rosetta@home|Pausing result PRODUCTION_ABINITIO_INCREASECYCLES50_1cg5B_317_853_2 (removed from memory) 4.3.2006 12:51:14|rosetta@home|Unrecoverable error for result ABINITrn_hom025_1rnbA_322_79_0 ( - exit code -164 (0xffffff5c)) |
Stephen Miller Send message Joined: 18 Sep 05 Posts: 13 Credit: 16,294,215 RAC: 0 |
I have had two 1% stuck recently. This one wasted 50+ hours but finished after one BOINC restart: 2/25/2006 1:50:25 AM|rosetta@home|Resuming computation for result PRODUCTION_ABINITIO_DBFLAGS_1aiu__307_294_1 using rosetta version 482 This one wasted 15 hours before I restarted BOINC. Now it is at 45:00 minutes and still at 1% and showing 8:00:00 to complete: 3/4/2006 3:27:35 PM|rosetta@home|Resuming computation for result HB_BARCODE_30_2chf__347_425_0 using rosetta version 482 I plan to reboot and restart to see if it will complete. Update - It has now passed 1% and expect it to finish. Stephen M |
Message boards :
Number crunching :
Report stuck & aborted WU here please
©2024 University of Washington
https://www.bakerlab.org