Message boards : Number crunching : Report stuck & aborted WU here please
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 17 · Next
Author | Message |
---|---|
Nicolas VC Send message Joined: 10 Nov 05 Posts: 1 Credit: 1,619,083 RAC: 0 |
I have BOINC 5.2.13 running since Dec 2005. Windows XP detects my pc as a multiprocessor system. My value for time switch betweeen apps is 60 min from the begginning. But since four days ago all my results except one are client errors. hostid=59738 12247891 9795546 1 Mar 2006 7:39:42 UTC 15 Mar 2006 7:39:42 UTC In Progress Unknown New --- --- --- 12228641 9791198 28 Feb 2006 22:13:42 UTC 1 Mar 2006 10:44:55 UTC Over Client error Computing 28,740.80 50.91 --- 12176920 9700235 27 Feb 2006 16:01:48 UTC 27 Feb 2006 18:02:38 UTC Over Client error Computing 3,552.36 6.29 --- 12131713 9716891 28 Feb 2006 14:31:53 UTC 28 Feb 2006 22:13:42 UTC Over Client error Computing 11,330.28 20.07 --- 12121508 9706950 27 Feb 2006 18:07:44 UTC 28 Feb 2006 14:31:53 UTC Over Client error Done 28,363.94 50.24 --- 12092713 9694375 25 Feb 2006 18:19:48 UTC 27 Feb 2006 11:51:23 UTC Over Client error Done 29,023.41 51.41 --- 12069367 9671614 25 Feb 2006 4:22:05 UTC 27 Feb 2006 11:51:23 UTC Over Client error Done 28,695.72 50.83 --- 12007623 9624382 26 Feb 2006 8:25:02 UTC 27 Feb 2006 16:01:48 UTC Over Client error Computing 6,715.63 11.90 --- 11879720 9572581 22 Feb 2006 10:02:46 UTC 25 Feb 2006 4:22:05 UTC Over Success Done 25,620.59 45.55 45.55 11879602 9572468 22 Feb 2006 10:01:08 UTC 8 Mar 2006 10:01:08 UTC In Progress Unknown New --- --- --- 11873799 9566928 22 Feb 2006 6:00:47 UTC 22 Feb 2006 9:58:35 UTC Over Client error Computing 5,748.20 10.22 --- 11840411 9535527 21 Feb 2006 1:30:19 UTC 21 Feb 2006 4:38:20 UTC Over Client error Computing 3,483.77 6.19 --- 11689498 9467768 18 Feb 2006 18:50:50 UTC 19 Feb 2006 9:52:24 UTC Over Client error Computing 9,606.52 16.99 --- 11655933 9449069 18 Feb 2006 7:35:25 UTC 18 Feb 2006 16:54:01 UTC Over Client error Computing 10,126.22 17.91 --- 11606229 9417205 17 Feb 2006 3:51:27 UTC 17 Feb 2006 18:25:28 UTC Over Client error Computing 11,394.44 20.16 --- 11606180 9417159 17 Feb 2006 3:51:27 UTC 17 Feb 2006 22:08:54 UTC Over Client error Computing 9,874.17 17.47 --- 11537668 9363268 16 Feb 2006 4:57:07 UTC 17 Feb 2006 3:51:27 UTC Over Client error Computing 14,039.75 24.83 --- 11537657 9363257 16 Feb 2006 4:57:07 UTC 16 Feb 2006 22:55:01 UTC Over Client error Computing 15,775.41 27.90 --- 11454642 9263916 15 Feb 2006 8:54:16 UTC 15 Feb 2006 14:19:01 UTC Over Client error Done 6,308.38 11.16 --- 11454641 9285436 15 Feb 2006 8:54:16 UTC 15 Feb 2006 14:19:01 UTC Over Client error Done 4,996.91 8.84 --- 11229047 5611704 14 Feb 2006 7:24:17 UTC 15 Feb 2006 8:54:16 UTC Over Client error Computing 25,129.64 44.45 --- 7867088 2157283 23 Jan 2006 7:45:08 UTC 23 Jan 2006 12:08:57 UTC Over Client error Computing 6,295.66 11.23 --- 7411296 5919940 24 Jan 2006 18:34:04 UTC 25 Jan 2006 4:06:00 UTC Over Client error Computing 18,074.89 32.23 --- As I can see, other computers can solve the same WUs. But if there isn't a workaround I am thinking about suspend rosetta project until the stabilization of the client. Maybe next version. Nicolas Velazquez noquierocomprar@hotmail.com |
Osku87 Send message Joined: 1 Nov 05 Posts: 17 Credit: 280,268 RAC: 0 |
Got stucked WU. Stucks always to step 20840 (1,0%) and when restarting the client starts calculation all over. Tried three times to restart the client with the same effect. (Helps usually when stucked in 1,0%). Now aborting. Result ID |
Team TMR Send message Joined: 2 Nov 05 Posts: 21 Credit: 1,583,679 RAC: 0 |
This one WU 9696277 was stuck on 1% for 3 days! I've just aborted it. No wonder my daily points have taken a hit. Looking forward to getting the credit it... |
casio7131 Send message Joined: 10 Oct 05 Posts: 35 Credit: 149,748 RAC: 0 |
not too sure whether you're still concerned with these... ABINITgv_hom021_1gvp__322_53_0 https://boinc.bakerlab.org/rosetta/result.php?resultid=12134791 was stuck at 1% after >5 hours using boinc. [edit: i also checked the graphics to ensure that it actually was stuck.] now using the command line method, it is at 13.1% after 1h 18min. rosetta_4.82_windows_intelx86.exe xx 1gvp _ -output_silent_gz -silent -increase_cycles 10 -new_centroid_packing -no_filters -nstruct 10 -protein_name_prefix hom021_ -frags_name_prefix hom021_ -constant_seed -jran 3884548 |
ecafkid Send message Joined: 5 Oct 05 Posts: 40 Credit: 15,177,319 RAC: 0 |
This one was 24 +Hrs and 27 left tog o I hope to get some credit out of these failing WU's. 3/2/2006 5:22:32 PM|rosetta@home|Unrecoverable error for result ABINITew_hom002_1ew4A_322_56_0 (aborted via GUI RPC) |
AKH54 Send message Joined: 8 Dec 05 Posts: 4 Credit: 1,812,208 RAC: 0 |
I did set my target run time to 2 hours but I have a WU that has been running for over 6 hrs & still on 1% & the completion time is increasing Is this a duff WU or should I try and perservere Alan This is the second time I have posted this, First time was in the wrong place?? |
ecafkid Send message Joined: 5 Oct 05 Posts: 40 Credit: 15,177,319 RAC: 0 |
2 more gone. This is getting ceazy. 3/3/2006 6:46:23 AM|rosetta@home|Unrecoverable error for result ABINITen_hom009_1enh__322_39_0 ( - exit code -1073741811 (0xc000000d)) 3/3/2006 6:46:26 AM|rosetta@home|Unrecoverable error for result HBLR_1.0_1mky_323_1748_0 ( - exit code -1073741811 (0xc000000d)) |
David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0 |
2 more gone. This is getting ceazy. sorry! David is working on a general fix for this error and running lots of tests on ralph this week. there should be a solution soon ... |
vfrey Send message Joined: 17 Sep 05 Posts: 9 Credit: 705,755 RAC: 266 |
a WU stuck at 1 % https://boinc.bakerlab.org/rosetta/workunit.php?wuid=9740377 unfortunately it ran for more than 29 hours until I noticed it... |
ecafkid Send message Joined: 5 Oct 05 Posts: 40 Credit: 15,177,319 RAC: 0 |
2 more 3/3/2006 7:50:23 PM|rosetta@home|Unrecoverable error for result ABINITpt_hom015_1ptq__322_19_0 ( - exit code -1073741811 (0xc000000d)) 3/3/2006 7:50:25 PM|rosetta@home|Unrecoverable error for result ABINITpg_hom016_1pgx__322_6_0 ( - exit code -1073741811 (0xc000000d)) |
ecafkid Send message Joined: 5 Oct 05 Posts: 40 Credit: 15,177,319 RAC: 0 |
2 more 3/4/2006 7:43:47 AM|rosetta@home|Unrecoverable error for result ABINITa1_hom013_1a19A_320_84_1 ( - exit code -1073741811 (0xc000000d)) 3/4/2006 7:43:48 AM|rosetta@home|Unrecoverable error for result ABINITen_hom019_1enh__322_79_1 ( - exit code -1073741811 (0xc000000d)) |
simpe73 Send message Joined: 20 Feb 06 Posts: 4 Credit: 438,570 RAC: 0 |
Here is some. I only checked 3 of my 40 computers, so there is alots of more that will not be reported. It seems that the my problems are related to pausing the WU. All computers will stop crunching will user is active. HOST 168941 4.3.2006 12:45:42||Suspending computation and network activity - user is active 4.3.2006 12:45:42|rosetta@home|Pausing result FAST_ABINITIO_DEFAULT_2acy__306_3270_2 (removed from memory) 4.3.2006 12:45:42|rosetta@home|Pausing result ABINITsc_hom005_1scjB_322_42_1 (removed from memory) 4.3.2006 12:45:43|rosetta@home|Unrecoverable error for result FAST_ABINITIO_DEFAULT_2acy__306_3270_2 ( - exit code -1073741819 (0xc0000005)) 4.3.2006 12:45:43|rosetta@home|Unrecoverable error for result ABINITsc_hom005_1scjB_322_42_1 ( - exit code -1073741819 (0xc0000005)) HOST 168943 4.3.2006 12:45:51|rosetta@home|Pausing result NEW_SOFT_CENTROID_PACKING_1mky_225_6294_2 (removed from memory) 4.3.2006 12:45:51|rosetta@home|Pausing result ABINITig_hom007_1ig5A_322_2_1 (removed from memory) 4.3.2006 12:45:53|rosetta@home|Unrecoverable error for result NEW_SOFT_CENTROID_PACKING_1mky_225_6294_2 ( - exit code -164 (0xffffff5c)) HOST 168960 4.3.2006 12:51:12||Suspending computation and network activity - user is active 4.3.2006 12:51:12|rosetta@home|Pausing result ABINITrn_hom025_1rnbA_322_79_0 (removed from memory) 4.3.2006 12:51:12|rosetta@home|Pausing result PRODUCTION_ABINITIO_INCREASECYCLES50_1cg5B_317_853_2 (removed from memory) 4.3.2006 12:51:14|rosetta@home|Unrecoverable error for result ABINITrn_hom025_1rnbA_322_79_0 ( - exit code -164 (0xffffff5c)) |
Stephen Miller Send message Joined: 18 Sep 05 Posts: 13 Credit: 16,294,215 RAC: 0 |
I have had two 1% stuck recently. This one wasted 50+ hours but finished after one BOINC restart: 2/25/2006 1:50:25 AM|rosetta@home|Resuming computation for result PRODUCTION_ABINITIO_DBFLAGS_1aiu__307_294_1 using rosetta version 482 This one wasted 15 hours before I restarted BOINC. Now it is at 45:00 minutes and still at 1% and showing 8:00:00 to complete: 3/4/2006 3:27:35 PM|rosetta@home|Resuming computation for result HB_BARCODE_30_2chf__347_425_0 using rosetta version 482 I plan to reboot and restart to see if it will complete. Update - It has now passed 1% and expect it to finish. Stephen M |
nmelhorn Send message Joined: 16 Oct 05 Posts: 1 Credit: 177,616 RAC: 0 |
The following WU assigned to me: ResultID 12020507 WUID 9636787 Sent 26 Feb 2006 14:21:21 UTC still shows In Progress / Unknown / New on my Results page, though there's no record left in my machine. The adjacent WU's failed. I assume I should notify here, so the WU can be quickly reassigned elsewhere. --regards, Nate |
OhioDude Send message Joined: 11 Dec 05 Posts: 8 Credit: 4,056,499 RAC: 0 |
Stuck at 1% for 15 hours: ABINITvc_home007_1vcc_337_21_0 Visit my websites honoring some of America's heroes: USS Rich DE-695 USS Bunch DE-694 / APD-79 |
Stu D. Send message Joined: 3 Mar 06 Posts: 8 Credit: 575,867 RAC: 0 |
3/4/2006 11:08:23 PM|rosetta@home|Unrecoverable error for result HOMSdt_homDB009_1dtj__340_108_0 (Incorrect function. (0x1) - exit code 1 (0x1)) |
Fardringle Send message Joined: 22 Feb 06 Posts: 3 Credit: 5,487,674 RAC: 2 |
ABINITwi_hom007_1wit__337_79_0 is stuck at 1% after 11 hours. The system is an Athlon XP 2200+ running Windows 2000 with version 5.2.13 of the BOINC client. |
OhioDude Send message Joined: 11 Dec 05 Posts: 8 Credit: 4,056,499 RAC: 0 |
Stuck at 1% for 15 hours: Got another one stuck at 1%: HB_BARCODE_30_1acf_347_958_0 Visit my websites honoring some of America's heroes: USS Rich DE-695 USS Bunch DE-694 / APD-79 |
OhioDude Send message Joined: 11 Dec 05 Posts: 8 Credit: 4,056,499 RAC: 0 |
And one more: ABINITvi_hom020_2vik_337_83_0 Visit my websites honoring some of America's heroes: USS Rich DE-695 USS Bunch DE-694 / APD-79 |
Ib Rasmussen Send message Joined: 27 Sep 05 Posts: 16 Credit: 211,416 RAC: 0 |
SSFEATURES_BARCODE_ABINITIO_1acf__334_321_0 was stuck at 1% for 57+ hours. I tried stopping and restarting Boinc, but it restarted the wu at 00:00:00, so I killed it. /Ib |
Message boards :
Number crunching :
Report stuck & aborted WU here please
©2025 University of Washington
https://www.bakerlab.org