Message boards : Number crunching : Report stuck & aborted WU here please
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 · Next
Author | Message |
---|---|
Delk Send message Joined: 20 Feb 06 Posts: 25 Credit: 995,624 RAC: 0 |
work-units aborted at 1%: Result ID's: 14740903 & 14592911 |
Team_Elteor_Borislavj~Intelligence Send message Joined: 7 Dec 05 Posts: 14 Credit: 56,027 RAC: 0 |
HB_BARCODE_30_4ubpA_351_16734_0 still stuck at 1% after 9 hours of crunching with 100% load! |
mewbysea Send message Joined: 29 Jan 06 Posts: 17 Credit: 15,917,465 RAC: 1,790 |
FA_RLXpt_hom004_1ptq_361_127_0 stuck at 83.81%. WU ID = 11670028; Result ID = 14405752 PC (153231) = Dell 8400, P4 (HT) 3.2 GHz (stock), WIN XP (SP2) Aborted after over 30 hours of crunching. |
Grutte Pier [Wa Oars]~Nemesis Send message Joined: 8 Nov 05 Posts: 3 Credit: 386,730 RAC: 0 |
After a bogus WU on one of my pc's that cost me over 300 credits (it was hanging for a long time) I went though all of my WU's. This is a list of all my recent WU's that were aborted with an error: Intel(R) Pentium(R) M processor 1.73GHz Microsoft Windows XP Professional Edition, Service Pack 2, (05.01.2600.00) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11410757 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10507541 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10454400 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10309222 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10644097 AuthenticAMD mobile AMD Athlon(tm) XP 2000+ Microsoft Windows XP Professional Edition, Service Pack 2, (05.01.2600.00) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11665942 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11639527 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11478408 Intel(R) Pentium(R) 4 CPU 1.60GHz (@2.40GHz) Microsoft Windows XP Professional Edition, Service Pack 2, (05.01.2600.00) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11045076 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11068185 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11008648 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10993712 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10976761 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10961239 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10961160 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10931034 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10928750 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10808712 AuthenticAMD mobile AMD Athlon(tm) XP-M 2800+ (LV) Microsoft Windows XP Professional Edition, Service Pack 2, (05.01.2600.00) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10419709 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10421027 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10438624 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10529395 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10455024 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10417302 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10390604 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10095664 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10064299 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10015662 AuthenticAMD AMD Sempron(tm) Processor 3000+ Microsoft Windows XP Professional Edition, Service Pack 2, (05.01.2600.00) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10387309 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=9956247 AuthenticAMD AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ Microsoft Windows XP Professional Edition, Service Pack 2, (05.01.2600.00) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10629816 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10459506 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10283291 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10059896 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=9544176 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=5796746 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=5733085 AuthenticAMD AMD Sempron(tm) 2400+ Microsoft Windows XP Professional Edition, Service Pack 2, (05.01.2600.00) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11452627 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11439431 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10345630 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10727823 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=10548190 I'm running Rosetta for the medical purpose, but I think there's over 1000 credits in the list above... |
TA_GeoffS Send message Joined: 16 Dec 05 Posts: 2 Credit: 704,640 RAC: 0 |
I'll try to be more vigilent with respect to the status of the WU when I killed it, but I don't think any of these were 1% issues... they were well into the WU and stuck (no progress over a 20 minute span, graphic not moving at all... should I be looking for something else?) All machines are dedicated crunchers with very little else being done on them. https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11711529 (68k CPU seconds, 358 pts) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11682684 (117k CPU seconds, 606 pts) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11011944 (142k CPU seconds, 732 pts) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11733000 (43k CPU seconds, 247 pts) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11732999 (71k CPU seconds, 406 pts) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11733500 (60k CPU seconds, 347 pts) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11580576 (3k CPU seconds, 19 pts) |
Rossmor35 Send message Joined: 24 Sep 05 Posts: 4 Credit: 84,870 RAC: 0 |
This WU stuck at 1% for 6.5hrs before i aborted it. https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11960938 |
Hoogie Send message Joined: 4 Nov 05 Posts: 13 Credit: 1,572,894 RAC: 0 |
The following workunit 12125177, HB_BARCODE_30_1c8cA_351_20458, has stopped at Model 1 Step 20167. This is repeatable, and I have aborted it. |
anders n Send message Joined: 19 Sep 05 Posts: 403 Credit: 537,991 RAC: 0 |
This wu https://boinc.bakerlab.org/rosetta/workunit.php?wuid=11694953 got stuck at Model 1 step 20690 100% cpu and at 1 %. After restart it got stuck at the same place again. Aborted Anders n |
Rich Send message Joined: 30 Nov 05 Posts: 5 Credit: 594,384 RAC: 0 |
Good morning. Attached is a work unit I just aborted at 1% after 16 hrs or so. I assume you can pull up the result codes. Let me know if there is more information you'all usually collect and report. I just discovered this thread and will make an effort to report more often. Hope you'all find a solution. I get these about once every 2 weeks. What is really frustrating to me is to come home from travel and find several days wasted on a 1% work-unit. However, I understand that it is a work-in-progress. Take care and have a good day. Rich Seyfert Work unit name = FA_RLX56_hom014_256bA_362_392_0 Rich Seyfert Eatontown, NJ SeyfertR@att.net |
casio7131 Send message Joined: 10 Oct 05 Posts: 35 Credit: 149,748 RAC: 0 |
stuck at 1% after 11h40min: 27/03/2006 10:32:47 PM|rosetta@home|Pausing task HB_BARCODE_30_5croA_351_23561_0 (left in memory) https://boinc.bakerlab.org/rosetta/result.php?resultid=14998210 command executed: projects/boinc.bakerlab.org_rosetta/rosetta_4.82_windows_intelx86.exe cc 5cro A -abrelax -stringent_relax -more_relax_cycles -output_chi_silent -vary_omega -rand_envpair_res_wt -rand_SS_wt -farlx -ex1 -ex2 -silent -barcode_from_fragments -new_centroid_packing -barcode_from_fragments_length 30 -ssblocks -barcode_mode 3 -omega_weight 0.5 -jitter_frag -jitter_variation gauss -output_silent_gz -nstruct 10 -paths ccfrags200.txt -relax_score_filter -filter1 -85 -filter2 -95 -short_range_hb_weight 0.50 -long_range_hb_weight 1.0 -increase_cycles 10 -cpu_run_time 7200 -constant_seed -jran 3349200 i've looked at it for a further 10-20 min and it didn't seem to have moved any more. i will restart boinc now and see what happens. --- after restart, it has stuck again (at the same point). workunit aborted. |
TA_GeoffS Send message Joined: 16 Dec 05 Posts: 2 Credit: 704,640 RAC: 0 |
|
Rich Send message Joined: 30 Nov 05 Posts: 5 Credit: 594,384 RAC: 0 |
Workunit: FA_RLXub_hom008_4ubpA_362_450_0 stuck at 1% for 20 hrs. URL: https://boinc.bakerlab.org/rosetta/result.php?resultid=14787271. Rich Seyfert Eatontown, NJ SeyfertR@att.net |
[AF>Libristes>Jip] otax Send message Joined: 25 Sep 05 Posts: 1 Credit: 312,969 RAC: 0 |
Hello, this is my list of Wu client errors : FA_RLX56_hom007_256bA_362_202 FA_RLXch_hom015_2chf__362_223 FA_RLXwi_hom026_1wit__362_411 FA_RLXac_hom021_2acy__362_430 FA_RLXch_hom017_2chf__362_264 FA_RLXci_hom024_2ci2I_362_380 FA_RLXpt_hom006_1ptq__361_347 FA_RLXpt_hom002_1ptq__361_380 FA_RLXwh_hom024_1who__362_476 FA_RLXwh_hom017_1who__362_476 For a total of about 60 hours .... (on 3 PCs in 2 days ) Otax. |
Brf Send message Joined: 17 Jan 06 Posts: 1 Credit: 901,500 RAC: 0 |
I have: FA_RLXai_hom028_1aiu_359_210_0 stuck qat 46.06%. If I close Boinc or reboot, it starts up again, the CPU resets at 55 minutes, and it runs until the CPU is at 57 mins and 57 seconds and gets stuck at Model 2, Step 21273. The CPU continues counting up, but will rewind to 55 minutes if I restart Boinc. |
John Perko Send message Joined: 1 Jan 06 Posts: 3 Credit: 604,568 RAC: 0 |
3/28/2006 4:17:39 PM|rosetta@home|Starting result HB_BARCODE_30_2chf__351_32846_0 using rosetta version 482 The above WU was running for 35 minutes (out of a total time of 2:35). At that point, I turned on the graphic and saw that it was stuck at 1%. A second later it jumped to 29.5% and started filling up the graphs in the graphic box, which were previously empty. |
TCU Computer Science Send message Joined: 7 Dec 05 Posts: 28 Credit: 12,861,977 RAC: 0 |
The following were aborted today. All were stuck at 1.00% after running for 20+ hours ID=12326404 name = HB_BARCODE_30_1c8cA_351_32403 ID=12261321 name = HB_BARCODE_30_256bA_351_28680 ID=12034212 name = HB_BARCODE_30_1bk2__351_16205 ID=11076727 name = FA_RLXb3_hom001_1b3aA_359_347 ID=11972587 name = FA_RLXb3_hom010_2chf__362_384 ID=11761822 name = FA_RLXur_hom004_1urnA_362_308 |
David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0 |
The following were aborted today. All were stuck at 1.00% after running for 20+ hours that is not good. with the jobs currently released, this problem should be greatly reduced, and from the "percent complete" we will be able to tell where the problem is. |
RC Send message Joined: 27 Sep 05 Posts: 13 Credit: 262,048 RAC: 0 |
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=12388765 I suspended this unit after it remained at 1% for almost 4 hours. After suspending BOINC I tried running rosetta standalone for a while; it went to 17% within 15 minutes. When I restarted BOINC and resumed processing on this unit, it reset itself to zero, so I aborted it. |
Grutte Pier [Wa Oars]~Nemesis Send message Joined: 8 Nov 05 Posts: 3 Credit: 386,730 RAC: 0 |
After a bogus WU on one of my pc's that cost me over 300 credits (it was hanging for a long time) I went though all of my WU's. This is a list of all my recent WU's that were aborted with an error: I'm wondering if the claimed credits will be awarded for these bogus WU's?? |
Laurenu2 Send message Joined: 6 Nov 05 Posts: 57 Credit: 3,818,778 RAC: 0 |
[quote]that is not good. with the jobs currently released, this problem should be greatly reduced, and from the "percent complete" we will be able to tell where the problem is. Yes on the stuck units if you restart boinc the restets the timer to 0 . I abouted another 4 W/Us to day that brings the total to 9 since Sunday Sory I am Not much good at gathering Info Just hope the returned W/U will help give you the info you need to stop this BUG If You Want The Best You Must forget The Rest ---------------And Join Free-DC---------------- |
Message boards :
Number crunching :
Report stuck & aborted WU here please
©2025 University of Washington
https://www.bakerlab.org