Message boards : Number crunching : Report stuck & aborted WU here please
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 17 · Next
Author | Message |
---|---|
Rebirther Send message Joined: 17 Sep 05 Posts: 116 Credit: 41,315 RAC: 0 |
Damn, all is going wrong here, another one after 14h and 45% fall back to 15% :(. I will cancel all and waiting for a fix. (Rosetta 4.82, Boinc 5.2.13). I have never had any problems before... 2 of 3 failed :o https://boinc.bakerlab.org/rosetta/result.php?resultid=11747939 https://boinc.bakerlab.org/rosetta/result.php?resultid=11748069 |
Stwainer Send message Joined: 9 Nov 05 Posts: 27 Credit: 4,406,829 RAC: 0 |
I had the following Wu stuck at 1% for 2 hours: PRODUCTION_ABINITIO_INCREASECYCLES50_1dhn__312_608_0 |
Jon Kennedy Send message Joined: 1 Oct 05 Posts: 6 Credit: 418,027 RAC: 0 |
This workunit was stuck at 1% after 27h35m: https://boinc.bakerlab.org/rosetta/result.php?resultid=11510637 Nothing occured on the machine to interrupt crunching - Message log: 2/19/2006 5:04:22 PM|rosetta@home|Starting result PRODUCTION_ABINITIO_RANDOMFRAG_1urnA_309_445_0 using rosetta version 481 2/19/2006 5:04:24 PM|rosetta@home|Started upload of PRODUCTION_ABINITIO_RANDOMFRAG_1ughI_309_445_0_0 2/19/2006 5:04:31 PM|rosetta@home|Finished upload of PRODUCTION_ABINITIO_RANDOMFRAG_1ughI_309_445_0_0 2/19/2006 5:04:31 PM|rosetta@home|Throughput 23263 bytes/sec 2/20/2006 8:26:06 PM||request_reschedule_cpus: project op 2/20/2006 8:26:10 PM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi 2/20/2006 8:26:10 PM|rosetta@home|Reason: Requested by user 2/20/2006 8:26:10 PM|rosetta@home|Reporting 7 results 2/20/2006 8:26:15 PM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded 2/20/2006 10:35:53 PM|rosetta@home|Unrecoverable error for result PRODUCTION_ABINITIO_RANDOMFRAG_1urnA_309_445_0 (aborted via GUI RPC) The next WU PRODUCTION_ABINITIO_RANDOMFRAG_2acy__309_389_0 is also seemingly stuck at 1% after 37+ minutes... <sigh> |
Snake Doctor Send message Joined: 17 Sep 05 Posts: 182 Credit: 6,401,938 RAC: 0 |
I have one stuck on a Mac G4 Laptop running OS 10.4.5. the WU is here. The application version is 4.82. The previous "owner" had a client error on this WU. This will be the result ID If I can make it finish. The WU is stuck at 1% complete after 2:15 of CPU time. My time setting is set for 2 hours. It has completed 97345 steps but shows 1% complete. The WU name is -PRODUCTION_ABINITIO_QUADRUPLELONGRANGEANTIPARALLEL_1acf__311_807 Regards Phil We Must look for intelligent life on other planets as, it is becoming increasingly apparent we will not find any on our own. |
Snake Doctor Send message Joined: 17 Sep 05 Posts: 182 Credit: 6,401,938 RAC: 0 |
This workunit was stuck at 1% after 27h35m: I wonder if we actually have stuck WUs or if they are just one of the ones that used to take 30 hours. Both yours and mine are "PRODUCTION_ABINITIO_xxxx". I just watched the screen saver for a while and it is running over 100,000 steps on the first model, but it is running. It could be that it is just doing more steps per model and therefore taking longer to checkpoint and that would delay the percent complete. It has run over 20 min and all the WUs I have seen since the New version of the software was released have only taken about 5 mins per model. |
Peter Ingham Send message Joined: 27 Sep 05 Posts: 14 Credit: 4,215,134 RAC: 0 |
FYI, I've just aborted a WU () stuck at 1% after 175K seconds Name: PRODUCTION_ABINITIO_RANDOMFRAG_1vcc__309_441 WU: 9337995 ResultID: 11582797 |
KwintenB Send message Joined: 24 Nov 05 Posts: 6 Credit: 183,329 RAC: 0 |
I've got a WU who's crunching already 51h, now i suspended the WU. Is there any chance that i'll get point voor this job if I abort it. Because this is obviously a project fault Details of the WU: 19/02/2006 04:12:00|rosetta@home|Starting result PRODUCTION_ABINITIO_DBFLAGS_1lis__307_738_0 using rosetta version 481 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=9300013 https://boinc.bakerlab.org/rosetta/result.php?resultid=11470130 |
arklms Send message Joined: 17 Dec 05 Posts: 7 Credit: 177,488 RAC: 0 |
PRODUCTION_ABINITIO_INCREASECYCLES50_1tul__317_178_0 Appears stuck on 1%. Can't start it from the DOS window, it crashed. |
arklms Send message Joined: 17 Dec 05 Posts: 7 Credit: 177,488 RAC: 0 |
PRODUCTION_ABINITIO_INCREASECYCLES50_1tul__317_178_0 I just clicked on the Rosetta graphics, which crashed the computer. Upon reboot, it's 17% and ongoing. Strange, but true. |
Daral Send message Joined: 13 Jan 06 Posts: 13 Credit: 870,334 RAC: 0 |
Got a 1% error for 1 hr 21 minutes. Work Unit Production_Abinitio_increasecycles50_1ten_317_127_0 Running it from command line now with seed 1037999 seems to also get stuck on the first iteration. It's run over 512k steps and is still on the first model. |
Nico Send message Joined: 29 Sep 05 Posts: 1 Credit: 548,959 RAC: 0 |
PRODUCTION_ABINITIO_QUADRUPLELONGRANGEANTIPARALLEL_1tul__311_863 stucked at 1%: (requestet 2h WUs and this one is running for more then 2h now and still at 1%) http://666kb.com/i/117ucnv1ep5vl.gif |
O&O Send message Joined: 11 Dec 05 Posts: 25 Credit: 66,900 RAC: 0 |
Hello David PRODUCTION_ABINITIO_1acf__250_809_2 My computer did 13.32 hours on this WU ... before it errored out with -177 (0xffffff4f) Exit status and "Maximum CPU time exceeded". What about the ... 131.16 cliamed credits? Regards, O&O |
Runaway1956 Send message Joined: 5 Nov 05 Posts: 19 Credit: 535,400 RAC: 0 |
Well, glad I stopped in to look around. After the upgrade to 4.81, it seemed that none of my previously downloaded WU wanted to run. Which was odd, as I'd already returned a number of similar WU from the same batch. I put those all on hold, and ran some of the newer WU, which said they were for 4.81. I got the 1% glitch on about 4 of them. Hit reset. Everything goes away, and BOINC downloads some new WU. Same thing. 1% lasts about 2 1/2 eternities. Was about to hit reset again, but decided to come here..... Thanks guys. I'll let the little monster run. FYI, I've just aborted a WU () stuck at 1% after 175K seconds |
XS_team_germany Send message Joined: 2 Jan 06 Posts: 6 Credit: 1,469,591 RAC: 0 |
I uploaded these results today and I received no credit for them:
Host: https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=166212 The above work units ran 7+ hours each. :( |
Marie Lucie Send message Joined: 9 Dec 05 Posts: 5 Credit: 40,616 RAC: 0 |
Hello, I made the change in Rosetta settings as requested and I got again an error. It run 53 minutes and than ... 25/02/2006 10:28:41|rosetta@home|Unrecoverable error for result HBLR_1.0_1hz6_321_998_0 ( - exit code -1073741819 (0xc0000005)) 25/02/2006 10:28:42||request_reschedule_cpus: process exited 25/02/2006 10:28:42|rosetta@home|Computation for result HBLR_1.0_1hz6_321_998_0 finished I've one WU remaining. We will see |
Marie Lucie Send message Joined: 9 Dec 05 Posts: 5 Credit: 40,616 RAC: 0 |
Same problem. It achieved +/- 26% AFTER41 minutes 25/02/2006 12:33:35|rosetta@home|Pausing result HBLR_1.0_1n0u_321_977_0 (removed from memory) 25/02/2006 12:33:47|rosetta@home|Unrecoverable error for result HBLR_1.0_1n0u_321_977_0 ( - exit code -1073741819 (0xc0000005)) 25/02/2006 12:33:47||request_reschedule_cpus: process exited 25/02/2006 12:33:47|rosetta@home|Computation for result HBLR_1.0_1n0u_321_977_0 finished |
Insidious Send message Joined: 10 Nov 05 Posts: 49 Credit: 604,937 RAC: 0 |
Same problem. It achieved +/- 26% AFTER41 minutes I think you need to set your preferences to "leave in memory while suspended" to YES to help with this. -Sid Proudly crunching with TeAm Anandtech |
Steve Shedroff Send message Joined: 7 Nov 05 Posts: 11 Credit: 250,657 RAC: 0 |
This work unit was at 1% after 86 hours with 96 hours to go so I aborted it. 2/26/2006 1:15:09 PM|rosetta@home|Unrecoverable error for result PRODUCTION_ABINITIO_INCREASECYCLES50_2acy__317_316_0 (aborted via GUI RPC) Regards, Steve |
Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0 |
workunit 9626703 (ABINITbg_hom023_1bgf__320_10) got stuck at Model 1, Step 20710 for 14 CPU hours, with 25+ to go, until I noticed. I exited & restarted BOINC, it started the WU all over from scratch, but it again (after running for a few seconds) stuck at M1,S20710. I'm aborting it. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity |
chris mathews Send message Joined: 21 Jan 06 Posts: 1 Credit: 1,192,890 RAC: 0 |
Had never received this error until yesterday: 2006-02-26 15:06:35 [rosetta@home] Unrecoverable error for result ABINITbq_hom014_1bq9A_320_100_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 15:38:11 [rosetta@home] Unrecoverable error for result ABINITcg_hom025_1cg5B_320_2_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 15:48:43 [rosetta@home] Unrecoverable error for result ABINITce_hom005_1cei__320_11_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 15:59:32 [rosetta@home] Unrecoverable error for result ABINITce_hom008_1cei__320_15_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 16:28:40 [rosetta@home] Unrecoverable error for result ABINITce_hom012_1cei__320_18_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 18:18:13 [rosetta@home] Unrecoverable error for result ABINITct_hom006_1ctf__320_25_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 18:29:54 [rosetta@home] Unrecoverable error for result ABINITce_hom002_1cei__320_51_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 18:39:43 [rosetta@home] Unrecoverable error for result ABINITcc_hom003_1cc8A_320_54_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 19:13:46 [rosetta@home] Unrecoverable error for result ABINITcc_hom016_1cc8A_320_64_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 19:38:46 [rosetta@home] Unrecoverable error for result ABINITcg_hom028_1cg5B_320_74_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 19:54:49 [rosetta@home] Unrecoverable error for result ABINITct_hom010_1ctf__320_81_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 20:11:24 [rosetta@home] Unrecoverable error for result ABINITcg_hom015_1cg5B_320_85_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 21:21:59 [rosetta@home] Unrecoverable error for result ABINITcg_hom021_1cg5B_320_93_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 21:35:17 [rosetta@home] Unrecoverable error for result ABINITe6_hom021_1e6iA_320_10_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 22:05:31 [rosetta@home] Unrecoverable error for result HBLR_1.0_1b72_321_3512_1 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 22:18:07 [rosetta@home] Unrecoverable error for result ABINITce_hom022_1cei__320_33_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 22:42:40 [rosetta@home] Unrecoverable error for result ABINITcg_hom005_1cg5B_320_56_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 23:24:48 [rosetta@home] Unrecoverable error for result ABINITel_hom022_1elwA_320_11_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 23:37:54 [rosetta@home] Unrecoverable error for result ABINITe6_hom025_1e6iA_320_23_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-26 23:56:02 [rosetta@home] Unrecoverable error for result ABINITdh_hom016_1dhn__320_22_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-27 00:03:35 [rosetta@home] Unrecoverable error for result ABINITct_hom015_1ctf__320_31_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-27 00:21:59 [rosetta@home] Unrecoverable error for result ABINITdh_hom005_1dhn__320_34_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-27 00:36:28 [rosetta@home] Unrecoverable error for result ABINITe6_hom030_1e6iA_320_38_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-27 02:09:20 [rosetta@home] Unrecoverable error for result ABINITe6_hom026_1e6iA_320_41_0 ( - exit code -1073741811 (0xc000000d)) 2006-02-27 03:32:48 [rosetta@home] Unrecoverable error for result ABINITdh_hom030_1dhn__320_70_0 ( - exit code -1073741811 (0xc000000d)) |
Message boards :
Number crunching :
Report stuck & aborted WU here please
©2025 University of Washington
https://www.bakerlab.org