Message boards : Number crunching : Problems with Minirosetta 1.80
Author | Message |
---|---|
Yifan Song Volunteer moderator Project developer Project scientist Send message Joined: 26 May 09 Posts: 62 Credit: 7,322 RAC: 0 |
In this version: New protein-protein docking protocol. New rotamer library. |
nick n Send message Joined: 26 Aug 07 Posts: 49 Credit: 219,102 RAC: 0 |
I am getting ALOT of errors on my mac. I have tried resetting and detaching and re attaching to no avail. Here are a few WU examples https://boinc.bakerlab.org/rosetta/result.php?resultid=261129500 https://boinc.bakerlab.org/rosetta/result.php?resultid=261082154 https://boinc.bakerlab.org/rosetta/result.php?resultid=261052997 https://boinc.bakerlab.org/rosetta/result.php?resultid=261042205 https://boinc.bakerlab.org/rosetta/result.php?resultid=260869175 https://boinc.bakerlab.org/rosetta/result.php?resultid=260866803 https://boinc.bakerlab.org/rosetta/result.php?resultid=260840258 |
Bill Hepburn Send message Joined: 18 Sep 05 Posts: 14 Credit: 14,953,680 RAC: 584 |
I have had three now that came up with a "compute error" after they had almost finished. Don't think it is on my end. They were on two different computers (one XP Pro, one Win Server 2003). Two of them have been reissued and the second person errored out too. The last one just went out. Other 1.80 tasks run fine, other projects are running just fine. https://boinc.bakerlab.org/rosetta/workunit.php?wuid=238330815 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=238113829 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=238093549 |
RC Send message Joined: 27 Sep 05 Posts: 13 Credit: 262,048 RAC: 0 |
I have also had a couple of failures on a Mac. In both cases the run time was less than 10 minutes: https://boinc.bakerlab.org/rosetta/result.php?resultid=261100252 https://boinc.bakerlab.org/rosetta/result.php?resultid=261064311 |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
lb_cutback_all_multi_hb_t290__IGNORE_THE_REST_1LOPA_7_12941_28_0 Outcome = Success and Validate state = valid but cpu time = 1637.58 secs and no models appear in the stderr out but this does: Hbond tripped: [2009- 6-25 5:28: 3:] Snags |
slamb Send message Joined: 19 Oct 05 Posts: 2 Credit: 2,050,032 RAC: 0 |
Running out of work. Can't get any more work to download. |
nick n Send message Joined: 26 Aug 07 Posts: 49 Credit: 219,102 RAC: 0 |
Now just about everything is failing. I am going to leave for a while if this isn't fixed soon..... |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Running out of work. Can't get any more work to download. It seems the work server waits until you complete or get a long ways into your last running tasks before it downloads new work. I have seen this happen allot lately. I came down to my last 2 tasks (1 per core) and was running them when I got my huge quota (current +5 days extra) of new work. See if that is happening on your system. |
TomaszPawel Send message Joined: 28 Apr 07 Posts: 54 Credit: 2,791,145 RAC: 0 |
I found "bug". This WU make only 84.37 credit but was runing 22,555.02sec.... This WU make 84.34 credit and was runing 10652.67sec.... So it is bug or it is normal that for WU runing 2x longer I get the same credit? WWW of Polish National Team - Join! Crunch! Win! |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Hi. This one ran for over ten hours on my six hour runtime then fell over, NOT GOOD. https://boinc.bakerlab.org/rosetta/workunit.php?wuid=238272279 Fri 26 Jun 2009 14:59:27 EST|rosetta@home|Output file lb_cutback_all_multi_hb_t325__IGNORE_THE_REST_1ZZMA_12_12955_12_0_0 for task lb_cutback_all_multi_hb_t325__IGNORE_THE_REST_1ZZMA_12_12955_12_0 absent <error_code>-161</error_code> pete. |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 2 |
I don't know if this is the right place, but have set 6 hours as the target runtime and this wu has been running 54:05:12 now and claims to be 15.255% complete. I have suspended the task pending comment. Claims to have 88:17:24 to completion. <edit> Mini Rosetta 1.80, Windows XP, BOINC 6.6.20. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
adrianxw, please click the task from the task list and click the properties button. Does it show more then 10 hours of CPU time as well? (because the task list now shows "elapsed time" with the new BOINC version). If you unsuspend the task (and get it running again, perhaps by suspending other tasks for a moment), is it using CPU time? If it has more then 10 hours of actual CPU time, I would suggest aborting the task. Rosetta Moderator: Mod.Sense |
Venturini Dario[VENETO] Send message Joined: 25 May 07 Posts: 22 Credit: 245,028 RAC: 0 |
adrianxw, please click the task from the task list and click the properties button. Does it show more then 10 hours of CPU time as well? (because the task list now shows "elapsed time" with the new BOINC version). I also have a WU that got stuck, luckily I noticed after just 4 hours. Here's a screenshot of the properties of that WU, as you can see that CPU time is just 1 hour + while Run time is 4 hours + Suspending --> Resuming didn't work to "unstuck" it, until I removed the flag from "keep WU's in memory when suspended". After that, suspending --> resuming made it work again from the percentage reached before the stop (43,43%) |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Venturini, are you allowing BOINC to use 100% of CPU? And all of the available CPUs? Is the machine busy working on other applications that are running? Rosetta Moderator: Mod.Sense |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 2 |
The "Properties" box shows "CPU Time" 00:58:34, the "CPU time at last checkpoint" also shows as 00:58:34 "Elapsed time" 54:05:12 and "Estimated time remaining" 88:17:24. Resuming the task, it started running in "High priority" mode. I think I would have noticed if it had really been sitting there for a couple of days. In the time it has taken to write this, the percentage complete has risen to 18.012% and the estimated completion dropped to 83:58:43. Something weird going on there. I'll leave it running for the moment at least. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Venturini Dario[VENETO] Send message Joined: 25 May 07 Posts: 22 Credit: 245,028 RAC: 0 |
Venturini, are you allowing BOINC to use 100% of CPU? And all of the available CPUs? Is the machine busy working on other applications that are running? All of the cores (2) are dedicated to BOINC, both running 100%, and the only other application running is Word (I'm writing schemes for my next university exams) plus the background ones (antivirus and so on) ;) Plus, I have only Rosetta on this PC (and WCG, but it's set to no new task). OS is Windows Vista Home Premium, BOINC is 6.6.28, CPU is a Intel 7700. And, btw, call me Dario, Venturini is my surname ;) |
PinkPenguin Send message Joined: 26 Apr 09 Posts: 5 Credit: 280,676 RAC: 0 |
Reporting a couple of -161 errors encountered at the end of lb_cutback_all_multi_hb work units which appear to have completed OK. On Windows Vista (Intel Core Duo 2GHz) - BOINC 6.6.36 / Rosetta 1.80: https://boinc.bakerlab.org/rosetta/result.php?resultid=261371341 On Linux Fedora v10 (Intel Pentium 4 3.00GHz) - BOINC 6.4.7 / Rosetta 1.80: https://boinc.bakerlab.org/rosetta/result.php?resultid=261035946 In this case the other task with the same workunit (238257150) completed without errors. I noticed that there are similar reports earlier in thIS thread (see also message: 61948 from P.P.L.). This may be similar to a series of lb_thread_all_multi errors reported earlier this month. All the best, Richard |
Chris Down Send message Joined: 19 Jun 09 Posts: 1 Credit: 11,750 RAC: 0 |
Also experiencing some compute errors and strange completion times. Seems to be ignoring my settings, too. |
Venturini Dario[VENETO] Send message Joined: 25 May 07 Posts: 22 Credit: 245,028 RAC: 0 |
Venturini, are you allowing BOINC to use 100% of CPU? And all of the available CPUs? Is the machine busy working on other applications that are running? Here you go, completed, reported and validated succesfully https://boinc.bakerlab.org/rosetta/result.php?resultid=261619500 |
Rayburner Send message Joined: 4 Oct 05 Posts: 32 Credit: 16,518,823 RAC: 0 |
compute error after 4 hours https://boinc.bakerlab.org/rosetta/result.php?resultid=261844121 real_core_1.5_low200_beta_low200_start_hb_t374__IGNORE_THE_REST_13119_137_0 |
Message boards :
Number crunching :
Problems with Minirosetta 1.80
©2024 University of Washington
https://www.bakerlab.org