1)
Message boards :
Number crunching :
RAC dropping, BOINC dropping comms
(Message 31703)
Posted 27 Nov 2006 by Team TMR Post: Is the boinc.exe task still running when this happens? |
2)
Message boards :
Number crunching :
Problems with Rosetta version 5.40
(Message 31118)
Posted 14 Nov 2006 by Team TMR Post: I woke up this morning to find that over 20 WUs failed overnight. It's good to see the cause has already been found though. |
3)
Message boards :
Number crunching :
Report Problems with Rosetta Version 5.25
(Message 21613)
Posted 2 Aug 2006 by Team TMR Post: I've had at least 5 WUs recently that have failed in the last week because they ran for over 12 hours (default 3 hour target CPU time in effect), and another is going to fail within the next hour (it's already up to 12.5 hours). They're not getting credit either. |
4)
Message boards :
Number crunching :
Report stuck & aborted WU here please - II
(Message 13381)
Posted 10 Apr 2006 by Team TMR Post: WU 16856997 was aborted after 7 hours, when it was stuck on about 1.36%. I have 3 more that seem to be stuck near 1% after an hour, but I won't abort them until they pass 2 hours or so. |
5)
Message boards :
Number crunching :
Report stuck & aborted WU here please
(Message 13078)
Posted 5 Apr 2006 by Team TMR Post: I've got two more somewhere that are stuck on 1.04% after 2 hours and 1.5 hours. |
6)
Message boards :
Number crunching :
Report stuck & aborted WU here please
(Message 13038)
Posted 4 Apr 2006 by Team TMR Post: This WU seems to be stuck: 13107954 Over 3 hours in and it's on 1.19%. Job CPU time is set to 2 hours. Edit: Now at 4 hours and 1.30%. |
7)
Message boards :
Number crunching :
Report stuck & aborted WU here please
(Message 12497)
Posted 22 Mar 2006 by Team TMR Post: Another one, aborted after 16 hours stuck on 1% http://boinc.bakerlab.org/rosetta/workunit.php?wuid=11717839 This one was a bit different - it was stuck on 30.19% after 8 hours. After restarting BOINC, it reset back to 38 mins CPU time and 30.19% and got stuck again. http://boinc.bakerlab.org/rosetta/workunit.php?wuid=11743501 It's getting increasingly frustrating having to babysit this project all the time. Fingers crossed for those working on a fix. |
8)
Message boards :
Number crunching :
Report stuck & aborted WU here please
(Message 12332)
Posted 20 Mar 2006 by Team TMR Post: And another. http://boinc.bakerlab.org/rosetta/workunit.php?wuid=11627337 This and the 4 I mentioned below were all stuck on 1%. |
9)
Message boards :
Number crunching :
Report stuck & aborted WU here please
(Message 12322)
Posted 20 Mar 2006 by Team TMR Post: Just aborted 4 more. Really hope this gets fixed soon, we've just wasted over 5 days of CPU time! Good luck Rom. 8.8 hours http://boinc.bakerlab.org/rosetta/workunit.php?wuid=11584596 18.2 hours http://boinc.bakerlab.org/rosetta/workunit.php?wuid=11551106 41.4 hours http://boinc.bakerlab.org/rosetta/workunit.php?wuid=11460182 71.0 hours http://boinc.bakerlab.org/rosetta/workunit.php?wuid=11330309 |
10)
Message boards :
Number crunching :
Report stuck & aborted WU here please
(Message 12089)
Posted 16 Mar 2006 by Team TMR Post: Another one: http://boinc.bakerlab.org/rosetta/workunit.php?wuid=11167215 Stuck on 1% after 12 hours. |
11)
Message boards :
Number crunching :
Report stuck & aborted WU here please
(Message 12009)
Posted 14 Mar 2006 by Team TMR Post: Had 3 today that have been stuck on 1% after anything between 3-16 hours (runtime set to 2 hours). 3.1 hours: http://boinc.bakerlab.org/rosetta/workunit.php?wuid=11019262 4.3 hours: http://boinc.bakerlab.org/rosetta/workunit.php?wuid=11008654 16.9 hours: http://boinc.bakerlab.org/rosetta/workunit.php?wuid=10949277 All were aborted. |
12)
Message boards :
Number crunching :
Report stuck & aborted WU here please
(Message 11557)
Posted 2 Mar 2006 by Team TMR Post: This one WU 9696277 was stuck on 1% for 3 days! I've just aborted it. No wonder my daily points have taken a hit. Looking forward to getting the credit it... |
13)
Message boards :
Number crunching :
Report stuck & aborted WU here please
(Message 11002)
Posted 20 Feb 2006 by Team TMR Post: Well it finished eventually, at 8hr 39mins. But it never did get off 1% as far as I could see. |
14)
Message boards :
Number crunching :
Report stuck & aborted WU here please
(Message 10997)
Posted 20 Feb 2006 by Team TMR Post: Another one, 9442770 Over 8 hours in and still stuck on 1%. It's running rosetta 4.82 too, so I guess that didn't fix the 1% problem then. Max CPU setting is 2 hours. |
15)
Message boards :
Number crunching :
Report Maximum CPU Time Exceeded WU HERE
(Message 10575)
Posted 8 Feb 2006 by Team TMR Post: And now the 3rd has failed. Result 8433350 I hope we're going to get credit for these! |
16)
Message boards :
Number crunching :
Report Maximum CPU Time Exceeded WU HERE
(Message 10574)
Posted 8 Feb 2006 by Team TMR Post: I also have 3 other ABINITIO WUs in progress that have been running over 12 hours (2 are on 2+ GHz PCs) which might be heading the same way. One of them now has: Result 9027571 |
17)
Message boards :
Number crunching :
Report Maximum CPU Time Exceeded WU HERE
(Message 10569)
Posted 8 Feb 2006 by Team TMR Post: This one just timed out: WU 5610404, Result 9094342 I also have 3 other ABINITIO WUs in progress that have been running over 12 hours (2 are on 2+ GHz PCs) which might be heading the same way. If these timed out WUs are of use, are you still giving credit for them? |
18)
Message boards :
Number crunching :
Shorter WU deadlines
(Message 10007)
Posted 27 Jan 2006 by Team TMR Post: For us, we've had to rejig the projects that some PCs work on, taking some PCs off Rosetta completely. We had Rosetta running on a few slow PCs, which take 1-2 days to complete a WU. When you factor in that those PCs are only on for less than 8 hours a day, it becomes 4-6 days to complete - and then a weekend arives (PCs are off) and it takes over a week to complete. If the short deadline WUs were smaller so they completed quicker it wouldn't be such a problem. |
19)
Message boards :
Number crunching :
huge WU
(Message 9799)
Posted 25 Jan 2006 by Team TMR Post: Thanks! I obviously didn't click through enough pages. |
20)
Message boards :
Number crunching :
huge WU
(Message 9797)
Posted 25 Jan 2006 by Team TMR Post: I know these ABINITIO WUs are big and take longer, but is anyone else finding they disappear when they're reported? We've just uploaded two of these WUs, each took 18 hours to complete, but both are missing from our list of results. I've checked back, and of the 5 ABINITIO WUs that we've completed this morning only 1 is present in our results list. Every other type of WU seems to appear in the results list immediately. I'm fairly sure we're getting credit (our credit increased by the expected amount the last time we reported out of these WUs), I'm just curious why they're missing from the reports list. |
©2024 University of Washington
https://www.bakerlab.org