Message boards : Number crunching : Problems with Rosetta version 5.80
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 10 · Next
Author | Message |
---|---|
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
I moved Markus' post here from Q&A boards. Sorry the post is so long. Looks like one of the CAPRI WUs may have caused the reported msgs. See Rhiju's post below/above. There have been some problems with these tasks on some machines and so they've stopped sending them out. Markus, could you post links to the two specific hosts (and the specific tasks if possible) where you had problems? Rosetta Moderator: Mod.Sense |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
Tell me there's some mistake here in the native structure shown (linky to screenshot with a rather straight native structure shown for a t015_1_NMRREF_1_t015_1_id_model_07_idlIGNORE_THE_REST_core_2097_3599_0 task). Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
Conan Send message Joined: 11 Oct 05 Posts: 150 Credit: 4,244,078 RAC: 3,376 |
Result ID 106047623 Well Michael at least you got 20, mine ran for over 28,000 sec and got 5.38. |
teemac Send message Joined: 18 Jul 06 Posts: 1 Credit: 192,962 RAC: 0 |
I have 5 machines on Rosetta at present: 3x Intel E4300 Core2Duos- Kubuntu v7.04 (64bit) 1gb ram - these machines have been locking one core and sometimes both cores over the last day or so. I have aborted all WU's with the word CAPRI in them. I also currently have nearly all work units with 'IGNORE THE REST' in the units name also locking and freezing cores or completely locking machines with an error message saying something like 'if this keeps happening you may need to reset the project'. 1x AMD X2/4600 - Kubuntu v7.04 (64bit) 1gb ram - this machine is mostly ok - no locking but some errored WU's. 1x AMD 3200+ - Kubuntu v7.04 (32bit) 512mb ram - same as the 4600 machine above. I currently have 2 of the E4300's locked - no work ticking over for the last hour or so - one of the machines is totally locked and am unable to use the OS at all - the other machine only has BOINC locked up, but I can use the OS. |
hugothehermit Send message Joined: 26 Sep 05 Posts: 238 Credit: 314,893 RAC: 0 |
I noticed that the ...CAPRI14_DOCK... native looked wrong, they were too far apart to be interacting at all, compared with what I have seen before. |
Wits End Send message Joined: 16 Apr 07 Posts: 4 Credit: 29,477 RAC: 0 |
Of the eight post-CAPRI WUs that I've returned, two produced "validate errors". I received credit for the other six but they all had "watchdog shutting down" notes, and one had "WARNING! Not sure non-ideal rotamers are compatible with symmetry yet..." What's going on?!? 107006854: Validate error 106890130: Watchdog notice 106794699: Watchdog and Warning notices 106724332: Validate error 106613376: Watchdog notice 106550676: Watchdog notice 106521483: Watchdog notice 106514350: Watchdog notice |
anders n Send message Joined: 19 Sep 05 Posts: 403 Credit: 537,991 RAC: 0 |
Of the eight post-CAPRI WUs that I've returned, two produced "validate errors". I received credit for the other six See this post about no more Capri for now. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
RE: Watchdog notice Normal. Been that way since the watchdog was implemented. Since the watchdog runs in a seperate thread, this message just confirms that the watchdog thread properly ended as the task was completed. So it is just saying everything ended normally, including the watchdog. Rosetta Moderator: Mod.Sense |
mdettweiler Send message Joined: 15 Oct 06 Posts: 33 Credit: 2,509 RAC: 0 |
RE: Watchdog notice When the watchdog has to end a task, is it of any use at all to the project scientifically, or is it practically aborted? I think I heard that the watchdog will abort a task if it goes a given amount of times longer than your preferred runtime, regardless of whether the application is showing visible progress; is this true? If so, are those terminated results useful at all? |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
RE: Watchdog notice oh...good question..has me wondering the same thing. |
M.L. Send message Joined: 21 Nov 06 Posts: 182 Credit: 180,462 RAC: 0 |
Result ID 106760639 Name NeT6__BOINC_SYMM_FOLD_AND_DOCK_RELAX-NeT6_-mfr__2100_7176_0 Workunit 96937615 Validate state Valid Claimed credit 91.5795040403045 Granted credit 48.2594968464685 application version 5.80 Never seen such a big difference between claimed and granted credits,unless the WU failed in some way but don't see any sign of that. Anyone got any ideas? |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
When the watchdog has to end a task, is it of any use at all to the project scientifically, or is it practically aborted? Results are always useful. I exchanged some EMails with Chu some time ago and collected some details on the watchdog. I'll compile them into the FAQ and post them shortly. Even knowing that a given approach does not function as expected is important to know. This is why Rosetta considers all results useful and meaningful, and attempts to issue credit to participants for their assistence in making such a determination. Rosetta Moderator: Mod.Sense |
Markus Schuhmacher Send message Joined: 29 May 06 Posts: 4 Credit: 1,455,542 RAC: 0 |
I moved Markus' post here from Q&A boards. Sorry the post is so long. Sorry, I've been wondering where my post was gone. The two maschines are https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=603857 https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=509233 How can I figure it out which workunit was currently in progress? |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
|
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
serious problems with this WU from 5.80 non capri |
Jmarks Send message Joined: 16 Jul 07 Posts: 132 Credit: 98,025 RAC: 0 |
|
Andrii Muliar Send message Joined: 10 Nov 05 Posts: 12 Credit: 7,655,243 RAC: 0 |
I am forgot to say: I have Core Duo processor, ADSL connection and Windows XP SP2 as operating system. |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
beat__BOINC_JUMPRELAX_BARCODE2_CONSTRAINT-beat_-_1951_67075_0 ( workunit 98293407 ) stuck on 0% for >1 hour on an Intel iMac2 under OS X 10.4.10 with Boinc 5.10.20. Aborting. |
Paul Send message Joined: 29 Oct 05 Posts: 193 Credit: 66,470,630 RAC: 9,649 |
I noticed that all of my work units are now based on the 5.80 Beta again. The last few days, it looked like most of them were an older version of the application. This morning, I have 5 out of 6 units with Compute Error status. The computer ID is 43057 Can someone please look into this situation? It is very frustrating to have so much CPU time wasted. I just refocused 100% of this computer on R@H because it looked like the problems were fixed. If we are going back to compute errors, it would appear to be a better use of resources to focus this CPU on other projects until R@H is fixed. What is the problem with all the failed WUs and 5.8? Thx! Paul |
Nothing But Idle Time Send message Joined: 28 Sep 05 Posts: 209 Credit: 139,545 RAC: 0 |
beat__BOINC_JUMPRELAX_BARCODE2_CONSTRAINT-beat_-_1951_61847_0 WU 108207857 v.5.80 Ran 21% over my specified run time preference; never saw this before. |
Message boards :
Number crunching :
Problems with Rosetta version 5.80
©2024 University of Washington
https://www.bakerlab.org