1)
Message boards :
Number crunching :
New WUs failing
(Message 89702)
Posted 6 Oct 2018 by BelgianEnthousiast Post: Does anyone observe faulty WU's on WIN 10 platform ? Since april 1st, I crunched 366 WU's in total. Up until september 9th 176 WU's without any errors. Since September 9th, I crunched 190 WU's, but gradually racking up 33 (till today, Oct 6th) failed WU's. Not sure how to find out which ones failed. Can anyone help ? I'd like to dig a little deeper. Apart from that, any comments as to why all of a sudden so many WU's fail ? I'm running 2 cores for Rosetta, 5 cores for LHC (5 core WU's). I do not observe any issues on LHC, so I'm pretty sure it's not my rig that's having issues. Many thanks for your advice ! BE. |
2)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 79411)
Posted 13 Jan 2016 by BelgianEnthousiast Post: Hi Timo, Steve, Thanks for your quick reactions & advice ! To answer to your questions : 1) I tried exiting "graciously" from BOINC Mgr hoping that it would pick it up and wrap the WU's up, but to no avail unfortunately. I also tried resetting the project, and it keeps giving these mixed results. 2) I do not hibernate nor put my laptop in (deep)sleep mode. I always it shut down fully. Tried that multiple times too, but again without much success... To add to my initial post, I got another trio hanging at the moment : rb_01_10_61977_106329_T000__1C1_SAVE_ALL_OUT etc. (running for 12h14, no remaining estimate and only at 27.492 %) shrtNTF2_2_UM_1_N16E92S12_noCH_NTF2_bb-610__1_0001 etc. (running for 9h44, no remaining estimate and on 89.556 % but no more moving) and another nkid_1_2_2016 etc. (running for 9h45, no remaining estimate and only on 11.186 %) all the while I have a tj_2016A_insert_X_DHR53_DHR18 etc running just fine, showing 2h19, 3h47 remaining and at 34.973 %. Nice evening ! B.E. |
3)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 79406)
Posted 13 Jan 2016 by BelgianEnthousiast Post: Hi All, Been running Rosetta for a while and now encountering serious issues with near-endless or endless loops. Normal running time is 6 hours on a task. And half of the WU's seem to adhere to that, however the other half is showing some weird behaviour : 1. Running forever without any estimated time left, going on for 20+ hours as an example : nkid_1_3_2016_final3_0716_00058_0043.pdb343_TG_dez_fold_SAVE_ALL_OUT_322141_663_0 nkid_1_3_2016_final3_0692_00366_0042.pdb342_TG_dez_fold_SAVE_ALL_OUT_322134_678_0 2. Running forever, but with an estimated time left which keeps creeping up. don't have examples here, I aborted them after 25+ hours of running. This appears on a laptop. On my desktop, it seems to work well. Although I have other issues there with the scheduling of Rosetta. Could you please investigate ? Many thanks in advance ! Kind Regards, B.E. |
4)
Message boards :
Number crunching :
Downloading far too many WU's - too short period to complete them
(Message 79001)
Posted 28 Oct 2015 by BelgianEnthousiast Post: Hi D & Mod.Sense, Thanks for the quick response ! I'll wait it out and see how it behaves. K. |
5)
Message boards :
Number crunching :
Downloading far too many WU's - too short period to complete them
(Message 78996)
Posted 27 Oct 2015 by BelgianEnthousiast Post: Hi, When syncing Rosetta, the project automatically downloads a huge amount of WU's at once : - I now have 226 WU's in wait(3 hours each, used to be 6 hours each) - I'm currently processing 10 WU's on my system. - The validity period is only 10 or 11 days. I'm also participating in LHC, Atlas, Climateprediction which means that I am forced to process all Rosetta's continuously to be able to meet the deadlines and not losing out on any jobs, effectively halting all other projects. I have fiddled around with setting "Maintain enough work for an additional (Enforced by version 5.10+)" to 1 day, but Rosetta just seems to ignores this and keeps downloading this many jobs. Is there any way I can prevent this from happening ? I'm ok to download quite a few jobs and prefer to have a buffer of say 3 to 4 days to ensure sufficient work even if a server goes down for a couple of days. But this many jobs is simply choking my system & hindering other projects. Many thanks for investigating into this ! Kind Regards, K. |
©2023 University of Washington
https://www.bakerlab.org