Posts by BelgianEnthousiast

1) Message boards : Number crunching : New WUs failing (Message 89702)
Posted 6 Oct 2018 by BelgianEnthousiast
Post:
Does anyone observe faulty WU's on WIN 10 platform ?
Since april 1st, I crunched 366 WU's in total. Up until september 9th 176 WU's without any errors.
Since September 9th, I crunched 190 WU's, but gradually racking up 33 (till today, Oct 6th) failed WU's.

Not sure how to find out which ones failed. Can anyone help ? I'd like to dig a little deeper.

Apart from that, any comments as to why all of a sudden so many WU's fail ?

I'm running 2 cores for Rosetta, 5 cores for LHC (5 core WU's). I do not observe any issues on LHC, so I'm pretty sure
it's not my rig that's having issues.

Many thanks for your advice !

BE.
2) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 79411)
Posted 13 Jan 2016 by BelgianEnthousiast
Post:
Hi Timo, Steve,

Thanks for your quick reactions & advice !

To answer to your questions :
1) I tried exiting "graciously" from BOINC Mgr hoping that it would pick it up and wrap the WU's up,
but to no avail unfortunately.
I also tried resetting the project, and it keeps giving these mixed results.

2) I do not hibernate nor put my laptop in (deep)sleep mode. I always it shut down fully.
Tried that multiple times too, but again without much success...

To add to my initial post, I got another trio hanging at the moment :
rb_01_10_61977_106329_T000__1C1_SAVE_ALL_OUT etc. (running for 12h14, no remaining estimate and only at 27.492 %)
shrtNTF2_2_UM_1_N16E92S12_noCH_NTF2_bb-610__1_0001 etc. (running for 9h44, no remaining estimate and on 89.556 % but no more moving)
and another nkid_1_2_2016 etc. (running for 9h45, no remaining estimate and only on 11.186 %)

all the while I have a tj_2016A_insert_X_DHR53_DHR18 etc running just fine, showing 2h19, 3h47 remaining and at 34.973 %.

Nice evening !

B.E.
3) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 79406)
Posted 13 Jan 2016 by BelgianEnthousiast
Post:
Hi All,

Been running Rosetta for a while and now encountering serious issues with near-endless or endless loops.
Normal running time is 6 hours on a task. And half of the WU's seem to adhere to that, however the other half is showing some weird behaviour :
1. Running forever without any estimated time left, going on for 20+ hours
as an example : nkid_1_3_2016_final3_0716_00058_0043.pdb343_TG_dez_fold_SAVE_ALL_OUT_322141_663_0
nkid_1_3_2016_final3_0692_00366_0042.pdb342_TG_dez_fold_SAVE_ALL_OUT_322134_678_0

2. Running forever, but with an estimated time left which keeps creeping up.
don't have examples here, I aborted them after 25+ hours of running.

This appears on a laptop. On my desktop, it seems to work well. Although I have other issues there with the scheduling of Rosetta.

Could you please investigate ?

Many thanks in advance !

Kind Regards,

B.E.
4) Message boards : Number crunching : Downloading far too many WU's - too short period to complete them (Message 79001)
Posted 28 Oct 2015 by BelgianEnthousiast
Post:
Hi D & Mod.Sense,

Thanks for the quick response !
I'll wait it out and see how it behaves.

K.
5) Message boards : Number crunching : Downloading far too many WU's - too short period to complete them (Message 78996)
Posted 27 Oct 2015 by BelgianEnthousiast
Post:
Hi,
When syncing Rosetta, the project automatically downloads a huge amount of WU's at once :
- I now have 226 WU's in wait(3 hours each, used to be 6 hours each)
- I'm currently processing 10 WU's on my system.
- The validity period is only 10 or 11 days.

I'm also participating in LHC, Atlas, Climateprediction
which means that I am forced to process all Rosetta's continuously to be able to meet the deadlines and not
losing out on any jobs, effectively halting all other projects.

I have fiddled around with setting "Maintain enough work for an additional (Enforced by version 5.10+)" to 1 day, but Rosetta just seems to ignores this and keeps downloading this many jobs.

Is there any way I can prevent this from happening ?
I'm ok to download quite a few jobs and prefer to have a buffer of say 3 to 4 days to ensure sufficient work even if a server goes down for a couple of days.

But this many jobs is simply choking my system & hindering other projects.

Many thanks for investigating into this !

Kind Regards,

K.






©2024 University of Washington
https://www.bakerlab.org