Over aggressive work fetch?

Message boards : Number crunching : Over aggressive work fetch?

To post messages, you must log in.

AuthorMessage
Divide Overflow

Send message
Joined: 17 Sep 05
Posts: 82
Credit: 921,382
RAC: 0
Message 66542 - Posted: 10 Jun 2010, 18:59:24 UTC

I’ve observed that my hosts have very aggressive work fetch tendencies here at Rosetta. Is the run time preference setting taken into consideration for host work scheduling?

My 8 core system has a cache setting for half a day of work, has a pretty stable result duration correction factor, and is almost always on and working. Based on the suggestions for CASP 9, my run time preference has been set to 12 hours. (No account managers running. Please save them for another discussion.)

With no other work on the machine, it contacts the project and requests work. Based on my settings, I’d expect it to be allocated no more than 16 tasks. 8 to work on now and 8 more as a cache of half day’s worth of work to spare. Instead, I’m given 86 tasks before things settle down and it’s decided that I’ve had enough for now.

There seem to be some logic checks missing in work allocation! Why am I given over five times the work that I should be?

With tighter turnaround times needed for CASP 9 work, I'd expect that this kind of bloat would be unwelcome and submit to you that the process is not working as it should.

Does anyone else experience something similar?
ID: 66542 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1982
Credit: 38,463,172
RAC: 15,101
Message 66546 - Posted: 11 Jun 2010, 0:09:07 UTC

That sounds strange, but if everything is so settled, how come you received no tasks for nearly 6 days and returned 18 on the 4th, 9 on the 5th, 0 on the 6th, 0 on the 7th, 7 on the 8th and 2 on the 9th with the last 8 tasks runing between 4 & 8 hours? How long is the completion time of your unstarted tasks?

I don't mean to contradict you, but it doesn't sound like your DCF would be very stable to me.

Your Rosetta boincstats page
ID: 66546 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Divide Overflow

Send message
Joined: 17 Sep 05
Posts: 82
Credit: 921,382
RAC: 0
Message 66547 - Posted: 11 Jun 2010, 2:41:42 UTC
Last modified: 11 Jun 2010, 3:01:52 UTC

I believe I said that my result duration correction factor was stable, not "everything". The completion estimates for all of my Rosetta tasks are only off by a minute or so to what my run time preference is set to. Which is again puzzling why preferences are ignored and a glut of work can be given.

I'm not questioning why I didn't receive work in a steady manner. Tasks at Ralph explain that. I'm wondering why, when I did ask from work here at Rosetta, I got quite so much! :)

EDIT: I'm catching BOINC Manager asking for (and getting) yet even more work! I guess cache settings are ignored completely in deference to satisfying some other figure I have no control over or visibility to. Run time preferences are certainly not taken into account for work scheduling, at least not in a sensible way.
ID: 66547 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 66548 - Posted: 11 Jun 2010, 4:06:47 UTC

If you recently modified your runtime preference from the default 3 hours to the 12hr setting you now have, BOINC Manager takes a while to understand that. Is it possible it got all of those tasks before you saw that the time estimates are inline with your 12hr preference? I mean normally based on your comment that the estimate is inline, I would say the fetch should be... well much closer to your original expectation. But if it pulled all the work, then completed tasks and revised the DCF, it could explain what you are seeing.

Bottom line is Rosetta doesn't define the work fetch. BOINC does, and unfortunately, it has been having a variety of issues in that regard the past year or so.

I suspect your machine will settle in once this backlog of tasks is cleared.
Rosetta Moderator: Mod.Sense
ID: 66548 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Bikermatt

Send message
Joined: 12 Feb 10
Posts: 20
Credit: 10,552,445
RAC: 0
Message 66567 - Posted: 13 Jun 2010, 14:03:06 UTC
Last modified: 13 Jun 2010, 14:04:39 UTC

This might not do anything, but here is what I would try if it does not start behaving properly in a few days.

- Select "No new tasks" on BOINC and let everything you have right now complete. You may have to manually update for the tasks to report.

- After everything is gone try resetting and detaching from the project. I've read in other forums that that can knock some sense into BOINC.

- If all else fails you could try a different (possibly older?) version of BOINC.

Good luck!

Matt
ID: 66567 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Over aggressive work fetch?



©2024 University of Washington
https://www.bakerlab.org