no work units

Message boards : Number crunching : no work units

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 6 · Next

AuthorMessage
ldzppln

Send message
Joined: 4 Jan 10
Posts: 1
Credit: 1,632,695
RAC: 0
Message 67352 - Posted: 26 Aug 2010, 11:59:04 UTC

My work units all disappeared at some point yesterday, and I haven't received any news ones. I've reset the project and that didn't help. What's going on?
ID: 67352 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 67354 - Posted: 26 Aug 2010, 12:11:45 UTC

There has been a problem at source and work units have been hard to get. However, it looks like the situation is gradually getting resolved as I have just received another supply of work units.
ID: 67354 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jochen

Send message
Joined: 6 Jun 06
Posts: 133
Credit: 3,847,433
RAC: 0
Message 67355 - Posted: 26 Aug 2010, 12:47:27 UTC

Since my computers are currently running 24/7 anyway, I've increased the default running time to 24 hours.
I think, this is the best way to deal with the lack of available WUs.
ID: 67355 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 67357 - Posted: 26 Aug 2010, 14:01:24 UTC

Just be aware that when you change the target runtime like that, that it effects all of the tasks that you have on-board at the time you update to the project and your machine gets that setting. So you want to be sure you don't have a big cache full of tasks, otherwise you will end up with too much work to finish before the deadlines. It just depends how frequently you are at the machine to be able to switch things back, and how much you value not babysitting the machine as compared to having some idle CPU time.
Rosetta Moderator: Mod.Sense
ID: 67357 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
HW&JC

Send message
Joined: 2 May 08
Posts: 20
Credit: 7,477,164
RAC: 2,485
Message 67362 - Posted: 26 Aug 2010, 16:42:54 UTC - in response to Message 67357.  

Just be aware that when you change the target runtime like that, that it effects all of the tasks that you have on-board at the time you update to the project and your machine gets that setting. So you want to be sure you don't have a big cache full of tasks, otherwise you will end up with too much work to finish before the deadlines...

What are the chances of that at the moment?! Tending rapidly to zero right now.
ID: 67362 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 67364 - Posted: 26 Aug 2010, 17:30:03 UTC

I just didn't want others to read that post later and not be aware of how it works.
Rosetta Moderator: Mod.Sense
ID: 67364 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
HW&JC

Send message
Joined: 2 May 08
Posts: 20
Credit: 7,477,164
RAC: 2,485
Message 67381 - Posted: 26 Aug 2010, 23:21:45 UTC

Failed humour on my part. Sorry.

I'm running another project as back-up but I much prefer Rosetta. It looks like there are many tasks but the make_work server is the problem. Someone give it a kick. Quick.
ID: 67381 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jochen

Send message
Joined: 6 Jun 06
Posts: 133
Credit: 3,847,433
RAC: 0
Message 67387 - Posted: 27 Aug 2010, 10:20:21 UTC

Well actually ModSense's advice wasn't even usesless at this point. On one of my computers I got 80 (eighty!!!!) new WUs last night. I didn't expect this at all.

Since the BOINC manager hadn't realized, that I changed from 6 to 24 hours running time, I now got a 10 days cache... Taking into consideration, that some of the ProteinInterfaceDesign-tasks will be long running models, I already might have a problem...

I will keep an eye on this and in the worst case I will revert to 12 or 6 hours running time.

With not many WUs available, I felt rather save, increasing the cache size and default running time at the same time.
I know this is not a good idea, if there's enough work available and I usually set the cache size to 0 and let it empty the cache before increasing the running time. And of course the BOINC manager needs to realize the change in the running time, before increasing the cache again.

My other two computers do hardly get enough work, to keep all cores busy...


ID: 67387 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 67391 - Posted: 27 Aug 2010, 16:22:56 UTC - in response to Message 67387.  

Well actually ModSense's advice wasn't even usesless at this point.


TRY not to sound so surprised! :)

Yes, the good news is you can just watch how it is doing for a few days here and if needed, lower the runtime, update to the project to get the change of preference, and then remaining will run with the lower preference.
Rosetta Moderator: Mod.Sense
ID: 67391 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
deesy58

Send message
Joined: 20 Apr 10
Posts: 75
Credit: 193,831
RAC: 0
Message 67393 - Posted: 27 Aug 2010, 17:47:58 UTC

I have received no new work units in the past 36 hours or so. I have rebooted my machine, and I even shut it down last night. Still no work units, even though the "Server Status" page indicates that all servers are up and running. So ,,, what's wrong?

deesy
ID: 67393 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jochen

Send message
Joined: 6 Jun 06
Posts: 133
Credit: 3,847,433
RAC: 0
Message 67396 - Posted: 27 Aug 2010, 19:10:13 UTC - in response to Message 67391.  

TRY not to sound so surprised! :)


This was primarily directed to HW&JC.

As I said, I WAS totally suprised to get 80 WUs on that computer. The chances of this happening are probably as low as winning the once in a century jackpot in a lottery. ;)

Actually, I'm just sitting here, trying to figure out, if should fill in a lottery-ticket. This week's Jackpot is as high as 11,000,000 Euro. I just don't know whether I have a run of luck, or whether I just wasted my lifetime's luck on getting Rosetta WUs. ;)

cu

Joe
ID: 67396 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jochen

Send message
Joined: 6 Jun 06
Posts: 133
Credit: 3,847,433
RAC: 0
Message 67397 - Posted: 27 Aug 2010, 19:21:05 UTC - in response to Message 67393.  

I have received no new work units in the past 36 hours or so. I have rebooted my machine, and I even shut it down last night. Still no work units, even though the "Server Status" page indicates that all servers are up and running. So ,,, what's wrong?

deesy


There's nothing wrong and absolutely nothing you could do. Rosetta just generates only a limited number of WUs right now.

You could try one of these old pagan sacrificial offerings, I've read about in one of the other posts... ;)

cu Joe
ID: 67397 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
deesy58

Send message
Joined: 20 Apr 10
Posts: 75
Credit: 193,831
RAC: 0
Message 67401 - Posted: 27 Aug 2010, 21:32:51 UTC

Do I understand this correctly? The problems are so big that only grid computing can solve them, but they are not big enough to keep the grid busy?

deesy
ID: 67401 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 67402 - Posted: 27 Aug 2010, 21:50:06 UTC - in response to Message 67401.  

Do I understand this correctly? The problems are so big that only grid computing can solve them, but they are not big enough to keep the grid busy?

deesy

No, there is a problem with the servers. Look on the server status page and you will see they have taken nearly all of them off line while they make a proper fix.
ID: 67402 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Murasaki
Avatar

Send message
Joined: 20 Apr 06
Posts: 303
Credit: 511,418
RAC: 0
Message 67403 - Posted: 27 Aug 2010, 21:50:13 UTC - in response to Message 67401.  

Do I understand this correctly? The problems are so big that only grid computing can solve them, but they are not big enough to keep the grid busy?

deesy


Nope. More like "the problems are so big that only grid computing can solve them, but the servers are put under pressure 24 hours a day every day of the year so something is bound to breakdown every now and then".

There are years of work still to do on this project, we just have to wait until the current difficulties are resolved.
ID: 67403 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
deesy58

Send message
Joined: 20 Apr 10
Posts: 75
Credit: 193,831
RAC: 0
Message 67406 - Posted: 27 Aug 2010, 22:27:48 UTC

As I pointed out earlier, I checked the "Server Status" page, and it indicates that all servers are running. Have you checked that page?

If the servers are running, what next?

deesy
ID: 67406 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Murasaki
Avatar

Send message
Joined: 20 Apr 06
Posts: 303
Credit: 511,418
RAC: 0
Message 67407 - Posted: 27 Aug 2010, 22:34:26 UTC - in response to Message 67406.  

As I pointed out earlier, I checked the "Server Status" page, and it indicates that all servers are running. Have you checked that page?

If the servers are running, what next?

deesy


The servers are going up and down. There is obviously a problem that the team are trying to fix. Problems like this occur once every six to twelve months but are just temporary glitches that last a few days or a couple of weeks at most.

If you are worried about your computer sitting idle, feel free to crunch some work units for another project for a while.
ID: 67407 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 67408 - Posted: 27 Aug 2010, 22:36:29 UTC - in response to Message 67406.  


If the servers are running, what next?



...wait patiently.

Rosetta Moderator: Mod.Sense
ID: 67408 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chris Holvenstot
Avatar

Send message
Joined: 2 May 10
Posts: 220
Credit: 9,106,918
RAC: 0
Message 67410 - Posted: 28 Aug 2010, 0:21:38 UTC
Last modified: 28 Aug 2010, 0:22:31 UTC

You know I think that we all understand that the servers at Rosetta@home have been having some issues the past few days but I think we all need to stop for a moment and reflect on what a "soft" outage this has been.

There have been times these past few days when I have not been able to keep all my cores "well fed" but you know I have never been flat out of work either.

I am not exactly sure what the nature of the problems are (I am curious) but as a system's guy for the last thirty years I am amazed with how the staff has minimized the issues - they have been able to keep a fair amount of support going throughout the week when I am sure it would have been much easier to shut down to a cold-iron state and work the issues "off line"

I for one am impressed and don't mind saying so.
ID: 67410 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Michael Gould

Send message
Joined: 3 Feb 10
Posts: 39
Credit: 14,508,627
RAC: 5,889
Message 67411 - Posted: 28 Aug 2010, 1:57:22 UTC

Yes, I agree with Chris, it is absolutely amazing how smoothly rosetta runs considering the volume of users and work units. It seems that on the rosetta home page, the "Server status" box almost always says "scheduler running," but when the "Estimated TeraFlops" figure dips significantly from its usual 100 or so, that indicates that the servers aren't running smoothly.

Time to crunch some malaria control wu's!
ID: 67411 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · 4 . . . 6 · Next

Message boards : Number crunching : no work units



©2024 University of Washington
https://www.bakerlab.org