Posts by Divide Overflow

21) Message boards : Number crunching : Pending Credit (Message 11274)
Posted 23 Feb 2006 by Divide Overflow
Post:
20 Feb 2006 17:47:51 UTC 23 Feb 2006 14:55:48 UTC Over Success Done 28,746.61 69.39(claimed) 5.75(granted)

Not strictly a pending question but I've never had credit other than claimed before except 0 when a client has errored.


Thats because you were not the first host to be sent that WU. The first host to be given that unit returned a (late) result the day after you were given it. Since they returned their result first, the granted credit that was awarded was what they requested.

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=8930476
22) Message boards : Number crunching : Won't connect to localhost (Message 11254)
Posted 23 Feb 2006 by Divide Overflow
Post:
23) Message boards : Number crunching : WU not downloading (Message 10736)
Posted 13 Feb 2006 by Divide Overflow
Post:
There must be some significant network problems going on. Rosetta keeps dropping off the radar of BOINC projects that are up, the database is hit or miss and uploads, downloads and scheduler updates are all suffering problems.
24) Message boards : Number crunching : Report Maximum CPU Time Exceeded WU HERE (Message 10400)
Posted 3 Feb 2006 by Divide Overflow
Post:
25) Message boards : Number crunching : Database server unhappy - Jan 29 11:50 UTC (Message 10213)
Posted 30 Jan 2006 by Divide Overflow
Post:
Something is going on and the validator might also be affected.. For the first time ever, I've seen one of my returned results in a "pending" status.

EDIT: Currently the project is reporting as being down and the server status page is not available.
26) Message boards : Number crunching : @ Dave Baker (Deadlines causes EDF) (Message 10097)
Posted 28 Jan 2006 by Divide Overflow
Post:
Dave
Any chance of the deadlines going to 2 weeks like most other projects?

This LINK goes to the Shorter WU deadline thread about how easily some users were put into EDF with the one week deadlines.

Why? You seem to be overly concerned with the behavior of your short term queue. You might crunch some Rosetta for a while on EDF, but eventually the long term debt for the other projects you have active will prevent you from getting more Rosetta work and your focus will be on your other projects. The scheduler was designed to work with these variables to keep you as close as possible to your resource allocation over the long term.

Each project should keep the settings that best meet their own demands. BOINC should be flexible enough to roll with the punches.
27) Message boards : Number crunching : What's with this? (Message 9510)
Posted 21 Jan 2006 by Divide Overflow
Post:
Wow! I wish I had that much computing power available...

It certainly looks like this guy is having the most problems:
http://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=53940

I see a lot of your computers have a download error for their bad WU's. Are you having connectivity problems? Did you change any proxy addressing recently?

To answer your specific question, what's going on? When a host returns a result with errors, it's daily quota is reduced. This is done to prevent unattended rogue bad systems out there from draining the total number of WU's available to everybody. As soon as a system that was having temporary problems starts returning good results again, the quota avaialable to it doubles with each good result it returns.

I can't tell what is causing you to return bad results at this point, but the download errors are a good clue to start from.
28) Message boards : Number crunching : What's with this? (Message 9508)
Posted 21 Jan 2006 by Divide Overflow
Post:
Each valid result returned from that host will double your daily quota, so as soon as you get your problems sorted out, you should be back up to full quota again in next to no time. Good luck!
29) Message boards : Number crunching : What's with this? (Message 9505)
Posted 21 Jan 2006 by Divide Overflow
Post:
You have your computers hidden, so I can't be sure. It sounds like you returned a *LOT* of bad results from that machine, dropping your daily quota for that host down to just 2. The project won't let you have any more work until you return a valid result, which will double your daily quota each time. Each valid result returned from that host will double your daily quota.

What is going on with that host?!?
30) Message boards : Number crunching : Report stuck & aborted WU here please (Message 9321)
Posted 19 Jan 2006 by Divide Overflow
Post:
Yet another ABINITIO that exceeded the maximum CPU time...

1/18/2006 3:16:11 PM|rosetta@home|Aborting result PRODUCTION_ABINITIO_1fkb__250_452_0: exceeded CPU time limit 32474.092756
1/18/2006 3:16:11 PM|rosetta@home|Unrecoverable error for result PRODUCTION_ABINITIO_1fkb__250_452_0 (Maximum CPU time exceeded)
31) Message boards : Number crunching : WU Started then switch to new computer no results posted (Message 9240)
Posted 18 Jan 2006 by Divide Overflow
Post:

Although out of the ordinary, should this have worked ( downloading to one, processing on another, and uploading by either )?

No.

You should have just let it crunch on the original machine. (If you didn't want any Rosetta work for that host, the command that you were looking for was the suspend button from the projects tab.) Don't worry about that WU any longer. It will eventually expire at it's deadline and be sent out to another host. BOINC is not setup for transferring assigned and downloaded WU's from one host to another.

The new WU's appear to be checkpointing more often, which means smaller % complete increases and lots more of those "endpoints".
32) Message boards : Number crunching : Report stuck & aborted WU here please (Message 9238)
Posted 18 Jan 2006 by Divide Overflow
Post:
I just noticed two WU's that ran for just over 9 hours before aborting with the Maximum CPU time exceeded:

1/17/2006 12:20:28 PM|rosetta@home|Aborting result PRODUCTION_ABINITIO_2chf__250_242_0: exceeded CPU time limit 32474.092756
1/17/2006 12:20:28 PM|rosetta@home|Unrecoverable error for result PRODUCTION_ABINITIO_2chf__250_242_0 (Maximum CPU time exceeded)

1/17/2006 3:58:41 PM|rosetta@home|Aborting result PRODUCTION_ABINITIO_2vik__250_261_0: exceeded CPU time limit 32474.092756
1/17/2006 3:58:41 PM|rosetta@home|Unrecoverable error for result PRODUCTION_ABINITIO_2vik__250_261_0 (Maximum CPU time exceeded)

Are there more bad batches of WU's out there again?
33) Message boards : Rosetta@home Science : Top Results page ? (Message 9064)
Posted 15 Jan 2006 by Divide Overflow
Post:
Sorry for the delays. We have been really busy setting up a queueing and data management system. Once we have the system set up, we should have time to update the top predictions and results.


Any estimate on when there will be another update on predictions and some analysis on the latest results? There's been quite an increase to the diversity of WU's since the last update and I'm sure many of us are eager to hear how things are going!
34) Message boards : Number crunching : Upload errors (error 500???) (Message 9062)
Posted 15 Jan 2006 by Divide Overflow
Post:
I'm definitely seeing some network congestion and routing errors today. Perhaps it's the wet weather on the west coast, perhaps it's still issues with the campus network. In either case, be prepared for a struggle sending, receiving, and reporting work for a while.
35) Message boards : Number crunching : Is this real? (Message 8993)
Posted 14 Jan 2006 by Divide Overflow
Post:
Lucid, when first connecting to the project, there are several support files that are downloaded. What you're describing sounds about right, so I wouldn't be concerned. From here on out, you'll only typically need to be transferring the smaller WU's and result files.
36) Message boards : Number crunching : Maximum CPU time Exceeded...How about some granted credit! (Message 8820)
Posted 12 Jan 2006 by Divide Overflow
Post:
David has just finished awarding credits to recently returned jobs, and will have gone through all of the archived jobs within the next two days.

Another example of why I respect the management of this project so much. Thanks for the follow through!
37) Message boards : Rosetta@home Science : Screensaver needs to be more fluid please (Message 8465)
Posted 6 Jan 2006 by Divide Overflow
Post:
38) Message boards : Number crunching : Internet traffic and necessary data (Message 8298)
Posted 3 Jan 2006 by Divide Overflow
Post:
Each BOINC project will have different requirements, including network traffic volume. If Rosetta is asking for too much bandwidth for you, please consider one of the other projects out there.
39) Message boards : Number crunching : (reached daily quota of 200 results) (Message 8140)
Posted 1 Jan 2006 by Divide Overflow
Post:

I still think it's gremlins.

Best explanation I've heard so far. (Ok your other stuff was pretty good too...)
;)
40) Message boards : Number crunching : (reached daily quota of 200 results) (Message 8090)
Posted 1 Jan 2006 by Divide Overflow
Post:
And, I don't have ANY cache on this and several other machines which I have since shut down do to lack of viable work.

Sounds like you're the victim of quite a lot of ghost WU's. (The project thinks it's sent you work, but it never makes it across to your host.) I noticed that at least one of your machines had a large number of download errors. Are you having some network problems on your end that could be interfering with getting new work? If you leave the machines up, they should be able to make another attempt within 24 hours. Hopefully they will actually make it across to you this time!


Previous 20 · Next 20



©2024 University of Washington
https://www.bakerlab.org