Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 55 · Next
Author | Message |
---|---|
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,526,036 RAC: 10,392 |
But, as you say, there appears to be nothing about it which I can actively solve, so I must just 'put up with it', unless I become a programmer on the Rosetta team, and that ain't about to happen anytime soon! Yup, that's what I'd do just to tidy up the waiting tasks. If there's only a few %age points of runtime left it ought to go through quite quickly. Boinc (not Rosetta) has always had this kind of scheduling problem. It's actually got better with the latest public release, but it looks like there are still issues. Very annoying if you run more than one project. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,526,036 RAC: 10,392 |
The tasks that came down a few days ago (Monday) after the outage all seem to have run very short. 1-3 hours instead of my preferred 8 hour setting. I guess that's why we ran out again so quickly. Also, the validator seems to be way behind as well as giving out some errors. See here: My results Note: the runtime is requested to be 28800 but 3-14000 is a typical outcome Also, it's worth the Rosetta guys looking to be well-stocked up with tasks ahead of the Christmas period and rebooting the server in the last week to ensure the holiday period goes as smoothly as possible. EVERY year there's a problem, so as much that can be done to pre-empt any issues would be appreciated & cut down on the regular whining that comes as a result. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
The tasks that came down a few days ago (Monday) after the outage all seem to have run very short. 1-3 hours instead of my preferred 8 hour setting. I guess that's why we ran out again so quickly. My response to your Christmas idea is "HAH" won't happen. Never has, never will. But we can always hope. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
After a review of my tasks I found one that came up with: Maximum elapsed time exceeded. However there was 0 cpu time and no debuger report. The name of the task was: ab_11_19__opt_T6041_opt_cst_pred_wt_14_03_09_35570_28_0 result page: https://boinc.bakerlab.org/rosetta/result.php?resultid=465870741 Funny thing is, my wingman completed the tasks with no trouble on a 64bit vista machine. He has a slightly newer quad core than me. |
brilor Send message Joined: 31 Mar 08 Posts: 9 Credit: 124,013 RAC: 0 |
My results show 7 completed tasks with "granted credit" equal to "pending" since 29-Nov-2011. All programs are running in server status but noticed both a "validator_mini" and a "validator_beta". Presumably this is a technical issue with the server(s) but any hints to the contrary would be welcome. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,526,036 RAC: 10,392 |
My response to your Christmas idea is "HAH" won't happen. Thank you for your contribution. The short runtimes seem to have been solved with the jobs that were sent from 2 Dec onwards. Thanks. |
Alan J Rodger Send message Joined: 16 Oct 05 Posts: 7 Credit: 32,282 RAC: 0 |
Two more work units continuing to run with elapsed time and time to go increasing simultaneously. This has been going on too long - it doesn't seem as if Rosetta can manage their system - I quit! there are many more systems that run without problems! |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,860,059 RAC: 7,494 |
Two more work units continuing to run with elapsed time and time to go increasing simultaneously. This has been going on too long - it doesn't seem as if Rosetta can manage their system - I quit! there are many more systems that run without problems! Alan That isn't strictly a bug in Rosetta and always happens with this project because BOINC struggles to calculate how long is left due to the way Rosetta packs multiple models into a task to meet the run-time preference of the user. Time to complete increases steadily until a checkpoint and is then recalculated at which point it drops before beginning to climb again. Danny |
Bob Stryk Send message Joined: 16 Jan 06 Posts: 1 Credit: 484,245 RAC: 0 |
Since re-installing Windows XP most rosetta@home work units end with a Computation error. The notation in the Events Log is of the form "Output file ... for task ... absent". Is this the work unit or something not quite right with the computer? |
Leland Kornhaus Send message Joined: 16 Jul 06 Posts: 2 Credit: 1,793,841 RAC: 0 |
Two more work units continuing to run with elapsed time and time to go increasing simultaneously. This has been going on too long - it doesn't seem as if Rosetta can manage their system - I quit! there are many more systems that run without problems! I've got the same issue. First, I had a lot of 3 hour tasks that would take 7+ hours of processor time: Here's a nice example that was granted less than 1/6th the claimed credit: https://boinc.bakerlab.org/rosetta/result.php?resultid=465189829 ... but at least it was granted credit! Since December 1st I believe I have 179 work units with about 20,600 in claimed credits that are still pending. If the work units aren’t even acknowledged, are they being used? I’d hate this much electricity & processing to be wasted on random work that isn’t valued. |
Jesse Viviano Send message Joined: 14 Jan 10 Posts: 42 Credit: 2,700,472 RAC: 0 |
Two more work units continuing to run with elapsed time and time to go increasing simultaneously. This has been going on too long - it doesn't seem as if Rosetta can manage their system - I quit! there are many more systems that run without problems! Normally the servers are able to keep up with the validation. However, there was a crash this week so there is a big post-crash backlog to handle. I do not know if the crash was a server crash or a networking hardware failure, but the results are the same either way. This project apparently has very little margin left and probably needs to upgrade its servers. |
Leland Kornhaus Send message Joined: 16 Jul 06 Posts: 2 Credit: 1,793,841 RAC: 0 |
Thank you for the explanation. > Normally the servers are able to keep up with the validation. However, there > was a crash this week so there is a big post-crash backlog to handle. I do > not know if the crash was a server crash or a networking hardware failure, > but the results are the same either way. This project apparently has very > little margin left and probably needs to upgrade its servers.[/quote] |
Michael Kingsford Gray Send message Joined: 28 Nov 11 Posts: 3 Credit: 4,593,564 RAC: 0 |
Thanks for all of the very helpful suggestions. I may have an explanation for at least a part of the recent drought of jobs: My new multiprocessor workstations that I am progressively enabling might be gobbling them up. (More to come, as well!) Rosetta appears to have partially resolved my scheduling issue, at least to the extent where one of the 99%+ jobs has decided to complete at last! But its deadline was in another 4 days anyway, (as are all of the stalled jobs), so perhaps I am worrying about nothing? Certainly "GPUGrid" is far better behaved than Rosetta. Things must only get better. I assume that the Rosetta programmers are either volunteers, or academics of some sort? It may well be that I am demanding commercial performance from those who are effectively unpaid amateurs? Philosophy is Bunk - Richard P. Feynman |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Hi. Any chance of someone having a look at the validator, there are tasks that have been sitting waiting over a week now. thanks. |
luisr Send message Joined: 11 Sep 06 Posts: 3 Credit: 1,039,929 RAC: 0 |
Same here. There are ~4200 credits claimed and waiting validation from 2 december until today... 3 complete days of computing... |
[PUGLIA] kidkidkid3 Send message Joined: 14 Sep 10 Posts: 11 Credit: 2,348,063 RAC: 0 |
Same problem for all of us ... from the first of december the validator didn't work correctly ... we are waiting for an administrator's help ... please tell us what's appening. Thanks in advance. I'm a old italian programmer (do you know cards ?). Now, i recycle/repair old pc of my friends, and they revive for research. A long trip begin with a little step ... |
Terianne929 Send message Joined: 5 Oct 11 Posts: 1 Credit: 11,531 RAC: 0 |
I have 3 jobs that are not validating, all three names have similar beginning: Name: T0569... Task ID: 466779186, WU ID: 425851216, completion time given as 03:02:48 but only ran 00:50:33. Name: T0540... Task ID: 467186228, WU ID: 426233861 Name: T0541... Task ID: 467186269, WU ID: 426233928 Also, have had several jobs that ADDED time to the completion time (as much as 15 minutes) while still elapsing time |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,860,059 RAC: 7,494 |
Also, have had several jobs that ADDED time to the completion time (as much as 15 minutes) while still elapsing time Hi Terianne, that's normal for Rosetta. BOINC can't calculate time remaining very accurately so it tends to increase slowly and then drop more significantly and then repeat. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Nobody really reads these posts anymore from the team. I sent a message to someone I had a conversation with once before to relay this issue to someone on the team. Same problem for all of us ... from the first of december |
Cutchet Salvador Send message Joined: 1 Feb 10 Posts: 17 Credit: 10,690,439 RAC: 0 |
Greg, I am sorry to have to give him the account, but it is true. I sent a personal mail to the Manager and neither he has answered me. Really the team of R@H goes on from the disinterested collaborators. Only news of Mr. Baker i nothing more. It is painful and sad. Meanwhile the people stop working with R@H and so calm they. Between the credits validated to 0 and the earrings, we are doing the idiot. Greetings and patience. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org