Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 125 · 126 · 127 · 128 · 129 · 130 · 131 . . . 309 · Next
Author | Message |
---|---|
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Bryn Mawr - added the half_life, will sit back and see what happens.Or just let things be until they have a chance to settle down- with 8 active projects, even with the changed half life value, i'd expect you're looking at a couple of weeks. One week bare minimum. I had more than a lump and a bump before I tried dividing up the computer. Like now, WCG is really really down close to dead and now that I opened things back up it still is down, but the results I checked are pending. So there is hope. |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 400 Credit: 12,294,748 RAC: 6,222 |
Bryn Mawr - added the half_life, will sit back and see what happens.Or just let things be until they have a chance to settle down- with 8 active projects, even with the changed half life value, i'd expect you're looking at a couple of weeks. One week bare minimum. That’s the project, not your machine. I’ve just had two days of low WCG credits and the shortfall turned up this morning - c’est la vie. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Bryn Mawr - added the half_life, will sit back and see what happens.Or just let things be until they have a chance to settle down- with 8 active projects, even with the changed half life value, i'd expect you're looking at a couple of weeks. One week bare minimum. I gave it 200% and now its climbing like a jet plane. Just have to get LHC back up after WCG and then I think everything can go back to 100%. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,408,362 RAC: 20,061 |
I gave it 200% and now its climbing like a jet plane. Just have to get LHC back up after WCG and then I think everything can go back to 100%.And then it will drop again. So you'll change it, and it will rise again. So you'll change it and it will fall again. So you'll change it, and it will rise again. So you'll change it and it will fall again. etc, etc. Most (if not all) of that rapid increase is not a result of your changes but for the reason Bryn posted- the Project had a delay in granting Credit, now it's all coming through. Hence the surge in Credit. RAC rises slowly, and falls quickly. The half_life change Bryn suggested should allow things to settle down sooner rather than later, but with the number of projects you have we're still talking weeks- not days. And as you change things, then change them back again, then change them, then change them again, it just keeps extending the time it will take for things to settle to actually meet whatever Resource share you finally leave things at for an extended period (ie over a few weeks). Grant Darwin NT |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
I gave it 200% and now its climbing like a jet plane. Just have to get LHC back up after WCG and then I think everything can go back to 100%.And then it will drop again. Yeah I know it drops. So Just ramming it through to get up and later when I go back to work drop it. Half life was changed last week. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,538,222 RAC: 10,691 |
Project was down a little earlier, apparently to do a quick filesystem switch, but it got delayed and they didn't start it back up, so people would've seen Server error: feeder not running Project requested delay of 3600 seconds Quickly fixed after a nudge. Looks fine now. You didn't imagine it |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,408,362 RAC: 20,061 |
Quite a backlog of Validations now. Given that there is no longer any work for minirosetta, they could probably shut down all of the minirosetta processes, and make use of the freed up resources for a few more Rosetta Assimilators and Validators. From the Server Status page- rah_assimilator_rosetta1 (rosetta) rah_assimilator_rosetta2 (rosetta) rah_assimilator_rosetta3 (rosetta) rah_assimilator_rosetta4 (rosetta) rah_assimilator_rosetta5 (rosetta) rah_assimilator_mini1 (minirosetta) rah_assimilator_mini2 (minirosetta) rah_assimilator_mini3 (minirosetta) rah_assimilator_mini4 (minirosetta) rah_assimilator_mini5 (minirosetta) rah_validator_rosetta1 (rosetta) rah_validator_rosetta2 (rosetta) rah_validator_mini1 (minirosetta) rah_validator_mini2 (minirosetta) Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,408,362 RAC: 20,061 |
Validation backlog appears to be growing- now over 104,000 The Server Status for the Validators might be showing green, but they don't appear to be actually doing anything at present. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,408,362 RAC: 20,061 |
Validation backlog appears to be growing- now over 104,000Now over 114,000. Yep- it's broken. Grant Darwin NT |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
A task running MUCH longer than the expected 8 hours: aaab_nNMALA_pp-SAR_pp-mPPS-BGLY_pp_2_2245795_6_1 https://boinc.bakerlab.org/rosetta/result.php?resultid=1441862159 2 days, 8 hours, 32 minutes so far rosetta python 1.03 vbox64 This is elapsed time, not the much shorter CPU time. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,408,362 RAC: 20,061 |
Now over 138k.Validation backlog appears to be growing- now over 104,000Now over 114,000. Grant Darwin NT |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 400 Credit: 12,294,748 RAC: 6,222 |
Now over 138k.Validation backlog appears to be growing- now over 104,000Now over 114,000. And now over 176k but some must be getting through. Yesterday I dropped to 3k credits for the day as everything was pending but today I have 11k :-) |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,538,222 RAC: 10,691 |
Now over 138k.Validation backlog appears to be growing- now over 104,000Now over 114,000. Now up to 237k backlog, but I don't have any pending dated 28th Oct so some are going through, just nowhere near enough to keep up, let alone catch up. I sent a message about 11hrs ago and got a reply about 8hrs ago that it'd be looked at when they got in, which I'm guessing would be ~6hrs ago. That it's not fully fixed yet indicates it's not as straightforward as the feeder issue a few days before. I've heard nothing more since. It's been reported and acknowledged. That's all I can say. PS: Apart from being away from home from yesterday until Sunday week apart from 1.5days, my email provider has had a major outage which looks like it'll take 2-3 days to fix, making matters worse. I will be able to check in here for 6 of 9 days I'm away and I am using a backup email account if anything new comes up - hopefully I won't have to When it rains it pours... Edit: When I started typing my credits were 300 less than what were showing here, so I did a manual update and my credits were 400 more than are showing here. Lots from 29th October updated, but in quite a funny order. Maybe things are moving much more rapidly right now? Fingers crossed |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,408,362 RAC: 20,061 |
Just checked my Tasks and a few from the 29th have come through, but the number of Pendings is still almost triple the number of Valids. Hopefully the life signs will continue to improve as the day goes on. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,538,222 RAC: 10,691 |
Just checked my Tasks and a few from the 29th have come through, but the number of Pendings is still almost triple the number of Valids. Yeah, another look and I'm not buying my idea either tbh. Updated to 243k backlog - higher still, not lower. A watched pot never boils - I'll look again tomorrow |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,408,362 RAC: 20,061 |
Luckily the Rosetta graphs also show the Validation numbers. It looks like the Validators have been having issues for a while now. Generally they've been averaging a backlog of around 600 or so. But since Wednesday of last week, there have been periods where they've been falling behind, then catching up. The amount they fall behind each time getting larger until they came good for a couple of days from late Sunday. Then they stared falling behind again, more and more each time until the present huge backlog. Compare that to over the last year. Grant Darwin NT |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
A task running MUCH longer than the expected 8 hours: Now aborted after 3 days and 20 hours elapsed, less than 10 minutes CPU time. The python tasks need a major improvement in how they detect tasks taking too long to run, Could the current validator be written in Python, and having this same problem? |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
Rosetta@Home has a problem with how you recover after losing your password. The line where it asks you to enter your email address will not allow you to enter anything unless toy first click in the right half of the line and make the box appear that you need to put the email address inside. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,538,222 RAC: 10,691 |
Just checked my Tasks and a few from the 29th have come through, but the number of Pendings is still almost triple the number of Valids. Not getting any better - in fact much worse. I've sent another nudge with a request for a timescale. Combined with my entire email provider being down for 3 consecutive days, this is not what I want to see... |
TSD Send message Joined: 10 Oct 08 Posts: 7 Credit: 2,189,714 RAC: 0 |
As usual there is no information about what is happening. I don't know what I am doing here. I'm running Folding@Home now. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org