Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 24 · 25 · 26 · 27 · 28 · 29 · 30 . . . 309 · Next
Author | Message |
---|---|
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
Sorry to be a bit late on this, but I did notice around 13th March I had a task consuming 2.4Gb and 14Gb of my 16Gb (total) RAM being in use to run 8 tasks. So you've got your extra RAM installed already? If it was a RAM issue (with 8Gb) you'll be fine now. I was only indicating there were some rogue tasks around last week that may have tripped you up back then. Hopefully new tasks play nicer as standard. Your original question was to ask if there was anything you could do - there probably wasn't at that time and you've more than covered yourself now under normal conditions. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
As of this AM I have 16gb on the ryzen and it's currently showing 81% free memory but that's with no Rosetta as no WUs have come down since early yesterday. Up to 10 minutes ago it was still showing those 8086 so that doesn't sound right. However, I'm here to say a whole load of tasks just came down and the server status page has just changed to show an additional 20k Rosetta tasks in progress and 15k still unsent. No idea how long that will last, but there is some progress. |
rjs5 Send message Joined: 22 Nov 10 Posts: 273 Credit: 23,229,863 RAC: 6,747 |
I was watching when a couple of the Rosetta WU failed. They computed properly down until the TIME REMAINING was zero seconds and the compute time was 8 hours and a few minutes. Instead of reporting the completion, the WU was marked as WAITING with zero seconds remaining. When the WU restarted, it indicated a COMPUTE ERROR with the "finish file present too long</message>". The 34 failing WU seemed to all fail at the end and were 4.08 Linux WU. https://boinc.bakerlab.org/rosetta/result.php?resultid=1063704662 |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
We had a good run, but no tasks left to download (and that mysterious 8086 ready to send again, whatever that is) |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 398 Credit: 12,294,748 RAC: 6,222 |
I was watching when a couple of the Rosetta WU failed. That sounds very similar to mine. I did notice that a few of mine showed n decoys and then appeared to restart and showed a session with 1 decoy before failing. |
bcavnaugh Send message Joined: 7 Dec 13 Posts: 7 Credit: 2,389,640 RAC: 0 |
Not getting any Tasks on this Host https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=3112116 A T630 Server but my other T630 is getting them fine https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=3282035 Both running Server 2012 R2 |
bcavnaugh Send message Joined: 7 Dec 13 Posts: 7 Credit: 2,389,640 RAC: 0 |
Not getting any Tasks on this Host https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=3112116 Looks OK now https://boinc.bakerlab.org/rosetta/results.php?hostid=3112116 |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 398 Credit: 12,294,748 RAC: 6,222 |
Not getting any Tasks on this Host https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=3112116 I suspect that was the last splutterings as the pool was draining, project status is showing 0 tasks unsent (but, as has been said, 8086 tasks ready to send). |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
Not getting any Tasks on this Host https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=3112116 Maybe they're tasks for pre-80386 machines? ... |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 398 Credit: 12,294,748 RAC: 6,222 |
Despite having a 6 hour limit set I am currently processing a batch of Rosetta 4.08 WUs that have been running for 8 hours and are showing an estimated 2 hours remaining. They all have names starting :- rb_03_21_2022_2162_ab_t000__robetta_cstwt_5.0_FT Is this normal or are they likely to error out? |
Admin Project administrator Send message Joined: 1 Jul 05 Posts: 5144 Credit: 0 RAC: 0 |
This seems odd but I would continue to let it run since it is a relatively large protein to model. |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
This seems odd but I would continue to let it run since it is a relatively large protein to model. Besides that, the limit is CPU-hours, so depending on what else the CPU has to do, the runtime can be a lot longer. . |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 398 Credit: 12,294,748 RAC: 6,222 |
This seems odd but I would continue to let it run since it is a relatively large protein to model. After 10 hours (elapsed and CPU) 2 of them (1064201222 and 1064201281) errored out with the same symptoms I’ve been seeing. Interestingly the 4 that succeeded (1064201216, 1064201223, 1064201224 and 1064201283) also had the default.out.gz exist, stream information inconsistent error so that is also a red herring. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
Validation seems to be offline for the last half hour |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
Validation seems to be offline for the last half hour And back about 30mins ago, I think |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 398 Credit: 12,294,748 RAC: 6,222 |
Validation seems to be offline for the last half hour And off again since 04:00 this morning |
Trevor ct Send message Joined: 7 Oct 14 Posts: 2 Credit: 23,386,023 RAC: 0 |
Rosetta 4.07 work tasks are reporting 'computational error' immediately they are opened on one of my two computers. Only known difference is affected computer BOINC version is 7.14.2 and the non-affected is earlier version 7.6.33. I cannot discover how to revert version as a trial. Trevor ct |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Rosetta 4.07 work tasks are reporting 'computational error' immediately they are opened on one of my two computers. Only known difference is affected computer BOINC version is 7.14.2 and the non-affected is earlier version 7.6.33. It is not the BOINC version. I see the errors too (one on each of two Ubuntu machines), and they are both running BOINC 7.14.2. |
LarryMajor Send message Joined: 1 Apr 16 Posts: 22 Credit: 31,533,212 RAC: 0 |
Yeah, I got a bunch of these on different machines, and they all fail when they are resent to someone else. It's the work units, not your computer. |
Trevor ct Send message Joined: 7 Oct 14 Posts: 2 Credit: 23,386,023 RAC: 0 |
Thank you. Comforting I am not running a rogue program. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org