Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 291 · 292 · 293 · 294 · 295 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1636 Credit: 16,775,951 RAC: 9,341 |
The boinc-process host is down again, so no Validation for work being returned at this time. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2082 Credit: 40,621,050 RAC: 4,056 |
The boinc-process host is down again, so no Validation for work being returned at this time. Sometimes the server page doesn't report accurately, so when I see some parts of boinc-process are running (some assimilators) I'm not sure what to think. Rosetta_beta and Rosetta_python validators were showing as running for a while, even when other parts weren't, but have now switched to not running again. Whatever's really happening, it all comes across as very flaky <sigh> |
TPCBF Send message Joined: 29 Nov 10 Posts: 111 Credit: 4,991,034 RAC: 516 |
Well, no new task during the day, nothing validated, still all assimilator/vaildators not running... :( |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1636 Credit: 16,775,951 RAC: 9,341 |
The boinc-process host is back up again, although we now have a error message on the main page in the Server Status section Notice: Undefined variable: stats in /projects/boinc/rosetta/html/user/index.php on line 81 Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1636 Credit: 16,775,951 RAC: 9,341 |
Another 600k or so Tasks just released. Hopefully things will stay up for a while. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2082 Credit: 40,621,050 RAC: 4,056 |
Another 600k or so Tasks just released. I arrived at my PC that crashed every task it grabbed from the last batch, like yours did, last night and saw boinc-process was back an hour or two before you posted. It'd been back for some while already, going by how much the validation backlog had reduced. Now that tasks are available, let's see if it handles this new batch any better. I'm currently on another PC that crashed last Monday and missed the last batch altogether, but is rushing through its last few WCG tasks that are right up against their deadline, so I won't find out how this one goes until I get home tonight. Fingers crossed on them all. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1636 Credit: 16,775,951 RAC: 9,341 |
Server Status is showing all green, but a backlog is developing with the Assimilators. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2082 Credit: 40,621,050 RAC: 4,056 |
Another 600k or so Tasks just released. Both running fine and running all tasks to completion. Not sure what the previous blip was about |
Bill F Send message Joined: 29 Jan 08 Posts: 44 Credit: 1,482,505 RAC: 535 |
Trying to get attention that the Stat's export for RALPH has not been updated in over 42 days and that the Posting on the RALPH Message Board is getting no attention. Respectfully Bill F In October 1969 I took an oath to support and defend the Constitution of the United States against all enemies, foreign and domestic; There was no expiration date. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1980 Credit: 9,197,551 RAC: 1,983 |
Trying to get attention that the Stat's export for RALPH has not been updated in over 42 days and that the Posting on the RALPH Message Board is getting no attention. After years on Ralph, i think it's a lost cause... I write, sometimes, in their forums, but i have not a lot of hope (not that the Rosetta forums are much better) |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2082 Credit: 40,621,050 RAC: 4,056 |
Somehow grabbed 16 rb tasks this morning to fill my cache Just checked how many tasks were issued and it appears to be next to none - seems I was just lucky with the odd few Then I saw boinc-process is down again - not so lucky after all <sigh> |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1636 Credit: 16,775,951 RAC: 9,341 |
A glitch is back on the main page Notice: Undefined variable: stats in /projects/boinc/rosetta/html/user/index.php on line 81Just under the Server Status heading. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1636 Credit: 16,775,951 RAC: 9,341 |
Glitch fixed, and lo and behold- new work! And an interesting batch it is- It's Beta 6.06 work, but they're using 1.4 to 1.7GB of RAM each, and it looks like their target Runtime is 8 hours (unlike the usual 200-400MB of RAM and 3 hrs runtime of previous Beta Tasks. Grant Darwin NT |
Bill Swisher Send message Joined: 10 Jun 13 Posts: 28 Credit: 31,096,251 RAC: 14,187 |
Ouch! At first glance this beta does not seem to play well with my processors. I had two of them go slightly insane, so to speak, btw the only two to download the beta. Both the computers are running openSUSE Leap 15.6, the first (an old AMD Ryzen Threadripper 2950X) jumped up to 99 active users and started swapping like mad (32Gb of memory). I had to hit the power button to get to where I could suspend boinc. The second (a newer AMD Ryzen 9 7950X) only jumped up to 64 users before I suspended boinc. I'm heading out of town for a couple of days so all the machines aren't getting any Rosetta until I get back and can watch them. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2082 Credit: 40,621,050 RAC: 4,056 |
Glitch fixed, and lo and behold- new work! I polled 2hrs before your msg and got nothing, then not again until 3hrs ago - and only just noticed. Argh Not many left to grab now either - a small batch. On runtime, I've said before I use a 12hr runtime, but didn't mention that the last batch of 16 I sneaked a few days ago only ran 8hrs too. Not sure if that's a coincidence as my 12hr setting usually overrides the default set to the individual tasks. Anyway, work is work. I'll take whatever I can get. |
Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 186 Credit: 6,174,578 RAC: 3,700 |
Ouch! At first glance this beta does not seem to play well with my processors. The first three I got ran just fine. My machine is running Red Hat Enterprise Linux release 8.10 (Ootpa) us ing kernel 4.18.0-553.22.1.el8_10.x86_64 1583987342 1409332410 1 Oct 2024, 10:50:55 UTC 1 Oct 2024, 18:23:18 UTC Completed and validated 26,366.61 25,872.01 369.98 Rosetta Beta v6.06 x86_64-pc-linux-gnu 1583987355 1409332393 1 Oct 2024, 10:50:55 UTC 1 Oct 2024, 18:26:58 UTC Completed and validated 27,345.06 26,815.76 383.71 Rosetta Beta v6.06 x86_64-pc-linux-gnu 1583987363 1409332409 1 Oct 2024, 10:50:55 UTC 1 Oct 2024, 18:23:18 UTC Completed and validated 26,759.98 26,251.39 375.50 Rosetta Beta v6.06 x86_64-pc-linux-gnu |
Klimax Send message Joined: 27 Apr 07 Posts: 40 Credit: 2,623,807 RAC: 5,184 |
Glitch fixed, and lo and behold- new work! So that explains why WUs are failing on one of my computers. 20 threads and only 16GB of RAM and fairly small paging file. They could have warned us... |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1636 Credit: 16,775,951 RAC: 9,341 |
So that explains why WUs are failing on one of my computers. 20 threads and only 16GB of RAM and fairly small paging file. They could have warned us...It's been the general rule of thumb since i've been here (a bit over 4 years)- 1.5GB of RAM per core/thread is needed in order to do Rosetta work. It's only been recently with the Beta application that Tasks have used less (there were batches of Rosetta 4.20 work that have Tasks that used 2- 4GB each). Edit- Interestingly- on one system all running tasks are using up to 1.6GB of RAM, on the other only 2 are using more than 1GB of RAM, the rest 400-700MB. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1636 Credit: 16,775,951 RAC: 9,341 |
128GB of RAM on a 16 core/thread system leaves plenty of RAM available for the system even when all cores/threads are doing Rosetta work.Ouch! At first glance this beta does not seem to play well with my processors.The first three I got ran just fine. My machine is running Red Hat Enterprise Linux release 8.10 (Ootpa) Grant Darwin NT |
Klimax Send message Joined: 27 Apr 07 Posts: 40 Credit: 2,623,807 RAC: 5,184 |
So that explains why WUs are failing on one of my computers. 20 threads and only 16GB of RAM and fairly small paging file. They could have warned us...It's been the general rule of thumb since i've been here (a bit over 4 years)- 1.5GB of RAM per core/thread is needed in order to do Rosetta work. Argh. I just ( after writing a reply) realized what happened. NumberFields uses OpenCL for multiprecision arithmetic and OCL compiler will during compilation use up lots of RAM (sharp increase to few GBs, after it it will return back to fairly small footprint). So I have enough of virtual memory, when it's not being exhausted by another project... |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org