Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 146 · 147 · 148 · 149 · 150 · 151 · 152 . . . 309 · Next
Author | Message |
---|---|
Grant Morphett Send message Joined: 12 May 07 Posts: 4 Credit: 18,289,252 RAC: 896 |
Agreed but I don't want to have to "monitor" boinc tasks and manage them. I just let it run. Thanks. |
Grant Morphett Send message Joined: 12 May 07 Posts: 4 Credit: 18,289,252 RAC: 896 |
Those task have hung or the processing inside the VM crashed. Just abort the work unit(s). Great - I have made that setting. We will see what happens. I really appreciate your advice. Thanks. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Those task have hung or the processing inside the VM crashed. Just abort the work unit(s). Grant - you need to look at the memory size. Old Python use 7.5 gigs per process New tasks are max 2 gigs. At 7 gigs a task that will chew up your 64 really fast. If a task does not finish within 12 hours of start time uninterrupted, then yes something went wrong in the process and you will have to abort. The Vbox your using? Is that V5.4? Python does not like the newest Vbox. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
I'm using VirtualBox 6.0, under Windows 10. The Python tasks run properly, with one oddity. They will only start if there is at least 7.45 GB of available memory, but if they reserve less, only the reserved amount counts against what is available for starting yet another Python task. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
I'm using VirtualBox 6.0, under Windows 10. The Python tasks run properly, with one oddity. They will only start if there is at least 7.45 GB of available memory, but if they reserve less, only the reserved amount counts against what is available for starting yet another Python task. Weird, because when I started with Python I was getting 3 at a time running (Vbox 5.4) and at really 7.38 or something like that (7.5 round) it was using up most of my memory (24) and the rest of the memory went to the system and running a few of the other CPU projects. But as you see, the multi letter ones at the monsters. So just figure on 7.5/7.4 GB per task. And depending on your memory load from other stuff, you might only be able to run 1. Cages work is only 686 MB. So maybe when those come your way you will see an increase in the amount of cores used. I don't have any really good answers. About the only thing I can think of is there might be something in the scheduler that gives out only a max of x number of units no matter the memory size. Weird...but an idea. Grant(SSSF) or Jim1348 might be able to help you better. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I also see that the latest pythons are now down to a reasonable size (686 MB). There are still a few of the 2861 MB ones around, but I don't see anything larger. I'm using VirtualBox 6.0, under Windows 10. The Python tasks run properly, with one oddity. They will only start if there is at least 7.45 GB of available memory, but if they reserve less, only the reserved amount counts against what is available for starting yet another Python task.That should change with the smaller ones, but all my machines have more memory, so I can't check it. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
I also see that the latest pythons are now down to a reasonable size (686 MB). You can still check it, but expect more Python tasks to be already running when it decides whether to start another one, I should have written free memory instead of available memory, I observed it on a computer with 32 GB main memory. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
You can still check it, but expect more Python tasks to be already running when it decides whether to start another one, I should have written free memory instead of available memory, I observed it on a computer with 32 GB main memory. That is quite possible, but I am running all my pythons on Ubuntu now. It seems that "free" and "available" memory have different meanings on Linux than Windows anyway. Linux puts a lot of stuff in cache and buffers; I add to that with a large write-cache. I expect most of it can be reclaimed for running apps as necessary, in addition to the "free" memory. Have you checked the memory settings in BOINC manager? |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
You can still check it, but expect more Python tasks to be already running when it decides whether to start another one, I should have written free memory instead of available memory, I observed it on a computer with 32 GB main memory. I've checked them many times, usually during the changes I made to allow more Python tasks to run at the same time. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I've checked them many times, usually during the changes I made to allow more Python tasks to run at the same time. I figured you had. It is such a strange problem we can only hope that it will go away. But if not, I would detach from Rosetta and try again. There may be something corrupted. |
Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 196 Credit: 6,613,600 RAC: 6,755 |
That is quite possible, but I am running all my pythons on Ubuntu now. It seems that "free" and "available" memory have different meanings on Linux than Windows anyway. Linux puts a lot of stuff in cache and buffers; I add to that with a large write-cache. I expect most of it can be reclaimed for running apps as necessary, in addition to the "free" memory. I am running Red Hat Enterprise Linux release 8.5 (Ootpa) release of Linux. If I run the free program, it gives the following. I am running very few Boinc tasks at the moment. $ free total used free shared buff/cache available Mem: 65435808 6337708 2390400 142808 56707700 58224084 Swap: 16375804 932608 15443196 So I have about 64 Gigabytes of RAM (total), 6.3 GBytes are used, about 2.4 GBytes are free, 142 MBytes are shared (mostly shared libraries, I suppose), 56 GBytes are output buffers and input cache. These last 58 GBytes (the sum of the free ones and the avaiable ones) are available. The input cache can be claimed immediately, but the output buffers must be witten out before they can be used. Anyhow, those free and available are considerable (at the moment. When running 8 Boinc tasks, these figures will change somewhat. I have about 15 GBytes of disk allocated for paging (swap), of which almost 1 GBytes are used. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I have a Ryzen 3900X running all python boinc_cages on 23 cores at the moment under Ubuntu 20.04.3 It has 96 GB of memory. $ free total used free shared buff/cache available Mem: 98808092 35710244 3313196 18312 59784652 62096680 Swap: 1952764 4096 1948668 About the only thing I learn from it is that I have much more memory than I need now. But it allows me to have a 20 GB write cache, with a 12-hour latency. That should do it. |
Grant Morphett Send message Joined: 12 May 07 Posts: 4 Credit: 18,289,252 RAC: 896 |
I'm using vbox v6.1. My job requires a lot of virtualisation so I'm always running the latest stable virtualbox. Interesting that python doesn't like it. I'm running python 3.9.7 which I am guessing isn't "old python". Any legacy Rosetta is running fine after I made the changes suggested above so I'm happy for the moment. Again, BOINC for me is set and forget - the only thing I adjust is the %CPU it uses on my desktop whilst I'm working. Overnight 100%, email, doco etc it gets 80%, if I'm coding it gets none. Thanks, Grant. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
I'm using vbox v6.1. My job requires a lot of virtualisation so I'm always running the latest stable virtualbox. Well...just let the old pythons pass through your system and let someone else using V5.xx of Vbox run them. I am on 6.x now of Vbox and I am getting only the new work which is stable. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,391,654 RAC: 19,096 |
Looks like we've finished the most recent batch of Rosetta 4.20 Tasks. Grant Darwin NT |
tullio Send message Joined: 10 May 20 Posts: 63 Credit: 630,125 RAC: 0 |
I am running two tasks at a time on my PC with 12 GB RAM. A third one is waiting for more memory but it has done some running too. A fourth one is ready to run. Tullio |
Paddles Send message Joined: 15 Mar 15 Posts: 11 Credit: 5,434,545 RAC: 2,879 |
I too was having problems with too many of the vbox jobs trying to run at once, leaving not enough resources resulting in Rosetta and other project tasks not being able to run, resulting in lower overall performance. In my case, lowering Rosetta's resource from 200 (out of 500 - 40% share) to 100 (out of 400 - 25% share) is giving me 10-15% more work completed across all projects. I haven't had any vbox jobs that had to be aborted since then. BOINC is allowed 11/12 cores, 75% of 32GB RAM. Other projects are WCG and Yoyo. |
Scottie McKinley Send message Joined: 14 Jan 21 Posts: 5 Credit: 2,597,109 RAC: 152 |
Yeah, looks like vbox only jobs running. My 8 phyton tasks only show using 20gb ram. It seems vbox reserves so much more ram than it actually uses. |
.clair. Send message Joined: 2 Jan 07 Posts: 274 Credit: 26,399,595 RAC: 0 |
Having great difficulty getting any work, can`t even get pythons on systems that are set up for it [and been running them], anyone else seeing this. I have seen 5000 python ready to send on the server status page, most work requests get the `no tasks sent` and no reason why message. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I have been getting pythons regularly (most recently 3 hours ago) on two Ryzen 3900X's. So it much be a machine or network problem. But the errors ("0 CPU" and "Vm job unmanageable") are getting too annoying for me. I will wait until later to get more, and maybe they will have it fixed. Though they do not even indicate that they know there are problems, so it could be a while. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org