Message boards : Number crunching : Not getting any python work
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 · Next
Author | Message |
---|---|
G.L.I.S. Send message Joined: 25 Dec 08 Posts: 26 Credit: 2,227,945 RAC: 2,778 |
Update: "the workaround" to be able to download the wus pythons, it seems not to work anymore ... Maybe something on the server might have changed. In case of positive updates, I will post the results. Sorry for the inconvenience Byez |
mmstick Send message Joined: 4 Dec 12 Posts: 8 Credit: 606,792 RAC: 0 |
With as many issues as the Python tasks have; with half of them begin unmanageable, or causing the system's OOM killer to assassinate them for using too much memory; not getting any should be considered a blessing. I've just opted to uninstall virtualbox on my Linux systems. There's simply no valid reason that BOINC projects should be using it on Linux. We all know that virtualization is largely an inefficient waste of resources. That's especially true for VirtualBox compared to the Linux kernel's KVM/QEMU support. There are better solutions that exist today that would provide the same benefits -- virtual environments, namespaces, and containers -- without having to emulate an entire virtual machine. I'd rather wait for BOINC projects to start using these solutions. You could argue about Python dependencies, but we live in an era where Python programmers have pip, virtualenv, and anaconda at their disposal. You could bundle your entire development environment into an OSTree or docker image, and execute them natively on a system using a bubblewrap chroot, or podman. Such that the software is being run in an isolated sandbox with no interference from the host OS. Root's not even required to achieve this. Of course, I'd also argue that Python itself is not the best tool for distributed computing. 100 computers running a Python application will get the same computational output as 1 computer running a Rust application. As far as super simple scripting languages go, I'd give more of a pass to Julia because it at least leverages the most performant mathematics libraries while also performing JIT compilation of its scripts to something that's close to optimized machine code. WASM would also be an excellent target with its ability to compile on any platform architecture and optimize for the system's native CPU. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1675 Credit: 17,738,371 RAC: 22,926 |
The standard Rosetta tasks also reserve 8gb of ram for each task,No they don't. They request & release RAM as required- none is reserved. And apart from a batch of faulty Tasks some time back, the most i have seen used by a single Task was around 4GB. Generally the highest is around 1.3GB. The current batch of work are using between 700MB & 1GB each. Please stop making thing up, it's not helpful. one way around this is a simple app_config file that limits the number of tasks running per project, like this:Which sometimes results in Tasks continuously being downloaded without any chance of processing them due to a known bug with how BOINC handles that setting. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1675 Credit: 17,738,371 RAC: 22,926 |
Update: "the workaround" to be able to download the wus pythons, it seems not to work anymore ...Rosetta 4.20 Tasks are now available again. For several days there, they weren't (apart from the very occasional RB Task or a resend). Grant Darwin NT |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Yeah ok...but where in the file with all the other text? top of the pile or what? So what you are saying is the file name, that I get. The scheduling priority WUs: rah_make_work_rosetta_python_projects, goes inside this file along with the code? That's what I am getting at. |
G.L.I.S. Send message Joined: 25 Dec 08 Posts: 26 Credit: 2,227,945 RAC: 2,778 |
Oh...ok,thanksUpdate: "the workaround" to be able to download the wus pythons, it seems not to work anymore ...Rosetta 4.20 Tasks are now available again. For several days there, they weren't (apart from the very occasional RB Task or a resend). |
G.L.I.S. Send message Joined: 25 Dec 08 Posts: 26 Credit: 2,227,945 RAC: 2,778 |
[quote][quote]Yeah ok...but where in the file with all the other text? top of the pile or what? ?? What exactly are you referring to with: 'all other text' |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
[quote][quote]Yeah ok...but where in the file with all the other text? top of the pile or what? Well..it just wasn't clear to me...but anyway. will make the modification. |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,152,433 RAC: 4,296 |
The standard Rosetta tasks also reserve 8gb of ram for each task,No they don't. I didn't 'make it up' it was my misunderstanding that the 4.20 tasks 'request & release ram as required'. I have only ever looked in the properties of a running task and see what it says so was going by that. |
G.L.I.S. Send message Joined: 25 Dec 08 Posts: 26 Credit: 2,227,945 RAC: 2,778 |
Yesterday I was unable to communicate with the server, today it started sending alerts again. In my experience (always regarding the topic of the original post) the best 'app_config.xml (*) (**)' is: ------------------------------------ <app_config> <app> <name>rosetta_python_projects</name> </app> <app_version> <app_name>rosetta_python_projects</app_name> <plan_class>vbox64</plan_class> </app_version> </app_config> ------------------------------------- The multithreading logically I am not able to activate it and I always find that 2 physical cores of the processor are left free (the logical cores are not used). I assume (example) that with CPU 8 core FX, max 6 wus are processed simultaneously. I repeat, with Ryzen 3 3100, max 2 (python) wus, with Ryzen 5 3600, max 4 (python) wus, simultaneously. (*) Obviously, if you also want to download/modulate Rosetta 4.2, the file must be suitably integrated with the appropriate strings. In this case the wus should/could occupy the rest of the CPU's free cores/threads. (**) After each modification to the 'app_config.xml' file, save and refresh the page and pages back. Then click on 'Read configuration files', from the 'Options' menu of the BOINC client. Sometimes, it should be necessary to exit BOINC and also terminate it from 'Task Manager', then restart the program. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1675 Credit: 17,738,371 RAC: 22,926 |
Yesterday I was unable to communicate with the server, today it started sending alerts again.The reason it is "sending alerts" is because we are out of Rosetta 4.20 work again. That's all. If you had been unable to contact the server, you would have got a message stating that. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1675 Credit: 17,738,371 RAC: 22,926 |
I didn't 'make it up' it was my misunderstanding that the 4.20 tasks 'request & release ram as required'. I have only ever looked in the properties of a running task and see what it says so was going by that.If you had been looking at a Rosetta 4.20 Task there is no way you would have come up with it requiring 8GB of RAM. The memory column on the Process tab in Task Manager shows the amount of memory in use for the listed process/application. The most in use for a Rosetta 4.20 Task i've seen lately has been 3.3GB for a RB Task, All the rest were no more than 1.2GB. Python Tasks, and only Python Tasks, require 8GB of RAM. Edit- found a 3.3GB RB Task. Grant Darwin NT |
Michael Goetz Send message Joined: 17 Jan 08 Posts: 12 Credit: 179,114 RAC: 0 |
One problem solved But it's not a very satisfying solution. For a while, I've been trying, and failing, to run the new Python VM tasks. I'm an experienced user; I'm actually a BOINC system admin and have modified both the BOINC client and server code when I needed to. Suffice it to say that I know my way around BOINC pretty well. And yet, for weeks, I've been unable to get the Rosetta server to send Python tasks to my main computer. It's got plenty of RAM. VBox is installed and virtualization is enabled in the BIOS. VBox apps from other projects run just fine on this computer. But the Rosetta server refused to send Python tasks, no matter what I did. It didn't matter which versions of VBOX or BOINC I used, I could not get tasks. I"ve reset the Rosetta project multiple times. I've detached and attached it multiple times. Nothing worked. Finally, today, I got it working. But the solution isn't satisfying because it doesn't illuminate what caused the problem. Yes, I fixed it, but I have no idea why it's working now. That's what's so frustrationg. What I did was to enable multiple BOINC instances in cc_config.xml, and set up a separate BOINC instance on the very same computer. Then I attached to Rosetta using the second BOINC instance. That did the trick. I have no clue why the second instance of BOINC works while the first one doesn't. They are using identical cc_config.xml files. The only differences are the location of the data directory and the RPC port number. Otherwise, it's the same computer and the same software. But the Rosetta server sends tasks to one instance but not the other. I can't explain whjy. I'm posting this because if you have the same problem, perhaps this may help. For reference, this is the original BOINC install and this is the second BOINC instance. Instructions for setting up a second BOINC instance. Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1675 Credit: 17,738,371 RAC: 22,926 |
What I did was to enable multiple BOINC instances in cc_config.xml, and set up a separate BOINC instance on the very same computer. Then I attached to Rosetta using the second BOINC instance. That did the trick.Ok, that is just plain weird, going all the way to ridiculous. As Greg_BE the starter of this thread posted, he was getting Python tasks, then he wasn't. Even when the project (as now) ran out of regular Rosetta 4.20 Tasks, it still wouldn't pick up Python Tasks, even though VirtualBox is working for Tasks on another BOINC project. The only differences are the location of the data directory and the RPC port number.Firewall configuration issue??? Although how that would stop BOINC from asking for Python work... But then with Rosetta, there is no Python or Rosetta 4.20 work- it's all just Rosetta. There is no way to select one or the other. If your system can do it, you get it. If not, you just get the one you can do. You request more work from Rosetta and it's the luck of the draw as to which one you get if your system can do both. At present with no Rosetta 4.20 work, and my BOINC installation not including VirtualBox, i can't do Python work. So each work request just results in a "Vbox is not installed" message. If you've got Vbox, and you need Rosetta work, then you should be getting Python tasks. Does re-installing BOINC with VBox wipe the previous Vbox installation? Or are any files that are created when Vbox runs, but not part of the installation process, left there? ie Config files, failed VMs etc. That is the case for projects you are attached to when you re-install BOINC- eg upgrading versions. Would detaching from Rosetta, using the Add/remove programmes Windows installer to remove VirtualBox then manually making sure the Rosetta project folder & sub folders are all deleted, manually making sure Vbox and all it's sub folders are deleted, then re-install BOINC with Vbox support then re-attach to Rosetta, possibly resolve the issue? Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2119 Credit: 41,179,074 RAC: 11,480 |
Does re-installing BOINC with VBox wipe the previous Vbox installation? Or are any files that are created when Vbox runs, but not part of the installation process, left there? ie Config files, failed VMs etc. I tried this on the PC I installed VBox on last week. It doesn't remove Vbox. However, Greg_BE did send some instructions on how to uninstall Vbox, which I'll attempt to use when I get back down there tomorrow night |
Michael Goetz Send message Joined: 17 Jan 08 Posts: 12 Credit: 179,114 RAC: 0 |
Ok, that is just plain weird, going all the way to ridiculous. My thoughts exactly. Firewall configuration issue??? It's definitely not the firewall. The RPC port is only used for controlling the BOINC client via the BOINC manager, boinccmd command line interface, or BOINCTasks. It's got nothing to do with commincating with the BOINC project server, Besides, the BOINC client was communicating just fine with the server. The message "I've got no tasks for you!" was getting through loud and clear. Does re-installing BOINC with VBox wipe the previous Vbox installation? Yes, if you install the BOINC package containing VBox, it replaces any existing VBOX installation. Or are any files that are created when Vbox runs, but not part of the installation process, left there? ie Config files, failed VMs etc. I'm not sure if configuration files persist when VBox is reinstalled, either manually or as part of BOINC's installation process. Would detaching from Rosetta, using the Add/remove programmes Windows installer to remove VirtualBox then manually making sure the Rosetta project folder & sub folders are all deleted, manually making sure Vbox and all it's sub folders are deleted... Tried that. Multiple times. No joy. ... then re-install BOINC with Vbox support then re-attach to Rosetta, possibly resolve the issue? I did reinstall BOINC; I even tried using a different BOINC version. No joy there either. I didn't try the exact sequence of steps you suggested, but I don't think it would have made any difference. Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1675 Credit: 17,738,371 RAC: 22,926 |
That is just beyond weird. On other projects you can select the type of work you do when there are multiple types, although on Seti at least the mechanism was broken for the last couple of years. Selecting one type (Multibeam), and the other (AstroPulse) only if there was none for the first type (Multibeam) (or not at all) worked that way for years without issue. Many would do just AP due to the high Credit payout, but If there was no AP work the systems would pick up MB till the next batch of AP was released- exactly as the settings were configured. Then there was an update to the Scheduler and it stopped working the way it previously did- you had to enable work for both types all the time, in order to reliably get any. Even if your system didn't support one of them. And from memory it didn't affect everyone, that way, just some systems (which was still a lot due to the number of crunchers there). The fact that Rosetta doesn't have the option to select the type of work- if your system can't support it, you won't get it- but here we have a case of systems supporting it, capable of running it but still not getting it. It could very well be that old issue coming in to play here. I don't recall if anyone tried your multi-instance workaround (i'm pretty sure they didn't). Yep- Beyond weird, just ridiculous. Grant Darwin NT |
Michael Goetz Send message Joined: 17 Jan 08 Posts: 12 Credit: 179,114 RAC: 0 |
On other projects you can select the type of work you do when there are multiple types, although on Seti at least the mechanism was broken for the last couple of years. (TO BE CLEAR... the content of this post is referring to BOINC in general, and NOT specifically to Rosetta. I am in no way implying anything about Rosetta or its management. The "don't know what they're doing" part obviously doesn't apply to Rosetta. But it frequently describes new projects that pop up. There's a big learning curve.) About this... some projects don't support user selection of apps. Sometimes this is intentional. Sometimes it's simply because the admins don't know what they're doing. People don't come out of the womb knowing how to run a BOINC project. In fact, the documentation is pretty poor and a lot of things are done by trial and error. BOINC has a configuration option to enable user app selection. You change one line in a PHP include file to turn it on or off. It's off by default. I guess the thinking is that most projects start with just a single app, so it would only confuse users to have a selection for just one app. The problem is that a new admin doesn't know they have to change this setting when they add more apps. Or they don't really understand why their users would even care which apps they run. Unfortunately, many projects don't have this option turned on. It sucks for their users, and many users leave because of that. For some projects which have both an abundance of users and a shortage of work, it doesn't matter. But for projects with sufficient work, this cuts down on the science that gets done. Everyone loses. Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1675 Credit: 17,738,371 RAC: 22,926 |
BOINC has a configuration option to enable user app selection. You change one line in a PHP include file to turn it on or off. It's off by default.Even so, the Scheduler would still have the code to implement this ability. And as i mentioned, it became noticeably broken on Seti after a Scheduler update a year or two before they shut down. It's the only thing that comes to mind that might explain the present odd work allocation behaviour affecting a very few systems. A project that has the multiple application option enabled, the Scheduler has to check for the status of the flags for the different types of application when a host requests more work. But if the option isn't enabled, then how the Scheduler allocates work will have different (or no) default flags, so the behaviour may differ. And whatever causes the occasional failure of systems to get work with valid applications & application settings may be resulting in the present issue occurring where the feature hasn't been selected, due to the underlying code & default flags & values. Just a WAG (Wild Arse Guess). But as to why whatever it is that's causing the issue only happens on so few systems, would probably explain why a multi-instance BOINC installation can have one getting work where the other doesn't. One of things that would probably appear blindingly obvious- once the problem was found and resolved (i certainly had more than my fair share of those repairing electronics over the years). Grant Darwin NT |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,551,716 RAC: 6,403 |
The problem is that a new admin doesn't know they have to change this setting when they add more apps. Or they don't really understand why their users would even care which apps they run. Unfortunately, many projects don't have this option turned on. This is correct for a new project. R@H is on the "boinc world" since.... i don't remember when it was I don't want to believe that their admins don't know how to change a simple php/xml file to activate the function. And yes, documentation is not so large and precise, but there is some pages that can help and you can also partecipate to Boinc newsletter/discussion group if you have configuration problems. So, for me, it's simply a problem of will. |
Message boards :
Number crunching :
Not getting any python work
©2024 University of Washington
https://www.bakerlab.org