Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 135 · 136 · 137 · 138 · 139 · 140 · 141 . . . 302 · Next
Author | Message |
---|---|
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
I give up on trying to get Python Falconet, I won't get any anway..they blocked my system. Go look in my thread, I just responded back to Admin about a quick and dirty fix to a problem they found. Rather than try to communicate with "volunteers" or make a thread about the problem, they just made a quick and dirty script to block systems that kicked back certain errors. It seems my system kicked back one of those errors, so I am blocked. They can' keep 4.2 filled, so it is rather pointless to keep pinging their scheduler for work that is not there and work I can not access. |
ProDigit Send message Joined: 6 Dec 18 Posts: 27 Credit: 2,718,346 RAC: 0 |
I reattached Rosetta, after I no longer got any WUs. Not only on my Android device, but also my intel x86 units. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,281,662 RAC: 1,150 |
Could it be that the Python tasks are only for teaching their younger students how to create tasks? So far, Rosetta@Home has not mentioned any other purpose for them. By the way, I tried a Google search for vbox64 instructions, to see how hard it would be to change the amount of reserved memory. I found instructions, but they did not mention the amount of reserved memory. Some of you could ask Oracle to add a feature for controlling the amount of reserved memory. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Could it be that the Python tasks are only for teaching their younger students how to create tasks? The only thing that I recall is that one of their experimenters was using them, and I don't know for what purpose. It was not exactly a ringing endorsement that they would be widely used as a basis for new work, as we might hope. If anyone can find a mention of a use, maybe they could post it. |
FunkyKoval Send message Joined: 25 Dec 07 Posts: 1 Credit: 10,684,108 RAC: 660 |
Hi guys, I do not know if this is the right thread but I have a problem for two days - I do not download the usual Rosetta units. I guess that this means a definitive transition to units under the Virtual Box? And here comes my problem- I installed BOINC from Virtual Box and downloaded me about 20 units. Unfortunately, it counts only 4 at a time and 16 I have as ready to start. I have absolutely all the resources allocated to Rosetta, 12 threads and 18GB of free RAM. What can I do to make it count on all 12 threads? Since now each process is a virtual machine I checked in the Virtual Box - each machine has reserved 6GB so it would be correct: 6x4 = 24GB and the next one does not fit to allocate because it uses the system and my other apps. Why as much as 6GB per Linux image? Can you set somewhere globally at 4GB per machine? This is some kind of massacre that now you need 128GB of ram to use all threads on a stupid 5600X ... Thank you in advance for your help. Mirek |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,281,662 RAC: 1,150 |
Hi guys, The Python Rosetta@Home tasks now available reserve 7.45 GB of memory each, so you'll need that much memory per task plus a gigabyte or two for things outside the tasks. The tasks actually use closer to 100 MB, though. The usual Rosetta tasks are now in high demand from computers that don't have that much memory available, so you'll get rather few of them. You can join into the effort to persuade the Oracle company that provides VirtualBox to add a few features such as the ability to set much lower amounts of memory reserved. I was able to increase the number of the Python tasks that would run on my computer at the same time from 1 to 3 by increasing the fraction of the total memory BOINC could use, and then also increasing the fraction of the total disk space BOINC could use. It still won't use all 8 of the virtual cores I allow BOINC to use, since the computer has only 32 GB of main memory. I'm thinking of buying more memory is this problem lasts long enough, but I'm not in a hurry to do so. |
Tomcat雄猫 Send message Joined: 20 Dec 14 Posts: 180 Credit: 5,386,173 RAC: 0 |
I've always had this problem, but it's far worse with python tasks. aaam-NHM_pp-mTIQ_pp-SAR-AMACBEN2_pp_6_2539600_1_0 Instead of running this task, BOINC keeps downloading new Rosetta tasks and running those. Looking at all the tasks on that computer and you'll see this has nothing to do with queue size. It's almost as if BOINC periodically forgets a task exists. It's worse with Python tasks because BOINC will download the same number of tasks as the available threads but will be able to run at most 2 at a time. It's thus far more common to get clogs and "forgotten" tasks. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Instead of running this task, BOINC keeps downloading new Rosetta tasks and running those. You probably have an app_config.xml file, with a project_max_concurrent entry. Get rid of it. BOINC has a bug. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1686 Credit: 17,990,157 RAC: 23,506 |
Nah, that's where it keeps downloading more work over & over again ignoring deadlines & cache settings.Instead of running this task, BOINC keeps downloading new Rosetta tasks and running those. I suspect this one will be a Task that in the BOINC Manager will show as running, but if you look in Task Manager it will show 0 CPU usage. Some fail to start, while others start but never actually finish. Grant Darwin NT |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Nah, that's where it keeps downloading more work over & over again ignoring deadlines & cache settings. That is the app_config bug. The 0 CPU usage bug is not actually 0, just very low. It will actually occupy a slot and you won't get one to replace it until you abort it. But I have seen Rosetta download too many pythons even without either problem. So it is not all that reliable in the best of times. I just yesterday had to set NNW after three days nine hours of download, though I had set the buffer to 1 +1.5 days, and did not see any 0 resource ones, and I don't have an app_config any longer either. It may just be that the estimates are off, though they look OK at eight hours, which is actually a bit long. They usually run six hours or less on my machine. But who knows what value it is actually using. It may not be the one displayed. |
Falconet Send message Joined: 9 Mar 09 Posts: 353 Credit: 1,227,479 RAC: 753 |
Could it be that the Python tasks are only for teaching their younger students how to create tasks? I belive this is what you are referring to. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Thanks. That looks much more promising. I have ordered more memory, and will put another large machine on it if it is stable. But 128 GB may cause problems. You have to be careful. PS: I know I will still have at least the "Vm job unmanageable" problem, which prevents the further download of any work unit until it times out or you reboot. So I am going to try an automatic reboot. https://askubuntu.com/questions/13730/how-can-i-schedule-a-nightly-reboot But the first method shown did not work. Using sudo gedit /etc/crontab 00 6 * * * root reboot looks more promising. And I am hoping that they will solve the 0 CPU jobs, whenever that will be. |
StarCastle Send message Joined: 25 Apr 20 Posts: 7 Credit: 1,023,276 RAC: 640 |
Even though I have the BOINC client set to use up to 6 cores the Rosetta python project 1.03 tasks only ever single thread which reduces the amount of work the system can do. Is this a config issue? |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
No, they are single-threaded. Each work unit occupies one virtual core. |
StarCastle Send message Joined: 25 Apr 20 Posts: 7 Credit: 1,023,276 RAC: 640 |
But I have 6 virtual cores available so why are there not 6 tasks running at the same time? I don't have this issue with other Projects so it is not a system issue |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,281,662 RAC: 1,150 |
Even though I have the BOINC client set to use up to 6 cores the Rosetta python project 1.03 tasks only ever single thread which reduces the amount of work the system can do. No, it is due to the very large amount of memory each Python task reserves, 7.45 GB. They seldom actually use more than 100 MB, but that's not as important. Computers with only 16 GB of memory cannot fit in two of those tasks and also the other programs needed to keep BOINC running. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Each of the python work units (the only ones available at the moment) take 8 GB of memory. They don't actually use that much, but they require it for download. Your machines have 16 GB each, so they could run only two at most. Do you have BOINC set to use 100% of the memory? (As robertmiles said, the OS takes some too. I am not sure whether BOINC takes that into account.) |
StarCastle Send message Joined: 25 Apr 20 Posts: 7 Credit: 1,023,276 RAC: 640 |
OK, so that is the issue, I understand. Thanks a million for the feedback. Time to add more memory! |
WR-HW95 Send message Joined: 5 Jan 06 Posts: 2 Credit: 8,086,818 RAC: 0 |
So.. I started to crunch R@h after long break. What should be running time for python 1.03 (Vbox64) units? Atm. I have 3 of those running on AMD 5900X and those are now @99.996% after 57hours. 2 more units are "Postponed:VM environment needed to be cleaned up." |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Atm. I have 3 of those running on AMD 5900X and those are now @99.996% after 57hours. Congratulations. You have managed to hit both of the problems right off the bat. You could have aborted the 99.996% ones in the first five minutes, when they would show less than 1% CPU usage (I use BoincTasks to check that). And to fix the postponed ones, just reboot. Otherwise, you have to wait about a day for them to restart. You have made much more progress than most. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org