Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 135 · 136 · 137 · 138 · 139 · 140 · 141 . . . 279 · Next

AuthorMessage
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5664
Credit: 5,711,666
RAC: 1,686
Message 103437 - Posted: 19 Nov 2021, 17:09:20 UTC - in response to Message 103433.  

I give up on trying to get Python
Your app_config.xml file refers to an unknown application 'rosetta_python_projects'. Known applications: 'rosetta'

That's from that other guy's script in the python thread.
If they can't keep 4.2 running then I'm dumping this project after 15 years of crunching.



That message is because you haven't received a Python task yet.


Falconet, I won't get any anway..they blocked my system.
Go look in my thread, I just responded back to Admin about a quick and dirty fix to a problem they found.
Rather than try to communicate with "volunteers" or make a thread about the problem, they just made a quick and dirty script to block systems that kicked back certain errors.

It seems my system kicked back one of those errors, so I am blocked.
They can' keep 4.2 filled, so it is rather pointless to keep pinging their scheduler for work that is not there and work I can not access.
ID: 103437 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
ProDigit

Send message
Joined: 6 Dec 18
Posts: 27
Credit: 2,718,346
RAC: 0
Message 103448 - Posted: 21 Nov 2021, 1:02:22 UTC

I reattached Rosetta, after I no longer got any WUs.
Not only on my Android device, but also my intel x86 units.
ID: 103448 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1225
Credit: 13,870,115
RAC: 2,772
Message 103449 - Posted: 21 Nov 2021, 1:29:25 UTC

Could it be that the Python tasks are only for teaching their younger students how to create tasks?

So far, Rosetta@Home has not mentioned any other purpose for them.

By the way, I tried a Google search for vbox64 instructions, to see how hard it would be to change the amount of reserved memory. I found instructions, but they did not mention the amount of reserved memory. Some of you could ask Oracle to add a feature for controlling the amount of reserved memory.
ID: 103449 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103450 - Posted: 21 Nov 2021, 1:47:47 UTC - in response to Message 103449.  

Could it be that the Python tasks are only for teaching their younger students how to create tasks?

The only thing that I recall is that one of their experimenters was using them, and I don't know for what purpose.
It was not exactly a ringing endorsement that they would be widely used as a basis for new work, as we might hope.

If anyone can find a mention of a use, maybe they could post it.
ID: 103450 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
dezajner

Send message
Joined: 25 Dec 07
Posts: 1
Credit: 10,547,932
RAC: 0
Message 103451 - Posted: 21 Nov 2021, 3:22:09 UTC
Last modified: 21 Nov 2021, 3:47:06 UTC

Hi guys,

I do not know if this is the right thread but I have a problem for two days - I do not download the usual Rosetta units. I guess that this means a definitive transition to units under the Virtual Box? And here comes my problem- I installed BOINC from Virtual Box and downloaded me about 20 units. Unfortunately, it counts only 4 at a time and 16 I have as ready to start. I have absolutely all the resources allocated to Rosetta, 12 threads and 18GB of free RAM. What can I do to make it count on all 12 threads?

Since now each process is a virtual machine I checked in the Virtual Box - each machine has reserved 6GB so it would be correct: 6x4 = 24GB and the next one does not fit to allocate because it uses the system and my other apps. Why as much as 6GB per Linux image? Can you set somewhere globally at 4GB per machine? This is some kind of massacre that now you need 128GB of ram to use all threads on a stupid 5600X ...

Thank you in advance for your help.
Mirek
ID: 103451 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1225
Credit: 13,870,115
RAC: 2,772
Message 103452 - Posted: 21 Nov 2021, 4:20:38 UTC - in response to Message 103451.  
Last modified: 21 Nov 2021, 5:15:16 UTC

Hi guys,

I do not know if this is the right thread but I have a problem for two days - I do not download the usual Rosetta units. I guess that this means a definitive transition to units under the Virtual Box? And here comes my problem- I installed BOINC from Virtual Box and downloaded me about 20 units. Unfortunately, it counts only 4 at a time and 16 I have as ready to start. I have absolutely all the resources allocated to Rosetta, 12 threads and 18GB of free RAM. What can I do to make it count on all 12 threads?

Since now each process is a virtual machine I checked in the Virtual Box - each machine has reserved 6GB so it would be correct: 6x4 = 24GB and the next one does not fit to allocate because it uses the system and my other apps. Why as much as 6GB per Linux image? Can you set somewhere globally at 4GB per machine? This is some kind of massacre that now you need 128GB of ram to use all threads on a stupid 5600X ...

Thank you in advance for your help.
Mirek

The Python Rosetta@Home tasks now available reserve 7.45 GB of memory each, so you'll need that much memory per task plus a gigabyte or two for things outside the tasks. The tasks actually use closer to 100 MB, though.

The usual Rosetta tasks are now in high demand from computers that don't have that much memory available, so you'll get rather few of them.

You can join into the effort to persuade the Oracle company that provides VirtualBox to add a few features such as the ability to set much lower amounts of memory reserved.

I was able to increase the number of the Python tasks that would run on my computer at the same time from 1 to 3 by increasing the fraction of the total memory BOINC could use, and then also increasing the fraction of the total disk space BOINC could use. It still won't use all 8 of the virtual cores I allow BOINC to use, since the computer has only 32 GB of main memory. I'm thinking of buying more memory is this problem lasts long enough, but I'm not in a hurry to do so.
ID: 103452 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 1,609
Message 103453 - Posted: 21 Nov 2021, 4:30:34 UTC - in response to Message 103452.  
Last modified: 21 Nov 2021, 4:33:10 UTC

I've always had this problem, but it's far worse with python tasks.

aaam-NHM_pp-mTIQ_pp-SAR-AMACBEN2_pp_6_2539600_1_0

Instead of running this task, BOINC keeps downloading new Rosetta tasks and running those. Looking at all the tasks on that computer and you'll see this has nothing to do with queue size. It's almost as if BOINC periodically forgets a task exists. It's worse with Python tasks because BOINC will download the same number of tasks as the available threads but will be able to run at most 2 at a time. It's thus far more common to get clogs and "forgotten" tasks.
ID: 103453 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103454 - Posted: 21 Nov 2021, 5:03:19 UTC - in response to Message 103453.  

Instead of running this task, BOINC keeps downloading new Rosetta tasks and running those.

You probably have an app_config.xml file, with a project_max_concurrent entry. Get rid of it. BOINC has a bug.
ID: 103454 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1494
Credit: 14,713,799
RAC: 15,870
Message 103455 - Posted: 21 Nov 2021, 5:13:11 UTC - in response to Message 103454.  

Instead of running this task, BOINC keeps downloading new Rosetta tasks and running those.

You probably have an app_config.xml file, with a project_max_concurrent entry. Get rid of it. BOINC has a bug.
Nah, that's where it keeps downloading more work over & over again ignoring deadlines & cache settings.
I suspect this one will be a Task that in the BOINC Manager will show as running, but if you look in Task Manager it will show 0 CPU usage. Some fail to start, while others start but never actually finish.
Grant
Darwin NT
ID: 103455 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103456 - Posted: 21 Nov 2021, 5:17:46 UTC - in response to Message 103455.  
Last modified: 21 Nov 2021, 5:34:43 UTC

Nah, that's where it keeps downloading more work over & over again ignoring deadlines & cache settings.
I suspect this one will be a Task that in the BOINC Manager will show as running, but if you look in Task Manager it will show 0 CPU usage. Some fail to start, while others start but never actually finish.

That is the app_config bug. The 0 CPU usage bug is not actually 0, just very low. It will actually occupy a slot and you won't get one to replace it until you abort it.

But I have seen Rosetta download too many pythons even without either problem. So it is not all that reliable in the best of times. I just yesterday had to set NNW after three days nine hours of download, though I had set the buffer to 1 +1.5 days, and did not see any 0 resource ones, and I don't have an app_config any longer either.

It may just be that the estimates are off, though they look OK at eight hours, which is actually a bit long. They usually run six hours or less on my machine. But who knows what value it is actually using. It may not be the one displayed.
ID: 103456 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 350
Credit: 1,017,068
RAC: 183
Message 103457 - Posted: 21 Nov 2021, 9:45:49 UTC - in response to Message 103450.  

Could it be that the Python tasks are only for teaching their younger students how to create tasks?

The only thing that I recall is that one of their experimenters was using them, and I don't know for what purpose.
It was not exactly a ringing endorsement that they would be widely used as a basis for new work, as we might hope.

If anyone can find a mention of a use, maybe they could post it.



I belive this is what you are referring to.
ID: 103457 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103458 - Posted: 21 Nov 2021, 13:52:27 UTC - in response to Message 103457.  
Last modified: 21 Nov 2021, 14:01:22 UTC

Thanks. That looks much more promising.
I have ordered more memory, and will put another large machine on it if it is stable. But 128 GB may cause problems. You have to be careful.

PS: I know I will still have at least the "Vm job unmanageable" problem, which prevents the further download of any work unit until it times out or you reboot. So I am going to try an automatic reboot.
https://askubuntu.com/questions/13730/how-can-i-schedule-a-nightly-reboot

But the first method shown did not work.
Using sudo gedit /etc/crontab
00 6 * * * root reboot
looks more promising.

And I am hoping that they will solve the 0 CPU jobs, whenever that will be.
ID: 103458 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
StarCastle

Send message
Joined: 25 Apr 20
Posts: 7
Credit: 789,365
RAC: 710
Message 103460 - Posted: 22 Nov 2021, 2:35:27 UTC

Even though I have the BOINC client set to use up to 6 cores the Rosetta python project 1.03 tasks only ever single thread which reduces the amount of work the system can do.

Is this a config issue?
ID: 103460 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103461 - Posted: 22 Nov 2021, 2:38:43 UTC - in response to Message 103460.  
Last modified: 22 Nov 2021, 2:39:12 UTC

No, they are single-threaded. Each work unit occupies one virtual core.
ID: 103461 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
StarCastle

Send message
Joined: 25 Apr 20
Posts: 7
Credit: 789,365
RAC: 710
Message 103462 - Posted: 22 Nov 2021, 2:41:46 UTC - in response to Message 103461.  

But I have 6 virtual cores available so why are there not 6 tasks running at the same time?

I don't have this issue with other Projects so it is not a system issue
ID: 103462 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1225
Credit: 13,870,115
RAC: 2,772
Message 103463 - Posted: 22 Nov 2021, 2:44:44 UTC - in response to Message 103460.  

Even though I have the BOINC client set to use up to 6 cores the Rosetta python project 1.03 tasks only ever single thread which reduces the amount of work the system can do.

Is this a config issue?

No, it is due to the very large amount of memory each Python task reserves, 7.45 GB. They seldom actually use more than 100 MB, but that's not as important.

Computers with only 16 GB of memory cannot fit in two of those tasks and also the other programs needed to keep BOINC running.
ID: 103463 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103464 - Posted: 22 Nov 2021, 2:44:57 UTC - in response to Message 103462.  
Last modified: 22 Nov 2021, 2:47:37 UTC

Each of the python work units (the only ones available at the moment) take 8 GB of memory. They don't actually use that much, but they require it for download.
Your machines have 16 GB each, so they could run only two at most. Do you have BOINC set to use 100% of the memory?
(As robertmiles said, the OS takes some too. I am not sure whether BOINC takes that into account.)
ID: 103464 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
StarCastle

Send message
Joined: 25 Apr 20
Posts: 7
Credit: 789,365
RAC: 710
Message 103465 - Posted: 22 Nov 2021, 2:46:58 UTC - in response to Message 103464.  

OK, so that is the issue, I understand.

Thanks a million for the feedback.

Time to add more memory!
ID: 103465 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
WR-HW95

Send message
Joined: 5 Jan 06
Posts: 2
Credit: 8,086,818
RAC: 0
Message 103468 - Posted: 22 Nov 2021, 15:29:20 UTC

So.. I started to crunch R@h after long break.
What should be running time for python 1.03 (Vbox64) units?
Atm. I have 3 of those running on AMD 5900X and those are now @99.996% after 57hours.
2 more units are "Postponed:VM environment needed to be cleaned up."
ID: 103468 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103469 - Posted: 22 Nov 2021, 16:15:41 UTC - in response to Message 103468.  

Atm. I have 3 of those running on AMD 5900X and those are now @99.996% after 57hours.
2 more units are "Postponed:VM environment needed to be cleaned up."

Congratulations. You have managed to hit both of the problems right off the bat.

You could have aborted the 99.996% ones in the first five minutes, when they would show less than 1% CPU usage (I use BoincTasks to check that).
And to fix the postponed ones, just reboot. Otherwise, you have to wait about a day for them to restart.

You have made much more progress than most.
ID: 103469 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 135 · 136 · 137 · 138 · 139 · 140 · 141 . . . 279 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org