Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 129 · 130 · 131 · 132 · 133 · 134 · 135 . . . 309 · Next

AuthorMessage
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103230 - Posted: 13 Nov 2021, 13:33:05 UTC - in response to Message 103229.  

I wish they'd use KVM/QEMU instead of Virtualbox for Linux.

I am sure it would work better, since it can't work worse. It is getting practically impossible to run the pythons. The first problem is "Vm job unmanageable" suspensions, which occur on all of my machines no matter what steps I take (mainly limiting cores) to prevent it. You need to either wait a long time, or reboot to fix it.

But now the problem is that about half the pythons won't run at all. They get stuck at less than 1% CPU utilization, and I have to abort them.
I am moving away from interventionist projects on my machines, and the pythons are the next ones to go.
ID: 103230 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile mmstick

Send message
Joined: 4 Dec 12
Posts: 8
Credit: 606,792
RAC: 0
Message 103231 - Posted: 13 Nov 2021, 14:07:08 UTC - in response to Message 103230.  

I do constantly get the issue of having to abort Python units at 99.996% completion, even on my Ryzen 5700g desktop with 64 GB RAM, which seems to be good enough for running 8 python units simultaneously on each physical core. Have tried to limit the number of Python work units to 4 just in case so I can run 12 normal tasks in addition to that, but apparently using an app_config.xml to define max-concurrent work units causes BOINC to repeatedly ask for 12 work units every 30 seconds, so had to abort that attempt.
ID: 103231 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103232 - Posted: 13 Nov 2021, 14:19:06 UTC - in response to Message 103231.  

I do constantly get the issue of having to abort Python units at 99.996% completion, even on my Ryzen 5700g desktop with 64 GB RAM, which seems to be good enough for running 8 python units simultaneously on each physical core.

It isn't a problem of memory, and you don't need to go to 99%.
If in the first five minutes they are less than 1% CPU utilization, you can abort them. I use BoincTasks to monitor that.
ID: 103232 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
doug

Send message
Joined: 28 Mar 20
Posts: 8
Credit: 1,638,060
RAC: 1,418
Message 103233 - Posted: 13 Nov 2021, 16:04:36 UTC - in response to Message 103223.  

Thanks for the reply.

I have not done that, nor have I ever had to do it in the past. I'm running Win10 with all the latest updates. In Task Manager, on the second (Performance) tab, at the bottom with all the CPU info, it says "Virtualization: Enabled". Does that address what you are asking about? If not, do you know where in Windows I can find the info you are asking for?

Thanks.

Doug




[/img]
ID: 103233 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 103234 - Posted: 13 Nov 2021, 16:43:14 UTC - in response to Message 103233.  
Last modified: 13 Nov 2021, 16:44:38 UTC

Deleted.
ID: 103234 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103236 - Posted: 13 Nov 2021, 19:39:03 UTC
Last modified: 13 Nov 2021, 19:39:49 UTC

Maybe try what I do for LHC ATLAS which is a very picky project and has a hard time running on single cores and such.

I have in the past wrote an app_config that forced it to run on just 4 cores and 1 task at a time.
Now I can set that in the web preferences of this project.

So maybe you can try that for Python. But being it falls under "Rosetta" it will apply to all tasks from RAH.
Another stupid thing from this project and you can not set this in the web preferences here either.
ID: 103236 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 103239 - Posted: 13 Nov 2021, 21:52:12 UTC
Last modified: 13 Nov 2021, 22:07:51 UTC

I tried to get RPP to run multithreaded with this app config :-

<app_config>
<app>
<name>rosetta_python_projects</name>
</app>
<app_version>
<app_name>rosetta_python_projects</app_name>
<plan_class>vbox64</plan_class>
<avg_ncpus>5</avg_ncpus>
</app_version>
</app_config>

but even though it shows on boinc manager as ` Running(5cpus) `
each RPP task runs 25 threads total, so unless the data they are crunching is very linier.
it don't actualy do it when looking at cpu graphs, any ideas as to what else could be in an app config to force it to use multi thread
or could it be hard coded in the VM not to??
or am I wasting my time trying :(

I changed it around from the one I use at cosmology@home

<app_config>
<app>
<name>camb_boinc2docker</name>
<max_concurrent>2</max_concurrent>
</app>
<app_version>
<app_name>camb_boinc2docker</app_name>
<plan_class>vbox64_mt</plan_class>
<avg_ncpus>7</avg_ncpus>
</app_version>
</app_config>
ID: 103239 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 2,014
Message 103240 - Posted: 13 Nov 2021, 22:43:41 UTC - in response to Message 103239.  
Last modified: 13 Nov 2021, 22:46:18 UTC

I tried to get RPP to run multithreaded with this app config :-

[snip]

It's rare that you can make a program run multithreaded unless it's written to know how to do so.

Changing the app config file isn't enough if that's all you do.
ID: 103240 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103241 - Posted: 14 Nov 2021, 0:37:20 UTC
Last modified: 14 Nov 2021, 0:38:58 UTC

<name>rosetta_python_projects</name>

That as far as I know is an internal naming of the type of task.
As far as I know all tasks fall under "rosetta"

I have not found a way to isloate python tasks.
ID: 103241 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 103242 - Posted: 14 Nov 2021, 0:54:20 UTC
Last modified: 14 Nov 2021, 1:46:48 UTC

I decided to have a go at it again, give the computer a full reboot [not out the door]
even if it is / was a case of knowing just enuf to make a big mess of it
I did get a some xml errors noted in event log,
I just keep bashing away at it till something happens :)
well
it did some thing . . . . .
I know it sounds like something from a Frankenstine video
because one of the `vboxheadless.exe` instances in win7 resource monitor is using 22% of cpu on 16 core cpu, [one cpu is only 6.25%]
could someone be mad enuf to try it @home and see what happens
only new tasks downloaded AFTER the app-config is in place will get the new settings config
ID: 103242 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile mmstick

Send message
Joined: 4 Dec 12
Posts: 8
Credit: 606,792
RAC: 0
Message 103243 - Posted: 14 Nov 2021, 1:11:05 UTC

Using an app_config to set the max-concurrent value will cause your system to endlessly request work until you've fully depleted the server of work units. I don't recommend doing so until this issue is fixed: https://github.com/BOINC/boinc/issues/4322
ID: 103243 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 103244 - Posted: 14 Nov 2021, 1:32:00 UTC - in response to Message 103243.  
Last modified: 14 Nov 2021, 1:33:02 UTC

Using an app_config to set the max-concurrent value will cause your system to endlessly request work until you've fully depleted the server of work units. I don't recommend doing so until this issue is fixed: https://github.com/BOINC/boinc/issues/4322

I have not run cosmo@home for several months , endless workfetch was stopped by them having a limit serverside on the number of workunits anyone was allowed to have
I have been reading the threads here on R@H with interest about that work fetch problem
ID: 103244 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103245 - Posted: 14 Nov 2021, 2:54:04 UTC - in response to Message 103244.  

I have been reading the threads here on R@H with interest about that work fetch problem

I first ran into it several years ago on WCG. More recently, we had a discussion of it on LHC.
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5720&postid=45308#45308

Also:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5726&postid=45384#45384

It has been reported to BOINC.
https://github.com/BOINC/boinc/issues/4322
ID: 103245 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2141
Credit: 41,534,176
RAC: 10,708
Message 103250 - Posted: 14 Nov 2021, 5:29:49 UTC

After getting a nudge <cough> from Grant I asked about all this. The reply as follows

There is no server issue failing to create Rosetta 4.20 tasks. We have run out.
All those 2.3 million queued tasks on the front page really are all Python tasks. In their words "a huge queue"

However, regarding the shortage and re-supply of Rosetta 4.20 tasks:

"This will be temporary since there will be many more protein protein interaction Rosetta design jobs loaded into the queue"

That comes across to me like they're preparing them now and it won't be too much longer before we see them, though no actual ETA, nor any clue how many "many" is.

So, don't panic, more work will arrive before too long.

Famous last words...

Also, while people do disappear around the bigger holiday periods like Christmas and New Year, I've often had replies on Saturdays and Sundays, so things do happen at weekends.
Just that I think it's unreasonable to expect that will always be the case.
ID: 103250 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103251 - Posted: 14 Nov 2021, 9:06:16 UTC - in response to Message 103250.  
Last modified: 14 Nov 2021, 9:07:51 UTC

After getting a nudge <cough> from Grant I asked about all this. The reply as follows

There is no server issue failing to create Rosetta 4.20 tasks. We have run out.
All those 2.3 million queued tasks on the front page really are all Python tasks. In their words "a huge queue"

However, regarding the shortage and re-supply of Rosetta 4.20 tasks:

"This will be temporary since there will be many more protein protein interaction Rosetta design jobs loaded into the queue"

That comes across to me like they're preparing them now and it won't be too much longer before we see them, though no actual ETA, nor any clue how many "many" is.

So, don't panic, more work will arrive before too long.

Famous last words...

Also, while people do disappear around the bigger holiday periods like Christmas and New Year, I've often had replies on Saturdays and Sundays, so things do happen at weekends.
Just that I think it's unreasonable to expect that will always be the case.



If there are so many Python tasks, then why can't I get them?
I've monkeyed around with all the parameters in BOINC and I get nothing.
Right now due to the last monkeying around the queue to each project is all messed up so I am playing catch up.

About the only thing I have not done is remove the project from BOINC, do a system clean and reinstall RAH on BOINC.
ID: 103251 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,399,907
RAC: 19,807
Message 103252 - Posted: 14 Nov 2021, 10:43:23 UTC - in response to Message 103251.  

If there are so many Python tasks, then why can't I get them?
One last wild guess- Clean up any left over VMs that are gumming things up.


To completely delete any virtual machine from VirtualBox on Mac, Windows, or Linux, simply do the following:

1 Open VirtualBox and go to the VM VirtualBox Manager screen
2 Select the virtual machine and OS you want to delete (quit the VM if it’s currently active first)
3 Right-click on the virtual machine name in the list and choose “Remove”, or optionally pull down the “Machine” menu and choose “Remove”
4 To completely delete the operating system and virtual machine from VirtualBox, choose “Delete all files” *
5 Repeat with other virtual machines to delete them as needed

* If you choose “Remove only” than the virtual machine is simply removed from the VirtualBox VM manager, but none of the actual files or associated VM, OS, VDI, or anything else is deleted. Thus if you actually want to delete the VM and associated files, choose ‘Delete all files’
Remove an OS and Delete a Virtual Machine in VirtualBox
Grant
Darwin NT
ID: 103252 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 399
Credit: 12,294,748
RAC: 6,222
Message 103253 - Posted: 14 Nov 2021, 11:04:25 UTC - in response to Message 103250.  

After getting a nudge <cough> from Grant I asked about all this. The reply as follows

There is no server issue failing to create Rosetta 4.20 tasks. We have run out.
All those 2.3 million queued tasks on the front page really are all Python tasks. In their words "a huge queue"

However, regarding the shortage and re-supply of Rosetta 4.20 tasks:

"This will be temporary since there will be many more protein protein interaction Rosetta design jobs loaded into the queue"

That comes across to me like they're preparing them now and it won't be too much longer before we see them, though no actual ETA, nor any clue how many "many" is.

So, don't panic, more work will arrive before too long.

Famous last words...

Also, while people do disappear around the bigger holiday periods like Christmas and New Year, I've often had replies on Saturdays and Sundays, so things do happen at weekends.
Just that I think it's unreasonable to expect that will always be the case.


Many thanks for the update. At the rate it’s going it will take years to get through a couple of million Python tasks so the best of luck with that, I’ll wait for the normal Rosetta tasks.
ID: 103253 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 103256 - Posted: 14 Nov 2021, 12:11:00 UTC
Last modified: 14 Nov 2021, 12:16:26 UTC

have a look at VM cpu count, my frankenstine app_config for RPP duz do `something`
https://boinc.bakerlab.org/rosetta/result.php?resultid=1449470255
have a look at my valid RPP tasks [I dident get it working on all of them] or whatever it is doing
https://boinc.bakerlab.org/rosetta/results.php?userid=139198&offset=0&show_names=0&state=4&appid=9

<stderr_txt>
-snip-
2021-11-14 04:18:14 (1236): Create VM. (boinc_abea2fc7e66074d6, slot#8)
2021-11-14 04:18:14 (1236): Setting Memory Size for VM. (6144MB)
2021-11-14 04:18:15 (1236): Setting CPU Count for VM. (5)
2021-11-14 04:18:15 (1236): Setting Chipset Options for VM.
2021-11-14 04:18:15 (1236): Setting Boot Options for VM.
ID: 103256 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103257 - Posted: 14 Nov 2021, 13:16:28 UTC - in response to Message 103252.  

If there are so many Python tasks, then why can't I get them?
One last wild guess- Clean up any left over VMs that are gumming things up.


To completely delete any virtual machine from VirtualBox on Mac, Windows, or Linux, simply do the following:

1 Open VirtualBox and go to the VM VirtualBox Manager screen
2 Select the virtual machine and OS you want to delete (quit the VM if it’s currently active first)
3 Right-click on the virtual machine name in the list and choose “Remove”, or optionally pull down the “Machine” menu and choose “Remove”
4 To completely delete the operating system and virtual machine from VirtualBox, choose “Delete all files” *
5 Repeat with other virtual machines to delete them as needed

* If you choose “Remove only” than the virtual machine is simply removed from the VirtualBox VM manager, but none of the actual files or associated VM, OS, VDI, or anything else is deleted. Thus if you actually want to delete the VM and associated files, choose ‘Delete all files’
Remove an OS and Delete a Virtual Machine in VirtualBox



I knew how to remove VM's, maybe I will remove Oracle Vbox (via Revo Uninstaller) and remove Rosetta from the list and clean my drive with CCcleaner and Wise365 and then reinstall Vbox and add Rosetta back to the list. If this fails I have no freaking idea what to do other than complete all my work and unistall BOINC (via Revo) and clean the drive and reinstall it. Then its fresh.
ID: 103257 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103259 - Posted: 14 Nov 2021, 15:01:14 UTC
Last modified: 14 Nov 2021, 15:04:28 UTC

11/14/2021 3:58:02 PM | Rosetta@home | update requested by user
11/14/2021 3:58:06 PM | Rosetta@home | Sending scheduler request: Requested by user.
11/14/2021 3:58:06 PM | Rosetta@home | Requesting new tasks for CPU
11/14/2021 3:58:08 PM | Rosetta@home | Scheduler request completed: got 0 new tasks
11/14/2021 3:58:08 PM | Rosetta@home | No tasks sent
11/14/2021 3:58:08 PM | Rosetta@home | Project requested delay of 31 seconds


Tasks ready to send 5000
rosetta python projects 5000 23096

RAH was removed and added back
Vbox was removed, registry cleaned, drive cleaned, rebooted after install.
No VM's are active at this time. NO LHC running.
No app_config
No cc_config limiting RAH
Just a GPU restriction to put prime grid on my 1080.
Preferences set for "Home" all settings are default.

There is NOTHING on my end restricting RAH from doing anything.
Yet all it wants is 4.2
So what the %$#&%& is wrong with RAH?
ID: 103259 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 129 · 130 · 131 · 132 · 133 · 134 · 135 . . . 309 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org