Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 231 · 232 · 233 · 234 · 235 · 236 · 237 . . . 257 · Next

AuthorMessage
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1479
Credit: 7,259,548
RAC: 10,935
Message 107024 - Posted: 2 Oct 2022, 9:30:50 UTC - in response to Message 107022.  

Hello, these are from me. I thought I'd double check how things looked on the forums after submitting such a massive run.

This is part of a massive sampling of small cyclic peptides generated using some pretty cool tweaks to Alphafold 2. The term hallucinated refers to protein structures generated using this method: https://www.nature.com/articles/s41586-021-04184-w

These jobs are generating energy landscapes for structurally diverse cluster centers. I'm super excited about these results and really appreciate the incredible amount of computation available from Rosetta@home. The info we get from this will help us put together a massive set of peptide backbones for binder design projects in the future.

I hope to be submitting more... Assuming I'm not breaking any rules by monopolizing this resource.

Thank you all!

-Stephen
Thank you for taking the time to come in here and speak to us. We like to know more about the science our machines are running. And submit as much as you like, Rosetta has been pretty quiet for years. I'll stick my 10 machines on it right away.
ID: 107024 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1905
Credit: 35,167,433
RAC: 9,651
Message 107025 - Posted: 2 Oct 2022, 9:39:19 UTC - in response to Message 107023.  

I hope to be submitting more... Assuming I'm not breaking any rules by monopolizing this resource.
Nope.
Many of the systems that used to process Rosetta work have been moved to other projects due to the lack of Rosetta 4.20 work here for some time now. And due to the system requirements of Python Tasks, very few systems were capable of processing them, and because of issues with the application/ some Tasks many others that could have processed them chose not to because of the manual intervention often required to sort out those issues.

The more Rosetta 4.20 work there is, then the more people that will come to process it, and the sooner it will be returned.

And are running successfully, fully and with no errors on Intel, AMD and Android as well, which can't be said for everything.

However large the batch submitted, the appetite to run them is huge.
In past times, if the queue to run was even 4 times as large, there'd be concern the queue was running a little low
ID: 107025 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1479
Credit: 7,259,548
RAC: 10,935
Message 107026 - Posted: 2 Oct 2022, 9:43:07 UTC - in response to Message 107025.  

And are running successfully, fully and with no errors on Intel, AMD and Android as well, which can't be said for everything.
My 2 Androids (OS 7 and 11) are getting no tasks.
ID: 107026 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5652
Credit: 5,622,096
RAC: 32
Message 107029 - Posted: 2 Oct 2022, 13:45:02 UTC - in response to Message 107022.  
Last modified: 2 Oct 2022, 13:45:26 UTC

Bring anything you want done on a pc here, I will be glad to help process them for you.
I take it these are complex models as they do not run as fast as the old stuff.
At an 1hr and 50 minutes run time CPU time is just over 30 minutes and CPU % usage is down in the 30's. Also a bit odd. Maybe my system?


Hello, these are from me. I thought I'd double check how things looked on the forums after submitting such a massive run.

This is part of a massive sampling of small cyclic peptides generated using some pretty cool tweaks to Alphafold 2. The term hallucinated refers to protein structures generated using this method: https://www.nature.com/articles/s41586-021-04184-w

These jobs are generating energy landscapes for structurally diverse cluster centers. I'm super excited about these results and really appreciate the incredible amount of computation available from Rosetta@home. The info we get from this will help us put together a massive set of peptide backbones for binder design projects in the future.

I hope to be submitting more... Assuming I'm not breaking any rules by monopolizing this resource.

Thank you all!

-Stephen
ID: 107029 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1479
Credit: 7,259,548
RAC: 10,935
Message 107030 - Posted: 2 Oct 2022, 14:01:41 UTC - in response to Message 107029.  
Last modified: 2 Oct 2022, 14:06:42 UTC

At an 1hr and 50 minutes run time CPU time is just over 30 minutes and CPU % usage is down in the 30's. Also a bit odd. Maybe my system?
Something is up with your system, all my 8 are running them at 100% CPU usage. Is your CPU doing other things? What's the total CPU usage?

Your previous Rosetta 4.20 tasks are showing the correct flops for your CPU.
ID: 107030 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 183
Credit: 103,927
RAC: 2
Message 107031 - Posted: 2 Oct 2022, 14:04:13 UTC - in response to Message 107030.  

My task manager showed that cpu was loaded 50% when voltage regulation module on my board was thermally throttled and dropped voltage and cpu frequency.
ID: 107031 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1479
Credit: 7,259,548
RAC: 10,935
Message 107032 - Posted: 2 Oct 2022, 14:07:38 UTC - in response to Message 107031.  

My task manager showed that cpu was loaded 50% when voltage regulation module on my board was thermally throttled and dropped voltage and cpu frequency.
I do not permit throttling, I add more fans! 7 of my 8 systems are in the garage and can make as much noise as they like!
ID: 107032 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1905
Credit: 35,167,433
RAC: 9,651
Message 107033 - Posted: 2 Oct 2022, 15:23:24 UTC - in response to Message 107026.  

And are running successfully, fully and with no errors on Intel, AMD and Android as well, which can't be said for everything.
My 2 Androids (OS 7 and 11) are getting no tasks.

I was sure I had a few earlier in the week, but my host isn't showing recent completions so you may be right
ID: 107033 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1905
Credit: 35,167,433
RAC: 9,651
Message 107034 - Posted: 2 Oct 2022, 15:29:45 UTC - in response to Message 107030.  
Last modified: 2 Oct 2022, 15:30:26 UTC

At an 1hr and 50 minutes run time CPU time is just over 30 minutes and CPU % usage is down in the 30's. Also a bit odd. Maybe my system?
Something is up with your system, all my 8 are running them at 100% CPU usage. Is your CPU doing other things? What's the total CPU usage?

Your previous Rosetta 4.20 tasks are showing the correct flops for your CPU.

Mine are all running 100%, but typically 5h 2m CPU time, 5h 25m elapsed time, but it's the PC I'm regularly using for all sorts of things.
A laptop I'm not using is at 5h 34m v 5h 43m
Maybe a little more drop-off than usual but nothing surprising or concerning
ID: 107034 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5652
Credit: 5,622,096
RAC: 32
Message 107035 - Posted: 2 Oct 2022, 16:05:04 UTC - in response to Message 107030.  
Last modified: 2 Oct 2022, 16:19:13 UTC

That's odd.
I have 8 physical 16 virtual
BOINC is told to use 95% i.e. 15 cores in order to leave one free for GPU and general purpose things.
We have had this discussion in the past already.

I have FAH running, but GPU only.

NOW Boinc has allocated 4 new slots for working on these new tasks.

First time is when I started to compose, 5 minutes later the second values are showing.

Original #1 task 39.8% 4 hrs clock time and 1:38 CPU
#2 47.9 +/- 1:13/35 Went up to 48 now
#3 47.5 +/- 57/27 Dropped to 48
#4 51.5 +/- 54/27 Went up to 52.2
#5 50.9 +/- 28/14 Dropped to 48

Only using 32% of all physical memory
Task manager shows 4.2 tasks only using only 2-5% CPU% out of BOINC managers 66-70% of the total cpu percentage.
ID: 107035 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1479
Credit: 7,259,548
RAC: 10,935
Message 107036 - Posted: 2 Oct 2022, 16:19:20 UTC - in response to Message 107035.  
Last modified: 2 Oct 2022, 16:20:24 UTC

To drop to 50% is very odd indeed. So you have 15 of Rosetta 4.2 tasks running at once, yes? And the 16th thread helps the GPU with Folding? Folding hasn't stolen the CPU had it? Folding often does whatever it likes and ignores what you say. I do similar things and get almost 100% usage in Rosetta. Use the task manager to find out how much % of your CPU each program is getting. With 16 threads, you should be getting about 6% usage for each Rosetta task.
ID: 107036 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 183
Credit: 103,927
RAC: 2
Message 107037 - Posted: 2 Oct 2022, 16:22:15 UTC

6.25
ID: 107037 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1479
Credit: 7,259,548
RAC: 10,935
Message 107038 - Posted: 2 Oct 2022, 16:42:38 UTC - in response to Message 107037.  

6.25
I said about 6 because the OS etc etc steals a bit.
ID: 107038 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5652
Credit: 5,622,096
RAC: 32
Message 107039 - Posted: 2 Oct 2022, 17:29:28 UTC - in response to Message 107036.  
Last modified: 2 Oct 2022, 17:32:11 UTC

To drop to 50% is very odd indeed. So you have 15 of Rosetta 4.2 tasks running at once, yes? And the 16th thread helps the GPU with Folding? Folding hasn't stolen the CPU had it? Folding often does whatever it likes and ignores what you say. I do similar things and get almost 100% usage in Rosetta. Use the task manager to find out how much % of your CPU each program is getting. With 16 threads, you should be getting about 6% usage for each Rosetta task.



Well I had a deep clean round with Wisecare and cleaned up the system and reboot.
The CPU format is 15 cores for BOINC and 1 Core everything esle (GPU support, web browsing, youtube if I am on it), etc.

BOINC is now running 1 GPU Grid, 1 Milkyway (1 CPU core for both) and 14 cores of 4.2 for a total of 15 cores BOINC

Percentages have evened out now on the new work but with the exception of the one I'm 5hrs clock time running and only 2hrs+ of CPU and only 46% CPU. Progress is only 38%
STDERR on these tasks does not offer any information.

2nd task runs at 51% CPU running 2:13/1:08 and 17.7 done
3rd task is 52% 1:58/1:02 and 15.4
4 and 5 show similar behavior

All the rest show normal behavior 30 clock and 20 cpu with 62-65%
I have one running at 94%!

I'm about ready to kill that 5hr one because all the other tasks will finish before that even gets close to finishing. Rosetta@home 4.20 Rosetta 10mer_af_hallucinated_163_6_best_SAVE_ALL_OUT_2918052_367_0 05:15:14 (02:32:10) 48.2 41.242 04:42:02 10/5/2022 1:45:59 PM Running 273.99 MB 249.61 MB DESKTOP-LFM92VN [14] 00:00:12



This comes from Boinc Tasks
ID: 107039 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 183
Credit: 103,927
RAC: 2
Message 107040 - Posted: 2 Oct 2022, 17:34:33 UTC - in response to Message 107039.  

Do you run python workunits for gpugrid? Python workunits use several cpu cores to run python machine learning environment, so conflict with cpu workunits.
For gpugrid you can tail wrapper_run.out file.
ID: 107040 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1479
Credit: 7,259,548
RAC: 10,935
Message 107041 - Posted: 2 Oct 2022, 17:46:18 UTC - in response to Message 107039.  

There's something terribly wrong with your system if you're getting 16 things getting half a core each when you have 16 cores.
ID: 107041 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5652
Credit: 5,622,096
RAC: 32
Message 107042 - Posted: 2 Oct 2022, 19:26:38 UTC - in response to Message 107040.  
Last modified: 2 Oct 2022, 19:31:29 UTC

Do you run python workunits for gpugrid? Python workunits use several cpu cores to run python machine learning environment, so conflict with cpu workunits.
For gpugrid you can tail wrapper_run.out file.



Well that's interesting...in task manager there is 1 python that uses about 29.x% max and goes down to 16%
And slew...I can't count..maybe 30 processes but only 0.1% each. So that grabs what...31% or something?

I had a look while suspending GPU. While it takes them out of the system resources, BOINC does not change anything with RAH%.

I have WCG running on GPU, but nothing else besides FAH is running GPU.

So it really doesn't look like python is affecting anything.
I'll run the task out and put that project as no new tasks and see what happens.
Due to energy costs, I can not run this system like I used to. So there will be a delay in getting started again. (10 hrs or more but less than a day)
ID: 107042 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1479
Credit: 7,259,548
RAC: 10,935
Message 107043 - Posted: 2 Oct 2022, 19:42:35 UTC

Is there any CPU time free in task manager? If yes, something very weird is wrong. If no, get rid of whatever is using it up.
ID: 107043 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5652
Credit: 5,622,096
RAC: 32
Message 107044 - Posted: 2 Oct 2022, 20:34:34 UTC - in response to Message 107043.  

Is there any CPU time free in task manager? If yes, something very weird is wrong. If no, get rid of whatever is using it up.



I killed some non windows stuff that I don't use that is in the background.
It did not make any difference.
I'll try a "repair" of BOINC and see if that does anything, otherwise tomorrow night (Eu time) I'll start up again and see what happens.
ID: 107044 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1479
Credit: 7,259,548
RAC: 10,935
Message 107045 - Posted: 2 Oct 2022, 20:39:34 UTC

Show us what you see in your task manager. What is the free % CPU usage? What CPU usage does each Rosetta task get? We can't help you if you give no information.
ID: 107045 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 231 · 232 · 233 · 234 · 235 · 236 · 237 . . . 257 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2022 University of Washington
https://www.bakerlab.org