Posts by Greg_BE

21) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107554)
Posted 22 Oct 2022 by Profile Greg_BE
Post:
Now you have seen every angle and data point as to how the GPU's operate with MOO and FAH working simultaneously.
There is no OC on this. Forgot to turn that on. So this is default mode.
22) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107553)
Posted 22 Oct 2022 by Profile Greg_BE
Post:



23) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107549)
Posted 22 Oct 2022 by Profile Greg_BE
Post:
Rule 1: If heat is pouring off the chip, it's doing a lot of work.- The fans get their workout everyday. At the moment they are just doing the easy stuff with FAH. Apparently TN needs everything cpu, thought I suppose I can write a script to put that down to 14 or something.
24) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107546)
Posted 22 Oct 2022 by Profile Greg_BE
Post:
But MOO is sharing the cards with FAH on my system.
So maybe that downgrades the time a bit?

And I don't have the fastest cards.
A 1080 plain and a 1050TI
25) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107539)
Posted 22 Oct 2022 by Profile Greg_BE
Post:
So whats happening then in plain terms is that the MOO task is huge and complex and splits its self among the two GPU's and the CPU's to be more efficient in getting the work done?
26) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107534)
Posted 21 Oct 2022 by Profile Greg_BE
Post:
There are generally two ways to distribute computation across multiple devices:

Data parallelism, where a single model gets replicated on multiple devices or multiple machines. Each of them processes different batches of data, then they merge their results. There exist many variants of this setup, that differ in how the different model replicas merge results, in whether they stay in sync at every batch or whether they are more loosely coupled, etc.

Model parallelism, where different parts of a single model run on different devices, processing a single batch of data together. This works best with models that have a naturally-parallel architecture, such as models that feature multiple branches.

This guide focuses on data parallelism, in particular synchronous data parallelism, where the different replicas of the model stay in sync after each batch they process. Synchronicity keeps the model convergence behavior identical to what you would see for single-device training.

Specifically, this guide teaches you how to use the tf.distribute API to train Keras models on multiple GPUs, with minimal changes to your code, in the following two setups:

On multiple GPUs (typically 2 to 8) installed on a single machine (single host, multi-device training). This is the most common setup for researchers and small-scale industry workflows.
On a cluster of many machines, each hosting one or multiple GPUs (multi-worker distributed training). This is a good setup for large-scale industry workflows, e.g. training high-resolution image classification models on tens of millions of images using 20-100 GPUs.

More at: https://keras.io/guides/distributed_training/
27) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107533)
Posted 21 Oct 2022 by Profile Greg_BE
Post:
What other graphs does it support?



What do you mean?
28) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107531)
Posted 21 Oct 2022 by Profile Greg_BE
Post:



So I am showing you the 1080 startup and then the 1050 running
So as you see the copy box is active on both.

Again...refer to the text in the previous post.
29) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107530)
Posted 21 Oct 2022 by Profile Greg_BE
Post:
I did have a look at the time.
They were mirrored just as it says in the processes.
The text below all the images I posted explains it.

A GPU engine represents an independent unit of silicon on the GPU that can be scheduled and can operate in parallel with one another. For example, a copy engine may be used to transfer data around while a 3D engine is used for 3D rendering. While the 3D engine can also be used to move data around, simple data transfers can be offloaded to the copy engine, allowing the 3D engine to work on more complex tasks, improving overall performance. In this case both the copy engine and the 3D engine would operate in parallel.
30) Message boards : Cafe Rosetta : Off topic stuff from technical discussion (Message 107526)
Posted 21 Oct 2022 by Profile Greg_BE
Post:
Policing is not a non political department.
Google translate..... "Policing is a political department".

Their funding depends on politics, their distribution depends on politics. Poor areas get less and rich areas get more and the rest of us get whatever is left after the rich.
I disagree, they put more in the poor areas so they can arrest more people and get commission.

Maybe over there, not in the US

Traffic patrol is about "safety" but it also has to be the funding department. That's why there are so many campaigns and undercover cars and cameras everywhere to record your speed between two points. But is under the guise of "Safety".
It's not safe when you're watching your speedometer and looking out for police instead of watching the road. It's nearly caused me to crash twice.

Yeah well, you are supposed to know by feel and sound how fast you are going no matter what car you drive.

Trucks get hit with "inspections" that nit pick over the smallest details. Your a half inch over the limit for driving without oversize signs and they hit the trucker for driving without proper signage.
Those are absurd, I'm not blind, I can tell if the truck is big without having to read a sign.

But you have to have the sign, does not matter if you can see it or not.

Weight limits and badly secured loads I can understand. But I watched a show where they picked out a truck that was with large load that looked to be within the lane limits, but was long. Carnival ride or something. They got out the tape and measured him in all three dimensions and then fined him for not having signs and lights for oversize because he was just an inch or so over length for driving without.
OCD is wrong and should be punishable by death.

Hahaha..yeah...but the cops here on traffic are beyond OCD, they are 4th degree OCD. They know every rule in the book by memory.

Or the undercover cars, that to me is just harassment.
Indeed, but I'm damn good at talking my way out of them. You have to either embarrass, confuse, amuse, or distract them.

Not going to happen here in BE world. Your holding the phone, you get fined. No ifs buts or whatevers. The only time they might let you go is if your wife is in labor. Then they will not ticket you. But they usually run your plates and license anyway to check.

Then there is the driving above the speed limit to come and pull a car over for driving with a phone in the hand. Who was more dangerous? Them speeding to go get the guy or the guy on the phone?
That's one way I got let off. I was going 95 in a 70, but he was doing "well over 100 to catch up with you". When I asked if he was allowed to do that without blue lights on, he didn't know!

Here and in the US that does not matter, you were at fault, they have to get you even if they exceed the speed limit by 20 or more whichever units you want to use.

Same in the US for speeding, who's more dangerous, me going 10 over or the cop coming from behind the bridge piling at high speed for 3 minutes to come chase me down for doing 10 over? Oh...and privacy. You have a right to privacy, yet the cops set up cameras on every corner to watch you. Ok, it does help solve crime, but where is your privacy?
Agreed.
31) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107525)
Posted 21 Oct 2022 by Profile Greg_BE
Post:


And then FAH GPU


But what is interesting is Windows shows only 20% usage of the GPU but if you look at MSI Afterburner it shows 98%



A GPU engine represents an independent unit of silicon on the GPU that can be scheduled and can operate in parallel with one another. For example, a copy engine may be used to transfer data around while a 3D engine is used for 3D rendering. While the 3D engine can also be used to move data around, simple data transfers can be offloaded to the copy engine, allowing the 3D engine to work on more complex tasks, improving overall performance. In this case both the copy engine and the 3D engine would operate in parallel.
32) Message boards : Cafe Rosetta : Off topic stuff from technical discussion (Message 107523)
Posted 21 Oct 2022 by Profile Greg_BE
Post:
poop? really? Must be a crowd of PC people.
Shit is so common in American that it ridiculous.
Move that shit, see that shit over there, dumb shit,I got to take a shit, etc.
I guess my work in the stage industry and other places I have worked are rough and tough.
I don't know anyone that says poop.
They all say it in forums because they want to avoid saying shit. If I have to avoid shit I say crap or faeces or excrement, but never poop!

Well bull fighting is animal cruelty in todays society because the bull is injured and is forced to continue until he's dead. Where as with meat, you shoot them in the head and they are gone in a instant.
Death is a million times worse than torture, because it's final.

But I didn't think it was bull meat, I thought it was more the cows that aren't producing a calfs anymore.
Then you have to watch out for downer cows that they shove through the system anyway, the cows with brain eating stuff in them, the contaminated foot and mouth cows. Though all this seems to have disappeared.
I don't have to worry as I don't like the taste of meat! But I do have an odd sense of smell, I think X smells like Y yet others think I'm being daft.

And yeah you country Brits swear more than a drunk American.
I noticed Ramsy manages to keep it at PG level, but then when you watch the police interception shows, wow! some really hardcore mouths there. An American cop would be putting the cuffs on and stuffing them in the car for that kind behavior.
We don't let our police get quite so out of control, but there are still too many of them doing us for pointless little things.


Policing is not a non political department. Their funding depends on politics, their distribution depends on politics. Poor areas get less and rich areas get more and the rest of us get whatever is left after the rich.
Traffic patrol is about "safety" but it also has to be the funding department. That's why there are so many campaigns and undercover cars and cameras everywhere to record your speed between two points. But is under the guise of "Safety". Trucks get hit with "inspections" that nit pick over the smallest details. Your a half inch over the limit for driving without oversize signs and they hit the trucker for driving without proper signage. Weight limits and badly secured loads I can understand. But I watched a show where they picked out a truck that was with large load that looked to be within the lane limits, but was long. Carnival ride or something. They got out the tape and measured him in all three dimensions and then fined him for not having signs and lights for oversize because he was just an inch or so over length for driving without.

Or the undercover cars, that to me is just harassment. Then there is the driving above the speed limit to come and pull a car over for driving with a phone in the hand. Who was more dangerous? Them speeding to go get the guy or the guy on the phone? Same in the US for speeding, who's more dangerous, me going 10 over or the cop coming from behind the bridge piling at high speed for 3 minutes to come chase me down for doing 10 over? Oh...and privacy. You have a right to privacy, yet the cops set up cameras on every corner to watch you. Ok, it does help solve crime, but where is your privacy?
33) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107520)
Posted 21 Oct 2022 by Profile Greg_BE
Post:
I would like to see what actual use is in resource/ task monitor or the like



When MOO is up to run again, I will grab a screen shot.
Right now Einstein and Prime are running.
34) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107519)
Posted 21 Oct 2022 by Profile Greg_BE
Post:
I've asked here if AMDs can do so: https://moowrap.net/forum_thread.php?id=647

Did you do anything special to combine cards on the project?



No...everything is standard.
I don't mess around with stuff like that.
All projects are default.
35) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107516)
Posted 20 Oct 2022 by Profile Greg_BE
Post:
It's ok now since it started a new task. So now it 2+2 on MOO, 4 LHC (like normal) and 9 single tasks (15) and then 16 is for windows and other stuff.
I've never seen a task (including Moo) use two GPUs. I don't have any machines with two identical cards to test it right now, but I'm sure I have in the past and I just got 2 Moo tasks, one on each card. Maybe only Nvidias do it? (I only have AMDs) Are your GPUs identical?





1080 and a 1060TI
36) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107511)
Posted 20 Oct 2022 by Profile Greg_BE
Post:
I use that as well...more for boosting the GPU temp range to run it a bit harder.
I find the manufacturer's range lets it go up to 90C and crash. I reduce it to 50C no fan, 70C half fan, 80C full fan. I also reduce clocks on old cards that are tired and crash a lot, but I never increase clocks as that just causes unreliability and busted cards. I also boosted the max power consumption on my Fury from normal to +50% (!) to stop it reducing the clock when thinking hard and it seems happy with it, apart from I melted the power connector, so I soldered the wires directly on, it's working now. I was going to boost it even higher, but if I select the option to exceed manufacturers overclocking specs, it crashes the other card in that machine, even though I'm not overpowering that one.

But something isn't adding up: 10/20/2022 2:32:52 PM | Moo! Wrapper | Found app_config.xml
10/20/2022 2:32:52 PM | | Config: use all coprocessors
I see nothing wrong there.

The percentages look the same and the core usage configuration looks the same still.
It should have noticed your Moo task needs 2 CPU cores, and should have reduced the number of CPU tasks running. The CPU will presumably still be maxed out, but doing the correct number of things, so the CPU tasks should drift up to nearly 100% each.

Note: it will still display the same 0.2C+2NV until you download new Moo tasks, then it will show 2C+2NV. It's doing what you asked, just not updating the display on the tasks you already have. Yet another Boinc bug.


It's ok now since it started a new task. So now it 2+2 on MOO, 4 LHC (like normal) and 9 single tasks (15) and then 16 is for windows and other stuff.
37) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107509)
Posted 20 Oct 2022 by Profile Greg_BE
Post:
Then all is well. I use MSI afterburner which gives lovely graphs. I can see temperature and usage of CPU and GPU all at once and adjust Boinc until everything is running most efficiently.


I use that as well...more for boosting the GPU temp range to run it a bit harder.
But I see the usage graph now as well. I'm pretty much maxed out. 98% on both cards.

I forgot it looks at CPU as well.
Based on that info, the system is working just under maximum. 98% seems to be a common number.
CPU temp with a radiator is also just under max.

That's what I built this system to do.
Gaming board, hardcore but not pricey CPU.
GPU's are second hand. I got a good deal on the 1080 a long time ago, that group sold off all their cards really fast. I'm ok with a 1060 and a 1080.
I've spent all the money I am going to on this system and I will just keep it this way until it dies.

App_config I have used in the past to limit LHC ATLAS. But then they changed their coding and the app_config was no longer needed.

But something isn't adding up: 10/20/2022 2:32:52 PM | Moo! Wrapper | Found app_config.xml
10/20/2022 2:32:52 PM | | Config: use all coprocessors


The percentages look the same and the core usage configuration looks the same still.


Thats good to see they are looking at that idea for more specific control for idle and non idle systems.
38) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107507)
Posted 20 Oct 2022 by Profile Greg_BE
Post:
Also noticed that BOINC tasks says MOO is using between 180-190% CPU and maybe TN with Rosetta and FAH is chewing up to much capacity?

A couple of TN's are 40 minutes behind in CPU time vs run time.
Makes me wonder if TN needs more power.
Rosetta is fine, running about 15 minutes behind.
Maybe drop TN then. For now, setting it to no new tasks to see what happens after it has completed all the stuff in queue.
Your computer's running fine. Your GPU task is getting the 2 CPU cores it needs. The GPUs are most important since they're much more powerful and should always be kept busy. Your remaining CPU cores are shared between the CPU tasks.

Either leave it as it is, since presumably (you haven't told me) your GPU is flat out and so is your CPU.

Or use app config in Moo to tell it that task requires 2 CPU cores, so Boinc will run less CPU tasks at the same time.

Or decrease the number of CPU cores Boinc is allowed to use in preferences, but this won't be very automatic, since if you change your GPU to do something other than Moo, it won't be set correctly.


HWINFO says GPU's are 98% capacity. 15/16 cores are used in BOINC. 1 core is left free for general non BOINC stuff. Though if there was a way that when I am not on the computer to use all 16 and then release one when the computer is in use, then it would be a more efficient use of my system. So I could configure MOO to do like you said for now. I will have to look up the command to do that.
39) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107504)
Posted 19 Oct 2022 by Profile Greg_BE
Post:
Also noticed that BOINC tasks says MOO is using between 180-190% CPU and maybe TN with Rosetta and FAH is chewing up to much capacity?

A couple of TN's are 40 minutes behind in CPU time vs run time.
Makes me wonder if TN needs more power.
Rosetta is fine, running about 15 minutes behind.
Maybe drop TN then. For now, setting it to no new tasks to see what happens after it has completed all the stuff in queue.
40) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107503)
Posted 19 Oct 2022 by Profile Greg_BE
Post:
FAH both GPU's
BOINC 1 GPU at the moment, 15 cores in total allocated to BOINC
TN got more work, so more cores as its a new project.
MOO is also "new"
Must have been a learning thing. CPU's in the 90s now on TN
Either that or you're running more things than cores. You have 16 cores, but you're using 17 cores (Moo is using two cores, as indicated by task manager saying 12%). The CPU doesn't care, but you should watch your GPU usage, because if the CPU is overloaded, it might not feed the GPU fast enough (depends on the project).



yeah I keep forgetting gpu uses cpu, but then again figured BOINC or the system would take care of managing that.


Previous 20 · Next 20



©2024 University of Washington
https://www.bakerlab.org