21)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107554)
Posted 22 Oct 2022 by Greg_BE Post: Now you have seen every angle and data point as to how the GPU's operate with MOO and FAH working simultaneously. There is no OC on this. Forgot to turn that on. So this is default mode. |
22)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107553)
Posted 22 Oct 2022 by Greg_BE Post: |
23)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107549)
Posted 22 Oct 2022 by Greg_BE Post: Rule 1: If heat is pouring off the chip, it's doing a lot of work.- The fans get their workout everyday. At the moment they are just doing the easy stuff with FAH. Apparently TN needs everything cpu, thought I suppose I can write a script to put that down to 14 or something. |
24)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107546)
Posted 22 Oct 2022 by Greg_BE Post: But MOO is sharing the cards with FAH on my system. So maybe that downgrades the time a bit? And I don't have the fastest cards. A 1080 plain and a 1050TI |
25)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107539)
Posted 22 Oct 2022 by Greg_BE Post: So whats happening then in plain terms is that the MOO task is huge and complex and splits its self among the two GPU's and the CPU's to be more efficient in getting the work done? |
26)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107534)
Posted 21 Oct 2022 by Greg_BE Post: There are generally two ways to distribute computation across multiple devices: Data parallelism, where a single model gets replicated on multiple devices or multiple machines. Each of them processes different batches of data, then they merge their results. There exist many variants of this setup, that differ in how the different model replicas merge results, in whether they stay in sync at every batch or whether they are more loosely coupled, etc. Model parallelism, where different parts of a single model run on different devices, processing a single batch of data together. This works best with models that have a naturally-parallel architecture, such as models that feature multiple branches. This guide focuses on data parallelism, in particular synchronous data parallelism, where the different replicas of the model stay in sync after each batch they process. Synchronicity keeps the model convergence behavior identical to what you would see for single-device training. Specifically, this guide teaches you how to use the tf.distribute API to train Keras models on multiple GPUs, with minimal changes to your code, in the following two setups: On multiple GPUs (typically 2 to 8) installed on a single machine (single host, multi-device training). This is the most common setup for researchers and small-scale industry workflows. On a cluster of many machines, each hosting one or multiple GPUs (multi-worker distributed training). This is a good setup for large-scale industry workflows, e.g. training high-resolution image classification models on tens of millions of images using 20-100 GPUs. More at: https://keras.io/guides/distributed_training/ |
27)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107533)
Posted 21 Oct 2022 by Greg_BE Post: What other graphs does it support? What do you mean? |
28)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107531)
Posted 21 Oct 2022 by Greg_BE Post: So I am showing you the 1080 startup and then the 1050 running So as you see the copy box is active on both. Again...refer to the text in the previous post. |
29)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107530)
Posted 21 Oct 2022 by Greg_BE Post: I did have a look at the time. They were mirrored just as it says in the processes. The text below all the images I posted explains it. A GPU engine represents an independent unit of silicon on the GPU that can be scheduled and can operate in parallel with one another. For example, a copy engine may be used to transfer data around while a 3D engine is used for 3D rendering. While the 3D engine can also be used to move data around, simple data transfers can be offloaded to the copy engine, allowing the 3D engine to work on more complex tasks, improving overall performance. In this case both the copy engine and the 3D engine would operate in parallel. |
30)
Message boards :
Cafe Rosetta :
Off topic stuff from technical discussion
(Message 107526)
Posted 21 Oct 2022 by Greg_BE Post: Policing is not a non political department.Google translate..... "Policing is a political department". |
31)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107525)
Posted 21 Oct 2022 by Greg_BE Post: And then FAH GPU But what is interesting is Windows shows only 20% usage of the GPU but if you look at MSI Afterburner it shows 98% A GPU engine represents an independent unit of silicon on the GPU that can be scheduled and can operate in parallel with one another. For example, a copy engine may be used to transfer data around while a 3D engine is used for 3D rendering. While the 3D engine can also be used to move data around, simple data transfers can be offloaded to the copy engine, allowing the 3D engine to work on more complex tasks, improving overall performance. In this case both the copy engine and the 3D engine would operate in parallel. |
32)
Message boards :
Cafe Rosetta :
Off topic stuff from technical discussion
(Message 107523)
Posted 21 Oct 2022 by Greg_BE Post: poop? really? Must be a crowd of PC people.They all say it in forums because they want to avoid saying shit. If I have to avoid shit I say crap or faeces or excrement, but never poop! Policing is not a non political department. Their funding depends on politics, their distribution depends on politics. Poor areas get less and rich areas get more and the rest of us get whatever is left after the rich. Traffic patrol is about "safety" but it also has to be the funding department. That's why there are so many campaigns and undercover cars and cameras everywhere to record your speed between two points. But is under the guise of "Safety". Trucks get hit with "inspections" that nit pick over the smallest details. Your a half inch over the limit for driving without oversize signs and they hit the trucker for driving without proper signage. Weight limits and badly secured loads I can understand. But I watched a show where they picked out a truck that was with large load that looked to be within the lane limits, but was long. Carnival ride or something. They got out the tape and measured him in all three dimensions and then fined him for not having signs and lights for oversize because he was just an inch or so over length for driving without. Or the undercover cars, that to me is just harassment. Then there is the driving above the speed limit to come and pull a car over for driving with a phone in the hand. Who was more dangerous? Them speeding to go get the guy or the guy on the phone? Same in the US for speeding, who's more dangerous, me going 10 over or the cop coming from behind the bridge piling at high speed for 3 minutes to come chase me down for doing 10 over? Oh...and privacy. You have a right to privacy, yet the cops set up cameras on every corner to watch you. Ok, it does help solve crime, but where is your privacy? |
33)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107520)
Posted 21 Oct 2022 by Greg_BE Post: I would like to see what actual use is in resource/ task monitor or the like When MOO is up to run again, I will grab a screen shot. Right now Einstein and Prime are running. |
34)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107519)
Posted 21 Oct 2022 by Greg_BE Post: I've asked here if AMDs can do so: https://moowrap.net/forum_thread.php?id=647 No...everything is standard. I don't mess around with stuff like that. All projects are default. |
35)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107516)
Posted 20 Oct 2022 by Greg_BE Post: It's ok now since it started a new task. So now it 2+2 on MOO, 4 LHC (like normal) and 9 single tasks (15) and then 16 is for windows and other stuff.I've never seen a task (including Moo) use two GPUs. I don't have any machines with two identical cards to test it right now, but I'm sure I have in the past and I just got 2 Moo tasks, one on each card. Maybe only Nvidias do it? (I only have AMDs) Are your GPUs identical? 1080 and a 1060TI |
36)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107511)
Posted 20 Oct 2022 by Greg_BE Post: I use that as well...more for boosting the GPU temp range to run it a bit harder.I find the manufacturer's range lets it go up to 90C and crash. I reduce it to 50C no fan, 70C half fan, 80C full fan. I also reduce clocks on old cards that are tired and crash a lot, but I never increase clocks as that just causes unreliability and busted cards. I also boosted the max power consumption on my Fury from normal to +50% (!) to stop it reducing the clock when thinking hard and it seems happy with it, apart from I melted the power connector, so I soldered the wires directly on, it's working now. I was going to boost it even higher, but if I select the option to exceed manufacturers overclocking specs, it crashes the other card in that machine, even though I'm not overpowering that one. It's ok now since it started a new task. So now it 2+2 on MOO, 4 LHC (like normal) and 9 single tasks (15) and then 16 is for windows and other stuff. |
37)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107509)
Posted 20 Oct 2022 by Greg_BE Post: Then all is well. I use MSI afterburner which gives lovely graphs. I can see temperature and usage of CPU and GPU all at once and adjust Boinc until everything is running most efficiently. I use that as well...more for boosting the GPU temp range to run it a bit harder. But I see the usage graph now as well. I'm pretty much maxed out. 98% on both cards. I forgot it looks at CPU as well. Based on that info, the system is working just under maximum. 98% seems to be a common number. CPU temp with a radiator is also just under max. That's what I built this system to do. Gaming board, hardcore but not pricey CPU. GPU's are second hand. I got a good deal on the 1080 a long time ago, that group sold off all their cards really fast. I'm ok with a 1060 and a 1080. I've spent all the money I am going to on this system and I will just keep it this way until it dies. App_config I have used in the past to limit LHC ATLAS. But then they changed their coding and the app_config was no longer needed. But something isn't adding up: 10/20/2022 2:32:52 PM | Moo! Wrapper | Found app_config.xml 10/20/2022 2:32:52 PM | | Config: use all coprocessors The percentages look the same and the core usage configuration looks the same still. Thats good to see they are looking at that idea for more specific control for idle and non idle systems. |
38)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107507)
Posted 20 Oct 2022 by Greg_BE Post: Also noticed that BOINC tasks says MOO is using between 180-190% CPU and maybe TN with Rosetta and FAH is chewing up to much capacity?Your computer's running fine. Your GPU task is getting the 2 CPU cores it needs. The GPUs are most important since they're much more powerful and should always be kept busy. Your remaining CPU cores are shared between the CPU tasks. HWINFO says GPU's are 98% capacity. 15/16 cores are used in BOINC. 1 core is left free for general non BOINC stuff. Though if there was a way that when I am not on the computer to use all 16 and then release one when the computer is in use, then it would be a more efficient use of my system. So I could configure MOO to do like you said for now. I will have to look up the command to do that. |
39)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107504)
Posted 19 Oct 2022 by Greg_BE Post: Also noticed that BOINC tasks says MOO is using between 180-190% CPU and maybe TN with Rosetta and FAH is chewing up to much capacity? A couple of TN's are 40 minutes behind in CPU time vs run time. Makes me wonder if TN needs more power. Rosetta is fine, running about 15 minutes behind. Maybe drop TN then. For now, setting it to no new tasks to see what happens after it has completed all the stuff in queue. |
40)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107503)
Posted 19 Oct 2022 by Greg_BE Post: Either that or you're running more things than cores. You have 16 cores, but you're using 17 cores (Moo is using two cores, as indicated by task manager saying 12%). The CPU doesn't care, but you should watch your GPU usage, because if the CPU is overloaded, it might not feed the GPU fast enough (depends on the project).FAH both GPU'sMust have been a learning thing. CPU's in the 90s now on TN yeah I keep forgetting gpu uses cpu, but then again figured BOINC or the system would take care of managing that. |
©2024 University of Washington
https://www.bakerlab.org