Problems and Technical Issues with Rosetta@home

Author	Message
Greg_BE Send message Joined: 30 May 06 Posts: 5770 Credit: 6,139,760 RAC: 0	Message 107553 - Posted: 22 Oct 2022, 22:56:44 UTC Last modified: 22 Oct 2022, 23:01:47 UTC ID: 107553 · Rating: 0 · rate: / Reply Quote

Greg_BE Send message Joined: 30 May 06 Posts: 5770 Credit: 6,139,760 RAC: 0	Message 107554 - Posted: 22 Oct 2022, 22:58:43 UTC Now you have seen every angle and data point as to how the GPU's operate with MOO and FAH working simultaneously. There is no OC on this. Forgot to turn that on. So this is default mode. ID: 107554 · Rating: 0 · rate: / Reply Quote

Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0	Message 107555 - Posted: 23 Oct 2022, 10:36:44 UTC - in response to Message 107550. Isn't it possible to write the program so it does as much DP as possible on the GPU, but if that's not enough, use the CPU aswell? Possible: maybe, depending if the calculation can be split into few smaller ones, rational: very likely no. You would need the DP part twice in your application (OK, Einstein has that) AND a scheduler, which monitors the performance and assigns parts to the CPU and GPU. That would always cost performance, you would eventually need to assign more CPU cores to the task because of that (specially if they should be mt like you said before) and at the end of the day you might have speed up some tasks while slowing down the overall production of the computer. And that for some rare hardware configurations with higher DP performance on CPU than on GPU? OPNG shouldn't need it IIRC, no idea about the others, but unlike they implemented it like Einstein, if they need DP, the app will not run on SP cards, just like Milkyway. I'm sure I saw someone mention there's some DP in it. It was a bug in the beta. I would assume a decent program would use the CPU for that bit if the card was SP only (those actually exist? I thought all cards had at least a tiny bit of DP). So Milkyway isn't a decent program? No, if the program requires something from the hardware, it will simply crash if it's missing, that's same for CPUs, a program, which requires for example SSE2 won't run if the CPU is missing it. And no, not all cards have DP. Well, the newer ones I think do, but on older generations only the high end ones had it, otherwise there would never have been any question about it on Milkyway, it would always run on that "tiny bit", even if slow. Anyway, judging by the speed OPNG runs on my different cards, there's DP in it. As far as I can tell, a lot of it runs on the CPU, so people need up to 16 instances to load their GPUs. That does not indicate any DP requirement, it seems more like the hybrid app we had once for Astropulse on SETI. It will still run on GPU, the app doesn't know if the CPU is faster: https://einsteinathome.org/content/fgrp5-cpu-and-fgrpb1g-gpu-why-does-crunching-seem-pause-90 Surely the app can get the benchmark data from Boinc? No idea if it theoretically could, it's not doing it, if DP is available on the card, it will use it, because according to the devs anything else would be nonsense. BOINC has just the SP flops anyway + DP yes/no, see coproc_info.xml. Never seen that either. Have you tried Moo on two Nvidia cards? No. But I meant it more like "never heard of it". . ID: 107555 · Rating: 0 · rate: / Reply Quote

Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0	Message 107556 - Posted: 23 Oct 2022, 10:43:07 UTC - in response to Message 107554. Now you have seen every angle and data point as to how the GPU's operate with MOO and FAH working simultaneously. For comparison it would be interesting to see how the GPU load looks like with just Moo running and with just FAH running. . ID: 107556 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 6	Message 107557 - Posted: 23 Oct 2022, 10:49:10 UTC - in response to Message 107555. And that for some rare hardware configurations with higher DP performance on CPU than on GPU? I thought that was common, as CPUs are damn good at everything, and Nvidia sorely lack DP. It was a bug in the beta. That doesn't say much about it. The fact is it needs DP. Perhaps it was causing an error if there weren't enough, but now it makes do. So Milkyway isn't a decent program? No, if the program requires something from the hardware, it will simply crash if it's missing. But it isn't missing from the computer, it's on the CPU. A GPU task actually runs on the CPU and passes relevant parts to the GPU, which is sometimes all of it, and sometimes half chunks of it. And if DP is required when you have an SP card it should just run those bits on the CPU. As far as I can tell, a lot of it runs on the CPU, so people need up to 16 instances to load their GPUs. That does not indicate any DP requirement, it seems more like the hybrid app we had once for Astropulse on SETI. That's nothing like what I see here, I've not even had to double them up on my GPUs. And I have some pretty shit CPUs. No idea if it theoretically could, it's not doing it, if DP is available on the card, it will use it, because according to the devs anything else would be nonsense. BOINC has just the SP flops anyway + DP yes/no, see coproc_info.xml. Yeah that was a bit daft of me to assume Boinc would be even remotely that clever. ID: 107557 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 6	Message 107558 - Posted: 23 Oct 2022, 10:52:25 UTC - in response to Message 107556. Now you have seen every angle and data point as to how the GPU's operate with MOO and FAH working simultaneously. For comparison it would be interesting to see how the GPU load looks like with just Moo running and with just FAH running. I gave up on Folding at Home. Until they join Boinc they can get lost. It's ridiculous trying to use both at once because they aren't aware of what each other is doing, so it's impossible to fully load my computers. Also their scheduler is even stupider than Boinc. ID: 107558 · Rating: 0 · rate: / Reply Quote

Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0	Message 107559 - Posted: 23 Oct 2022, 12:02:25 UTC - in response to Message 107557. Last modified: 23 Oct 2022, 12:09:43 UTC And that for some rare hardware configurations with higher DP performance on CPU than on GPU? I thought that was common, as CPUs are damn good at everything, and Nvidia sorely lack DP. See the thread from Einstein, which I posted above, there are run times for CPU vs. GPU for the DP stage. It was a bug in the beta. That doesn't say much about it. The fact is it needs DP. Perhaps it was causing an error if there weren't enough, but now it makes do. "Double precision is not a requirement. During BETA, some hosts (mostly Intel IIRC) were getting some errors due to a lack of double precision floats but those errors were fixed in the application code and haven't been reported since." From the link I poted above. That sounds pretty clear to me, feel free to post something, that states the opposite. BTW, "not enough DP" does not exist, like every other instruction set, it's either available or not. So Milkyway isn't a decent program? No, if the program requires something from the hardware, it will simply crash if it's missing. But it isn't missing from the computer, it's on the CPU. A GPU task actually runs on the CPU and passes relevant parts to the GPU, which is sometimes all of it, and sometimes half chunks of it. And if DP is required when you have an SP card it should just run those bits on the CPU. It doesn't, Milkyway won't run on SP card. It's possible the way you describe it, if the application supports it, i.e. has both paths in it's code like Einstein's FGRPB1G. But than it does not require DP, it's optional. As far as I can tell, a lot of it runs on the CPU, so people need up to 16 instances to load their GPUs. That does not indicate any DP requirement, it seems more like the hybrid app we had once for Astropulse on SETI. That's nothing like what I see here, I've not even had to double them up on my GPUs. And I have some pretty shit CPUs. No idea, it's just what I've read on the forums, however it might not apply for all cards. . ID: 107559 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 6	Message 107560 - Posted: 23 Oct 2022, 12:12:17 UTC - in response to Message 107559. "Double precision is not a requirement. During BETA, some hosts (mostly Intel IIRC) were getting some errors due to a lack of double precision floats but those errors were fixed in the application code and haven't been reported since." From the link I poted above. That sounds pretty clear to me, feel free to post something, that states the opposite. BTW, "not enough DP" does not exist, like every other instruction set, it's either available or not. I have run it on many different cards, and comparing the time per task, the DP and SP speed of each card, I can see it's using some DP. It might not have to, but it can. My cards with a more DP do the tasks faster than they should be able to. No idea, it's just what I've read on the forums, however it might not apply for all cards. I've seen lots of forums with people complaining about high CPU usage on Nvidia cards. Cuda often needs a lot more CPU help than OpenCL. ID: 107560 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1264 Credit: 14,421,737 RAC: 0	Message 107563 - Posted: 24 Oct 2022, 5:00:52 UTC WCG now appears to be trying to get more useful work done by sending mostly tasks with small total sizes of the input files, such as tasks for the OPN1 subproject. Is that what others are also seeing? I don't expect Krembil to like this, since it means little work for the MCM1 subproject they are especially interested in. In other words, this may change soon. They were previously sending so many MCM1 tasks that the download server was often slow to respond, although it tended to make it easier to download large input files that small ones. ID: 107563 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1264 Credit: 14,421,737 RAC: 0	Message 107564 - Posted: 24 Oct 2022, 5:21:30 UTC - in response to Message 107560. Last modified: 24 Oct 2022, 5:22:14 UTC I've seen lots of forums with people complaining about high CPU usage on Nvidia cards. Cuda often needs a lot more CPU help than OpenCL. Depends on how it's written. CUDA allows more of the work that would normally be done on the CPU to be done on the GPU instead, but using the GPU clock instead of the CPU clock. If it's something that cannot be done in parallel, this usually means that the GPU will take about four times as long to do it. Such complaints could mean that there is only one version of the application, which does all DP work on the CPU even if the GPU could also do DP if it uses a different version of the application for GPUs that can handle DP. Moving DP work between the CPU and the GPU is NOT automatic - the application or applications must be written so that they know how to do so. ID: 107564 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 6	Message 107565 - Posted: 24 Oct 2022, 6:18:30 UTC - in response to Message 107563. Last modified: 24 Oct 2022, 6:21:37 UTC WCG now appears to be trying to get more useful work done by sending mostly tasks with small total sizes of the input files, such as tasks for the OPN1 subproject. Is that what others are also seeing? I don't expect Krembil to like this, since it means little work for the MCM1 subproject they are especially interested in. In other words, this may change soon. They were previously sending so many MCM1 tasks that the download server was often slow to respond, although it tended to make it easier to download large input files that small ones. I was doing WCG a day or three ago and was getting 90% cancer and 10% CPU COVID. I was getting no GPU COVID. Why don't they put all the COVID onto GPU and leave the CPUs free for cancer? Or do they have different types of tasks and some need a CPU? EDIT: I've just switched it back on and one of my computers requested work and got a tonne of CPU COVID, and no cancer. I want the GPU work. Even my phone's got some COVID work. What happened to the rainfall project? I know they had difficulties with the input data before the move to Krembil, but I thought that was all sorted out, and when Krembil started it up, I got lots of rainfall. ID: 107565 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 6	Message 107566 - Posted: 24 Oct 2022, 7:36:03 UTC - in response to Message 107563. Last modified: 24 Oct 2022, 7:36:33 UTC I don't expect Krembil to like this, since it means little work for the MCM1 subproject they are especially interested in. Then why don't Krembil put some more money into their falling apart server? I think we should start calling them Crumble. ID: 107566 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1264 Credit: 14,421,737 RAC: 0	Message 107567 - Posted: 24 Oct 2022, 15:54:27 UTC - in response to Message 107565. [snip] They were previously sending so many MCM1 tasks that the download server was often slow to respond, although it tended to make it easier to download large input files that small ones. I was doing WCG a day or three ago and was getting 90% cancer and 10% CPU COVID. I was getting no GPU COVID. Why don't they put all the COVID onto GPU and leave the CPUs free for cancer? Or do they have different types of tasks and some need a CPU? EDIT: I've just switched it back on and one of my computers requested work and got a tonne of CPU COVID, and no cancer. I want the GPU work. Even my phone's got some COVID work. What happened to the rainfall project? I know they had difficulties with the input data before the move to Krembil, but I thought that was all sorted out, and when Krembil started it up, I got lots of rainfall. Making the changes you suggest sounds like a good way to increase the load on their download server, and therefore make the problems worse. I got some rainfall project work last night, although not at good times to see the size of their downloads. Hopefully, the size is enough less than for the cancer work that they can increase that work soon. I'd also like more GPU work . ID: 107567 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1264 Credit: 14,421,737 RAC: 0	Message 107568 - Posted: 24 Oct 2022, 15:58:34 UTC - in response to Message 107566. I don't expect Krembil to like this, since it means little work for the MCM1 subproject they are especially interested in. Then why don't Krembil put some more money into their falling apart server? I think we should start calling them Crumble. Putting more money into their server is likely to be a slow process, even if that's something they're doing. They aren't saying if they're doing it or not. ID: 107568 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 6	Message 107569 - Posted: 24 Oct 2022, 16:05:15 UTC - in response to Message 107568. I don't expect Krembil to like this, since it means little work for the MCM1 subproject they are especially interested in. Then why don't Krembil put some more money into their falling apart server? I think we should start calling them Crumble. Putting more money into their server is likely to be a slow process, even if that's something they're doing. They aren't saying if they're doing it or not. I can buy a computer in 20 seconds with my bank card. It would get delivered in a couple of days, then my 200 tech staff (that Krembil have) would get it up and running in a week. Where the fuck is my last reply to you? ID: 107569 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1264 Credit: 14,421,737 RAC: 0	Message 107570 - Posted: 24 Oct 2022, 16:42:51 UTC - in response to Message 107569. [snip] Putting more money into their server is likely to be a slow process, even if that's something they're doing. They aren't saying if they're doing it or not. I can buy a computer in 20 seconds with my bank card. It would get delivered in a couple of days, then my 200 tech staff (that Krembil have) would get it up and running in a week. Where the fuck is my last reply to you? Server grade equipment is harder to find and therefore takes longer to order. Or would you prefer to have them use the more common grade of equipment and therefore have frequent outages? Could your last reply be SLOWLY making its way across the Atlantic? Or could Krembil have at least one employee who is reading your messages and is able to slow them down or delete them, but isn't able to do anything for their download server? ID: 107570 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 6	Message 107571 - Posted: 24 Oct 2022, 16:54:02 UTC - in response to Message 107570. Server grade equipment is harder to find and therefore takes longer to order. Not four months. And I used to order server parts in under a week. They're not hard to find, there are many servers in the world and many manufacturers making the parts. Or would you prefer to have them use the more common grade of equipment and therefore have frequent outages? You can't have an outage from bugger all. A 486 in a basement would do better than their cobbled together crap. Could your last reply be SLOWLY making its way across the Atlantic? No I fired it with a trebuchet. Or could Krembil have at least one employee who is reading your messages and is able to slow them down or delete them, but isn't able to do anything for their download server? That would be against the law, since this message is to the Rosetta server. ID: 107571 · Rating: 0 · rate: / Reply Quote

Greg_BE Send message Joined: 30 May 06 Posts: 5770 Credit: 6,139,760 RAC: 0	Message 107572 - Posted: 24 Oct 2022, 17:09:42 UTC This is all I get these days 10/24/2022 2:50:37 PM \| World Community Grid \| No tasks are available for OpenPandemics - COVID 19 10/24/2022 2:50:37 PM \| World Community Grid \| No tasks are available for OpenPandemics - COVID-19 - GPU 10/24/2022 2:50:37 PM \| World Community Grid \| No tasks are available for Help Stop TB 10/24/2022 2:50:37 PM \| World Community Grid \| No tasks are available for Mapping Cancer Markers 10/24/2022 2:50:37 PM \| World Community Grid \| Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them 10/24/2022 2:50:37 PM \| World Community Grid \| Tasks for Intel GPU are available, but your preferences are set to not accept them I have no idea what the brain farts is going on over there. ID: 107572 · Rating: 0 · rate: / Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1602 Credit: 13,011,009 RAC: 6	Message 107573 - Posted: 24 Oct 2022, 17:11:41 UTC - in response to Message 107572. Last modified: 24 Oct 2022, 17:12:20 UTC This is all I get these days 10/24/2022 2:50:37 PM \| World Community Grid \| No tasks are available for OpenPandemics - COVID 19 10/24/2022 2:50:37 PM \| World Community Grid \| No tasks are available for OpenPandemics - COVID-19 - GPU 10/24/2022 2:50:37 PM \| World Community Grid \| No tasks are available for Help Stop TB 10/24/2022 2:50:37 PM \| World Community Grid \| No tasks are available for Mapping Cancer Markers 10/24/2022 2:50:37 PM \| World Community Grid \| Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them 10/24/2022 2:50:37 PM \| World Community Grid \| Tasks for Intel GPU are available, but your preferences are set to not accept them I have no idea what the brain farts is going on over there. So you ask for CPU work and it says you can only have GPU work, yet I ask for GPU work and it says I can only have CPU work. Krembil are not the sharpest knives in the drawer. ID: 107573 · Rating: 0 · rate: / Reply Quote

Greg_BE Send message Joined: 30 May 06 Posts: 5770 Credit: 6,139,760 RAC: 0	Message 107574 - Posted: 24 Oct 2022, 18:07:13 UTC - in response to Message 107573. This is all I get these days 10/24/2022 2:50:37 PM \| World Community Grid \| No tasks are available for OpenPandemics - COVID 19 10/24/2022 2:50:37 PM \| World Community Grid \| No tasks are available for OpenPandemics - COVID-19 - GPU 10/24/2022 2:50:37 PM \| World Community Grid \| No tasks are available for Help Stop TB 10/24/2022 2:50:37 PM \| World Community Grid \| No tasks are available for Mapping Cancer Markers 10/24/2022 2:50:37 PM \| World Community Grid \| Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them 10/24/2022 2:50:37 PM \| World Community Grid \| Tasks for Intel GPU are available, but your preferences are set to not accept them I have no idea what the brain farts is going on over there. So you ask for CPU work and it says you can only have GPU work, yet I ask for GPU work and it says I can only have CPU work. Krembil are not the sharpest knives in the drawer. And no NVDIA work..just AMD/ATI and Intel? What the +-*/ ! Come on.... ID: 107574 · Rating: 0 · rate: / Reply Quote