Message boards : Number crunching : Rosetta support OpenCL ?
Author | Message |
---|---|
![]() Send message Joined: 18 Sep 05 Posts: 40 Credit: 7,487,314 RAC: 0 |
http://www.youtube.com/watch?v=r1sN1ELJfNo&eurl=http://www.generation-3d.com/actualite-La-premiere-demo-OpenCL-sur-nVidia,ac14322.htm @+ *_* |
![]() Send message Joined: 1 Dec 05 Posts: 2124 Credit: 12,426,657 RAC: 2,579 ![]() |
http://www.youtube.com/watch?v=r1sN1ELJfNo&eurl=http://www.generation-3d.com/actualite-La-premiere-demo-OpenCL-sur-nVidia,ac14322.htm http://www.opencldev.com/ I hope this help |
![]() Send message Joined: 22 Jan 08 Posts: 1 Credit: 752,135 RAC: 0 |
When one has a hammer, everything looks like a nail, but the reality is that there is a proper tool for each type of job. While you can pound a screw in with a hammer with some success, a screwdriver works much better for that job. You can also use the handle of a screwdriver to pound in a nail, but a hammer is better suited for that task. And good luck cutting a 2"x4" board with either a hammer or a screwdriver! Some projects will run better on a CPU and others are better suited for GPUs. Collatz runs well on GPUs because I wanted to learn about GPU programming and therefore chose an unsolved math problem that I know would run well on a GPU. Collatz needs very little external data and does the same relatively small equation over and over and over again. Since none of the results affect the next iteration of results, it runs in parallel very well. In comparison, Rosetta uses rather large data files and loading and unloading them from the GPU memory would likely end up making the GPU app run as slow or even slower than Rosetta's CPU apps. The key to fast GPU apps is to keep all those stream processors busy and not have them waiting for data to be copied to/from the GPU. GPUs may have double precision capabilities, but they don't necessarily round the same way as a CPU does and the number of digits of precision may or may not be the same. Collatz didn't have to worry about that because it uses only integer math (192 bit integers at present, but integers none the less). Collatz was also partly chosen so people without the latest and greatest GPU hardware would still be able to utilize their GPUs while crunching with their CPUs on projects such as Rosetta. Depending upon the project, a CPU may be the best tool for the job irregardless of how highly skilled the project developers are. It sounds like that is the case here. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
This is where you have to think outside the box slicker :) ...if you combine the hammer and the screwdriver, you can make it through a 2x4 in no time! Rosetta Moderator: Mod.Sense |
![]() Send message Joined: 1 Dec 05 Posts: 2124 Credit: 12,426,657 RAC: 2,579 ![]() |
This is where you have to think outside the box slicker :) ...if you combine the hammer and the screwdriver, you can make it through a 2x4 in no time! Rosetta may not use opencl at all, but...a new version of OpencCL is in town!! OpenCl 1.1 |
![]() ![]() Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
In comparison, Rosetta uses rather large data files and loading and unloading them from the GPU memory would likely end up making the GPU app run as slow or even slower than Rosetta's CPU apps. The key to fast GPU apps is to keep all those stream processors busy and not have them waiting for data to be copied to/from the GPU. GPUs may have double precision capabilities, but they don't necessarily round the same way as a CPU does and the number of digits of precision may or may not be the same. Collatz didn't have to worry about that because it uses only integer math (192 bit integers at present, but integers none the less). Collatz was also partly chosen so people without the latest and greatest GPU hardware would still be able to utilize their GPUs while crunching with their CPUs on projects such as Rosetta. This is what I did. Went up a few hundred positions in BOINC combined thanks to collatz (doing around 7,500 RAC, and I have another ATI HD 2600 waiting to crunch) ![]() |
Jochen Send message Joined: 6 Jun 06 Posts: 133 Credit: 3,847,433 RAC: 0 |
This is what I did. Went up a few hundred positions in BOINC combined thanks to collatz (doing around 7,500 RAC, and I have another ATI HD 2600 waiting to crunch) This is what I did, at least until I realized how much the extra cost of electrical power is. One 5870 plus a GTX280 = increase of electrical power bill by 70 Euro per month... (running 24/7) Just my 7000 cents ;) Jochen |
mikey![]() Send message Joined: 5 Jan 06 Posts: 1898 Credit: 12,723,752 RAC: 682 ![]() |
This is what I did. Went up a few hundred positions in BOINC combined thanks to collatz (doing around 7,500 RAC, and I have another ATI HD 2600 waiting to crunch) Yes using a gpu is not free, but it is soo much FUN!! |
![]() ![]() Send message Joined: 15 Aug 09 Posts: 5 Credit: 572,579 RAC: 0 |
Total N00B, so please be gentle when you impale me with my ignorance. I was just curious if it wasn't so much a right tool for the job issue, but rather the right way of going about the issue. If the problem is that you bought a screw & a hammer. You knew, or ought to have known that it wasn't a good idea before you even started hammering that screw. So if the research isn't planned in a way that is suited for the hardware, then it's not the hardware that is to blame. So now, where the future is Parallel, why fight it, instead of embrace it? If the research isn't possible in the way it's done now, how about finding a way to make the research possible in the way that has the best potential? If I wanted to go from A to B. I could could take a car, or I could take a bus. I wanted to go from A to B, but so did a lot of others, unless B wasn't that interesting a place to go. If that be the case, maybe I should go to where everbody else was going? All this is abstract & unspecific. But general rules do apply, even where specific tasks are involved. It would be nice if Rosetta could take the leap of faith, instead if comming up with reasons not to... You could have a really nice car, but you're a cab driver. Instead of making that same trip 30 times in 30 different directions. Those you transport sren't that interested to go site seeing in your cab, they just want to go to where they have to go, & that's the same place. If you were a bus driver, you could have taken all 30 in one go. If this example is relevant. Then it's more a question of logistics & conservative programming. A bus has to have a predefined route, a fixed schedule, & other buses ready to make that same trip so that everybody doesn't depend only on you. It's not even a question of RISC VS CISC. But even if that be the case, the iPad is doing quite well, Ubuntu also supports ARM. Not that you would use one for BOINC, but if ARM served the same role as Intel/AMD, it would have to. A Math Professor could outwit a room full of students. But if it were simple math & lots of it, the room full of students would overwhelm the professor, simply because it all had to be written down & the professor only has two hands. Even if he were a the fastest cowboy math teacher in the west, he'd get tired. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
I don't think anyone is blaming hardware or research methodology. I believe it is simply a matter of existing, established and tested code and systems as compared to a new environment. If you can do 100 hours of work and begin testing on a new protocol under the existing system, or 1000 hours of work to rearchitect all of the underlaying code to run on a new environment, you will need some pretty compelling reasons to make such a leap. The bulk of the Rosetta code is reused again and again as new protocols are developed. And so this existing legacy code creates a hurdle. This is the way essentially all software systems work and a decision that all IT folks are constantly weighing. Not a problem unique to Rosetta by any means. If you rework all of the old routines, you have to debug, retest, correct initial errors, retrain your developers and perhaps even retool your development environments. And that just gets you on to a new platform. Noone likes to go to the boss and say "...well, it's been 6 months of work and all the old routines are now running the way they always used to". The boss tends to say "...but what about the new functionality we've had on our "to-do" list?" Rosetta is very memory intensive, and GPUs are typically very fast, but do not have large memory spaces. And so GPU coding requires careful use and optimization of memory or you are unable to utilize all of the hardware resource. Hence the major coding effort required. Rosetta Moderator: Mod.Sense |
![]() ![]() Send message Joined: 15 Aug 09 Posts: 5 Credit: 572,579 RAC: 0 |
I don't think anyone is blaming hardware or research methodology. I believe it is simply a matter of existing, established and tested code and systems as compared to a new environment. If you can do 100 hours of work and begin testing on a new protocol under the existing system, or 1000 hours of work to rearchitect all of the underlaying code to run on a new environment, you will need some pretty compelling reasons to make such a leap. Point taken, & the reasons few, possibly vague. But here they are: A) 10x the initial time invested would quickly be worth the while, if the increase in speed is between 10-100 times. B) GPU VS CPU is more watt effecient even though Nvidia Fermi consumes much power, the ammount of work achieved is far more then the power used to power the beast compaire to a CPU. This in itself isn't an argument for those who run the DC Project, but rather those who contribute to the DC Project. Those who use ATI Cypress, even more so. C) Both Nvidia & ATI are still in their infancy, but the time is ripe to take the leap & get on board, as this is how baby giants learn to speak, & you'd want them to speak English. But just as there are pros, here are cons: A) Rosetta@home isn't big, neither is BOINC. There is a vast ocean of CPU's & BOINC itself is just a pond. WCG even is going the other way around. From small to micro WU's via a Facebook APP. It's not hard to install the BOINC Client, but that requires effort, & even more effort for those interested in contributing to GPU DC Projects. B) iPad is out, & there'll be more to come. Either running Mac or Ubuntu. These low power devices will be running light, there will be many, & Cloud Computing will grow. So if DC Projects have the choice, via money donations, to choose. Whould they choose to spend their cash on GRID, or Cloud? C) If GPGPU is the future, power efficient, & affordable super computing. Projects like Rosetta might via donations choose to rely only on money donations & do all the work themselves on their own HPC run by Nvidia or ATI GPU's. |
![]() ![]() Send message Joined: 15 Aug 09 Posts: 5 Credit: 572,579 RAC: 0 |
Yet another reason to go GPU is better explained here: http://www.isgtw.org/?pid=1002557 Which made me curious. Would that also mean that if Parallel programing would offer greater support for GPU's, that if might also be used in the same way as the WCG APP which sends lots of micro WU's out & back if it was done easy & the broadband speeds increase & latency decreases as a result of the need for speed in Cloud Computing? |
![]() ![]() Send message Joined: 2 May 10 Posts: 220 Credit: 9,106,918 RAC: 0 |
Pardon me if I'm wrong as I am speaking beyond the boundaries of my knowledge but I was always under the impression that the current generation of GPUs were less precise in their floating point operations than traditional CPUs are. I also understood that like the vector processors on older mainframes, GPUs were great for processing arrays of independent data, but when serialization came into play things really slowed down (a really, really simple example to try and illustrate my crippled thought process) First Step: A + B = C Second Step: C - Z = Y Since the second step of the process was dependent on the result of the first step you lost a certain amount of parallelism. Further, if they are like the old vector processors a specific cell in the array could either be an input cell or an output cell, and transferring data between them was not real fast. But I could be completely wrong, and often am. (oh I really miss 80 column punch cards and the rooms full of ladies who created them ...) |
![]() ![]() Send message Joined: 15 Aug 09 Posts: 5 Credit: 572,579 RAC: 0 |
Pardon me if I'm wrong as I am speaking beyond the boundaries of my knowledge but I was always under the impression that the current generation of GPUs were less precise in their floating point operations than traditional CPUs are. Ouch! That was a painful impalement! Love it when that happens! Wasn't my intent to let that happen, but it did. My GPU doesn't use the PCI, but it could. Sorry for being the N00B I said I was. A brain cell died, so did I get a wrong answer, or did I just die? |
![]() ![]() Send message Joined: 2 May 10 Posts: 220 Credit: 9,106,918 RAC: 0 |
Hey, there is no ouch to be had. I don't know if you are right or wrong - I was trying to relate my understanding of the situation with regard to GPU processing - and communicating thoughts clearly is often difficult for me. Please note the emphasis is on "my understanding" - heck, you could be a lot closer to the bulls eye than I was. However I think that there is one concept that we both can agree on and that is the concept of pragmatism. Go with what works. I think that if GPU/Stream/Array processing fit the mathematics and the processing paradigm at Rosetta we would have seen it implemented already. For you and I the project is a worthy outlet for the CPU cycles generated by our hobby, or maybe a way of showing support for a cause that is important to us. However, for the good folks at the lab (I assume most of them are good folks but you know about the word assume) this is their life's calling, something they make sacrifices for beyond eating day old dough-nuts. And I am confident that if the state of the art was such that they could get the data they needed quicker, faster, and cheaper using GPU processing, we would already have it. Buts that just my opinion, and the cliche covering the word "opinion" is almost as good as the one for "assume" |
![]() Send message Joined: 3 Nov 05 Posts: 1834 Credit: 124,260,318 RAC: 9 |
or did I just die? I'm sure you'll be alive again ready for next week ;) |
![]() ![]() Send message Joined: 15 Aug 09 Posts: 5 Credit: 572,579 RAC: 0 |
or did I just die? Not if U keep on killing me. I'm not immortal so implying that my single cell just died was a pun & a slap. Wasn't talking about sending WU's out in parallel w/o checking for errors, across different PC's running on different platforms on different hardware, with a different rate of return, just talking about micro WU's taking very little time to process making full use of increased broadband speed & low latency. It was the sending, checking, sorting, & return I was talking about on the part of Rosetta@home & if they could share it with us if they themselves use GPU's. Just like the serial PCI-e lane & the GPU. As said, I'm a N00B. Please don't slap me in the face when I say something stupid. To me, it's a PC, why I bought it, was because I had the perception that what I bought wasn't a bad deal at the time. If I got cheated, I'd like to know why, so I don't do it again. Most commercials I watch just tells me to buy, because it's THAT GOOD! Posting on this forum, I get to ask why my CPU is used 24/7 if I also have a GPU, & even add with my limited misconceptions some why, how, & really??? |
mikey![]() Send message Joined: 5 Jan 06 Posts: 1898 Credit: 12,723,752 RAC: 682 ![]() |
or did I just die? Right now Aqua is the only Boinc project I know of that has gotten true parallel processing to work and I am not sure it is true parallel processing. They use all of your cpu's, up to a max of 8, on one workunit all at once. All other Boinc projects crunch one workunit per cpu. Now there is one, I think Boinc project that uses more than one gpu a the same time, if you have them, but is limited to a max of 2 right now, DNETC. However that being said I am not sure how 'parallel' the processing really is, it could be fully independent or it could be dependent on the previous answer to get the next one. Heck even Photoshop will use more than one of your cpu's, if you have them, to process its data. As for Rosetta and using a gpu, they have looked at the process and decided that a full rewrite of the project would be required and then of course the requisite testing, rewriting, retesting etc, etc to move to a gpu based workunit project. This would involve untold thousands of dollars, maybe even hundreds of thousands, and is not cost effective right now. What they have works and gpu crunching is still very cutting edge. Over 95% of people still do not crunch with their gpu so spending the money to make the 5% or less people happy is not cost effective either. They have said that if the situation warrants it they will revisit the issue, but for now it is what it is and what they have works fine. As for brain cells we all make and lose them every day, unfortunately we don't make as many as we are losing!! And as we get older that process gets even more dramatic, if we could turn that around...well the possibilities are endless!! |
Mtitu Send message Joined: 2 Dec 05 Posts: 2 Credit: 284,489 RAC: 0 |
When open cl may to use with ati gpu's ? |
![]() ![]() Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
This is what I did. Went up a few hundred positions in BOINC combined thanks to collatz (doing around 7,500 RAC, and I have another ATI HD 2600 waiting to crunch) wow. forced the GPU and CPU to run on my laptop and then unplugged it. 20 MINUTES LEFT @ 98% charge! Talk about power hungry. ![]() |
Message boards :
Number crunching :
Rosetta support OpenCL ?
©2025 University of Washington
https://www.bakerlab.org