ROSETTA MUST ENABLE GPU UNITS

Message boards : Number crunching : ROSETTA MUST ENABLE GPU UNITS

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Murasaki
Avatar

Send message
Joined: 20 Apr 06
Posts: 303
Credit: 511,418
RAC: 0
Message 76913 - Posted: 29 Jun 2014, 14:59:09 UTC - in response to Message 76911.  
Last modified: 29 Jun 2014, 15:03:22 UTC

You have hit the nail on the head right there. Rosetta tasks can take anything from 100MB to 1,000MB+ per core to process. There is no way the current generation of GPUs can provide that much dedicated memory per core.


All the dedicated/discrete (not the ones that come with the motherboard) GPUs these days come with at least 1GB of dedicated graphics memory - even the cheapest ones for $30 or so.


Per core?

GPUs have thousands of cores, so a 1GB per core board would require at least 1TB of RAM. I don't think we are there yet with the current generation.

If you look in detail at the specs for current GPUs you will probably find about 0.01 to 2.00 MB per core. Nowhere near enough for what Rosetta appears to need.


even though this may be true...

...
...

However, i'm no expert in this arena and could only contribute little findings as such from a google search.


Sorry if I am sounding harsh here, but I have read hundreds (thousands?) of posts on this forum asking/begging/demanding that the Rosetta team develop a GPU application. Often these requests come from laymen who have heard that another project has managed to get good results from GPU processing and expect everyone else can do the same.

The Rosetta scientists seem to review new advances in processor technology every couple of years. At the last review it was found that the technology will not be able to give the perforamce that Rosetta needs.

I for one am willing to trust the word of the scientists who have investigated the technology and discounted it, over the long stream of people who have only researched performance statistics on the internet.
ID: 76913 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
CZ

Send message
Joined: 4 Apr 14
Posts: 17
Credit: 78,584
RAC: 0
Message 76916 - Posted: 29 Jun 2014, 18:24:54 UTC - in response to Message 76913.  
Last modified: 29 Jun 2014, 18:43:53 UTC



The Rosetta scientists seem to review new advances in processor technology every couple of years. At the last review it was found that the technology will not be able to give the perforamce that Rosetta needs.


Until they publish what steps they have taken to investigate the possibility of porting their code (or portions of it) to CUDA or OpenCL and the reasons for concluding that it's impossible, I will tend to believe they simply have come to the conclusion that they have enough people already doing the work on CPUs - why should we bother spending time and money on developing and testing new code that may or may not work on a GPU.

At the machine level it is all processed in binary language of ones and zeros, the GPU can perform these calculations much faster than CPUs, the whole problem lies in developing higher level language code to do this, but ATI and NVidia provide tons of resources for developers to help optimise applications for GPUs, I bet the Rosetta team haven't even bothered to have a good look at basic tools available to them such as "Performance analysis tools are also great for identifying performance bottlenecks in your CPU code that can be eliminated by moving computationally intensive algorithms to the GPU" https://developer.nvidia.com/performance-analysis-tools

The code for CPU and GPU processing appears similar enough http://www.nvidia.com/object/cuda-parallel-computing-platform.html

There are many tools available to Rosetta project developers (just from NVidia alone): https://developer.nvidia.com/cuda-zone and https://developer.nvidia.com/cuda-tools-ecosystem.

I for one am willing to trust the word of the scientists who have investigated the technology and discounted it, over the long stream of people who have only researched performance statistics on the internet


Some posts from the Rosetta team about GPU processing and how it applies to the Rosetta project:

22nd of May 2013 - Lucas from Baker Lab "GPU computing is much closer to reality for Rosetta, as it is completely routine for many other compute tasks. There is actually some preliminary work on this for Rosetta from a couple of developers as many tasks are amenable to GPU algorithms. It is just preliminary, proof-of-concept, and the fact is that as an academic group there is not a ton of man-power around to devote to the coding task of fully implementing this work. Could happen soon though" https://boinc.bakerlab.org/forum_thread.php?id=6185&nowrap=true#75653

12th of June 2013 - "Hi, I'm one of the rosetta devs Lucas mentioned above. Every time I post on forums I seem to get myself in trouble (probably because I do thinks like show people slides, which I'm doing again... I never learn...), so please be nice!

Rosetta for GPU in general is pretty tough for a few reasons, mainly that the it's tough to write large-general purpose GPU code -- and calculations we do in rosetta-land are incredibly diverse. A while back I'd figured out how to do a "refold," one of many basic underpinning of rosetta modeling, efficiently on the GPU. Here's a link to some slides (towards the end, I think) for the curious.

https://www.dropbox.com/s/6qkp6spzpz0cfs4/11_08_13_GPU.pdf

To date, we've had only one "application" of GPU calculations that I'm aware of, doing some entropic analysis of the effect of varying the size of floppy "linkers" attaching large protein components together. Here's some slides going over that:

https://www.dropbox.com/s/9gs20q4kaa0e8wp/12_04_05_GM_pub.pdf

I should note that the task in the above slides is very very simple and special-purpose... for that exact purpose the GPU turned out to work pretty well, but adapting the code to do more proved difficult enough that I'm not personally using it anymore. I've done similar tasks using CPU-based algorithms more recently and have gotten close to the GPU performance through various computational trickery which is impossible on current GPUs.

12th of June 2013 (another thread) - "We are primarily engaged in a few tasks here, all of which we use boinc for:
1) Making better algorithms to predict structures. Mod.Sense pointed this out. Much of the use of boinc is to test out variants of algorithms, totally new ideas, etc.
2) Improving the scoring functions in those algorithms. This gets pretty technical, but you can think of Rosetta software as a search algorithm -- it needs to look around (sampling) and it needs to evaluate what it finds (scoring). Boinc is used to test new methods of scoring, aka new ways to evaluate structures. These methods help structure prediction (1, above) and sequence design (3) below.
3) Design of new proteins for new tasks. This is the inverse of problem (1) where we know the sequence and are predicting the structure. Here we have a structure, or multiple structure ideas, in mind and we want to design a protein that takes on that structure. The structure could be an influenza binder, or a new enzyme to treat a disease. We run Rosetta algorithms to design new sequences for a given structure, and often run that on boinc.
4) When we make a new design, how do we know that it will look the way we want it too? Well, we put it back in to step (1) on boinc, to test if it is at least self-consistent. If boinc doesn't give us back the structure we are trying to make, we might be in trouble.

The majority of folks here in the lab are working on (3) and some are doing (4), and many of us use boinc as a vital tool to make design and design evaluation possible" https://boinc.bakerlab.org/rosetta/forum_thread.php?id=5185&nowrap=true#75755

26th June 2014 - "boboviz to answer your question: GPU code for Rosetta is not being actively pursued in our lab. Will posted lots of stuff about his work for GPUs a few days ago, but as he says what he did ends up being very specialized and is not of general use. We are pursuing lots of other algorithm improvements, including Will's work, that use standard CPUs and we think that in many cases we are more limited by our ability to come up with correct and fast algorithms than we are by anything that could be solved with a GPU. That being said, rosettaathome is an incredibly powerful tool for us to develop new algorithms with, and many of us are working on new algorithm development" https://boinc.bakerlab.org/forum_thread.php?id=6185&nowrap=true#75809


In the May 2013 post they say GPU is a real possibility (and that it is routine in other projects) but from the last post on the 26th of June 2014 it is clear that they are not even looking into possibilities of GPU processing for the Rosetta project (because it would appear they have enough trouble working on the CPU code and developing new algorithms for the CPU).
ID: 76916 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Murasaki
Avatar

Send message
Joined: 20 Apr 06
Posts: 303
Credit: 511,418
RAC: 0
Message 76918 - Posted: 29 Jun 2014, 20:35:35 UTC - in response to Message 76916.  
Last modified: 29 Jun 2014, 20:48:34 UTC

Until they publish what steps they have taken to investigate the possibility of porting their code (or portions of it) to CUDA or OpenCL and the reasons for concluding that it's impossible, I will tend to believe they simply have come to the conclusion that they have enough people already doing the work on CPUs - why should we bother spending time and money on developing and testing new code that may or may not work on a GPU.


I don't see how your post differs much from what I have said already. They have looked at it and have decided that the benefits they may be able to gain do not justify the effort. The key difference seems to be that you want the scientists to prove that the benefit is small where I just say to trust them and let them get on with their jobs.


At the machine level it is all processed in binary language of ones and zeros, the GPU can perform these calculations much faster than CPUs, the whole problem lies in developing higher level language code to do this, but ATI and NVidia provide tons of resources for developers to help optimise applications for GPUs, I bet the Rosetta team haven't even bothered to have a good look at basic tools available to them such as "Performance analysis tools are also great for identifying performance bottlenecks in your CPU code that can be eliminated by moving computationally intensive algorithms to the GPU"


A rather big generalisation there. A GPU only performs faster than a CPU when it can resolve the calculation within its own resources. If you have a calculation that requires the CPU to allocate system memory to the GPU on a regular basis you can end up slower than a dedicated CPU calculation. It all depends on the type of calculation being performed.


In the May 2013 post they say GPU is a real possibility (and that it is routine in other projects) but from the last post on the 26th of June 2014 it is clear that they are not even looking into possibilities of GPU processing for the Rosetta project (because it would appear they have enough trouble working on the CPU code and developing new algorithms for the CPU).


Or to put it another way, they have reviewed and tested work with GPUs and concluded that currently the technology does not benefit the majority of their work. As pointed out several times in the threads you quoted, the GPU work only seems to benefit a limited field of specialised tasks.
ID: 76918 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
CZ

Send message
Joined: 4 Apr 14
Posts: 17
Credit: 78,584
RAC: 0
Message 76919 - Posted: 29 Jun 2014, 21:14:33 UTC - in response to Message 76897.  
Last modified: 29 Jun 2014, 21:15:47 UTC



I assume that when these work units are validated (by another user)...


To maximise the search of the billions of possible results there is no cross-validation of tasks in Rosetta. Instead if several results return low energy points with a similar structure the scientists conduct the more detailed calculation in that area. The second set of results then provide more detailed answers to either validate or discount the original results.


Wow - then how can they be sure people are not abusing the system and claim credit for WUs not performed?

All one would need to do is to send back a whole heap of fabricated results back to the Rosetta server where the energy levels were never low enough to be investigated and validated - no real work would need to be performed on the units at all.

Worse still - IF this happened, then the whole Rosetta science project could be affected by a whole heap of false negatives.
ID: 76919 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Murasaki
Avatar

Send message
Joined: 20 Apr 06
Posts: 303
Credit: 511,418
RAC: 0
Message 76920 - Posted: 29 Jun 2014, 21:42:34 UTC - in response to Message 76919.  
Last modified: 29 Jun 2014, 22:04:00 UTC

Wow - then how can they be sure people are not abusing the system and claim credit for WUs not performed?

All one would need to do is to send back a whole heap of fabricated results back to the Rosetta server where the energy levels were never low enough to be investigated and validated - no real work would need to be performed on the units at all.

Worse still - IF this happened, then the whole Rosetta science project could be affected by a whole heap of false negatives.


Deliberate falsification of data is perhaps a risk, but there are several factors that would count against it:

1) The person falsifying the data would have to understand the returning data well enough to use a format that wouldn't trigger an error report. With several different experiments and hundreds of thousands of different protein structures getting fired out to participants each day I would guess that would be a difficult job.

2) The Rosetta team refuses to get into an "arms race" with the other BOINC projects to artificially inflate the credit they award. The other projects would only increase their own awards in response. As such Rosetta gives on average a lower credit return per CPU time than most other BOINC projects. If someone is going to submit fraudulent data to boost their credit I expect they would target one of the more rewarding proects.

3) It would be almost impossible for someone to get all the tasks searching in a particular structural region. As such any false results would stand out among the general energy levels reported by other users for that region. A few erroneous results wouldn't stand out but if you are pumping through enough false results to make the fraudulent credit claim worth while I expect the scientists will spot something fishy fairly quickly.

4) Some of the Rosetta work involves testing improvements to the algorithms against known structures (elements 1 & 2 in the 12th of June 2013 post you quoted above). Any mass falsification of data where the proper results are already known would get flagged up fairly quickly.

5) BOINC records the performance specifications of the computers it is linked to. If someone does spoof their data they will have to make sure they don't send back more than their computer could handle in a particular time period.
ID: 76920 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 76924 - Posted: 30 Jun 2014, 1:14:44 UTC

There are 150 developers in 23 universities and laboratories working on various aspects of the coding (Rosetta Commons). So, the idea has been considered from many many perspectives. And, as you quoted, some serious efforts have been made to utilize GPUs.

Noone is dismissing the value of GPU to computing. And with more and more on-chip memory their versatility improves. But even those making and selling GPUs do not attempt to argue that computing is all just ones and zeros, and therefore GPUs should be used for everything. They have a specific subset of compute tasks that they do very very well... such as graphics. But the "GP" does not stand for "General Purpose".

In past posts I have likened the comparison between CPU and GPU as being similar to the difference between a school bus and a race car. The race car's acceleration and top speed are certainly superior, but that doesn't mean it can be used effectively to bring all of the children to school if that is the task at-hand.
Rosetta Moderator: Mod.Sense
ID: 76924 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
CZ

Send message
Joined: 4 Apr 14
Posts: 17
Credit: 78,584
RAC: 0
Message 76925 - Posted: 30 Jun 2014, 1:33:09 UTC - in response to Message 76924.  
Last modified: 30 Jun 2014, 1:57:45 UTC

Deliberate falsification of data is perhaps a risk, but there are several factors that would count against it ...


and

There are 150 developers in 23 universities and laboratories working on various aspects of the coding (Rosetta Commons). So, the idea has been considered from many many perspectives. And, as you quoted, some serious efforts have been made to utilize GPUs.

Noone is dismissing the value of GPU to computing. And with more and more on-chip memory their versatility improves. But even those making and selling GPUs do not attempt to argue that computing is all just ones and zeros, and therefore GPUs should be used for everything. They have a specific subset of compute tasks that they do very very well... such as graphics. But the "GP" does not stand for "General Purpose".

In past posts I have likened the comparison between CPU and GPU as being similar to the difference between a school bus and a race car. The race car's acceleration and top speed are certainly superior, but that doesn't mean it can be used effectively to bring all of the children to school if that is the task at-hand.


Thank you for your feedback, I feel that reasonable consideration has been given to the GPU processing issue (for whatever my opinion is worth).

I am now happy again to support Rosetta@home with my CPU time (whether or not GPU processing is enabled).

I wish the scientists, and all involved in the project, the best of luck - I feel the project is one of the most worthwhile of all scientific efforts in the distributed computing space, and I am now again happy to maximise my computing efforts to the tasks Rosetta@home has set for our CPUs.
ID: 76925 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2

Message boards : Number crunching : ROSETTA MUST ENABLE GPU UNITS



©2021 University of Washington
https://www.bakerlab.org