Rosetta@home

Welcome from David Baker

  UW Seal
 
[ Home ] [ Join ] [ About ] [ Participants ] [ Community ] [ Statistics ]
  [ login/out ]


Welcome to the Rosetta@home distributed computing project and thank you for joining!


You will be helping us to solve one of the longest standing problems in molecular biology: the "protein folding" problem.

Proteins are the miniature machines that carry out almost all the important functions in your body. As with any machine, understanding how proteins work requires understanding what their structures are. It has been known for over 40 years that the structures of proteins are completely determined by their amino acid sequences, and we know the amino acid sequences of all proteins in the human genome thanks to the recently completed human genome project. However, until very recently, it has seemed nearly impossible to compute the structures of proteins from their amino acid sequences, and solving this problem has been something of a scientific "Holy Grail".

As you can read in the news releases and in Science magazine, we have in the past six months made significant progress and for the first time it appears possible to compute protein structures from their sequences. Success with this would have a huge impact on our understanding of how biology works, and even more importantly, could lead to new therapies and vaccines to help cure disease. The major stumbling block is the very large amount of computing time required to solve the problem.

I can explain the computing problem with an analogy. Suppose you are a space explorer and have found a new planet, and have been told that what you have always been looking for lies at the bottom of the deepest valley on the surface of the planet. How do you find this lowest point? One possibility would be to land somewhere on the planet, and search from there. However, if the planet is very large, you are unlikely to have landed close enough to this deep valley to find it. For example, if you landed on earth, you are unlikely to land close enough to the shore of the Dead Sea to stumble across it during your exploration--you would most likely be on a different continent, perhaps exploring the Himalayas or the Sahara desert. But what if you had 10,000 dedicated explorers, who would each parachute down to a random position on the planet, search around for the lowest elevation point in the region they landed, and report back to you the elevation of the lowest point they found. Your chances of finding what you are looking for would be very much larger, and the more explorers you can send out, the greater the chance of success.

Now in our case, the space being searched is not the surface of a planet, but the set of all possible structures that a protein can have. There are a very large number of possible structures because there are over one hundred different places where the protein can bend or twist in different ways. Remarkably, despite the very large number of possibilities, proteins fold up into single, well defined structures which allow them to carry out their biological functions. The special property of these "folded" structures is that they have lower energy than any other structures the protein could adopt. So rather than searching for the lowest elevation location, we are searching for the lowest energy structure, but conceptually the problem is very similar to the example I gave in the preceding paragraph.

So you can think of what your computer will be doing in the following way. At the beginning of the calculation, it will parachute onto a randomly chosen region of the energy landscape, and then hunt for the lowest energy point in the neighborhood. At the end, it will send the lowest energy structure that it found back to our server, along with the energy of this structure. Our server will compare the energies of all the low energy structures found by all of the participating processors, and the lowest energy structure overall will be identified.

Initially, we will take advantage of the fact that the lowest energy structures have already been determined for some proteins using complicated, expensive, and laborious experimental techniques I won't try to explain here. We will compare the lowest energy structure found overall to the experimentally determined structure, and see if they are the same. Once we have figured out how much we need to search (how many processors for how long) to be sure to find the lowest energy structure, we will use the approach to compute structures of important proteins with unknown structures. You will have helped to achieve this "Holy Grail" of biological research.

Now, if you have followed this illustration, you will realize that the ultimate solution--the lowest energy structure--will have been found by a single processor. This is like winning a lottery, since the space is so big and there are so many possibilities. Like a lottery, the more time your computer spends searching the more likely it will win. We will be keeping track of the lucky winning computer for each of the prediction problems, and the owner will get special notice and credits. See our Top Predictions page for more information.

So have fun, and tell all your friends and relations to join up--this is one of the most important open questions in science today that can potentially be solved by large scale distributed computing.

Thanks again for helping with our project!!

David Baker

Professor of Biochemistry at the University of Washington
Howard Hughes Medical Institute investigator


  • About Rosetta@home
  • Research Overview
  • David Baker Profile



  • Home | Join | About | Participants | Community | Statistics

    Copyright © 2016 University of Washington

    Last Modified: 10 Nov 2007 5:01:25 UTC
    Back to top ^