Comments/questions on Rosetta@home journal

Author	Message
eberndl Send message Joined: 17 Sep 05 Posts: 47 Credit: 3,539,274 RAC: 2,140	Message 15532 - Posted: 4 May 2006, 20:39:12 UTC Hugo, From what I understand, the whole point is for Rosetta to be able to predict the structure WITHOUT using the X-ray data. As it stands, that data (or data from protein NMR) is what provides the images in the "Native" box of the screen saver. Although being able to start with a good approximation is very important, it's also important to be able to start from scratch, because most proteins don't have homologues that have been solved yet. Questions? Try the Wiki! Take a look inside my brain ID: 15532 · Rating: 0 · rate: /

tralala Send message Joined: 8 Apr 06 Posts: 376 Credit: 581,806 RAC: 0	Message 15534 - Posted: 4 May 2006, 21:03:09 UTC - in response to Message 15532. Hugo, From what I understand, the whole point is for Rosetta to be able to predict the structure WITHOUT using the X-ray data. As it stands, that data (or data from protein NMR) is what provides the images in the "Native" box of the screen saver. Although being able to start with a good approximation is very important, it's also important to be able to start from scratch, because most proteins don't have homologues that have been solved yet. But as more proteins get solved the more homologues you will find. Somewhere they mentioned that they perhaps can use "cheap" experimental data such as visual microscopic data to have a rough estimation how the protein will look like. If this computationally aproach shall ever be useful they need to be able to predict the structure with limited computing power. Either by finding more folding "rules" or algorithms which will restrain the conformational space eihter by combining it with easy to obtain experimental data. ID: 15534 · Rating: 0 · rate: /

TioSuper Send message Joined: 2 May 06 Posts: 17 Credit: 164 RAC: 0	Message 15535 - Posted: 4 May 2006, 21:07:56 UTC - in response to Message 14638. How embarrassing... both the watchdog "killing" and get_the_hell_out() are my silly recent contributions. Its not representative of the rest of the code -- I didn't expect these error messages and functions to get broadcast to such a wide audience! I assure you that the rest of the code is totally dry and scientific. We'll change "killing" to "ending" in the release after the next (I read the message too late). Dry and scientific usually is boring. Anyhow, a quirky and sometimes macabre sense of humour is a sign of genius. :) Keep the good work and please don't let your humorous side be destroyed by science. ( Or science as dry and unhumorous people think it should be) ID: 15535 · Rating: 0 · rate: /

Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0	Message 15537 - Posted: 4 May 2006, 22:20:16 UTC - in response to Message 15532. From what I understand, the whole point is for Rosetta to be able to predict the structure WITHOUT using the X-ray data. Hugo made reference to ...early data from x-ray crystallography, or it's chemical reactions etc. I read an article which I wanted to link, but still haven't found it. It explained more about the X-ray process. The nutshell I got out of it is that there really is no "early" or "preliminary data". You either have done the complete X-ray study, or you haven't. Once you get the thing to crystilize (which is the hard part I guess, can take months and is considered more of an art than a science), then you take it to a one-of-a-kind huge X-ray device with superpowers and supercomputers, and you scan it from all angles all in an afternoon. The huge expense is that this is no simple hospital X-Ray machine. And it still takes hours of scans, I think the article said they scan every 2 degrees of rotation, to get the 3D image. So, even if you had the BILLION proteins all crystilized, it would STILL take WAY too long (billions of hours) to X-Ray all of them. As for chemical reactions... I believe that is how they know the sequence that the amino acids comprising the protein are in. But, ya, perhaps they could do some other chemistry on it to infer some broader info. to get clues as to how they are arranged. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ ID: 15537 · Rating: 0 · rate: /

eberndl Send message Joined: 17 Sep 05 Posts: 47 Credit: 3,539,274 RAC: 2,140	Message 15593 - Posted: 5 May 2006, 21:15:19 UTC So, even if you had the BILLION proteins all crystilized, it would STILL take WAY too long (billions of hours) to X-Ray all of them. Feet1st, you're right it does take hours to Xray each crystal, but it can take weeks or MONTHS to figure out how to crystallize a protein for the first time. And once you have the xrays... you have to run that through a massive computer program (another couple hours), and what that will give is an electron density map, which looks like this: But all those solid yellow lines have to be added BY HAND to the density map. Once this first estimate of the protein is obtained, they do a theoretical Xray on their model protein, and look at the differences between the actual (from the crystal) and experimental (from the model) scatter graphs, and then they go through and try to reduce the differences as best they can. again, they have to do this by hand. As for chemical reactions... I believe that is how they know the sequence that the amino acids comprising the protein are in. But, ya, perhaps they could do some other chemistry on it to infer some broader info. to get clues as to how they are arranged. There are 2 main ways to figure out the sequence of a protein: Get the cDNA and convert it to protein, or take the protein and do tandem Mass Spectroscopy on it, which can give you the actual protein sequence (except that they can't tell leucine and isoleucine apart due to their identical sizes). Questions? Try the Wiki! Take a look inside my brain ID: 15593 · Rating: 0 · rate: /

Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0	Message 15594 - Posted: 5 May 2006, 22:58:05 UTC Last modified: 5 May 2006, 22:58:51 UTC Always good stuff from eberndl. I always learn more from your posts. Thanks. If anyone else remembers the article, it talked about an entreprenure that's working on methods of doing the x-ray crystallography FASTER. And how each protein can cost $100,000s of USD. Anyway around it, existing approaches will take many thousands of years at the current pace, and after that, you still don't really know how what you need to know to DESIGN a protein that will target HIV or Alzheimer's. You'd have to experimentally create a bunch of proteins, then do the crystallography and then see if they happen to be a shape that will dock with your target protein. This is why the TOP7 is such a unique thing. An artificial protein. They actually planned it out, and BUILT a protein, and (I think) it took the shape they had predicted (or very close). So, if the prediction mechanism were perfect... then you could use the mass spectroscopy techniques to get the sequence of amino acids, calculate the shape, model another protein that will "dock" with the target, and then produce that new modelled protein. (Hence the REST of the R@H logo about protein design and docking). This is also what CASP is all about. You don't KNOW what it really looks like, how close can you predict based solely on the sequence of amino acids. Some predict by hand, some predict via distributed computing ;) ...net result, when you discover a new virus (SARS, bird flu, HIV...), you could readily create a protein treatment that targets it. And if you've got an existing database of the other billion proteins in the body, then you could create a treatment with minimal sideeffects (i.e. one that will NOT dock with other proteins in your body, just the diseased ones). Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ ID: 15594 · Rating: 0 · rate: /

Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0	Message 15612 - Posted: 6 May 2006, 16:35:40 UTC Last modified: 6 May 2006, 17:01:41 UTC In any event, if you are interested in finding more about our research, and can stand the ums, you can find it at: http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details.cgi?id=449 Just a quick note that even the "regular" (not high-bandwidth version) 320x240 movie available for download from the abovementioned URL is 127MBytes and a bit over 1hr long. I have just watched random 5-10min pieces of the video, but I found it quite understandable (having no background at all in the field). Absolutely worth watching. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity ID: 15612 · Rating: 0 · rate: /

BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0	Message 15618 - Posted: 6 May 2006, 18:47:34 UTC Last modified: 6 May 2006, 18:53:52 UTC I remember watching a video around 1980 in high school which was a presentation by the Kodiak Electrical Association about the proposed hydroelectric dam at Terror Lake. It discussed the effect on the bears that lived in the valley that would be turned into a larger lake behind the dam. They had the manager of the local electrical plant on the bank of Terror Lake, being eaten alive by mosquitos, and using a slightly distracting amount of "ums" in the conversation. A couple of years ago, I visited a museum in Pensacola, Florida, and a Floridian that had visited Terror Lake some time in the past had killed a pair of Kodiak bears, had the taxidermist mount them, and sent to Pensacola where his family donated them to the museum. The point being is that I remember when I saw that video.. By halfway through, the "ums" have really calmed down - so if you want to improve the presentation that you'll be making, see if you can do it in ... 5 minute segments. On the first pass, use your hand to hold your lips together on one side of your mouth, and talk with a rediculous accent. Relax, and then do it again - correctly. (Perhaps there's better advice from those that deal with videotaping.. have attended ToastMaster presentations, etc.) As a benefit to the project, you might be able to use the rediculous accent version of the video to bribe teams like The Knights Who Say Neee! into doubling their production for a few months, for the right to be the first team to get to see the silly version of the presentation first. :) As others have said, the material in the whole 1 hour presentation seemed really easy to understand, especially with description in layman's english about what most of the technobabble was. (I was informed by a client that she didn't understand a single thing I was telling her about why her computer died..) ID: 15618 · Rating: 1 · rate: /

rbpeake Send message Joined: 25 Sep 05 Posts: 168 Credit: 247,828 RAC: 0	Message 15627 - Posted: 6 May 2006, 21:04:25 UTC - in response to Message 15612. In any event, if you are interested in finding more about our research, and can stand the ums, you can find it at: http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details.cgi?id=449 Absolutely worth watching. I agree, and believe that you are your own worst critic (all of us are), and my focus on your presentation was on what you had to say. I found it fascinating, and a means to get to know you better. So David, I would suggest featuring the link on the project main page. Very worthwhile! Regards, Bob P. ID: 15627 · Rating: 0 · rate: /

Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0	Message 15642 - Posted: 7 May 2006, 8:40:39 UTC - in response to Message 15627. Last modified: 7 May 2006, 8:42:06 UTC http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details.cgi?id=449 Absolutely worth watching. I agree, and believe that you are your own worst critic (all of us are), and my focus on your presentation was on what you had to say. I found it fascinating, and a means to get to know you better. So David, I would suggest featuring the link on the project main page. Very worthwhile! Indeed ! There is one specific question I wanted to ask, though: right at the end of the lecture there was a statement which seemed to say that the work about cutting DNA at specific sites (as described on the "Disease Related Research" page) was already under way. I would definitely be interested to hear how far this work has proceeded (e.g., in terms of developing this for therapeutic purposes) and what is planed for the future. Thanks, -H. Team betterhumans.com - discuss and celebrate the future - hoelder1in.org ID: 15642 · Rating: 0 · rate: /

Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0	Message 15648 - Posted: 7 May 2006, 15:18:56 UTC Last modified: 7 May 2006, 15:19:40 UTC A question I have (maybe it's answered somewhere on the site and I missed it) is about the current focus. Are there still ongoing changes to the way Rosetta energy function is computed, or is that part basically finished and nowadays all efforts are towards better / more efficient methods to sample the conformational space? To rephrase it, would the recent difficulties with 1tul be due the fact that even 1 million structures still won't get one of our "explorers" near the energy minimum? Or was "1tul" one rare case where the energy function didn't work so well? It would be nice to also plot the native structure's Rosetta energy function value on those charts, as you did in the past in turquoise colour. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity ID: 15648 · Rating: 0 · rate: /

David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0	Message 15655 - Posted: 7 May 2006, 16:54:19 UTC - in response to Message 15648. A question I have (maybe it's answered somewhere on the site and I missed it) is about the current focus. Are there still ongoing changes to the way Rosetta energy function is computed, or is that part basically finished and nowadays all efforts are towards better / more efficient methods to sample the conformational space? To rephrase it, would the recent difficulties with 1tul be due the fact that even 1 million structures still won't get one of our "explorers" near the energy minimum? Or was "1tul" one rare case where the energy function didn't work so well? It would be nice to also plot the native structure's Rosetta energy function value on those charts, as you did in the past in turquoise colour. for 1tul it is very clearly a sampling problem--the native structure has much lower energy but no explorers got anywhere close. we have a number of improved "jumping" methods for larger beta sheet containing proteins that we are continuing to develop. work in the group is currently directed at improving both the energy function and the sampling method, but it is clear that sampling is a far greater problem, particularly for larger more complex proteins. we should still be plotting the native protein energies as you point out. ID: 15655 · Rating: 0 · rate: /

David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0	Message 15656 - Posted: 7 May 2006, 16:58:05 UTC - in response to Message 15642. http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details.cgi?id=449 Absolutely worth watching. I agree, and believe that you are your own worst critic (all of us are), and my focus on your presentation was on what you had to say. I found it fascinating, and a means to get to know you better. So David, I would suggest featuring the link on the project main page. Very worthwhile! Indeed ! There is one specific question I wanted to ask, though: right at the end of the lecture there was a statement which seemed to say that the work about cutting DNA at specific sites (as described on the "Disease Related Research" page) was already under way. I would definitely be interested to hear how far this work has proceeded (e.g., in terms of developing this for therapeutic purposes) and what is planed for the future. Thanks, -H. We have worked out the science/technology needed to create new DNA cleaving enzymes (if you ever look at the scientific journal "Nature" you will see our paper on this in a few weeks). We are currently hard at work trying to create enzymes that cleave within genes that cause disease and within pathogens. Once we have these they can be tested as therapeutics, but this is still a step or two away. ID: 15656 · Rating: 0 · rate: /

BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0	Message 15708 - Posted: 9 May 2006, 12:03:29 UTC You might also want to think up a way to encourage a production increase in the smaller teams, and the vast horde of those who don't belong to a team at all. Perhaps look for increases in models/month; and weigh those that have added an extra machine or 5 a little higher than those that increase Rosetta's Boinc share from 10% to 100%. (i.e. grab a couple of both). Perhaps picking out one at random so a non teamed member that runs a second cpu for all of Casp7 has a chance of being picked. ID: 15708 · Rating: 0 · rate: /

rbpeake Send message Joined: 25 Sep 05 Posts: 168 Credit: 247,828 RAC: 0	Message 15710 - Posted: 9 May 2006, 12:27:44 UTC - in response to Message 15708. Last modified: 9 May 2006, 12:28:35 UTC You might also want to think up a way to encourage a production increase in the smaller teams, and the vast horde of those who don't belong to a team at all. I agree! I'll bet the vast majority of the computing power comes from the "all others". Don't forget the little guy in your efforts to recruit the big guys! There should be some mechanism by which everyone feels they have a shot at some sort of recognition for their efforts. :) Regards, Bob P. ID: 15710 · Rating: 0 · rate: /

TioSuper Send message Joined: 2 May 06 Posts: 17 Credit: 164 RAC: 0	Message 15711 - Posted: 9 May 2006, 12:28:17 UTC - in response to Message 15708. You might also want to think up a way to encourage a production increase in the smaller teams, and the vast horde of those who don't belong to a team at all. Perhaps look for increases in models/month; and weigh those that have added an extra machine or 5 a little higher than those that increase Rosetta's Boinc share from 10% to 100%. (i.e. grab a couple of both). Perhaps picking out one at random so a non teamed member that runs a second cpu for all of Casp7 has a chance of being picked. Hey something like the NCAA: A Competition within the big Teams, The Middle teams and the small teams and a competition for the independents. But lets make this clear: the main goal is to motivate production, the competition cannot be allowed to degenerate into name calling and all the worst things that competition brings out. ID: 15711 · Rating: 0 · rate: /

David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0	Message 15721 - Posted: 9 May 2006, 15:28:52 UTC - in response to Message 15710. You might also want to think up a way to encourage a production increase in the smaller teams, and the vast horde of those who don't belong to a team at all. I agree! I'll bet the vast majority of the computing power comes from the "all others". Don't forget the little guy in your efforts to recruit the big guys! There should be some mechanism by which everyone feels they have a shot at some sort of recognition for their efforts. :) Hi Bob, any suggestions? we can also have an award for the lowest energy model for each target (we can't do this for the low rmsd model as we are now, because we won't know the true structure!). Rom is starting now to look at the credits issue--I'll direct him to the discussion here. ID: 15721 · Rating: 0 · rate: /

rbpeake Send message Joined: 25 Sep 05 Posts: 168 Credit: 247,828 RAC: 0	Message 15724 - Posted: 9 May 2006, 16:10:34 UTC - in response to Message 15721. Hi Bob, any suggestions? we can also have an award for the lowest energy model for each target (we can't do this for the low rmsd model as we are now, because we won't know the true structure!). Sounds like a good idea! Thinking out loud, I thought in addition perhaps a lottery would be nice, with each "lottery ticket" a structure prediction, or something like that. You would pick winners at random from the pool of structure predictions (whether correct or not is not the relevant issue, the purpose of this lottery is to obtain as many structure predictions as possible). Thus, like with lottery tickets, the more structure prediction "entries" one has, the greater chance one has of winning! Others may have suggestions as well.... Regards, Bob P. ID: 15724 · Rating: 0 · rate: /

Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0	Message 15746 - Posted: 10 May 2006, 2:45:00 UTC - in response to Message 15594. Last modified: 10 May 2006, 2:45:44 UTC If anyone else remembers the article, it talked about an entreprenure that's working on methods of doing the x-ray crystallography FASTER. And how each protein can cost $100,000s of USD. OK, I finally found it. I guess it was 10s of thousands, not 100s... either way mulitply it by a billion proteins and it's outta MY budget ;) This article in Wired discusses what some others are doing in the pursuit of protein folding. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ ID: 15746 · Rating: 0 · rate: /

David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0	Message 15750 - Posted: 10 May 2006, 4:06:40 UTC - in response to Message 15739. To Dr. Baker: Don't they release the native structure of the protein we're working on - sometime after we're no longer allowed to submit data for that protein to Casp? i.e. Can't we have an RMSD result before the year end comparison data is released? (Just to sate our curiosity?) Yes, we will probably get the solutions to the prediction problems, the native structures, in september or october, and we can definitely post the winners for each target retrospectively. ID: 15750 · Rating: 0 · rate: /