Suggestions and Questions

Author	Message
Jocelyn Larouche Send message Joined: 9 Nov 05 Posts: 11 Credit: 6,994 RAC: 0	Message 4690 - Posted: 29 Nov 2005, 16:03:01 UTC Last modified: 29 Nov 2005, 16:03:16 UTC What I remember from my biology classes is that protein are made of rna pair wich are split by enzymes and starts to fold. It will split the first pair and the rest works like a chain reaction. The angles between the amino acids are standard until the protein fold back on itself wich cause irregular bend. So why not start from there? ID: 4690 · Rating: 0 · rate: / Reply Quote

dgnuff Send message Joined: 1 Nov 05 Posts: 350 Credit: 24,773,605 RAC: 0	Message 4703 - Posted: 29 Nov 2005, 18:30:31 UTC - in response to Message 4690. Last modified: 29 Nov 2005, 18:31:59 UTC What I remember from my biology classes is that protein are made of rna pair wich are split by enzymes and starts to fold. It will split the first pair and the rest works like a chain reaction. The angles between the amino acids are standard until the protein fold back on itself wich cause irregular bend. So why not start from there? That does raise an intereasting flock of questions. If any of the team happen by, perhaps they can clear this up. Is it true that proteins are built in the same order every time? Does the folding happen during the build process, or later? What DOES the protein do while it's being built, or more accurately does it fold when it's patially constructed. If it does, how much does this fold vary from the final shape. I have a pretty strong suspicion that the shape of a protein will be quite different if you trim one third of the amino acids off one end of it. However that does beg the question of "If we know the shape that this chain of N amino acids makes, how much work is it to adjust when we add one more?" I think that where Jocelyn is trying to go with this is can we do something by using an incremental approach: determine the lowest energy as we add each amino acid, and then refine. This would (of course) require a complete rework of the algorithm, and we'd pretty much have to be processing two or three proteins (or more) in parallel during production runs. Fire out a bunch of WU's for protein A at stage N. We grind on them and get results back. Meanwhile send out B at stage O. Then C at stage P. By the time the C's are on the way out, most of the results from A at N are back, allowing progression to A at N+1. And so on. ID: 4703 · Rating: 0 · rate: / Reply Quote

Jocelyn Larouche Send message Joined: 9 Nov 05 Posts: 11 Credit: 6,994 RAC: 0	Message 4725 - Posted: 29 Nov 2005, 20:44:04 UTC Last modified: 29 Nov 2005, 20:45:46 UTC Is it true that proteins are built in the same order every time? Genes have markers to initiate the build sequence a start and a finish so it cannot be built backward. And yes dgnuff you definitly got my point ID: 4725 · Rating: 0 · rate: / Reply Quote

FZB Send message Joined: 17 Sep 05 Posts: 84 Credit: 4,948,999 RAC: 0	Message 4750 - Posted: 30 Nov 2005, 0:59:57 UTC if you mean "in the same order" like for a a given mRNA, yes, it always translate in the same protein (given there is no error during the synthesis and a bit of simplified). as the protein grows, it starts to fold, though it takes some time until it reaches its final form (there can attach different enzyms to the protein and modifiy it). the growth happens like this: a mRNA is build in the cell nucleos, that is transfered outside the nucleos to a ribosome that encapsulate it (ribosume being different sub parts). begining from a start codon (3 DNA elements like e.g. guanine), tRNAs with matching anticodon (again 3 DNA elements, have to match --> bind with the codon) bind to the mRNS/ribosome. those tRNAs carry the amino acids that are then attached to the growing protein/start amino acid. the already combined part has no specific mechanism i know off that would prevent it from folding, so it should start right away. -- Florian www.domplatz1.de ID: 4750 · Rating: 0 · rate: / Reply Quote

David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0	Message 4771 - Posted: 30 Nov 2005, 6:54:57 UTC - in response to Message 4703. What I remember from my biology classes is that protein are made of rna pair wich are split by enzymes and starts to fold. It will split the first pair and the rest works like a chain reaction. The angles between the amino acids are standard until the protein fold back on itself wich cause irregular bend. So why not start from there? That does raise an intereasting flock of questions. If any of the team happen by, perhaps they can clear this up. Is it true that proteins are built in the same order every time? Does the folding happen during the build process, or later? What DOES the protein do while it's being built, or more accurately does it fold when it's patially constructed. If it does, how much does this fold vary from the final shape. I have a pretty strong suspicion that the shape of a protein will be quite different if you trim one third of the amino acids off one end of it. However that does beg the question of "If we know the shape that this chain of N amino acids makes, how much work is it to adjust when we add one more?" I think that where Jocelyn is trying to go with this is can we do something by using an incremental approach: determine the lowest energy as we add each amino acid, and then refine. This would (of course) require a complete rework of the algorithm, and we'd pretty much have to be processing two or three proteins (or more) in parallel during production runs. Fire out a bunch of WU's for protein A at stage N. We grind on them and get results back. Meanwhile send out B at stage O. Then C at stage P. By the time the C's are on the way out, most of the results from A at N are back, allowing progression to A at N+1. And so on. This would be a great approach to solving the problem, as it would reduce the combinatorial complexity enormously. It turns out, however, that even though proteins are synthesized one amino acid at a time, folding probably does not occur until most of the protein is made because special helper proteins, called "chaperones" prevent folding from occuring prematurely. The problem is that most pieces of proteins do not fold into stable structures (this is part of why the prediction problem is difficult!), and so the growing peptide chain would likely make incorrect and possibly dangerous interactions with other molecules unless kept in check by the chaperones. So unfortunately, a chain growing algorithm would probably not work well here--the global minimum is only clearly defined in the context of the whole protein chain. ID: 4771 · Rating: 0 · rate: / Reply Quote

FZB Send message Joined: 17 Sep 05 Posts: 84 Credit: 4,948,999 RAC: 0	Message 4777 - Posted: 30 Nov 2005, 10:44:12 UTC the already combined part has no specific mechanism i know off that would prevent it from folding, so it should start right away. well, guess i was wrong here... ("chaperones") -- Florian www.domplatz1.de ID: 4777 · Rating: 0 · rate: / Reply Quote

dgnuff Send message Joined: 1 Nov 05 Posts: 350 Credit: 24,773,605 RAC: 0	Message 4829 - Posted: 30 Nov 2005, 21:20:06 UTC - in response to Message 4771. It turns out, however, that even though proteins are synthesized one amino acid at a time, folding probably does not occur until most of the protein is made because special helper proteins, called "chaperones" prevent folding from occuring prematurely. Going completely off topic now. Re: chaperones, you learn something new every day. I must admit that I envy you a little, David. When you step back from the analysis we're trying to do, and just look at this in operation, it's a beautiful system. The whole thing: DNA, RNA, 3 radicles selecting each amino acid. Holding the protein in check till it's all built, then "chaperones release" and let it fold. Here's some other interesting food for thought. Since DNA radicles pair up (A/T and G/C) there have to be an even number. To select from 20 amino acids, if there were just two, we'd need 5 radicles per amino acid, and to reduce to two radicles, we'd need a total of 6 different ones to chose from. So the current system, groups of 3 from the set of 4 available is about the most efficient, even though at first sight it looks very wasteful: 64 possible combinations of three radicles, over three times as many as are needed. A quote I once heard comes to mind: "The universe is as simple as it can be, and still exist." Thinking some more about this, but now you've got me intersted in these chaperones. Do they always "bond" to the protein at the same places, and if so is this caused by some other controlling agent (e.g. a special code in the DNA sequence). Also, do they hold the chain more or less straight? If (and that's a big IF), they attach at the same place every time, can I ask a favor. Assuming 1ogw (the first bit of the WU name) identifes the protein, can I get some idea where the chaperones attach to this guy. Even something as simple as a list of percentages along the protein chain (red end is 0% blue end is 100%) would be enough. ID: 4829 · Rating: 0 · rate: / Reply Quote

Vanita Send message Joined: 21 Oct 05 Posts: 43 Credit: 0 RAC: 0	Message 4835 - Posted: 30 Nov 2005, 22:00:58 UTC Last modified: 30 Nov 2005, 22:02:23 UTC A few partial answers to questions raised below: 64 codons in the genetic code is the correct number as you deduced below. Turns out nature is not wasteful, as you rightly point out. Three of the codons are used as "stop" codons, to tell the protein synthesis machinery when the end of the protein has been reached, and to release the protein. The rest of the redundancy in the genetic code can acutally be put to good use by the cell. By controlling the amount of available acitvated amino acids associated with each codon, the cell can actually provide another level of regulation for how much protein gets made and how fast. Proteins that are encoded by "high usage" codons are more easily synthesized that ones with similar amino acid composition, but encoded by "low usage" codons. Just so we are on the same page terminology-wise (jargon is annoying but necessary sometimes), the sequence of DNA is usually called a sequence of nucleotides, or sequence of bases (I haven't heard the term radicles used). The bases do pair up, but one strand of the DNA is called the template strand, and this strand is used to make the mRNA. The mRNA has the same sequence of bases as the other DNA strand, which is called the coding strand. As for chaperones, their mode of action is an area of ongoing study. The particular chaperones that bind 1ogw are not necessarily known, but I'll see if I can find some references on chaperones in general. In the meanwhile, check out this page for info on one type of chaperone. ID: 4835 · Rating: 0 · rate: / Reply Quote

Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0	Message 4865 - Posted: 1 Dec 2005, 10:45:09 UTC I am not sure this is the right thread, but I thought some of you may be interested in this: I found this link which gives a few details on the strategies Rosetta uses to find the global energy minimum (the first part of the code). Also good to know that there seemes to be some collaboration with the UW Math department going on to make Roestta even cleverer than it already is. The description may also be of interest for those who have observed some strange behaviour when the protein wiggles around on the graphics. I wonder whether the all-atom-relax part of the code works along the same lines or whether a completely different algorithm is used. Since most of the computing time is spent on this second part of the code, input from clever mathematicians might perhaps also be beneficial. ID: 4865 · Rating: 0 · rate: / Reply Quote

David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0	Message 4876 - Posted: 1 Dec 2005, 15:59:01 UTC - in response to Message 4865. I am not sure this is the right thread, but I thought some of you may be interested in this: I found this link which gives a few details on the strategies Rosetta uses to find the global energy minimum (the first part of the code). Also good to know that there seemes to be some collaboration with the UW Math department going on to make Roestta even cleverer than it already is. The description may also be of interest for those who have observed some strange behaviour when the protein wiggles around on the graphics. I wonder whether the all-atom-relax part of the code works along the same lines or whether a completely different algorithm is used. Since most of the computing time is spent on this second part of the code, input from clever mathematicians might perhaps also be beneficial. Hi Hermann, Paul is actually describing the second, all atom relax stage. For experts, the approach is monte carlo with minimization--each attempted move consists of a random perturbation to a randomly selected subset of the phi and psi angles, followed by updating the sidechain conformations, and finally, gradient based (quasi-Newton) minimization, the move is accepted if the energy decreases (standard Metropolis criterion). The initial fast low resolution search is traditional Monte Carlo without minimization; each move consists of substituting a randomly selected short fragment of a protein of known structure at a randomly selected position in the protein (this in general causes a big change in the conformation, which is why the movies are so jumpy). (Paul's text is a bit out of date--rosetta was ported to C++ from fortran a couple of years ago) ID: 4876 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 4926 - Posted: 2 Dec 2005, 9:15:42 UTC - in response to Message 4876. For experts, the approach is monte carlo with minimization--each attempted move consists of a random perturbation to a randomly selected subset of the phi and psi angles, followed by updating the sidechain conformations, and finally, gradient based (quasi-Newton) minimization, the move is accepted if the energy decreases (standard Metropolis criterion). I am not sure your random is random enough though ... see my notes for some thoughts on this. Again, I only have 4-5 hours watching, but, it seems to my biased observation that the changes *seem* to happen an awful lot to the same places all the time. Are you actually tracking the change distribution to see if your attempts ARE random? ID: 4926 · Rating: 0 · rate: / Reply Quote

Jocelyn Larouche Send message Joined: 9 Nov 05 Posts: 11 Credit: 6,994 RAC: 0	Message 4978 - Posted: 2 Dec 2005, 19:42:33 UTC - in response to Message 4771. This would be a great approach to solving the problem, as it would reduce the combinatorial complexity enormously. It turns out, however, that even though proteins are synthesized one amino acid at a time, folding probably does not occur until most of the protein is made because special helper proteins, called "chaperones" prevent folding from occuring prematurely. The problem is that most pieces of proteins do not fold into stable structures (this is part of why the prediction problem is difficult!), and so the growing peptide chain would likely make incorrect and possibly dangerous interactions with other molecules unless kept in check by the chaperones. So unfortunately, a chain growing algorithm would probably not work well here--the global minimum is only clearly defined in the context of the whole protein chain. Hi David, Can it be possible to have a low energy and low rmsd that would be different from the original one? or would they be unstable at some point? ID: 4978 · Rating: 0 · rate: / Reply Quote