Message boards : Rosetta@home Science : Folding of known structures
Author | Message |
---|---|
Mark Send message Joined: 10 Nov 13 Posts: 40 Credit: 397,847 RAC: 0 |
Hi, I was wondering why there is so much folding being done of structures with a known native solution? I could understand that it is useful sometimes as a metric to test the folding capabilities of Rosetta, and to develop new efficient folding movers to include in Rosetta, but it seems that 60-70% of the WU I get are folding to a known solution which seems awfully high. The only other thing I can think of is that you are trying to find the sequence for a computed binder structure. Is that it? |
AtHomer Send message Joined: 26 Jan 10 Posts: 13 Credit: 7,145,229 RAC: 0 |
I am curious about this as well. By the way, how can you tell that the folding being done is of structures with a known solution? |
Mark Send message Joined: 10 Nov 13 Posts: 40 Credit: 397,847 RAC: 0 |
I am curious about this as well. When in the screensaver it has 4 boxes and the fourth one is called "native", the native one is the actual solution probably derived from xray crystallography. Also you have a RMSD figure/graph which is the "difference" between the native structure and the structure the computer has just folded |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
I am just a volunteer and not in the BakerLab, but in very general terms, if you were to develop an algorithm that is to predict structures, and you test it by creating a pile of possible models of things of unknown structure... then what have to learned about how to improve your algorithm? Running it against known structures is how you prove your algorithm is working well, or came up with the same correct answer, or an answer with even better precision, or using dramatically less compute power. I guess perhaps the perspective you are missing is that the aim of the project is not to "solve unknown structures", but to "develop generalized computer algorithms that are able to accurately solve unknown structures". (I'm not quoting anyone there, I'm just trying to denote a possible title that summarizes things) Rosetta Moderator: Mod.Sense |
Mark Send message Joined: 10 Nov 13 Posts: 40 Credit: 397,847 RAC: 0 |
I am just a volunteer and not in the BakerLab, but in very general terms, if you were to develop an algorithm that is to predict structures, and you test it by creating a pile of possible models of things of unknown structure... then what have to learned about how to improve your algorithm? Running it against known structures is how you prove your algorithm is working well, or came up with the same correct answer, or an answer with even better precision, or using dramatically less compute power. Yes I get that point, but as I said in the original post it seems about 70% of the time you are folding to a known solution. If you multiply that up over the number of participants that's an awful lot of testing data. You would then be able to make lots of changes to the algorithms as you have lots of data to go on. However the minirosetta program gets updated infrequently, which brings me back to the original question.... |
Murasaki Send message Joined: 20 Apr 06 Posts: 303 Credit: 511,418 RAC: 0 |
Hi, Here is a description of the breakdown of Rosetta work units given in June 2013. gregorio wrote: This is Lucas from the Baker lab... Some of those tasks will obviously use native structures while others won't. What proportion of tasks fall into each category and how many use a native structure? Only the project scientists could answer but I suspect it fluctuates wildly over time. For example, in the run up to the CASP experiments I would expect a high level of testing against native structures to check their latest algortihms are working. During the CASP experiments the structures are unknown so there would be fewer native structure tasks. Added to that is the fact that scientists all around the world can submit tasks to the Robetta server and some of the Robetta work gets passed to BOINC. The proportion of native structures appearing in Robetta tasks will be completely outside the control of the Baker lab team. I am guessing that your observsation of 60-70% of tasks using native structures is based solely on the selection of tasks you receive. Have you gathered those observations over time or is it just a rough estimate based on a small sample? With 5 million tasks in the queue at any one time there could (hypothetically) be just 1 million tasks with a native structure (20%) but you are "lucky" enough to download a higher proportion of those, which makes it seem to you like 60-70%. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
A good portion of what defines how a work unit runs is defined in parameters to the base program. So, even without an application update, numerous variations can be configured in to groups of WUs for study. Rosetta Moderator: Mod.Sense |
Christopher Bahl Volunteer moderator Project developer Project scientist Send message Joined: 7 Feb 13 Posts: 9 Credit: 801,638 RAC: 0 |
Hi Mark et al. I think I understand the confusion here- we aren't commonly using Rosetta@home to predict the structure for proteins where we already know the structure (the closest we come to this is CASP). However, we are heavily utilizing Rosetta@home to validate and test de novo designed proteins. In these cases, we've designed a totally new structure and protein sequence that has never existed before in nature. Then, we take the amino acid sequence which codes for our de novo protein and ask Rosetta@home how this sequence will fold. Finally, we compare each Rosetta@home model to our designed model and ask how well they match up, which I think is what you're seeing as a "known native solution." In reality, this isn't a "known" structure, rather it's one we're attempting to make. Evaluation with Rosetta@home is currently the most important verification criterion we use to assess the quality of our de novo designed proteins prior to testing in the lab. As always, many thanks for donating your CPU hours; you make de novo protein design possible! Cheers, -Chris |
Mark Send message Joined: 10 Nov 13 Posts: 40 Credit: 397,847 RAC: 0 |
Hi Mark et al. Ah thx Chris, that makes sense! Yes in that case it is the "known" solution that is confusing. Thx again for clarifying Mark |
Message boards :
Rosetta@home Science :
Folding of known structures
©2024 University of Washington
https://www.bakerlab.org