What IS the difference between Predictor@home and Rosetta@home

Author	Message
robetta Volunteer moderator Project administrator Send message Joined: 19 Aug 05 Posts: 1 Credit: 51,094 RAC: 0	Message 14 - Posted: 16 Sep 2005, 23:20:29 UTC Both projects are similar in that both are trying to improve methods for protein structure prediction. Rosetta@home includes protein design and prediction of complexes. Rosetta@home uses a software package called Rosetta, which has been proven to be one of the best methods out there for protein structure prediction in repeated CASP experiments (See our About page). Rosetta is also being used for the Human Proteome Folding Project, which is trying to predict folds for many proteins in the human genome. While they, in collaboration with us, are applying Rosetta to the human genome and other genomes like malaria (P falciparum), we are trying to conduct research to make it better. David Baker's work has been published in today's issue of Science. It is exciting work. Thank you for your interest in helping our and similar projects, like Predictor@home!!! ID: 14 · Rating: 0 · rate: / Reply Quote

Yin Gang Send message Joined: 17 Sep 05 Posts: 13 Credit: 63,992 RAC: 0	Message 174 - Posted: 19 Sep 2005, 6:11:43 UTC Do you have any suggestion to PP@H users like me about how to decide which project to join, since they have some researching contents in common? Q1. Insist on PP@H or turn to R@H? Q2. R@H is a non-profit or profit project? I can't find any related explanation on your site. Welcome To Team China! ID: 174 · Rating: 0 · rate: / Reply Quote

David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1480 Credit: 4,334,829 RAC: 0	Message 215 - Posted: 20 Sep 2005, 0:30:52 UTC - in response to Message 174. Last modified: 20 Sep 2005, 0:33:01 UTC Do you have any suggestion to PP@H users like me about how to decide which project to join, since they have some researching contents in common? Q1. Insist on PP@H or turn to R@H? Q2. R@H is a non-profit or profit project? I can't find any related explanation on your site. It is up to the volunteer to do research and decide. Optimally, you can help both projects...or all projects for that matter! Both projects are doing research for similar goals but using different methods. Helping both projects would benefit science..not one or the other. Q1 - both! Q2 - R@H is non-profit and for science, all results will be available to the public. The Rosetta application is licensed to pharma but most of the proceeds go to further research, for example - it funds an annual Rosetta development conference that brings researchers from various universities together. The application is available for free to academic users. ID: 215 · Rating: 1 · rate: / Reply Quote

Yin Gang Send message Joined: 17 Sep 05 Posts: 13 Credit: 63,992 RAC: 0	Message 216 - Posted: 20 Sep 2005, 1:48:10 UTC I got it, thanks, David ;) Welcome To Team China! ID: 216 · Rating: 0 · rate: / Reply Quote

[BAT]Krikke Send message Joined: 20 Sep 05 Posts: 5 Credit: 670,766 RAC: 0	Message 316 - Posted: 22 Sep 2005, 10:30:23 UTC Do you interact with the predictor@home project? I'm mean, since you are working for a mutual interest can you benefit from eachothers work or is each project totally separated? It would be a waste of resources if you guys would be doing double work that's why I'm asking. ID: 316 · Rating: 0 · rate: / Reply Quote

David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1480 Credit: 4,334,829 RAC: 0	Message 336 - Posted: 22 Sep 2005, 18:53:40 UTC - in response to Message 316. Do you interact with the predictor@home project? I'm mean, since you are working for a mutual interest can you benefit from eachothers work or is each project totally separated? It would be a waste of resources if you guys would be doing double work that's why I'm asking. We are not currently interacting with the predictor team. There is definitely no waste in resources. They are working on other methods and there is no doubling of work. ID: 336 · Rating: 0 · rate: / Reply Quote

AZAMAZA Send message Joined: 17 Sep 05 Posts: 12 Credit: 26,841 RAC: 0	Message 513 - Posted: 26 Sep 2005, 4:32:22 UTC you can compare rosetta@home and predictor@home performance in the recent international double blind test of structure prediction methods by going to http://www2.predictioncenter.org/casp/casp6/public/cgi-bin/groups.cgi (rosetta is Baker group and Predictor is CLB3 group) by the way, go to www.azamaza.com ID: 513 · Rating: 0 · rate: / Reply Quote

TestPilot Send message Joined: 23 Sep 05 Posts: 30 Credit: 419,033 RAC: 0	Message 526 - Posted: 26 Sep 2005, 6:17:09 UTC Last modified: 26 Sep 2005, 6:51:16 UTC Azamaza - thanks for hint. I checked that page and did some calculation in Excel. The most important parametr is GlobalDistanceTest_TotalScore (GDT_TS) And second parametr I got for comparation is RMS (Root-mean-square deviation for the entire target structure or a subdomain) I'd prefer to use RMSD, but could not find it in thouse tables. I calculated averages of thouse. GDT_TS - bigger is better. RMS - smaller is better Top 100 model: Rosetta@home(Baker) GDT_TS = 80.24,RMS = 2.86 Predictor@home(CLB3 group) GDT_TS = 67.32,RMS = 4.25 Baker-Robetta GDT_TS = 78.31,RMS = 3.20 Baker-Robetta4 GDT_TS = 79.56,RMS = 3.02 Since Baker group brought more models, it would be fair to mention total avrages. Rosetta@home(Baker) GDT_TS = 50.56, RMS = 10.06, 439 models Predictor@home(CLB3 group) GDT_TS = 36.77, RMS = 11.52, 378 models Baker-Robetta GDT_TS = 47.80, RMS = 10.23, 449 models Baker-Robetta4 GDT_TS = 48.51, RMS = 17.90, 450 models !!!!!!!!!!!!! So overall Rosetta@home prediction way better and more accurate compare to CLB3 group work! !!!!!!!!!!!!! Robetta is an automated prediction server, based on software developed by same group wich developed Rosetta. Robetta4 must be a new version of Robetta(my gues). It gives some HUGE RMS values for some low ranked models, that is why RMS so big in totals, but it still looks better compare to Robetta1. AZAMAZA, could u tell me which group have best results, so we can compare Rosetta@home results with the best ones. It is too many groups there, it will take too much time to browse and avaluete them one by one. TestPilot, AKA Administrator ID: 526 · Rating: 0 · rate: / Reply Quote

David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0	Message 577 - Posted: 27 Sep 2005, 7:05:00 UTC I think the best group may be Ginalski. also, you might get a clearer picture by using the "rank" statistic. ID: 577 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 619 - Posted: 27 Sep 2005, 14:51:05 UTC Color me stupid, but I am not sure I have it yet ... I know of three projects in this area, Rosetta@Home, Predictor@Home, and Folding@Home. Obviously, looking at the long list of entries in the CASP study link there are many other projects. At the moment, the point seems to be, from my reading below, is to find the "best" method (or methods) of calculating the "folding" activity of the sequences. From a project perspective they are all working on the "same" known/unknown structures and comparing notes as to what you are finding. Do I have it right? Or missed the boat completely? (punishable under the UCMJ as missing movement which is worse than AWOL and only slightly worse than desertion) ID: 619 · Rating: 0 · rate: / Reply Quote

David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0	Message 684 - Posted: 28 Sep 2005, 5:06:40 UTC - in response to Message 619. Color me stupid, but I am not sure I have it yet ... I know of three projects in this area, Rosetta@Home, Predictor@Home, and Folding@Home. Obviously, looking at the long list of entries in the CASP study link there are many other projects. At the moment, the point seems to be, from my reading below, is to find the "best" method (or methods) of calculating the "folding" activity of the sequences. From a project perspective they are all working on the "same" known/unknown structures and comparing notes as to what you are finding. Do I have it right? Or missed the boat completely? (punishable under the UCMJ as missing movement which is worse than AWOL and only slightly worse than desertion) You are pretty close. predictor@home and rosetta@home both seek to predict the three dimensional structure of proteins from their amino acid sequences. The casp international double blind test evaluates the methods of participants by giving out the amino acid sequences of proteins for which the three dimensional structures have been solved experimentally, but not yet published. The different groups have three months to predict the structures, and then send the results back to the organizers, who assess the performance of the different groups (as summarized at the site discussed below). The casp tests are held once every two years, and afterwards there is a meeting to discuss the results, which often involves a fair bit of drama, as you can imagine. the most recent meeting was held south of Rome last December. Folding@home does not attempt to predict protein structure from sequence, but to simulate the process of folding given knowledge of the true structure. ID: 684 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 694 - Posted: 28 Sep 2005, 14:18:41 UTC Ok, I think I have it now ... But in an article referenced elsewhere there was a statement that this work will only take a year. So, we are only going to have work for a year or so? Hardly seems time to get out of Beta testing ... ID: 694 · Rating: 0 · rate: / Reply Quote

David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0	Message 696 - Posted: 28 Sep 2005, 14:58:04 UTC For better or for worse, this project will take far longer than a year. There are over 40,000 proteins in the human genome alone with lengths ranging from 50 amino acids to 1000 amino acids. In our paper in Science, we showed that with 400 CPU days of computing each we could predict accurate structures for 6/16 proteins less than 85 amino acids, but for the other ten much more time was necessary. The size of the space that has to be searched grows roughly exponentially with sequence length, so orders of magnitude more computing power will be necessary to find the lowest energy structures for larger proteins. ID: 696 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 703 - Posted: 28 Sep 2005, 16:31:12 UTC Ok, still swimming hard, but not sure my nose is above water yet ... :) I did not follow your numbers (sorry, physics is my "thing", though I can hope you will figure out a magic pill for mental illness ...) Ok, we have 40,000 proteins, each of which can have 50 to 1,000 amino acids giving us about a space with about 21,000,000 cells (40K proteins * 525 AA) to be computed. Now, you lost me with the 6/16 & 85 AA, takes 400 CPU days? What are we doing per work unit ... sorry for being dense ... but, I don't know why, Rosetta@Home seems to have an attraction to me that I never got from Predictor@Home, though I do admit to still doing work for them ... :) While I am at it ... I am working on work unit naming (LHC@Home just gave an explanation for theirs so, I have just added theirs and found the old data for SETI@Home ... a Wiki is never done ... Anyway, next question after you straighten me out from the above ... is, "What does the name 'aa1pvaA03_05.200_v1_3' mean?" ID: 703 · Rating: 0 · rate: / Reply Quote

Keith E. Laidig Volunteer moderator Project developer Send message Joined: 1 Jul 05 Posts: 154 Credit: 117,189,961 RAC: 0	Message 705 - Posted: 28 Sep 2005, 17:12:05 UTC - in response to Message 703. Last modified: 28 Sep 2005, 17:12:37 UTC If I may be so bold: the 6/16 & 85 AA, takes 400 CPU days... What the group has found is that given a set of 16 typical 'target' proteins, each 85 amino acids in length, 400 CPU days of computing per 'target' allows for the prediction of the correct structure 6 times. While this sounds pretty weak, I can assure you that prior to the recent advances NO amount of computing power would predict results anywhere near as good. Within the context of the 'protein folding world', I feel these kind of results amount to a stunning success. "What does the name 'aa1pvaA03_05.200_v1_3' mean?" The naming of work units is somewhat arbitrary. What we should do is get one of scientists to write up the overall process of structure prediction, including the naming of the results and why, so that things become more clear to all. ID: 705 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 707 - Posted: 28 Sep 2005, 17:35:36 UTC Ok, I am not sure I got it fully, but can't think of a better way of asking ... so, moving along ... As far as name data goes, it does not have to be that hard or that deep. For the two examples I have now, you can see them at Work Unit Name where I have what I know of for SETI@Home and LHC@Home ... Like most of the rest of the Wiki, I take what I can get where I can get it ... :) The troubles are that if I do not know where to put some things I know I miss things ... ah well, if it was easy, we would have 25 Wikis out there ... :) ID: 707 · Rating: 0 · rate: / Reply Quote