WorldCommunityGrig - HPF2 (uses Rosetta) Phase 2 started

Author	Message
FluffyChicken Send message Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0	Message 19223 - Posted: 24 Jun 2006, 16:26:28 UTC I noticed today that Worldcommunitygrids Human Proteome folding phase 2 has started, but what I noticed more was the optimisation of Rosetta. Now is the optimisation related to the rosetta@home optimisation ? (not asking about the relatedness or the history as I already know all about that :-), just about the types of optimisation they are trying to do to the software) 2006-06-23: World Community Grid - The Human Proteome Folding - Phase 2 project has been launched Human Proteome Folding Phase 2 (HPF2) continues where the first phase left off. The two main objectives of the project are to: 1) obtain higher resolution structures for specific human proteins and pathogen proteins and 2) further explore the limits of protein structure prediction by further developing Rosetta software structure prediction. Thus, the project will address two very important parallel imperatives, one biological and one biophysical. You can read more information at http://www.worldcommunitygrid.org/projects_showcase/viewHpf2Research.do Team mauisun.org ID: 19223 · Rating: 0 · rate: / Reply Quote

Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0	Message 19260 - Posted: 24 Jun 2006, 23:53:42 UTC The article says they plan to further develop Rosetta, not optimize it. So, just the way the Baker lab has devised the jumping method, to try and save time by assuming some known segments will take a given known shape, it sounds like they are planning to add some new methods of their own to see if they can improve the predictions made, or reduce the time to reach an accurate prediction. In other threads, where "optimization" is discussed, they are referring to running Dr. Baker's lab's version of Rosetta... but faster, by using the computers more efficiently. But it looks like WCG is going to modify the methods used and see if they can produce better protein models. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ ID: 19260 · Rating: 0 · rate: / Reply Quote

Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0	Message 19261 - Posted: 25 Jun 2006, 0:17:04 UTC I try to refrain from being critical, but take a good look at WCG's initial replication and quorum settings. Then, have a look at our discussion in Validation (not for credits, but for scientific reasons) While I certainly believe HPF is a very important project, I am not happy with the level of redundancy WCG (IBM) runs it. Maybe FAAH really needs this level of redundancy and they can't differentiate between WUs/apps? (again I need to look at BOINC-server code sometime) Unless ofcourse I'm missing something obvious. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity ID: 19261 · Rating: 0 · rate: / Reply Quote

FluffyChicken Send message Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0	Message 19283 - Posted: 25 Jun 2006, 15:13:41 UTC Feet1st, I call the modifying of the code to make it work better, faster or more accurate, optimisation. I mention it here as Baker Lab are the original developers and David & Lars are mentioned as the two of the 3 behind HPF2, Dr. Richard Bonneau being the other person and the main leader. As for the redundency on the results, well I don't think they changed it from the default BOINC settings ;-), also since it goes into a database and they are actually generating 'results' (rather than R@H's more developmental setting) it needs to be double checked at least. Also that's the way it works on Grid.org software... Team mauisun.org ID: 19283 · Rating: 0 · rate: / Reply Quote

Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0	Message 19287 - Posted: 25 Jun 2006, 18:18:18 UTC - in response to Message 19283. As for the redundency on the results, well I don't think they changed it from the default BOINC settings ;-), also since it goes into a database and they are actually generating 'results' (rather than R@H's more developmental setting) it needs to be double checked at least. Also that's the way it works on Grid.org software... Fluffy, CASP7 is actually very much "production" (vs "development" / "experimental") and yet R@H doesn't need to run redundant WUs. To re-iterate the points in the validation thread, looking at the predictions of 2chf (from Top-Predictions page) we're all looking for a needle in a haystack: the lowest energy structure highlighted in blue at the bottom-left of the graph. If R@H were sending out the same WU 5 times, i.e. re-calculating every red dot plotted above 5 times, we'd be able to calculate overall 1/5 as many dots (unique predicted protein structures) and quite possibly miss the elusive blue dot. One can also see that there is no clustering of results in the bottom-left corner of the RMSD-Energy graph. Just because grid.org and WCG run it this way, doesn't make it right. The only reason would be for calibrating credits, but not for the science IMHO. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity ID: 19287 · Rating: 0 · rate: / Reply Quote

FluffyChicken Send message Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0	Message 19314 - Posted: 26 Jun 2006, 11:22:34 UTC - in response to Message 19287. As for the redundency on the results, well I don't think they changed it from the default BOINC settings ;-), also since it goes into a database and they are actually generating 'results' (rather than R@H's more developmental setting) it needs to be double checked at least. Also that's the way it works on Grid.org software... Fluffy, CASP7 is actually very much "production" (vs "development" / "experimental") and yet R@H doesn't need to run redundant WUs. Actually I would class it as experimental, we are testing the quality of Rosetta@home's predictions and abilities. Quote first lines of CASP7 website Goals The main goal of CASP is to obtain an in-depth and objective assessment of our current abilities and inabilities in the area of protein structure prediction. To this end, participants will predict as much as possible about a set of soon to be known structures. These will be true predictions, not ‘post-dictions’ made on already known structures. I understand why we do not specifically need the redundency (I've been here for a long time now :-D ) I do not know why WCG uses it (or really care) it's just speculation as I don't know enough of what they are really doing... and has little to do with the original question ;-) Team mauisun.org ID: 19314 · Rating: 0 · rate: / Reply Quote

Keith Akins Send message Joined: 22 Oct 05 Posts: 176 Credit: 71,779 RAC: 0	Message 19444 - Posted: 29 Jun 2006, 0:24:20 UTC As a result of a unrelated question that I asked, David Baker responded in his journal with an answer that I think might be relevant here. ******* This question came up on the number crunching boards: With the new methodologies being developed, will there be a point at which we go beyond the needle-in-the-haystack decoys and start clustering around the actual structures of protiens? Answer: correct models will always be a very small fraction of the structures generated just because there are so many alternative conformations for a protein chain. but to have confidence in a prediction, there must be convergence of the lowest energy conformations on a single structure. As our methods improve and sampling (cpu power) increases, correct models will remain "needle in a haystack" in the overall population, but dominant in the population of lowest energy models. And is this the goal before (from what I understand) the project moves into the design/docking phase? Answer: No, while this is the solution to the structure prediction problem, it is not necessary for successful design and docking (certainly, though, more accurate prediction methods would impact both areas). We have had considerable success with both design and docking already. After CASP we will start running both docking and design calculations on rosetta@home, as well as continuiing to improve our structure prediction methods. ******* From what I gather, the more models generated the better. Redundancy would reduce the models generated by half if dual redundancy were required to validate results. This seems a lot like trying to hit a match tip with a shotgun at 40 feet. The more buckshot in the barrel the more likely you are to hit the target. ____________ ID: 19444 · Rating: 0 · rate: / Reply Quote