Where we are and where we are going.

Author	Message
Greg_BE Send message Joined: 30 May 06 Posts: 5770 Credit: 6,139,760 RAC: 0	Message 40241 - Posted: 2 May 2007, 22:33:14 UTC - in response to Message 40240. Looks like it is 1o6e (dropped the trailing "A"). Here's a direct link. ah ha! thanks mod ID: 40241 · Rating: 0 · rate: / Reply Quote

Greg_BE Send message Joined: 30 May 06 Posts: 5770 Credit: 6,139,760 RAC: 0	Message 43905 - Posted: 19 Jul 2007, 16:18:06 UTC any new information to share? I see we are running on 5.7 now with lots of new stuff. perhaps someone would like to take the time to update this thread? ID: 43905 · Rating: 0 · rate: / Reply Quote

Greg_BE Send message Joined: 30 May 06 Posts: 5770 Credit: 6,139,760 RAC: 0	Message 44018 - Posted: 21 Jul 2007, 15:07:03 UTC bump no answers? ID: 44018 · Rating: 0 · rate: / Reply Quote

proxima Send message Joined: 9 Dec 05 Posts: 44 Credit: 4,148,186 RAC: 0	Message 44143 - Posted: 24 Jul 2007, 8:14:45 UTC Agreed, it seems to have gone a bit quiet - any chance of a description of the "big picture" as it is at the moment, what we're working on, etc? Alver Valley Software Ltd - Contributing ALL our spare computing power to BOINC, 24x365. ID: 44143 · Rating: 0 · rate: / Reply Quote

David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0	Message 44400 - Posted: 27 Jul 2007, 14:12:49 UTC Sorry, I just got back from a several week family vacation a few hours ago. Will give you an update by early next week (I need to catch up on the current status of the projects as well!). ID: 44400 · Rating: 0 · rate: / Reply Quote

Tom Philippart Send message Joined: 29 May 06 Posts: 183 Credit: 834,667 RAC: 0	Message 44408 - Posted: 27 Jul 2007, 18:52:30 UTC - in response to Message 44400. Sorry, I just got back from a several week family vacation a few hours ago. Will give you an update by early next week (I need to catch up on the current status of the projects as well!). Thanks in advance!! ID: 44408 · Rating: 0 · rate: / Reply Quote

proxima Send message Joined: 9 Dec 05 Posts: 44 Credit: 4,148,186 RAC: 0	Message 46046 - Posted: 12 Sep 2007, 8:32:34 UTC Any chance of another brief update in layman's terms (for example someone who doesn't know the difference between DNA and RNA!) as to where we are going at the moment? What are the main areas being looked into at the moment, etc? Thanks in advance. Alver Valley Software Ltd - Contributing ALL our spare computing power to BOINC, 24x365. ID: 46046 · Rating: 0 · rate: / Reply Quote

Jmarks Send message Joined: 16 Jul 07 Posts: 132 Credit: 98,025 RAC: 0	Message 47514 - Posted: 8 Oct 2007, 15:00:20 UTC - in response to Message 44400. Sorry, I just got back from a several week family vacation a few hours ago. Will give you an update by early next week (I need to catch up on the current status of the projects as well!). Still Waiting! Jmarks ID: 47514 · Rating: -1 · rate: / Reply Quote

Starsystem017 Send message Joined: 6 Jul 06 Posts: 4 Credit: 11,699 RAC: 0	Message 47767 - Posted: 15 Oct 2007, 20:56:18 UTC Any news? I´m also waiting! ID: 47767 · Rating: 0 · rate: / Reply Quote

David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0	Message 47772 - Posted: 16 Oct 2007, 3:11:17 UTC sorry, for recent updates please see my rosetta@home journal. ID: 47772 · Rating: 0 · rate: / Reply Quote

Stacey Baird Send message Joined: 11 Apr 06 Posts: 19 Credit: 74,745 RAC: 0	Message 49113 - Posted: 28 Nov 2007, 0:23:40 UTC ID: 49113 · Rating: 0 · rate: / Reply Quote

Nickhorsky Send message Joined: 11 Mar 07 Posts: 10 Credit: 134,214 RAC: 0	Message 49115 - Posted: 28 Nov 2007, 4:39:26 UTC - in response to Message 49113. ID: 49115 · Rating: 0 · rate: / Reply Quote

Greg_BE Send message Joined: 30 May 06 Posts: 5770 Credit: 6,139,760 RAC: 0	Message 63102 - Posted: 1 Sep 2009, 8:25:56 UTC Bump Any new information for here? ID: 63102 · Rating: 0 · rate: / Reply Quote

ISTVAN BAJKAI Send message Joined: 31 Aug 08 Posts: 1 Credit: 510,938 RAC: 0	Message 64068 - Posted: 17 Nov 2009, 17:51:13 UTC - in response to Message 6400. May I get some information about the results achived, yet and about the projects running? Thanks! Istvan ID: 64068 · Rating: 0 · rate: / Reply Quote

monk_duck Send message Joined: 17 Nov 09 Posts: 11 Credit: 284,039 RAC: 0	Message 64087 - Posted: 19 Nov 2009, 13:04:00 UTC I was just reading David's 3rd November post on his journal https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1177&nowrap=true#63932 and it struck me there might be a possible way around the age old problem of... "Can we have a percentage of overall progress?" Obviously it would be somewhat impractical / redundant to say how many proteins had been looked at compared with the total number of proteins we know of. To be fair the current idea of looking at the number of publications is a fair way of deciding if the research is working, although it could be quite hard to quantify the success of the papers for a layman (are all papers equal.....) What I was thinking was that there are the CASP results from each year and as such it provides quantifies data. http://predictioncenter.org/casp8/groups_analysis.cgi?target_type=0&gr_type=server&domain_classifications_id=1,2,3,4&field=sum_z_gdt_ts_server_pos My next thought was that surely there are already internal systems that compare the generated results with the experimental results (that's my understanding of the project anyway.) Could we not get a page that had X (insert random number) number of proteins that are once a quarter recomputed with the latest version of the project and re-compared. So you would end up with some form of results such as:- Jan 60% accurate over 40 proteins Apr 61% accurate over 40 proteins Jul 60% accurate over 60 proteins Oct 62% accurate over 60 proteins etc....... I have no idea how practical it is to do this although my guess is most of the systems are in place or how much CPU time would be required, I assume a lot. It would however give a very nice view on how the project was improving if nothing else... ID: 64087 · Rating: 0 · rate: / Reply Quote

Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0	Message 64090 - Posted: 19 Nov 2009, 19:02:23 UTC Last modified: 1 Dec 2009, 23:24:53 UTC ...an interesting idea Monk Duck. The gauge of accuracy you are looking for here is the RMSD. This is the most common form used to compare a predicted structure with the experimentally determined structure. An RMSD of zero would be a perfect match to the native structure. And it is more difficult to achieve an accurate prediction for a large protein then a smaller one, so you'd always have to compare some fixed set of proteins each time, and not expect to find low RMSDs for the larger ones, but rather hope to see the RMSD of the best predictions getting smaller over time. The only hitch I see is that computing resources would then have to be used periodically to run the benchmarks, even if that is not inline with the current research objectives. And if significant efforts are devoted towards tackling a new class of proteins (such as those with zinc, or those that are symmetric for example), and great progress made, it would not be reflected in your benchmark because no such proteins were in the benchmark when it was established. In such a case the addition of new things to the table being benchmarked should be the measure of progress. And then, over time, hope to see RMSDs reduce for that new protein type. I believe what you would find is that the pace of such demonstrable progress would appear rather slow. It is extremely difficult to tackle new classes of proteins, or start modeling how two proteins will dock rather then just one. And so the entire change from one quarter to the next would often be the addition of a new line to the benchmark, with what may seem to be an unimpressive RMSD as compared to the rest of the benchmark which may tend to remain little changed. The thing that should be seen as impressive is that this type of protein can be modeled with any degree of accuracy at all. When impressive results are attained, in successfully modeling new types of proteins or making significant improvements in RMSD, papers are written. So I believe that is where Dr. Baker is coming from. They don't write papers detailing how they toiled away for 3 months and found nothing new :) So each paper is documenting some significant advance, even if you and I, as lay-persons, often cannot decipher exactly what the specific advance described is. Rosetta Moderator: Mod.Sense ID: 64090 · Rating: 0 · rate: / Reply Quote

dcdc Send message Joined: 3 Nov 05 Posts: 1834 Credit: 124,281,057 RAC: 162	Message 64100 - Posted: 20 Nov 2009, 8:37:45 UTC - in response to Message 64090. I really like the idea. I understand it would be difficult to provide any meaningful quantitative data like that, but I think it would be possible: We could have a table containing a list of proteins, starting with simple ones and working through to complex ones (e.g. larger ones or ones that contain different metals, including ones that Rosetta can't yet handle), and have a month-by-month accuracy % against each if they've been worked on in that month. That'd certainly be a great way to demonstrate to people what's being worked on and to show some of the progress being made (I know there's more to Rosetta than just prediction tho). Do we already have all the data we need... we have the RMSD values... Maybe we could do something like that without needing the project team to help? What would be relatively easy is to have project or mod only thread that states the month and a very brief list of what's being worked on - e.g.: Oct 09 Working on improving Zinc accuracy (xyz1 tasks). Bug fixes on... ID: 64100 · Rating: 0 · rate: / Reply Quote

mikey Send message Joined: 5 Jan 06 Posts: 1898 Credit: 12,781,959 RAC: 374	Message 64101 - Posted: 20 Nov 2009, 9:31:28 UTC - in response to Message 64090. ...an interesting idea Monk Duck. The gauge of accuracy you are looking for here is the RMSD. This is the most common for used to compare a predicted structure with the experimentally determined structure. An RMSD of zero would be a perfect match to the native structure. And it is more difficult to achieve an accurate prediction for a large protein then a smaller one, so you'd always have to compare some fixed set of proteins each time, and not expect to find low RMSDs for the larger ones, but rather hope to see the RMSD of the best predictions getting smaller over time. The only hitch I see is that computing resources would then have to be used periodically to run the benchmarks, even if that is not inline with the current research objectives. And if significant efforts are devoted towards tackling a new class of proteins (such as those with zinc, or those that are symmetric for example), and great progress made, it would not be reflected in your benchmark because no such proteins were in the benchmark when it was established. In such a case the addition of new things to the table being benchmarked should be the measure of progress. And then, over time, hope to see RMSDs reduce for that new protein type. Make it a part of Boinc and farm it out to us once a month. I believe what you would find is that the pace of such demonstrable progress would appear rather slow. It is extremely difficult to tackle new classes of proteins, or start modeling how two proteins will dock rather then just one. And so the entire change from one quarter to the next would often be the addition of a new line to the benchmark, with what may seem to be an unimpressive RMSD as compared to the rest of the benchmark which may tend to remain little changed. The thing that should be seen as impressive is that this type of protein can be modeled with any degree of accuracy at all. When impressive results are attained, in successfully modeling new types of proteins or making significant improvements in RMSD, papers are written. So I believe that is where Dr. Baker is coming from. They don't write papers detailing how they toiled away for 3 months and found nothing new :) So each paper is documenting some significant advance, even if you and I, as lay-persons, often cannot decipher exactly what the specific advance described is. Ah but results that find nothing new are still results, at least to us lay persons. So if say you guys worked 3 months on something and couldn't find what you were looking for, it still means work was done but a new way is needed to tackle the problem. Kind of like when all the nay sayers say the World Trade Center building was brought down intentionally. It took the investigators months to finally prove that it came down as a result of the stresses involved. That meant lots of days of number crunching to finally pin down the answer in a complete and documented way. Obviously some of the work turned up results that indicated something else. Those then had to be verified and validated and either accepted or rejected based on further evaluations. I can imagine that your protein research is similar.....focus on something and then try and prove the results beyond any speculation, whatever the final outcome is. My son is a Chemistry PHD student and often complains that his experiment doesn't prove what he wants it to do. But he fails to realize that negative results are indeed results and just indicate the wrong path, or the wrong something or other. It could be the order in which things were combined, it could be the amounts, it could be a million different things that didn't provide the results he was looking for. But it is still an indication of a path tried but found to be a dead end. When we talk I always have to try and bring him back to reality and the fact that it is an 'experiment', because otherwise the answer would already be known and there would be no point! Before you ask I have no idea what he is trying to do but it is Organic Chemistry, and it is waaaay beyond making a volcano in the kitchen! ID: 64101 · Rating: 0 · rate: / Reply Quote

Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0	Message 64108 - Posted: 20 Nov 2009, 16:49:49 UTC Point taken mikey. I was taking the perspective of the original post which was not as interested in things that did not show positive results. To attempt to further describe what mikey is saying, if you set out with the thought that doing x,y,z will lead you to better predictions for proteins with zinc, and you do some studies, and it doesn't seem to show improved results, then it is perfectly legitimate to write a paper, document what you did and what your observations were and basically indicate that no improvement was found. This is useful to other researchers because if they have a similar idea, they can review your paper and study and either attempt to replicate your results if they believe you may have done something incorrectly or reached an incorrect result... or they will be able to define specific elements of their own work which will differ from that described in the paper and perhaps make all the difference. This is the approach Thomas Edison took in making the light bulb. We've all heard about the laboratory notebook with over 1,000 things that DON'T work as a filament for a light bulb... including coffee grounds, and a few other recreational diversions in their relentless efforts and long hours. Rosetta Moderator: Mod.Sense ID: 64108 · Rating: 0 · rate: / Reply Quote

David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0	Message 64137 - Posted: 23 Nov 2009, 5:41:47 UTC This is a really interesting discussion and you have very insightfully hit on the major issues. We are running benchmarks on a set of test proteins once every two weeks or so to test new methods we are continually developing. We could post the results, but as Mod sense suggests progress would not appear to be fast (the problem is hard, so most new ideas don't end up being big steps forward. as suggested below, when we do make a big advance, this is documented in a scientific paper, but I appreciate these are probably not very accessible to a general audience. here is a possibility: in preparation for casp next summer, we are trying to assemble all of our new methods into a structure prediction pipeline that we are testing on a set taken from previous casp experiments. once this is in place, i can report to you how the performance of the new combined method does relative to the casp8 approach, and try to break this down to identify what the main advances were ID: 64137 · Rating: 0 · rate: / Reply Quote