Dr. Baker's journal archive 2009

Message boards : Rosetta@home Science : Dr. Baker's journal archive 2009

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 58751 - Posted: 12 Jan 2009, 6:11:25 UTC

The presentations made by the CASP8 assessors are now available at http://predictioncenter.org/casp8/docs.cgi?view=presentations.

We apologize for the outages over the past several weeks. The very good news is that Mike Tyka, Andrew Leaver-Fay and David Kim in an all out effort the past two weeks have identified the causes of many of the remaining errors some of you have been getting with the new rosetta code, and we hope that the error rate will now be much lower. I am very excited to see the results of the next weeks of calculations on rosetta@home--we are investigating some very fundamental issues. I'll give you a full report when your results are back in and we have had a chance to think about their implications.
ID: 58751 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 59031 - Posted: 26 Jan 2009, 7:57:55 UTC

We are very excited--the new improved mini Rosetta code should now be running on your computers, hopefully very smoothly, and we have really exciting fundamental science questions that you will be helping us tackle this week. HIgh on the list are the comparison of Rosetta low energy structures to native structures I mentioned recently (might the Rosetta models in some cases be more accurate??), and a comprehensive test of new comparative modeling methodologies we have developed after CASP8 last summer based on what we learned from your CASP8 results. We have developed several different approaches which use the information from already solved related structures to different extents, and we are excited to learn how the new methods compare to our CASP8 approach and which of the new methods is the best.
ID: 59031 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 59083 - Posted: 28 Jan 2009, 4:36:39 UTC

Mike has worked wonders! Here is his summary of what he and others have done to make mini Rosetta run smoothly on your computers:


Hello All!

We're ready for a new update. I want to say thank all of you who have helped over the last months to find and fix errors in minirosetta. A particular thank you goes to those who have donated their time over on RALPH and helped with their active feedback - we managed to find a number of difficult and rare bugs and put some new features into minirosetta that should help conserve computer time. Read about it here: http://ralph.bakerlab.org/forum_thread.php?id=431
and here http://ralph.bakerlab.org/forum_thread.php?id=432
I should add that work over there will continue,but now supplemented with information from Rosetta@HOME.

This update is highly focused on bugfixing and stability issues - we have virtually no new science in it, but: We will hopefully now be able to run the science projects that have been in the pipeline waiting for BOINC - we're expecting quite a bit of work to go out very soon indeed. See Dr. Baker's journal for more details.


Features/Fixes:
1.54 Release CHANGELOG


Faster loop closing in FoldCST/Abinitio (affects cc_* cc2_* cs_* WUs), should help with overrunning WUs.

Bug fix concerning intermittent crashes in relax benchmark jobs (_rlbd_) jobs - caused by buggy input file reader.

Bug fix for a potential instability in handling text files (affects all types of WUs).

Bug fix in checkpointing machinery, states were not being correctly restored, probably contributing to long runtimes. (affects cc_* cc2_* cs_* WUs)

Increased the density of checkpoints to lose less time on restarts and address the weired "backjumping" of the time reported in this thread. This will still happen, but the jumps should be much smaller (basically maximally as long as the time between checkpoints.)

Added checkpointing to Loopclosing part of FoldCST. (affects cc_* cc2_* cs_* WUs)

Added checkpointing to Looprelax.

The Watchdog has been checked and improved, now returning information on the aborted jobs to help us figure out how the remaining long running models come about. The watchdog will now abort if the runtime exceeds your preferred runtime + 4 hours. In other words the WUs should not overrun for more than around 4 hours. If they do please let us know !!

Added a limit ont he number of decoys per WU: 99. The WU will end gracefully after that and give full credit. This should address issues with excessive upload problems.

Fixed a bug in the BOINC API concerned with unzipping the input data. (I will let the BOINC guys know about this)

Fixed a strange problem in the options system leading to early crashes on some systems.

Two nasty instabilities fixed deep in the FoldConstraints/abinitio protocol (cc_* tasks and other homology modelling tasks)

Generally implemented much better error reporting - many many potential problems will now show up a meaningful error messages and not random segmentation faults.



NOTE: This new version contains a lot of debug output still. YOu will see that the stderr fills up with stuff - that is ok . It does not slow down the program nor cause much extra upload - but it tells us a lot about where things can go wrong still.


Despite all these fixes there are, i'm sure, many problems left. Most of them occur extremely rarely now though or are highly specific to particular machines. Thus we have decided to move the current version over from RALPH to Rosetta@HOME and give it a go on a much larger scale. Our effords to keep the failure rate down will continue and your time donations over on RALPH as well as error reports are still highly appreciated.

Please let us know how things work out there. Particularily i'd like to know about


Stuck workunits
Overrunning workunits (WUs should now, due to the new watchdog, never run more than 4 hours longer than the preferred user time)
Problems with checkpointing.
Any other strange behaviour.



Happy crunching - I'm very excited to see how this new version will pan out.

Mike
____________
ID: 59045 | Rating: 0 | rate: /


ID: 59083 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 59084 - Posted: 28 Jan 2009, 4:49:08 UTC

Following up on my last post, here is an email I was very happy to get today!:

Wow!! I don't think I've ever seen a new release go for 24hrs without a single problem reported in the "problems with..." thread. Congratulations, what a great turn around!

Not only no problems reported, but not even a "...but what about the problem I had, that noone else ever saw before or since?" post.

3 cheers for the R@h team! Kudos all 'round.

100 TFLOPS here we come!!!

ID: 59084 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 60010 - Posted: 7 Mar 2009, 5:03:08 UTC

The wonderful work that all of you are doing is keeping me very busy! We are now writing up manuscripts describing the many exciting results that you have obtained. In my next few posts I'll give you a brief overview of each of these.

The first manuscript is called "Simultaneous prediction of protein folding and docking at high resolution". It reports your exciting results with the "fold and dock" runs set up by Rhiju and Ingemar several months ago. Here is the abstract:
Despite recent successes in high resolution de novo modeling, computational methods have yet to achieve blind predictions of proteins in their most commonly occurring functional forms, symmetric homomultimers. Building on the Rosetta framework, we present a general method to simultaneously model the folding and docking arrangements of multiple chains. A benchmark study on large alpha-helical bundles, interlocking beta sandwiches, and interleaved alpha/beta motifs demonstrates the method’s generality, near-atomic accuracy, and potential use in molecular replacement phasing. Further, we present blind tests on a crystallized coiled-coil as well as two dimers with more complex geometries solved by NMR. These results indicate that high-resolution modeling of multimers is within the reach of the structure prediction community and may have immediate practical use for crystallographic phasing and the rapid structure determination of multimers with limited NMR information.


ID: 60010 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 60012 - Posted: 7 Mar 2009, 6:55:10 UTC

The second manuscript I am working on is titled "Alteration of Enzyme Specificity by Computational Loop Remodeling and Design". It describes a new approach using Rosetta to redesign enzymes in the human body to catalyze new reactions. Graduate student Paul Murphy not only developed the new method, but also in the manuscript shows how it can be used to create a new enzyme for gene therapy.

Here is the idea. Suppose a patient needs a transfusion of cells from another person to recover from a disease. There is a small chance that these cells will, rather than helping you, actually cause some new problem. In this case, it is important to be able to selectively kill these cells. For this purpose, special drugs have been developed which are not themselves toxic, but become toxic when broken down by a particular enzyme. If this enzyme is put into the introduced cells, then they can be killed if necessary by giving the drug to the patient.

The problem is--where does this enzyme come from. If it is a human enzyme, then the patient will convert the drug to the toxic compound in his/her normal tissue which would be very bad. If it is an enzyme from bacteria for example, that humans don't have the patient will be safe from the drug. However, our bodies are made to destroy anything that looks foreign, and this bacterial protein certainly will.

Our solution is to take a human enzyme, and keep the outside the same, so the patient's immune system thinks it is a human protein and doesn't destroy it, but change the catalytic site on the inside so that it converts the drug to the toxic compound. In the manuscript, Paul shows how a human enzyme that deaminates guanine can be redesigned by remodeling not only the sidechains but also the protein backbone to deaminate cytosine. While not quite ready for prime time, with some optimization this enzyme could be used in gene therapy as described above.

In this case I don't have much more work to do--the manuscript is already accepted for publication in the Proceedings of the National Academy of Sciences, but it is a bit over their length limit so we have to figure out how to cut out a few words.
ID: 60012 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 60027 - Posted: 8 Mar 2009, 17:14:16 UTC

The third manuscript I am working on is called "Blind docking of pharmaceutically relevant compounds using RosettaLigand". This describes the current state of our efforts towards improving methods for designing new small molecule drugs to combat diseases. The challenge addressed by this paper is to predict how drug molecules bind to proteins--determining how they bind is necessary to improve their efficacy and to identify good candidate molecules in the first place. In two earlier papers we had described the incorporation of small molecule modeling into Rosetta to create the new RosettaLigand drug docking methods, and tested the method on publicly available data sets. However, many of you will not be surprised that much of current drug discovery is being carried out in pharmeceutical companies, and the compounds they are testing are generally not made public (they are akin to trade secrets). We wanted to learn whether RosettaLigand was successful in docking drug compounds in actual drug discovery efforts, and were very fortunate when a large drug company agreed to let Ian Davis, who is working on the project, visit one of their company sites and analyze the results they had gotten with RosettaLigand on their private set of compounds. The results are described in this paper and are very encouraging, not only does RosettaLigand appear to be one of the best current methods for this crucial step in drug design, but also there are some clear avenues for improvement (which weren't evident with the public datasets) which we are now starting to work on.

ID: 60027 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 60035 - Posted: 9 Mar 2009, 6:19:24 UTC

The fourth manuscript describes rosetta@home predictions in CASP8. as you know with your great help we were again the top ranked group in CASP this year. The manuscript begins:

The CASP8 experiment provided an invaluable opportunity to extensively benchmark the Rosetta protein structure prediction method on a wide range of comparative modeling and free modeling targets. For the targets for which a sequence-detectable structural template existed, target-template sequence alignments were generated, and the Rosetta rebuild and refine protocol was used to generate low energy models. For the small number of targets for which a reliable template could not be identified modeling was carried out using the Rosetta de novo modeling protocol. All targets were subjected to extensive high resolution refinement with the physically realistic Rosetta all-atom forcefield using Rosetta@home.

this one has a due date--March 15--so we have to hurry up a bit!
ID: 60035 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 60053 - Posted: 10 Mar 2009, 3:59:00 UTC

The fifth manuscript describes a very exciting recent discovery made in collaboration with another research group. As many of you know, a large number of diseases, including Alzheimer's, are associated with the folding of a protein not into its normal active state but instead into long repeating fibrils. These are called "amyloid fibrils" and the associated diseases are referred to as "amyloid" diseases. We used Rosetta to design molecules predicted to block the formation of the fibrils, and, strikingly, when these molecules are added to the disease state protein under conditions where it normally forms fibrils, none are formed. We are very excited about this amyloid blocker, but as always I must stress that there are a lot of steps before a finding like this leads to an actually distributed drug.
ID: 60053 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 60070 - Posted: 11 Mar 2009, 5:24:53 UTC

The sixth manuscript describes our recent results on designing new DNA cutting enzymes that can cut the DNA double helix at very specific target sites. The enzymes we design only cut at one particular 20 base sequence. Remember that DNA is made of four different bases, A T G and C so a particular 20 base site might for example be AGAATAGGATCCAGATCGTC. This long a sequence is extremely unlikely to occur more than once in the human genome, so if we design an enzyme to cut at a particular place in the genome it is likely to cut only at that site (of course we can check this because the sequence of the human genome is known).

why would we want to do this? suppose you have a mutation in a gene and it is making you sick. if we can make an enzyme which cuts near the site of mutation and introduce this enzyme and a "correct" copy of the gene into your cells, the enzyme will cut the DNA and the cut will be repaired likely by copying the correct version we have added as well (cells repair breaks in DNA by copying the most closely related piece of DNA around). thus, with such enzymes we can potentially cure diseases by "gene therapy".

the manuscript describes progress towards this long range goal. we show we can design new enzymes that cut at new sites, and by examining how they work in detail we reveal some pretty big surprises in how the naturally occurring enzyme works which are fascinating in their own right and provide considerable insight into how to achieve the longer term goals of using redesigned enzymes for gene therapy.
ID: 60070 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 60083 - Posted: 12 Mar 2009, 3:52:11 UTC

The seventh paper is also related to your CASP8 predictions. The assessors asked us and a couple of other groups who made models that were geometrically and energetically nearly indistinguishable from native structures to write a paper describing how we were able to consistently able to generate physically plausible structure models. This was much less work for me than the preceding six manuscripts--I had only to describe the key concepts behind rosetta@home and the search for very low energy structures that are familiar to those of you who follow the screen saver from time to time.
ID: 60083 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 60156 - Posted: 14 Mar 2009, 4:31:31 UTC
Last modified: 15 Mar 2009, 4:27:44 UTC

Not all the writing I do is research papers, I have to write grants to get research funding as well (unfortunately). Today I'm putting together a proposal to use Rosetta to computationally design compounds which block the formation of the amyloid fibrils which accumulate in a number of different diseases (this is a followup on the fifth manuscript described below). My collaborator and I will send this proposal to the recently announced NIH "Challenge Grant" program, part of the federal stimulus package.
ID: 60156 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 60267 - Posted: 22 Mar 2009, 5:47:20 UTC

Rosetta@home has received a substantial monetary contribution from an anonymous donor! Following the suggestion of the donor, the University of Washington has used the money to start a special “Rosetta@home fund” that will be used to pay part of David Kim’s salary (David is the architect of Rosetta@home and the person who keeps the project running), upgrade the servers as needed, and allow us to make more rapid progress on the disease-releated research Rosetta@home is carrying out. If you would like to make a (tax-deductible) contribution to the project, the link is Rosetta@home fund . David will be adding a link to this from the Rosetta@home home page in the next day or two. Thank you for your contributions to the project!


ID: 60267 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 60609 - Posted: 12 Apr 2009, 7:20:35 UTC

I'd like to thank rosetta@home participant Michael G R for finding and posting on this article on protein design and rosetta@home:

http://seedmagazine.com/content/article/protein_power/

I talk to reporters fairly often, but I usually don't see the results (sometimes I don't want to!). This article I think is pretty good.



ID: 60609 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 60783 - Posted: 23 Apr 2009, 2:29:52 UTC

WIRED has a feature article on fold.it this week. It was originally a 3500 word piece that had much more about rosetta@home and casp, but unfortunately most of this got cut when they shrunk it to 2500 words at the end.
ID: 60783 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 60786 - Posted: 23 Apr 2009, 4:30:18 UTC

A link to the article: Gamers Unravel the Secret Life of Protein
(be sure to click through to the next pages).

and a few quotes:

Baker was the Most Valuable Player in the protein chemistry world's biennial World Series, a competition to see who can predict the shape a protein will fold into, knowing nothing more than the sequence of its constituent parts. It's called the Community-Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction, or CASP.


Of the 15 Foldit solutions that Baker submitted to CASP, seven had finished in the money...One of their solutions even took first place. A band of gamer nonscientists had beaten the best biochemists...when they turned to Cheese and asked him how he knew the way to tweak the proteins—for example, by orienting hydrophobic sidechains toward the protein core—he shrugged and said, "It just looks right."


He (Baker) and Popović have given the players a challenge: Design a new protein...These proteins could actually have therapeutic value in the real world, outside the game. And if they do, the Foldit players will share the credit. It might be the first time that a computer game's high score is a Nobel Prize.

Rosetta Moderator: Mod.Sense
ID: 60786 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 60827 - Posted: 26 Apr 2009, 4:48:50 UTC - in response to Message 60786.  


Baker was the Most Valuable Player in the protein chemistry world's biennial World Series, a competition to see who can predict the shape a protein will fold into, knowing nothing more than the sequence of its constituent parts. It's called the Community-Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction, or CASP.


"most valuable player" is definitely a misnomer! much more accurate (given that I did virtually nothing directly myself) would be coach or general manger of winning team; the key players of which include the students and postdocs in my group who did lots of hard work, and all of you for crunching during the summer on the CASP targets!
ID: 60827 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 60978 - Posted: 4 May 2009, 4:57:34 UTC

We just found a source of flu surface protein closely related to that on the strain making headlines in the last few days, and are starting to design tight binding inhibitors that target a site where the flu virus can't change. We will keep you posted as these design work units are sent out on rosetta@home. Fortunately, it seems that there may have been a bit of a false alarm with the flu in this case, but it is a good test case for us as we ultimately aim, with your help, to be able to design blockers to new pathogens within a relatively short time.
ID: 60978 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 61374 - Posted: 26 May 2009, 5:49:13 UTC

Tonight I will describe how we are going about designing inhibitors for the flu surface protein.

We start from the crystal structure of the key flu virus surface protein and focus on a particular area of the surface of this protein that does not change from strain to strain (as you know, we keep getting the flu because it changes rapidly and new versions can slip by our immune systems because they look different from the versions we fought off in the past). By focusing on an invariant region of the surface we hope to generate broadly neutralizing inhibitors.

The next step in to use Rosetta to dock disembodied amino acid sidechains onto this portion of the virus surface to identify particularly tight low energy interactions. For example, we might find that a tryptophan residue fits in snugly into one pocket on the virus surface, and a tyrosine residue into another. Given these maps of tight binding interactions of isolated amino acids with the virus surface, we then design a protein scaffold which can position as many of these binding "hotspot" residues as possible with orientations correct for binding the virus simultaneously. The next step is to optimize the remaining residues of the protein we are designing to interact as tightly as possible with the virus.

We plan to experimentally test around 20 of the novel proteins generated using the above procedure. We will of course pick the designs which have the strongest computed interactions with the virus. We will experimentally test more than 1 design because our computational design methods, while powerful, are still not perfect, so we don't expect every design predicted to bind tightly to actually bind tightly.

To experimentally test the designed inhibitors, we begin by synthesizing artificial genes encoding the new proteins. We simply use the genetic code in reverse: for each amino acid in the computer generated protein sequence we insert the appropriate DNA base "codon" until we have covered the whole sequence. Once we have the synthetic genes, we carry out two types of experimental tests. I'll describe these in my next post.

ID: 61374 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 62045 - Posted: 2 Jul 2009, 5:45:23 UTC

Manuel asked in the discussion thread about the manuscripts I described a couple of months ago. The second manuscript has already been published in a journal with the excellent policy of making all articles free to the public, you can take a look at

http://www.pnas.org/content/106/23/9215.long

The first manuscript was just accepted for publication in the same journal (proceedings of the national academy of sciences). The third and fourth manuscripts have also been accepted but not yet published.

The sixth manuscript got very enthusiastic reviews in the widely read journal Nature and hopefully you will be able to read about it there in not too long.

While we are all most interested in developing cures for diseases, aging and other problems, scientific papers are still a good way of benchmarking progress towards these goals. Your contributions to rosetta@home are certainly having a big impact on this short term measure of progress, and we will keep working together towards the longer term goals!




ID: 62045 · Rating: 0 · rate: Rate + / Rate - Report as offensive
1 · 2 · Next

Message boards : Rosetta@home Science : Dr. Baker's journal archive 2009



©2024 University of Washington
https://www.bakerlab.org