Posts by James Thompson

1) Message boards : Number crunching : Minirosetta 3.14 (Message 70623)
Posted 22 Jun 2011 by James Thompson
It looks like the ilv_fgf2_all_boinc units have a problem another one failed after 2.1 seconds same error as befor.

ERROR: Cannot open PDB file "2ilaA.pdb"
ERROR:: Exit from: src/core/import_pose/ line: 199
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

Something is wrong with those workunits. I'll remove them now.

2) Message boards : Number crunching : minirosetta 2.17 (Message 70145)
Posted 27 Apr 2011 by James Thompson
Thanks svincent. This is another input file issue, this time from a different user. The jobs have been removed, and we're working on the problem right now.

3) Message boards : Number crunching : Compute error (Message 70136)
Posted 27 Apr 2011 by James Thompson
Greg - this is one of the errors one of the errors people have been trying to get the sysadmins to address for several weeks. As you noted, there has been no status given by the project.

This is fixed now, and more detail is in this thread.
4) Message boards : Number crunching : could not open file cs_frags.9mers.gz (Message 70135)
Posted 27 Apr 2011 by James Thompson
Thanks for your reply James.

You're very welcome, it's the least that I can do. I mean that literally, because we're trying to make this kind of mistake difficult to make in the future.

We're currently discussing automated options for picking up and notifying developers of this problem, as the human component (me in this case) failed here, and we do not want this to happen in the future. Once we decide on a solution I'll it on the forum.

Once more, I'm very sorry for wasting your time. I'm going to try and encourage my colleagues to post around here and let you know what we're doing, we really appreciate all of your efforts and will try to communicate better.


5) Message boards : Number crunching : could not open file cs_frags.9mers.gz (Message 70132)
Posted 26 Apr 2011 by James Thompson
Hi everyone,

This job is my fault. Mod.Sense e-mailed me over the weekend, and the last vestiges of the jobs should have run their course. While there's no excuse for letting this problem go on for so long, I'd like to offer an apology for anyone who feels like their time is being wasted. While there's no excuse for letting this go so long without being caught, I'd like to explain the problem in more detail below and mention what actions we're taking in the future to prevent this.

The fundamental problem with these jobs is that the .zip files sent out to everyone's computers was missing a file. This means that Rosetta failed instantly as soon as the jobs were attempted to start. These jobs were part of a very large batch, and only some of the jobs were failing. As many of the jobs were successful, I didn't realize that this was going on until Mod.Sense e-mailed me. Even worse, I originally misdiagnosed which job was causing the problem, and removed jobs from the queue that were actually succeeding. Now that we have the true culprit the job success rates should return to previous levels.

The work that these jobs are doing is actually very important and exciting, I'll explain it in a separate post very soon. We're currently involved in a worldwide competition where people try to determine protein structures with limited experimental data, and our preliminary results are very promising.

In order to prevent this happening in the future, I'll no longer be submitting jobs in such large batches, so that jobs causing errors will be more obvious. I'm very sorry for the mistake, and even more sorry that I've managed to upset some of you. We're testing new and experimental methods all of the time with Rosetta, which makes it very unique and exciting, but mistakes like this are simply not acceptable even in testing. My sincerest apologies for the mistake, and I hope that you'll continue giving us your interest and your time.
6) Message boards : Rosetta@home Science : Stories from CASP8 (Message 55731)
Posted 13 Sep 2008 by James Thompson
Here are a some quick responses to questions people have asked me about CASP8:

Is the comparative modeling what "rebuild" is doing in the game? If the structure is unknown, how do you come up with the constraints? And how do you know (or at least gain some confidence that) your chosen constraints are not omitting the native structure?

Comparative modeling is an approach to protein structure prediction that utilizes information based on structures likely to be similar to other known structures in the Protein Data Bank. The rebuild tool as part of Fold.It is part of our comparative modeling protocol, but it's not the entire protocol, and it's not the only approach that people have! Other things that we try are small random perturbations to the structure followed by a gradient-based energy minimization, which is the same minimization done by the "wiggle" tool in Fold.It.

In comparative modeling, we only need to know one structure of a similar sequence to our sequence in order to derive constraints, so we don't need to know the answer beforehand. However, if there are no available similar structures (similar structures are called [b]templates[b]), then we use our abinitio approaches to structure prediction, which are unconstrained and rely only on fragment assembly and Rosetta's energy function.

Your last question is a very piercing one - a comparative model built from known structures always has some features right and some features wrong. Knowing which things to move and how to move them is the most important question in refining comparative models! My project in the lab right now is use statistics on the template structures and their relationship to the query to pick which constraints to turn on and off. The abstract idea is that based on the shape of the template structures and their relationship to the query, we believe some things more strongly than others. For example, even two proteins with very similar sequences will have minor differences in their structures. These differences are more likely to be in the periphery of the protein than in the core, as we know from experience that the core is usually conserved among similar proteins. Another example is that some pieces of the sequence you're trying to predict don't even line up with anything in the template structure - this means that those piece and pieces close by them are going to be less conserved! Formalizing these intuitively obvious statements about the relationships between proteins is my current project, and it's a very interesting one. At the very least, it's interesting to me. :)

David has suggested that several other members of our group post in this thread as well to talk about what we did, how we did it, and tell you some hopefully amusing anecdotes about our work during CASP8. Expect those as well in the coming weeks, along with more discussions of protocols, results, and future goals for the project.

The CASP 8 predictions from all participating servers are available at If you're using Windows, the *.tar.gz files there can be unzipped and untarred (i.e., accessed) by using a program like 7-Zip; Mac and Linux users have built in utilities to deal with .tar.gz files.
At that page, there are four servers clearly belonging to the Baker group: BAKER-DP_HYBRID, BAKER-GINZU, BAKER-ROBETTA and BAKER-ROSETTADOM. Which of these used Rosetta@home resources? It would also be helpful to know those servers' ID numbers, which I imagine are needed to figure out who made what predictions in the files I linked to.

The only automated server that submitted 3D models for CASP8 was the Baker-Robetta server, and the models are named BAKER-ROBETTA_TS1.pdb through BAKER-ROBETTA_TS5.pdb, so you shouldn't need the ID numbers. You can view them in a program like Rasmol or Pymol, which I think people can find through Google. You can also find some of the native structures on this page:

An example is target T0473, where the native structure is PDB ID 2k53.

Robetta used Rosetta@Home for its abinitio predictions - these are the ones for which there is no obvious comparative modeling template available. The comparative modeling stuff was done in-house, because our comparative modeling approach for Robetta has lots of steps and doesn't easily translate to something as parallel and powerful as Rosetta@Home. Our comparative modeling benchmark is directed towards figuring out which of our many approaches to comparative modeling will go onto Robetta in the next few months. So pay attention! That model your computer is making could be important to the future of the project. :)

As always, thanks for crunching. CASP8 was a great time for all of us, and everyone in the Baker Lab really appreciates your contributions.
7) Message boards : Number crunching : Minirosetta v1.34 bug thread (Message 55711)
Posted 12 Sep 2008 by James Thompson
Please post bugs/issues with minirosetta v1.34 here. This has several new scientific updates that David has mentioned in his journal. The basic idea is that we ran new code within the lab during CASP8, and we'd like to take the successful approaches we've found and port them to Rosetta@Home.

See this thread for more information on what we're trying to do. Thanks!
8) Message boards : Rosetta@home Science : Stories from CASP8 (Message 55710)
Posted 12 Sep 2008 by James Thompson
The purpose of this thread is to talk what we tried during CASP8 and what we learned. I'll try to update this once a week with successful predictions, stories, and lesson that we learned from CASP8. I'll start this week out with an overview of one of the lessons that we learned.

One of the most successful approaches we tried is based on exploiting knowledge of known protein structures in modeling. There are on the order of 4 million protein sequences, and we know the structure of about about 50,000 proteins in Protein Data Bank. Some of the sequences that we looked at during CASP8 were very similar to proteins of known structure, and these similar structures that we know can be very useful as starting points for refinement. Utilizing information from known structures to build a model of a protein with unknown structure is called comparative modeling, and it's a very interesting problem in the field.

Included in the minirosetta v1.34 application are some updates to our protocols for comparative modeling. These all try to steal features of one (or more!) known structures to predict the structure of a protein with unknown structure. These include:

- assembling a protein from an extended chain in the presence of constraints that exist between atoms, and also other geometric features of the protein.
- starting with a conformation derived from a protein of known structure and attempting to remodel it, both with and without constraints.

I'll try and make sure that workunits for these tasks have descriptive names so that everyone can better understand how your computers are helping us. Thanks for crunching!

Also, what do people want to see? Do you like seeing superpositions of successful predictions? Do you want to hear more about the open problems in structure prediction and what we're pursuing? We also have some other anecdotes and stories from CASP8 if people want to hear that as well.
9) Message boards : Number crunching : minirosetta v1.25 bug thread (Message 53327)
Posted 25 May 2008 by James Thompson
Please post minirosetta v1.25 bugs/issues here.
10) Message boards : Number crunching : minirosetta v1.19 bug thread (Message 52876)
Posted 6 May 2008 by James Thompson
We have an updated version of minirosetta v1.19 which should fix some of the stability issues with v1.15. Post minirosetta v1.19 bugs here.
11) Message boards : Number crunching : minirosetta v1.15 bug thread (Message 52799)
Posted 30 Apr 2008 by James Thompson
Hi everyone,

I just wanted to post again and let you know that we're in the process of debugging minirosetta. Thank you all for your input, we're taking the errors from this application very seriously.

We're enlisting the help of Rom Walton, one of the BOINC developers to help us debug some of the trickier problems with minirosetta v1.15, so expect a new release up on Ralph very soon. Rom is a very talented programmer, and has helped us a great deal in the past with the rosetta_beta app in the past. We hope to have a new version of minirosetta (v1.16) on Ralph by tomorrow that should address some of the problems people have been having.

We've also fixed the problem with validating the results from some of our minirosetta test jobs, so please let us know if that happens in the future. This is a result of trying some new protocols for the next CASP, which we'll describe in detail in the Science threads as we apply these methods to CASP8 targets.

This is all very exciting for us, and thank you for crunching. CASP8 starts on Monday, and I'm very much looking forward to it. Cheers,

12) Message boards : Rosetta@home Science : Tryptophan (Message 52798)
Posted 30 Apr 2008 by James Thompson
Hi Hugo,

This is a very good observation, and many people have looked at this in the past. On average, tryptophans tend to be buried in the center of proteins due to its hydrophobicity. However this isn't always true, there are many proteins that have tryptophans or other hydrophobic amino acids on their surface. A large amount of the time these hydrophobic amino acids are on the surface for a reason, and one such reason is that tryptophans can sometimes form binding interfaces for proteins to connect to one another.

The idea that some amino acids like to be more on the surface and others like to be more buried is behind one of the terms in our energy function. Good job figuring this out on your own!


13) Message boards : Rosetta@home Science : When can we see Rosetta project result? (Message 52720)
Posted 25 Apr 2008 by James Thompson
Einstein@home project releases periodic results. Does Rosetta do same or...?

Hi Orgil,

We've published several papers on a variety of subjects using results generated on Rosetta@Home. In David's Journal, there are references to several papers in which credit is given to volunteers (and their team, if applicable!) from the Rosetta@Home project. Here's an example figure from a recent paper from our lab in PNAS:

PNAS Figure

Here's a news article in the journal Nature that talks about Rosetta@Home:

Nature Article

We think it's important to acknowledge that our crunchers are helping our scientific research by donating computing time, and it's our opinion that this acknowledgment should continue for as long as we are running Rosetta@Home.

I hope that this answers your question. Cheers!
14) Message boards : Number crunching : Rosetta Application Version Release Log (Message 52673)
Posted 23 Apr 2008 by James Thompson
Version 1.15 of minirosetta is now released, which includes protocols for doing very fast optimization of multiple starting structures, improved methods for modeling protein structures based on constraints, and faster versions of our all-atom refinement protocol.
15) Message boards : Number crunching : minirosetta v1.15 bug thread (Message 52672)
Posted 23 Apr 2008 by James Thompson
Workunits for minirosetta v1.15 are going to be sent out in slowly increasing batch sizes over the next two days. Please report application bugs in this thread.
16) Message boards : Rosetta@home Science : What are those curls, whirls or coils ? (Message 43985)
Posted 20 Jul 2007 by James Thompson
I've been wondering about this for a while.

I understand the representation of molecules with "sticks" where sticks represent a binding between atoms.
But when I look at the graphics of Rosetta, I also see these curls (I hope I user the right word, I don't speak English natively). What are these a representation of?
If it would be a bunch of sticks curled up, no problem, but they are shown with a surface, like a ribbon of some sort.

(They look related to the Helix from DNA, but I don't know either in DNA why it is represented like that. In other words, if you see such a curl representation, can one be sure of the exact combination of molecules inside? For example, for DNA, it can be either of the 4 acids (A,T,G,C) about anywhere, so this representation does not reveal all, just a possibility. Is this the same?)

Please release my brain, someone.. ;-)

Mod.Sense is absolutely right, and I'm going to elaborate on his answer a bit.

I think that the curls you refer to are alpha helices, which are a type of secondary structure element found in proteins. The coils that you see in the BOINC graphics are termed alpha helices, and the arrow like structures you see are called beta strands. Each individual arrow is a beta strand, and a series of the arrows bonded together is called a beta sheet. Nearly any amino acid can be a part of any of the secondary structure elements, because the secondary structure elements involve atoms common to all amino acids (called backbone atoms).

The basic idea is that atoms within the backbone of the polypeptide chain can form hydrogen bonds to other backbone atoms, and these can give rise to alpha helices and beta strands, which are called secondary structure elements. These elements are often found within the final native structure of proteins. The ribbons and sheets aren't any wider or larger than other regions of the polypeptide chain, but they are usually important in understanding the final structure of a protein, which is why we highlight them within Rosetta's graphics. They also look very pretty to my eye.

Hopefully this answers your question. Cheers!
17) Message boards : Rosetta@home Science : Can I predict the 3D structure of my own protein sequence (Message 40219)
Posted 2 May 2007 by James Thompson

I'm a new Rosetta@home subscriber, and I'm wondering about how this work.

I understand the way the CPU's change datas, but I do not understand if it is possible to predict the 3D structure of my own protein sequence.

And If it is possible How can I do that?

Thank you for your attention

Predicting protein structure is a very computationally intensive process, so it could be difficult on your own. Check out our web service for structure prediction here if you're interested:
18) Message boards : Rosetta@home Science : Predictor of the day (Message 40025)
Posted 29 Apr 2007 by James Thompson
It's broken again. Same result has been posted four times now.

Fixed again! Thanks for the heads-up.
19) Message boards : Rosetta@home Science : Predictor of the day (Message 39326)
Posted 12 Apr 2007 by James Thompson
Predictor of the day is NOT working. Script is not inserting names.

Thanks for the heads-up. This should be fixed very soon.

It's back up and running. Our database server still had a couple of issues related to the power outage on April 10th, which have now been fixed.
20) Message boards : Rosetta@home Science : Predictor of the day (Message 39320)
Posted 12 Apr 2007 by James Thompson
Predictor of the day is NOT working. Script is not inserting names.

Thanks for the heads-up. This should be fixed very soon.

Next 20

©2023 University of Washington