Comments/questions on Rosetta@home journal

Message boards : Rosetta@home Science : Comments/questions on Rosetta@home journal

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 10 · Next

AuthorMessage
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 11592 - Posted: 3 Mar 2006, 7:00:43 UTC

Hope that wasn't too long a first post! looking forward to discussing issues with people here. David
ID: 11592 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Cureseekers~Kristof

Send message
Joined: 5 Nov 05
Posts: 80
Credit: 689,603
RAC: 0
Message 11597 - Posted: 3 Mar 2006, 10:08:22 UTC

I am wondering why after 3 hours, there still isn't a comment on your post, David. Everyone was screaming for this news, and now it's there, and it looks like no one want to give comments...

Although I'm not a person why is known in the world of this kind of science, your post is quit clear to me. As a member of DPC, I'll surely communicate this to the rest of our team, so everyone knows it. At our team results and feedback of the project is very important.
At this point I want to thank you for the effort and hope of course that you'll keep adding your comments, results, news items, etc to this thread.

One remark: Maybe it's better to make that thread read-only, so we only can find your (or collegue's) posts in there.

so we need more cpu power! it is kind of amazing that solving such a long standing scientific problem depends so crucially on the efforts of volunteers like yourselves! I don't know how much more cpu power it will take, but if you can each recruit ten friends or relations ...

That's the spirit :-D

Member of Dutch Power Cows
ID: 11597 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile KSMarksPsych
Avatar

Send message
Joined: 15 Oct 05
Posts: 199
Credit: 22,337
RAC: 0
Message 11600 - Posted: 3 Mar 2006, 12:10:23 UTC

I just saw your post David and I'm floored.

This is awesome news. What you wrote was clear, concise, and to the point (unlike most of my posts :))

I want to congratulate you and your whole team, as well as all of our crunchers here.

This is going to make a huge difference to humanity. I'm just happy I get to be a part of it.

Wonderful!

Kathryn
Kathryn :o)
The BOINC FAQ Service
The Unofficial BOINC Wiki
The Trac System
More BOINC information than you can shake a stick of RAM at.
ID: 11600 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile rbpeake

Send message
Joined: 25 Sep 05
Posts: 168
Credit: 247,828
RAC: 0
Message 11602 - Posted: 3 Mar 2006, 12:50:38 UTC - in response to Message 11592.  

Hope that wasn't too long a first post! looking forward to discussing issues with people here. David


I think this is a great public outreach, and may also help young people get a sense of the excitement and life of a research scientist. Hopefully it may encourage some people to pursue a career in science! :)

Regards,
Bob P.
ID: 11602 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 11603 - Posted: 3 Mar 2006, 13:06:01 UTC - in response to Message 11597.  

I am wondering why after 3 hours, there still isn't a comment on your post, David. Everyone was screaming for this news, and now it's there, and it looks like no one want to give comments...

Although I'm not a person why is known in the world of this kind of science, your post is quit clear to me. As a member of DPC, I'll surely communicate this to the rest of our team, so everyone knows it. At our team results and feedback of the project is very important.
At this point I want to thank you for the effort and hope of course that you'll keep adding your comments, results, news items, etc to this thread.

One remark: Maybe it's better to make that thread read-only, so we only can find your (or collegue's) posts in there.

so we need more cpu power! it is kind of amazing that solving such a long standing scientific problem depends so crucially on the efforts of volunteers like yourselves! I don't know how much more cpu power it will take, but if you can each recruit ten friends or relations ...

That's the spirit :-D


While it is not possible in the current forum environment to create a read only thread, That should not be an issue. The Moderators will keep the other thread clear of any user postings, by moving any that occur to this thread. As a rosetta user myself I would echo the the praise for the greater information flow, and the excitement over what Dr, Baker has reported.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 11603 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile bruce boytler
Avatar

Send message
Joined: 17 Sep 05
Posts: 68
Credit: 3,565,442
RAC: 0
Message 11604 - Posted: 3 Mar 2006, 13:21:06 UTC

Hi David,

Truly amazing news. The protien prediction problem was first set forth by Linus Pauling I belive in 1935! Correct me if I am wrong.

If the problem is solved through DC I belive it will go down as a major 21 century achievement.

"it is clear for the still large number of proteins for which we are failing that the problem is not enough sampling, even with 100,000 independent folding runs we are not coming close enough to the native strucutre to land in its energy minimum. so we need more cpu power! it is kind of amazing that solving such a long standing scientific problem depends so crucially on the efforts of volunteers like yourselves! I don't know how much more cpu power it will take, but if you can each recruit ten friends or relations ..."

If the 20 terabyte dc system is all that we end up having, given enough time, can this level of compute solve the protiens mentioned in this part of the journal post? Or is time spent on the compute not a factor and merely say another 60 terabytes is what is needed or the protiens can't be delt with?

Thanks loads for the good news.....cheers
ID: 11604 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Los Alcoholicos~Megaflix

Send message
Joined: 10 Nov 05
Posts: 24
Credit: 77,199
RAC: 0
Message 11612 - Posted: 3 Mar 2006, 16:50:17 UTC

Even for a science noob like myself it's clear information. Thanks! I'll be reading the journal regularly.
ID: 11612 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 11621 - Posted: 3 Mar 2006, 20:41:36 UTC

While I read the message during its first hour online, it took awhile to make sense of a few of the comments so I could make a useful comment.

Elsewhere, there were statements of 10,000 models being produced for each WU; and yet you mention needing more cpu power since 100,000 models is not enough. Am I correct in assuming that you're referring to the group of 10 or more WUs with the strange names such as "DementedOnlineSalesPitches_With_no_Fragging"? If so, would you mind describing what each of those strange names referred to - and their best score and compare that to the normal client's best score on 10,000 models. Secondly, (if we've done such) - what's the improvement we can expect from running 100,000 models instead of 10,000? (Not that there's on option at present to run everything with 100,000 models per WU.)

In two places today, you've referred to David Kim as merely "David" - and in the second message, you had not previously mentioned his full name. It gets awfully confusing reading a David state, "David has been working on.." and brings to mind someone that ventures into a dark room, pulls out a ring, and starts chanting, "my presh... yusssss..."

Your first post really does a great job of conveying your infectious optimism and excitement. (Although you do need to stay away from engraved golden rings.. *grin* ) Thanks.

As for more users - get rid of the collection of annoying bugs so the client becomes even more reliable; reduce the bandwidth usage of the client even further, change the stats to a format that gives equal credit to all 10,000 models of a given WU, while improving the client so it can get a RMSD score of less than 3 Angstroms for a 160 molecule chain - and we shouldn't have problems attracting more cpus.

(Today's mention of Stderr.txt filling 68Gigs of an 80Gig HD could explain the loss of lots of unattended machines..)
ID: 11621 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Hoelder1in
Avatar

Send message
Joined: 30 Sep 05
Posts: 169
Credit: 3,915,947
RAC: 0
Message 11678 - Posted: 5 Mar 2006, 9:35:57 UTC
Last modified: 5 Mar 2006, 9:39:57 UTC

I will definitely be a regular reader of the new journal which due to the time zone difference conveniently gets delivered right in time for breakfast (more interesting than the morning papers). Since David asked for suggestions on things to cover, here are two questions that come to mind:

In his first journal report David mentioned "evolutionarily related homologs". I am guessing that homologues may be proteins where the amino acids are replaced by other, similar amino acids ? But I am completely in the dark how evolution might come into the picture. Also, assuming that such homologues are available for all proteins of interest (?), why would it be a problem to use this kind of information in Rosetta ?

My second question is sort of a placeholder for my general interest in the algorithmic details and tricks employed by Rosetta: The following information appears on the 'Result' page of the new fixed-runtime WUs:

# DONE :: 1 starting structures built 69 (nstruct) times
# This process generated 69 decoys from 69 attempts

The number in front of (nstruct) seems to be the number of structures that are determined within the specified run time, but what are decoys and attempts ? I have seen rare cases where the number of attempts was _much_ larger than the number of structures.

Anyway, I probably could think of many more questions should David run out of things to discuss, but I have the impression that he will know best what will be suitable for the journal. I am very much looking forward to the future installments of the new Rosetta (breakfast-) journal/seminar/lecture...
ID: 11678 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Robert Everly

Send message
Joined: 8 Oct 05
Posts: 27
Credit: 665,094
RAC: 0
Message 11685 - Posted: 5 Mar 2006, 15:06:46 UTC

Thanks for the news. While 99.9% of the science info is over my head, it is nice to know that progress is being made.
ID: 11685 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile rbpeake

Send message
Joined: 25 Sep 05
Posts: 168
Credit: 247,828
RAC: 0
Message 11686 - Posted: 5 Mar 2006, 15:59:14 UTC
Last modified: 5 Mar 2006, 15:59:49 UTC

This may be too personal, or not of interest to others or to David (and I completely respect that), but as a thought I toss it out:

David, How did you become interested in pursuing a career in science, and how did you end up doing what you are doing? Not too personal, but I find the journey to what one is doing today interesting, and perhap might be inspiring to younger people seeking a future path.
Regards,
Bob P.
ID: 11686 · Rating: 0 · rate: Rate + / Rate - Report as offensive
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 11703 - Posted: 6 Mar 2006, 6:52:29 UTC

Must say Dr D.B great journal, I log on just to read it, as I'm doing now, no results to report or anything :). On reading your journal I've decided this truly seems to be a solvable problem, when someone said something about 500 dimentions and thought about all those atoms and electrons I must admit I thought "this isn't happening in my life time", but I now I'm not that sure, it seems that brains (you and your team) working with brawn (us) may actually make this giant leap. Truly amazing.

Now a question for you:
In this work unit, I have the minimum RMSD by the look of it, but too high an energy. I know that you had both the minimum energy and the minimum RSMD in the stats, I was wondering why does the RMSD matter at all, should a protein be both the smallest it can be and have the least energy? Or should it just have the least energy? Is RMSD just a short cut to working out the energy?

BTW congrats to, I assume, Jack. I'm not sure if anybody else has noticed but the ribbons into pipes in the graphics was a nice touch that didn't go unnoticed ;).

ID: 11703 · Rating: 0 · rate: Rate + / Rate - Report as offensive
uioped1
Avatar

Send message
Joined: 9 Feb 06
Posts: 15
Credit: 1,058,481
RAC: 0
Message 11738 - Posted: 6 Mar 2006, 23:36:36 UTC

The homolog insight is truly a leap forward. That is the sort of insight that comes along rarely and can be quite beneficial to search problems of this magnitude! Congratulations!
I wonder if it is required that the homologs be evolutionarily related, or if we could generate homologs that would serve the same purpose for the search. This might have the advantage of strengthening the value of the results (just like using the RMSD as a heuristic would invalidate them)

In answer to hugo the hermit:
[quote}In this work unit, I have the minimum RMSD by the look of it, but too high an energy. I know that you had both the minimum energy and the minimum RSMD in the stats, I was wondering why does the RMSD matter at all, should a protein be both the smallest it can be and have the least energy? Or should it just have the least energy? Is RMSD just a short cut to working out the energy?[/quote]

RMSD is a measure of the difference between the two foldings of a specific protien. I'd venture an educated guess that it stands for "[square]Root Mean Squared Deviation" For the runs we've done thus far, we've known the natural folding so we can calculate the RMSD and use it to evaluate how well the algorithm works. For the application to be usefull, however, it can't look at the RMSD to decide how to fold the protien, because we can't calculate it for protiens whose natural structures are unknown. I'm not a chemist, so I can't state with 100% certainty, but I think that the natural structures will have the lowest energies possible, thus in some sense we are trying to get to where we can use the energy of a structure as an approximation of the rmsd.
ID: 11738 · Rating: 0 · rate: Rate + / Rate - Report as offensive
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 11747 - Posted: 7 Mar 2006, 7:17:13 UTC
Last modified: 7 Mar 2006, 7:30:41 UTC

From uioped1
RMSD is a measure of the difference between the two foldings of a specific protien. I'd venture an educated guess that it stands for "[square]Root Mean Squared Deviation" For the runs we've done thus far, we've known the natural folding so we can calculate the RMSD and use it to evaluate how well the algorithm works. For the application to be usefull, however, it can't look at the RMSD to decide how to fold the protien, because we can't calculate it for protiens whose natural structures are unknown. I'm not a chemist, so I can't state with 100% certainty, but I think that the natural structures will have the lowest energies possible, thus in some sense we are trying to get to where we can use the energy of a structure as an approximation of the rmsd.


Thanks for the infomation, I seem to remember reading somthing like that some time ago (I'd forgotten), so I'm sure you are right.

But that begs the questions (now I'm not sure how low the RMSD is on the WU picture I posted as I wasn't looking at it at the time but it was pretty low).

Why, when the ab initio is that close to the original is the energy so high (does the full atom relax need tweaking)? Or is just a little out (ab initio) energetically large?


ID: 11747 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 11748 - Posted: 7 Mar 2006, 7:54:59 UTC - in response to Message 11747.  

From uioped1
RMSD is a measure of the difference between the two foldings of a specific protien. I'd venture an educated guess that it stands for "[square]Root Mean Squared Deviation" For the runs we've done thus far, we've known the natural folding so we can calculate the RMSD and use it to evaluate how well the algorithm works. For the application to be usefull, however, it can't look at the RMSD to decide how to fold the protien, because we can't calculate it for protiens whose natural structures are unknown. I'm not a chemist, so I can't state with 100% certainty, but I think that the natural structures will have the lowest energies possible, thus in some sense we are trying to get to where we can use the energy of a structure as an approximation of the rmsd.


Thanks for the infomation, I seem to remember reading somthing like that some time ago (I'd forgotten), so I'm sure you are right.

But that begs the questions (now I'm not sure how low the RMSD is on the WU picture I posted as I wasn't looking at it at the time but it was pretty low).

Why, when the ab initio is that close to the original is the energy so high (does the full atom relax need tweaking)? Or is just a little out (ab initio) energetically large?



the energy can be high even a little bit a way from the correct solution because small displacements can put atoms too close to each other which is very costly energetically.
ID: 11748 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 11750 - Posted: 7 Mar 2006, 10:53:57 UTC

the energy can be high even a little bit a way from the correct solution because small displacements can put atoms too close to each other which is very costly energetically.


After the client hits a fairly low RMSD - have you tried adjusting each atom in the chain one at a time - for even lower energy placements? Calculate its energy some tiny distance up, down, in, out, right, left. Move it to the location with the lowest energy, and then repeat another 9-19 times? And then move on to the next atom in the chain. Kind of like a zipper.. lock one set into place and move onto the next set of teeth. Or is it better to start in the middle of the chain and work out to both ends of the chain?




ID: 11750 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 11769 - Posted: 8 Mar 2006, 4:25:58 UTC - in response to Message 11750.  

the energy can be high even a little bit a way from the correct solution because small displacements can put atoms too close to each other which is very costly energetically.


After the client hits a fairly low RMSD - have you tried adjusting each atom in the chain one at a time - for even lower energy placements? Calculate its energy some tiny distance up, down, in, out, right, left. Move it to the location with the lowest energy, and then repeat another 9-19 times? And then move on to the next atom in the chain. Kind of like a zipper.. lock one set into place and move onto the next set of teeth. Or is it better to start in the middle of the chain and work out to both ends of the chain?





we do something like this-we move everything in the direction of the steepest decrease in the energy (by computing the derivative of the energy with respect ot all of the degrees of freedom and then moving in this direction)
ID: 11769 · Rating: 0 · rate: Rate + / Rate - Report as offensive
thom217

Send message
Joined: 29 Oct 05
Posts: 12
Credit: 182
RAC: 0
Message 11778 - Posted: 8 Mar 2006, 12:32:28 UTC
Last modified: 8 Mar 2006, 12:36:10 UTC

Hi Dr. Baker,

I think that if you keep informing people there is at least one or two proteins you are folding that have a direct link to some disease it will help to bring more people to Rosetta.

Another idea to try is to create a GPU client for Rosetta.

Thanks.
ID: 11778 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Dimitris Hatzopoulos

Send message
Joined: 5 Jan 06
Posts: 336
Credit: 80,939
RAC: 0
Message 11799 - Posted: 8 Mar 2006, 23:39:02 UTC - in response to Message 11778.  
Last modified: 8 Mar 2006, 23:49:13 UTC

I think that if you keep informing people there is at least one or two proteins you are folding that have a direct link to some disease it will help to bring more people to Rosetta.


Perhaps someone else can explain it better (or correct me if I'm wrong), but until then, let me try:

Rosetta's main goal is to eventually identify UNKNOWN proteins, i.e. those which 3D shape and thus biological function is currently unknown. Only after the determination of 3D shape has been performed, can biologists and biomed scientists study those shapes to decide that e.g. protein X is related to disease Y.

E.g. HPF2 project run at IBM's World Community Grid is applying Rosetta software (the one that we help develop here) to "cancer biomarkers" (proteins found at dramatically increased levels in cancer tissues), human secreted proteins and malaria. (see HPF update 16-jan-06

Here we're helping improve the algorithms used by projects such as HPF. Here's some more (from HPF):


"The Human Proteome Folding project is basic medical research. We are given some fundamental components of cells (proteins of unknown function) and we try to deduce their shape, then from this deduce which other proteins they interact with, and how. It is like pouring the components of an "Erector set" (editor: construction toy) onto the living room floor and trying to figure out what goes with what. The goal is to figure out the functional networks that drive basic cell processes. Once you have identified the function of a protein you can:

1. Select it as a target for a drug to interfere with its function
2. Figure out how it works and design a drug to duplicate the effect
3. Develop a diagnostic test to detect the concentration of that protein in order to measure the level of activity

Without the protein information, these three things are very important objectives that can only be accomplished by mass screening of a vast number of chemical compounds, hoping for a lucky breakthrough. Even with this information, a great deal of work, skill and luck is required to develop a drug.

The HPF project can provide very useful information for drug development, but it is aimed at basic understanding that can then be used to develop drugs. We are providing the shape information. Scientists studying the databases with this structural information will predict the function of the proteins (annotate the proteins)."


Have a look at the URL in my sig, I provide many links with info about the various life-science projects.
Best UFO Resources
Wikipedia R@h
How-To: Join Distributed Computing projects that benefit humanity
ID: 11799 · Rating: 0 · rate: Rate + / Rate - Report as offensive
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 11813 - Posted: 9 Mar 2006, 7:54:37 UTC
Last modified: 9 Mar 2006, 8:16:01 UTC

(2). I'm a bit disappointed that the total cpu power has remained constant for the past weeks rather than increasing as it had up until recently. More users and hosts are joining every day, but this is not translating into increased computing capability. We aim to extend our calculations to larger proteins based on teh success we ahve been having with proteins under 100 amino acids, but this will require a significant increase in computing power. Please let me know if there is anything I can do to help with recruiting


Dr D.B I've sent a request to the Australian Broadcasting Corporation (ABC, it's the Australian version of BBC or PBS (?) ), for a documentary on Rosetta@Home. To try to increase the participants in the project. Like all publiclly funded TV/Radio stations it's short of money so I can't promise anything, but even if they just do a radio programme it could make a differance. The way I see it is that 1000 users = about 1 million dollars worth of hardware, so every bit helps.

I just thought I should let you know, so that if anybody bothers ringing or e-mailing you form ABC Australia you have a chat with them :).

Ps I'm going to send you then request, so don't worry about getting an e-mail from hugothehermit.XXXXX it's just me ;)

Edit: I should at least be able to spell Australia you'd think :)
ID: 11813 · Rating: 0 · rate: Rate + / Rate - Report as offensive
1 · 2 · 3 · 4 . . . 10 · Next

Message boards : Rosetta@home Science : Comments/questions on Rosetta@home journal



©2024 University of Washington
https://www.bakerlab.org