Comments/questions on Rosetta@home journal

Message boards : Rosetta@home Science : Comments/questions on Rosetta@home journal

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 10 · Next

AuthorMessage
Profile rbpeake

Send message
Joined: 25 Sep 05
Posts: 168
Credit: 247,828
RAC: 0
Message 13164 - Posted: 7 Apr 2006, 11:18:17 UTC - in response to Message 13151.  

...we are already pushing the limits of public distributed computing with the moderate size proteins running now .

With new public machines having at least a gig of memory, perhaps you could establish a "Rosetta II" or something like that as well, whose minimum memory requirements are a gig. I don't know if this doubling of the memory requirement helps you, but more and more memory is being packed into the newer machines...

Regards,
Bob P.
ID: 13164 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Andrew

Send message
Joined: 19 Sep 05
Posts: 162
Credit: 105,512
RAC: 0
Message 13168 - Posted: 7 Apr 2006, 12:48:37 UTC - in response to Message 13164.  
Last modified: 7 Apr 2006, 12:49:31 UTC

With new public machines having at least a gig of memory, perhaps you could establish a "Rosetta II" or something like that as well, whose minimum memory requirements are a gig. I don't know if this doubling of the memory requirement helps you, but more and more memory is being packed into the newer machines...


Seasonal Attribution Project has a requirement of 1G memory, and according to BOINCstats there is about 580 users, and 700 hosts.

I think a Rosetta II could be one solution, but another could be an opt-in setting in our preferences for crunching large WU (as discussed else where in the forums)
ID: 13168 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile dag
Avatar

Send message
Joined: 16 Dec 05
Posts: 106
Credit: 1,000,020
RAC: 0
Message 13184 - Posted: 7 Apr 2006, 19:05:38 UTC - in response to Message 13151.  

[quote]Question 1:
Quoting from the journal (src)
[quote]
Question 2: from the science point of view, 2 hour and 8 hour work units are equivalent. however, 8 hour work units cut down on network traffic, and so are more optimal PROVIDED THAT the computer has a relatively low error rate. we are now cautiously increasing the default work unit length from 2 to 4 hours and we will see how this goes. but users who rarely encounter errors should definitely choose the 8 hour work unit option all other things being equal.


If I knew more about the checkpointing alg., I wouldn't have to ask this question. I have a portable that I move a couple of times per work day and I always do a nice suspend and wait until it clears from memory before I turn off the box. Even so, every turn-off probably loses the work that was done since the last checkpoint. Would an 8 hour WU cause more wasted cpu time than a 2 hour WU in this usage model?

dag
dag
--Finding aliens is cool, but understanding the structure of proteins is useful.
ID: 13184 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 13185 - Posted: 7 Apr 2006, 19:42:46 UTC - in response to Message 13184.  

Would an 8 hour WU cause more wasted cpu time than a 2 hour WU in this usage model?

dag



No the models has the same length with 8 H as in 2 H .

The computer just does alot more of them.

Anders n
ID: 13185 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 13189 - Posted: 7 Apr 2006, 21:37:00 UTC - in response to Message 13184.  

I always do a nice suspend and wait until it clears from memory


dag was talking about suspending the projects in the BOINC manager I presume. You might want to try using the hybernate shutdown on your portable. This basically pushes everything in memory out to disk, leaves the programs active and when you power up again, everything is where you left off. Not sure how that would work for BOINC. But it sure works great for word processors, browsers etc.

Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 13189 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile dag
Avatar

Send message
Joined: 16 Dec 05
Posts: 106
Credit: 1,000,020
RAC: 0
Message 13245 - Posted: 8 Apr 2006, 15:58:20 UTC - in response to Message 13189.  

I always do a nice suspend and wait until it clears from memory


...
You might want to try using the hybernate shutdown on your portable.
...


Putting it in standby mode may work for a Linux system, but we're talking about Windoz here. I depend on the daily reboot to clear the accumulated cruft. Also, it doesn't seem to do well with going to sleep on a docking station with wired LAN and waking up all alone with wireless LAN, different keyboard, mouse, etc., and vise-versa. (XP Pro w/SP2+)

dag
dag
--Finding aliens is cool, but understanding the structure of proteins is useful.
ID: 13245 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 13365 - Posted: 9 Apr 2006, 21:30:38 UTC

Would it be possible to run sample batches of new WUs through Ralph, and to post a notice here that a test run on Ralph is happening - i.e. it's own moderator-only thread (so those that shut off Ralph when they see no work can turn it back on?)

But these failures bring up a science related question or two. When looking at last month's WUs that ended up failing in less than a minute on 2Ghz machines, almost all of the ones reported failed 3 times and were discarded. At least one of them got picked up after two Windows client failures and finished fine on a Linux client. Was this because mostly because the Win clients so far outnumber the Linux clients, or more of the fact that the Linux clients were actually producing results for 2+ hours, while the Win clients were failing almost instantly and grabbing more WUs to fail?

What's different with the clients that allows Mac and Linux clients to handle these WUs that almost instantly fail on Windows clients?

What's different with the WUs themselves? I assume that biology doesn't allow amino acid/protein chains #87 and #88 in a 109 AA protein chain to be encrypted with "Bill Gates Suxx0rs" :) It'd be funny if it could be proven, but not likely. *grin* Are these larger or smaller than the WUs we've been working on - or have some unique visual feature? i.e. it looks like two plates held together in the middle with a few atoms that look like a rod - a yo-yo. (Yes, I know that's probably not even possible.)

You've mentioned that for some WUs that we've turned in 10,000 models for, that we haven't come close to the native structure. (You mentioned approaches that will allow us to go through the process twice.. and hopefully get close to those structures without having to run through 100,000 models/decoys or more..) I see at C562_EColi (the last result on the home page) that it lists the size as 106 AA; which is how we got used to having protein's size described at DF. At what size does the Rosetta client fail to get close enough to the native structure with 10,000 models/decoys? Or is it just that the longer the protein chain is, the more likely it has complex structures?
ID: 13365 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 13368 - Posted: 9 Apr 2006, 22:36:38 UTC - in response to Message 13365.  
Last modified: 9 Apr 2006, 22:44:31 UTC

Would it be possible to run sample batches of new WUs through Ralph, and to post a notice here that a test run on Ralph is happening - i.e. it's own moderator-only thread (so those that shut off Ralph when they see no work can turn it back on?)


The Work units are tested on RALPH. Most people do not actually suspend RALPH when there is no work, but run it at a reduced priority. That way when work is available, it loads automatically.


But these failures bring up a science related question or two. When looking at last month's WUs that ended up failing in less than a minute on 2Ghz machines, almost all of the ones reported failed 3 times and were discarded. At least one of them got picked up after two Windows client failures and finished fine on a Linux client. Was this because mostly because the Win clients so far outnumber the Linux clients, or more of the fact that the Linux clients were actually producing results for 2+ hours, while the Win clients were failing almost instantly and grabbing more WUs to fail?

Both really. When the WUs fail the system will of course load new ones. There are more Windows machines that any other platform, so a lot of work units were reissued after failing on a windows machine. Most of the failures are memory related. So environments that use memory more efficiently, or systems that are dedicated to running one project, can make more of the memory on board available for use by Rosetta and they have fewer problems.


What's different with the clients that allows Mac and Linux clients to handle these WUs that almost instantly fail on Windows clients?

As I said many of the issues are memory related. Most Macs come with at least 512Mb of memory when they are sold, and the Mac OS manages memory better than windows. Also the PowerPC chip-set is more flexible than the Intel chip-set internally and has a more robust instruction set. The way cashing is done on most Macs is also an advantage.

In any case, Rosetta is a very intensive application. It is much more like heavy graphics calculation and image rendering than many other BOINC projects. The Mac is designed for this type of application and Linux provides many of these same features for the intel systems. The Windows/intel system is more of a generalist system. So when pushed, the Mac OS (and Linux) tend to be more robust, and have more self correcting features. Underneath the Mac OS it is Unix. You can actually restart the Mac OS finder with applications running, and those applications will keep running undisturbed. You cannot do that with Windows. Most of the Macs on the Rosetta and Ralph project are running almost trouble free, as are a number of the Linux systems as you point out.


What's different with the WUs themselves? I assume that biology doesn't allow amino acid/protein chains #87 and #88 in a 109 AA protein chain to be encrypted with "Bill Gates Suxx0rs" :) It'd be funny if it could be proven, but not likely. *grin* Are these larger or smaller than the WUs we've been working on - or have some unique visual feature? i.e. it looks like two plates held together in the middle with a few atoms that look like a rod - a yo-yo. (Yes, I know that's probably not even possible.)

The newest batch of Work units are working on much larger protein structures. Some of these are the largest I have seen to date. This means they will take a LOT longer to run than the smaller Wus, and they require more memory. Many people assume that since the Work unit is running longer than they think it should it must be having a problem, and they cancel them. Usually, unless there is an obvious problem, if these are left alone to run, they will complete, but they will take more time than the user adjustable time setting. But the biggest difference in the work units is the size of the protein involved. There have also been some changes in the way the application approaches building the models.


You've mentioned that for some WUs that we've turned in 10,000 models for, that we haven't come close to the native structure. (You mentioned approaches that will allow us to go through the process twice.. and hopefully get close to those structures without having to run through 100,000 models/decoys or more..) I see at C562_EColi (the last result on the home page) that it lists the size as 106 AA; which is how we got used to having protein's size described at DF. At what size does the Rosetta client fail to get close enough to the native structure with 10,000 models/decoys? Or is it just that the longer the protein chain is, the more likely it has complex structures?[/quote]

This last one I will leave for the science team to answer in detail, but clearly as the protein size grows the model complexity grows geometrically. So it takes more models to cover all the possible energy domains
.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 13368 · Rating: 0 · rate: Rate + / Rate - Report as offensive
R/B

Send message
Joined: 8 Dec 05
Posts: 195
Credit: 28,095
RAC: 0
Message 13449 - Posted: 11 Apr 2006, 12:51:57 UTC

I read a post over in setiathome about Rosetta being mentioned in the newsletter of 'Livescience'. Good job, guys.
Founder of BOINC GROUP - Objectivists - Philosophically minded rational data crunchers.


ID: 13449 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile dgnuff
Avatar

Send message
Joined: 1 Nov 05
Posts: 350
Credit: 24,773,605
RAC: 0
Message 13580 - Posted: 12 Apr 2006, 21:20:04 UTC

Mostly to David Baker, but if you're looking in, the last couple of posts you've made in your journal cover 95% of the reason I'm here. There's somthing I really like about being in right at startup, and knowing that the crunching I'm doing is helping to actually fine tune the tools that someday might find a cure for HIV, or Malaria.

Keeping that in mind, in your last note you talk about sidechain sampling. Is it possible to explain in layman's terms what this is? So far, you've been able to do a great job of explaining the search methodologies, talking about explorerers on a planet, or explaining how the HIV search will go: trying to design a custom protein that will allow our immune system to generate antibodies effective against the "static" portions of the virus. Etc.

Any chance of an explanation of that sort for sidechains? Enquiring minds want to know. :)
ID: 13580 · Rating: 0 · rate: Rate + / Rate - Report as offensive
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 13618 - Posted: 13 Apr 2006, 5:08:06 UTC - in response to Message 13580.  

Mostly to David Baker, but if you're looking in, the last couple of posts you've made in your journal cover 95% of the reason I'm here. There's somthing I really like about being in right at startup, and knowing that the crunching I'm doing is helping to actually fine tune the tools that someday might find a cure for HIV, or Malaria.

Keeping that in mind, in your last note you talk about sidechain sampling. Is it possible to explain in layman's terms what this is? So far, you've been able to do a great job of explaining the search methodologies, talking about explorerers on a planet, or explaining how the HIV search will go: trying to design a custom protein that will allow our immune system to generate antibodies effective against the "static" portions of the virus. Etc.

Any chance of an explanation of that sort for sidechains? Enquiring minds want to know. :)



Sure (and thanks for the words of encouragement!). sorry about the jargon. proteins are chains of amino acids. there are 20 types of amino acids. they have a common backbone, but differ in the "sidechains". when they are spliced together, the resulting protein has a linear "backbone" with "sidechains" coming off, one for each amino acid. these sidechains can adopt a number of different possible conformations (shapes). the improvement is to try out many different possibiities for each of these shapes at each step in our sampling processs. as I said in the journal, this helps the search, but before we can deploy this on rosetta@home we need to track down and fix the windows specific problem.
ID: 13618 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Grutte Pier [Wa Oars]~MAB The Frisian
Avatar

Send message
Joined: 6 Nov 05
Posts: 87
Credit: 497,588
RAC: 0
Message 13635 - Posted: 13 Apr 2006, 12:37:16 UTC - in response to Message 13619.  
Last modified: 13 Apr 2006, 12:39:14 UTC

Good news today:

second, David Kim has awarded credits to those who lost valuable time during the problems last weekend.


Suppose it will be handled just as fast as those credits for time exceeding WU's and in that case : just don't bother as I will not bother to report them anymore.
Already written those off.
Counting the days "17".
ID: 13635 · Rating: -2 · rate: Rate + / Rate - Report as offensive
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 14437 - Posted: 23 Apr 2006, 3:00:30 UTC

Nice peek over the "cubicles" in the communications between programers. Brings back a lot of nightmares ... Uh I mean memories. But for some of the younger readers that might be interested in a future in this kind of work, it provides a little insight into the coordination that goes on all days on a programing team.
Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 14437 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Cureseekers~Kristof

Send message
Joined: 5 Nov 05
Posts: 80
Credit: 689,603
RAC: 0
Message 14470 - Posted: 23 Apr 2006, 11:54:46 UTC

I'm curious to see the source code.
When I read in the e-mails the names of the functions [get_the_hell_out()]
I guess the complete source code must be an equivalent :p
Member of Dutch Power Cows
ID: 14470 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 14481 - Posted: 23 Apr 2006, 14:51:39 UTC

Could the watchdog message be revised?
Rather than:
Rosetta score stayed the same too long. Watchdog is killing the run!

How about:
Rosetta score stayed the same too long. Watchdog is ending the run!

or, keeping inline with the programmer humor going here, perhaps:
Watchdog is barking. Postman ending delivery.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 14481 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 14638 - Posted: 26 Apr 2006, 7:51:50 UTC - in response to Message 14481.  

How embarrassing... both the watchdog "killing" and get_the_hell_out() are my silly recent contributions. Its not representative of the rest of the code -- I didn't expect these error messages and functions to get broadcast to such a wide audience! I assure you that the rest of the code is totally dry and scientific. We'll change "killing" to "ending" in the release after the next (I read the message too late).


Could the watchdog message be revised?
Rather than:
Rosetta score stayed the same too long. Watchdog is killing the run!

How about:
Rosetta score stayed the same too long. Watchdog is ending the run!

or, keeping inline with the programmer humor going here, perhaps:
Watchdog is barking. Postman ending delivery.


ID: 14638 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 14639 - Posted: 26 Apr 2006, 8:20:47 UTC

I had a friend that worked in tech support at Microsoft - who had a noose dangling from the ceiling and inside the noose was a picture frame. It was in reference to "My windows is hung."

Now I can picture Rhiu's workspace with a 12 inch blade dangling from the ceiling and stained in catchup.. :)

In reference to "terminating a process with extreme prejudice".. *grin*

Here's hoping the code helps eliminate most of the remaining bugs that are frustrating the users.. so we can run through CASP with as few problems as possible.


ID: 14639 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Cureseekers~Kristof

Send message
Joined: 5 Nov 05
Posts: 80
Credit: 689,603
RAC: 0
Message 14640 - Posted: 26 Apr 2006, 8:22:31 UTC

Why embarrassing? I like the name of the function ;)

Member of Dutch Power Cows
ID: 14640 · Rating: 0 · rate: Rate + / Rate - Report as offensive
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 14736 - Posted: 27 Apr 2006, 8:16:31 UTC

Just a comment to Davids last entry. I would strongly suggest to delay the release of the new app till Monday and use the Weekend for thorough testing. Failing WUs are a real burden for the project since they are sent out again for several weeks. We all hope that this won't happen with the new watchdog but you never know.
ID: 14736 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Cureseekers~Kristof

Send message
Joined: 5 Nov 05
Posts: 80
Credit: 689,603
RAC: 0
Message 14738 - Posted: 27 Apr 2006, 9:39:31 UTC

I agree with tralala!
If there happens anything now, the project will lose more power than with 2 days delay of a new version...

Member of Dutch Power Cows
ID: 14738 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 10 · Next

Message boards : Rosetta@home Science : Comments/questions on Rosetta@home journal



©2024 University of Washington
https://www.bakerlab.org