Improvements to Rosetta@home based on user feedback

Message boards : Number crunching : Improvements to Rosetta@home based on user feedback

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Profile Win2Kuser

Send message
Joined: 2 Nov 05
Posts: 7
Credit: 2,372,223
RAC: 0
Message 12938 - Posted: 2 Apr 2006, 9:45:08 UTC

Well, I've written a small app in VB6 that will scan a network of Windows machines, and send the abort command to any jobs that have stalled.

It isn't pretty, and I can't guarantee that it will work in all setups, but it should at the very least bring your attention to a stalled job.

You can read about it here

Hope it helps :)
ID: 12938 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile JChojnacki
Avatar

Send message
Joined: 17 Sep 05
Posts: 71
Credit: 10,038,829
RAC: 5,754
Message 14199 - Posted: 20 Apr 2006, 21:46:17 UTC
Last modified: 20 Apr 2006, 21:47:30 UTC

Hi,

I was just looking at the Server Status on the Rosetta@home front page. While reading the number of WU's that had been successfully processed in the last 24 hours, I had a thought. Would it be possible to add another line item? But this additional successes line would be for the number of Models processed in the last 24 hours instead.

I understand it is not technically part of the standard Boinc server functions as such, but I think it would be interesting to see that information. Could the information be taken out of the correct location on all the various Result Id's pages, from a given 24 hours? If the information is stored in some other database, well, that could work as well.

Anyway, just a thought.

Joel~


ID: 14199 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
The Bee

Send message
Joined: 26 Sep 05
Posts: 9
Credit: 41,925
RAC: 0
Message 14209 - Posted: 21 Apr 2006, 1:04:41 UTC

I'd like to see on the graphics page a value for the lowest energy figure calculated so far, in addition to the current "Accepted Energy" & "Accepted RMSD" values.

Sorry if I've posted this in the wrong place, but I couldn't find a thread for suggestions.
ID: 14209 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 14220 - Posted: 21 Apr 2006, 3:14:06 UTC - in response to Message 14199.  
Last modified: 21 Apr 2006, 3:14:47 UTC

Good point. The models quantifies the science being done, not the WUs nor the credits.

I'd like to see the WU grouping names, and the project's goal for the number of models to crunch against each, with a progress bar on how close we are to the goal. Something like:
WU Name                   Models processed     Model goal      % of goal 

VP_PRODUCTION                   12,345          50,000             24%
VP_PRODUCTION_truncate           6,543          25,000             26%
PROD_ABINITIO                   15,678          95,000             17%
FACONTACTS_RECENTER_NOFILTERS      100           5,000              2%

Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 14220 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile JChojnacki
Avatar

Send message
Joined: 17 Sep 05
Posts: 71
Credit: 10,038,829
RAC: 5,754
Message 14262 - Posted: 21 Apr 2006, 14:16:41 UTC - in response to Message 14220.  

Good point. The models quantifies the science being done, not the WUs nor the credits.

I'd like to see the WU grouping names, and the project's goal for the number of models to crunch against each, with a progress bar on how close we are to the goal. Something like:
WU Name                   Models processed     Model goal      % of goal 

VP_PRODUCTION                   12,345          50,000             24%
VP_PRODUCTION_truncate           6,543          25,000             26%
PROD_ABINITIO                   15,678          95,000             17%
FACONTACTS_RECENTER_NOFILTERS      100           5,000              2%

I kind of like this idea as well. Very Nice.

Though, I would think the list of WU Names may be rather long. So, I could see that on a separate page maybe? With a link to it, right by the Models processed in last 24 hours, under the server status. Does that sound reasonable?

Joel~
ID: 14262 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Knorr

Send message
Joined: 18 Feb 06
Posts: 21
Credit: 373,953
RAC: 0
Message 14492 - Posted: 23 Apr 2006, 17:48:11 UTC

It would be nice if you could see the result of your WU.

That is the lowest energy and RMSD YOU have predicted in the WU.
The info could maybe be added to the stderr out data?

Would be great to see how far off you were from winning the top prediction of that protein... ;)
ID: 14492 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 14510 - Posted: 23 Apr 2006, 23:42:07 UTC - in response to Message 14492.  
Last modified: 23 Apr 2006, 23:48:46 UTC

It would be nice if you could see the result of your WU.

That is the lowest energy and RMSD YOU have predicted in the WU.
The info could maybe be added to the stderr out data?

Would be great to see how far off you were from winning the top prediction of that protein... ;)

Actually this exists, but it has not been updated in a while (I will ask them to do that when they have some time).

The page is here but be aware it takes a few moments to load.

Looking at your signup date unless one of the proteins in the list has been worked on since you started the project you would not be in the list. You can check the protein list here and see if you can find a current WU type that is also on that list. (I think I will ask them to make that linkage as well)

Thank you for the idea.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 14510 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 14522 - Posted: 24 Apr 2006, 6:01:08 UTC - in response to Message 14510.  

It would be nice if you could see the result of your WU.

That is the lowest energy and RMSD YOU have predicted in the WU.
The info could maybe be added to the stderr out data?

Would be great to see how far off you were from winning the top prediction of that protein... ;)

Actually this exists, but it has not been updated in a while (I will ask them to do that when they have some time).

The page is here but be aware it takes a few moments to load.

Looking at your signup date unless one of the proteins in the list has been worked on since you started the project you would not be in the list. You can check the protein list here and see if you can find a current WU type that is also on that list. (I think I will ask them to make that linkage as well)

Thank you for the idea.


We wanted to resume this reporting, but David Kim found that the load it puts on the database server is currently too large.
ID: 14522 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 14524 - Posted: 24 Apr 2006, 7:40:23 UTC
Last modified: 24 Apr 2006, 7:42:58 UTC

Would having [pre-rendered] .gif file of all our results in red, followed by a green overlay with one point for lowest RMSD and one point for our lowest energy reduce the load on the db server? [So that only the green overlay .gif needs to be rendered when we request the image?]

And if not.. then how about releasing an app with downloadable zips of databases of userids and RMSD and energy scores? Let us move up and down the list of databases we've downloaded, and pull our userid (and computer id if we have more than one system) out of the Rosetta directory, and show where we would end up on the plot.

After all, it's a lot more interesting to see that spot or spots, than to read a line that states that the lowest RMSD was 2.1 Angstroms, the highest was 12 Angstroms, the median/average was ?, and ours was.. ??.
ID: 14524 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 14540 - Posted: 24 Apr 2006, 14:47:52 UTC

So that only the green overlay .gif needs to be rendered when we request the image?


I agree!! Seeing is believing. Certainly not vital to the science, but will keep users interested.

Sounds like a great idea to ease the load. You might have several green markers for your results, and various machines.

It would also be cool to see all the results of your team.

One of the suggestions we got from younger crunchers was:
Apperances... It has to look exciting to catch the first interest. Then it should be easy to figure out what the hole thing is.


This results page is a great step in that direction.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 14540 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
senatoralex85

Send message
Joined: 27 Sep 05
Posts: 66
Credit: 169,644
RAC: 0
Message 15009 - Posted: 29 Apr 2006, 16:10:43 UTC - in response to Message 14522.  

It would be nice if you could see the result of your WU.

That is the lowest energy and RMSD YOU have predicted in the WU.
The info could maybe be added to the stderr out data?

Would be great to see how far off you were from winning the top prediction of that protein... ;)

Actually this exists, but it has not been updated in a while (I will ask them to do that when they have some time).

The page is here but be aware it takes a few moments to load.

Looking at your signup date unless one of the proteins in the list has been worked on since you started the project you would not be in the list. You can check the protein list here and see if you can find a current WU type that is also on that list. (I think I will ask them to make that linkage as well)

Thank you for the idea.


We wanted to resume this reporting, but David Kim found that the load it puts on the database server is currently too large.



I took a look at the graphs and the axis do not have any units on them. Can this be updated? Also, how come there is not a quorum of say 2 to make sure that there are no errors. All of the workunits I have gotten so far have a quorum of 1.

ID: 15009 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 15032 - Posted: 29 Apr 2006, 18:05:44 UTC - in response to Message 15009.  

how come there is not a quorum of say 2 to make sure that there are no errors. All of the workunits I have gotten so far have a quorum of 1.


A quorum of 2 would cut the amount of useful work being crunched in half. And there is no reason to do that. While it takes a while to crunch a WU, verifying that the result is valid can be done very quickly.
ID: 15032 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rusty Lafavour

Send message
Joined: 12 Apr 06
Posts: 4
Credit: 26,391
RAC: 0
Message 15039 - Posted: 29 Apr 2006, 19:06:41 UTC

I started before 5.01. After getting start up errors (mine) over 5.01 loaded and all the WU's compeleted sussesfully runnig it. Missed 5.06 as I had a couple of days of work downloaded to go through. Now runnig 5.07. Runnig GREAT!!!! Keep up the good work as we go through many errors before prefection. Just putting in my two cents.
ID: 15039 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 15046 - Posted: 29 Apr 2006, 22:06:04 UTC - in response to Message 15009.  

how come there is not a quorum of say 2 to make sure that there are no errors. All of the workunits I have gotten so far have a quorum of 1.

Someone else stated it pretty well. Something about 95% of the models attempted are not fruitful, and you don't have to do them all again to prove that.

In other projects, my understanding is that the primary reason for quorums is to validate the credit claims. That's right, cut the effective TFLOPS by 2/3rds just to be sure everyone gets fair credit.

Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 15046 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 15051 - Posted: 30 Apr 2006, 0:21:05 UTC - in response to Message 15046.  
Last modified: 30 Apr 2006, 0:29:19 UTC

how come there is not a quorum of say 2 to make sure that there are no errors. All of the workunits I have gotten so far have a quorum of 1.

Someone else stated it pretty well. Something about 95% of the models attempted are not fruitful, and you don't have to do them all again to prove that.

In other projects, my understanding is that the primary reason for quorums is to validate the credit claims. That's right, cut the effective TFLOPS by 2/3rds just to be sure everyone gets fair credit.



Sorry for the long post;

Unlike many other projects, Rosetta really only has one correct result for each protein. Either the protein model correctly reflects the structure of the protein being tested or it does not. While more than one computer in a Work Unit run might find that answer (that would be good) it is currently very likely that none of them will. Hence the importance of the protein of the day posted on the front page of the project.

In most cases for the current batches of Work Units the structure is known. You can see that in the graphic as the "Native" structure. Since what we are working on here is developing an accurate modeling software, the Work Units that do not yield the correct structure are a valuable guide to what did not work.

Similar to a pile of small magnets pulling and pushing on each other, the molecules in a protein push and pull on each other with a force referred to as "energy" by the project team. Since in most known cases stable protein structures fall in the lowest possible energy categories for a given combination of molecules, the Work Unit/Application combination is seeking that low energy point (where all those magnets can form a stable shape) in the search for the correct structure.

So if we can find an application that will successfully and consistently find the lowest energy structures, it can be assumed that among those structures will be found the correct one. What Rosetta hopes to find is an approach that is not simply a brute force computing approach. So the applications and work units change in an attempt to find new and better ways to do the calculating.

You can begin to see the problem on the larger proteins. If you look at the graphic, every one of those small "kinks" in the images is a possible location for a bend, or angle, in the shape, and each bend has an energy. There can be billions of combinations of these bends, but only a very few combinations represent stable low energy structures.

So for now what is important to the project is direct computing power to apply to looking at all those possible shapes searching for the correct (and possible) shapes for structures we already know. If a reliable approach is found for known shapes then it "should" work for those we do not know. When we have that, then we can really get down to finding the causes of protein related decease, and then the cures.

So for what we are doing here, the power (no redundancy=more power) is far more important than any particular single result answer. All the answers are of value. And when the science guys get all this right, we will all know it because suddenly a very large number of results (perhaps all of them) will have the same answers (structures), and they will match the known structure. Moreover, it will be repeatable when they run it again. Later when the structure is not known, the modeling software will help by eliminating a lot of the wrong answers allowing experimental people to locate the right structure easier.

In any case the comment that at this point redundancy is only necessary for credits is true. And soon that will not be necessary either. On the SETI Beta project they are using a different credit system, that would not require redundancy and it seems to be working.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 15051 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 15059 - Posted: 30 Apr 2006, 2:30:01 UTC - in response to Message 15046.  

how come there is not a quorum of say 2 to make sure that there are no errors. All of the workunits I have gotten so far have a quorum of 1.

Someone else stated it pretty well. Something about 95% of the models attempted are not fruitful, and you don't have to do them all again to prove that.

In other projects, my understanding is that the primary reason for quorums is to validate the credit claims. That's right, cut the effective TFLOPS by 2/3rds just to be sure everyone gets fair credit.

It was discussed in a thread on Validation here:
Validation thread
A couple posts after Dimtris posted the graphic of our recent results, I pointed out how few of those 10,000 points (results) were in the same ballpark in terms of low RMSD or low energy as the best results. When asking for a quorum of 2 or 3.. take a look at our results and you'll hopefully see what a rediculous waste of cpu time you're proposing.
In case the posts in the validation thread, nor this one (including Moderator9's latest post) don't touch on it clearly enough, we're not testing for cures here; we're not testing radio signals, etc where missing a positive result would destroy the whole value of the project. But a corrupt result is not going to mean that we're missing a radio signal from the general direction of Antares.. and we're not going to miss the fact that a psytoplankton enzyme is a perfect protection against AIDS. At worst, we lose the lowest RMSD point.. or the lowest Energy point. If Nec had managed to corrupt Rosetta that week by using his system to edit videos of the Teletubbies (not implying that Nec would perform such a vile task) - the worst that would have happened would have been that we'd have lost the lowest RMSD point (and the FADBeens would have harrassed him about his bad taste). The lowest RMSD would be a little higher.. and we'd still be looking for ways of improving our results so we can get closer to the experimental native forms of these proteins. We're not looking for a needle in a haystack here. If we lose the best result.. bummer. But it's a 1 in 10,000 event.
And while every point... even those with the absolute highest RMSD have great value to the project, I still look forward to the time when the lowest RMSD and lowest energy point is the same point.. and is overlayed with 10-50 others with the exact same readings.

Visual aids really help - so being able to see how close you came to the RMSD - or how close all the members of your team came to the RMSD - could help foster competition.. but having an application so we could see the best results from all our machines (for the multi computer crowd) or from our team can spark more interest.


Working to reduce client errors, and test out the new approaches proposed by the various Rosetta staff - to see which actually improve our results will be the way I'd go to improve the usefullness of this project.
ID: 15059 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dag
Avatar

Send message
Joined: 16 Dec 05
Posts: 106
Credit: 1,000,020
RAC: 0
Message 15686 - Posted: 8 May 2006, 16:16:53 UTC

A couple of weeks ago something about RAH or BOINC jammed one of my hosts. I had to do a reinstall to clear it. Now I have two extra copies of that host in my computer list. It sure would be nice to be able to merge the several thousand pts. into one and to delete the trash.
dag
--Finding aliens is cool, but understanding the structure of proteins is useful.
ID: 15686 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 15688 - Posted: 8 May 2006, 16:26:13 UTC - in response to Message 15686.  

A couple of weeks ago something about RAH or BOINC jammed one of my hosts. I had to do a reinstall to clear it. Now I have two extra copies of that host in my computer list. It sure would be nice to be able to merge the several thousand pts. into one and to delete the trash.

all the points from those hosts is still under your total credit. you won't lose them. For housekeeping sake it would be nice to have "merge" and "delete" back, but it really doesn't affect you other than it's "untidy". Well, it does throw off total host count numbers at stats sites.
ID: 15688 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile UBT - Halifax--lad
Avatar

Send message
Joined: 17 Sep 05
Posts: 157
Credit: 2,687
RAC: 0
Message 15842 - Posted: 10 May 2006, 21:18:35 UTC

I would like to know when AMS support will be made available on RALPH & Rosetta, this would mean upgrading the server to enable support for the AMS.

The majority of projects have already upgraded and there is only a small minority who have not done so as yet, this project & RALPH been one of them.

So please do this upgrade ASAP!

Thanks
Dave
Join us in Chat (see the forum) Click the Sig


Join UBT
ID: 15842 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 15859 - Posted: 11 May 2006, 0:51:19 UTC - in response to Message 15842.  

I would like to know when AMS support will be made available on RALPH & Rosetta, this would mean upgrading the server to enable support for the AMS.

The majority of projects have already upgraded and there is only a small minority who have not done so as yet, this project & RALPH been one of them.

So please do this upgrade ASAP!

Thanks
Dave

While certainly circumstances may dictate otherwise over time, I would not expect to see any server upgrades during CASP without a very compelling reason.
Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 15859 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Improvements to Rosetta@home based on user feedback



©2024 University of Washington
https://www.bakerlab.org