Optimized Rosetta App...

Message boards : Number crunching : Optimized Rosetta App...

To post messages, you must log in.

AuthorMessage
Rafael Cesare Valente "CrackBoy"

Send message
Joined: 19 Jun 06
Posts: 1
Credit: 19,153
RAC: 0
Message 19879 - Posted: 7 Jul 2006, 11:36:53 UTC

Hi all...

I came from S@H and E@H... So, there are some app on that projects that have specific optimized codes for each type of processor ( SSE, SSE2, SSE3, MMX ). Is rosetta have the same type of app or we only have the stock app?

Best Regards,

Rafael Valente
ID: 19879 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1829
Credit: 115,876,923
RAC: 60,948
Message 19880 - Posted: 7 Jul 2006, 11:50:35 UTC

There's only the stock app at the moment. There are a few discussions on the subject, but it appears that SSE support etc might not increase output anyway. There's a recent discussion on 64 bit, which might even slow the app down.

HTH
Danny
ID: 19880 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mats Petersson

Send message
Joined: 29 Sep 05
Posts: 225
Credit: 951,788
RAC: 0
Message 19882 - Posted: 7 Jul 2006, 13:02:11 UTC
Last modified: 7 Jul 2006, 13:02:27 UTC

Rafael,

You may want to have a look at the discussion under the title of "64-bit" in this section of the message-board. [it's just a few subjects down from the top].

A summary is as follows:
1. 64-bit is unlikely to give MUCH, if any, performance benefit, and MAY possibly be detrimental to performance.

2. SSE-optimisation would probably work if someone sat down and rewrote the right section(s) of code - but at the moment, the code is in flux, so it's quite possible that such work is going to end up being wasted, since the next version isn't doing the same calculations...

3. It's not fair to compare for example S@H performance benefits from optimizing, with R@H, since R@H is doing a completely different set of Math, and the benefit of performance optimizing it would be small indeed.

Currently, R@H is not available to the public in source-form, so it's not possible to make simple tests (like throw different optimization switches to the compiler and see what it makes of it).

It is fairly clear, however, that Rosetta is mainly doing FPU-math, and very little else (about 2/3 of the instructions executed in Rosetta are FPU operations). So using SSE instructions would be one way to improve the FPU throguhput. As Leonard points out in the 64-bit thread, it's possibly also going to affect the results, but I find that unlikely - but not impossible. So carefull checking of the results from the optimized application would be necessary for sure.

Please feel free to make your own research - together we may be able to help out.

Of course, one other problem with optimized code is that the number of machines that have (SSE, SSE2, SSE3, etc) are increasingly smaller compared to the entire set of machines available. So although the code may run faster, it's only a limited amount of machines that are viable to run it faster... Going with stock SSE would be OK I think, as that's been around for some time, and there's probably a good number of machines with this feature. Ideally we'd have a multiple implementatin in a single binary, so there's no separate optimized version, but all are available in the same executable...

--
Mats
ID: 19882 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 19884 - Posted: 7 Jul 2006, 16:04:29 UTC

Hey Rafael, welcome to Rosetta.

Basically, Rosetta is cutting edge science. As such, they are constantly changing and improving it, and making modifications to try out new ideas for predicting the protein structures... and so the application doesn't remain unchanged long enough to be optimized. I mean you'd have to perform any hand optimizations over and over every time they improve the science... and do it for each specific processor class of optimization you've got out there. So, I think it is the project's stance to produce and support one executable for each operating system, that works regardless of specific CPU model.

Performing such optimizations for many processor specific environments would be time that could be better spent devising new methods of attacking the protein structure predicition problem we're all working to solve. So, they seem to have adopted the approach to invest the time in the science rather than the optimizations. If the science improves, it will easily leapfrog the benefits of any optimizations. I mean if they devise a new approach that works, it could easily cut the number of models needed for accurate prediction in half. ...and then another new approach after that could cut it in half again. So, time spent on the science is the way to go.

The search here isn't as cut and dried as a radio frequency pulse where the challenge is to scan spectrum in detail. We're searching the "universe" of possible atom configurations, and measuring each potential structure to find a structure with the best (lowest) energy levels. If we somehow get smarter about that search and can successfully eliminate potential structures that we've learned will not produce a low energy result, then we cut the "search" time dramatically.

With Rosetta, there is no "does it exist" question. We KNOW a native protein structure exists. The question is WHERE? To draw a SETI analogy... if I told you that ET exists, and he's broadcasting a beacon on frequency X Mhz... it would REALLY cut down the computing you have to do to perform the search! NOW your challenge would be to gather the radio data from more points in space to find WHERE he exists.

Over time, Rosetta will "learn" which structures to eliminate from the search space. To use the analogy... we will find the proper frequency to be searching.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 19884 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mats Petersson

Send message
Joined: 29 Sep 05
Posts: 225
Credit: 951,788
RAC: 0
Message 19887 - Posted: 7 Jul 2006, 17:00:27 UTC - in response to Message 19884.  

Feet1st:

Don't take this reply as a "You're wrong, I'm right" type of answer. I do understand the situation - I'm just arguing that there may be a case for doing SOME optimization. Even if it only gives 20-30% faster calculations, would it not be worth it to get more work done in the same amount of time? [And yes, 20-30% shouldn't be hard to manage, having looked a bit at the code].



Hey Rafael, welcome to Rosetta.

Basically, Rosetta is cutting edge science. As such, they are constantly changing and improving it, and making modifications to try out new ideas for predicting the protein structures... and so the application doesn't remain unchanged long enough to be optimized. I mean you'd have to perform any hand optimizations over and over every time they improve the science... and do it for each specific processor class of optimization you've got out there. So, I think it is the project's stance to produce and support one executable for each operating system, that works regardless of specific CPU model.

There are methods to solve the "One binary - Optimized for several CPU models" - many of the current applications available today do this. It looses the last 1-2% of optimization, but that's a small loss compared to the benefit.

Performing such optimizations for many processor specific environments would be time that could be better spent devising new methods of attacking the protein structure predicition problem we're all working to solve. So, they seem to have adopted the approach to invest the time in the science rather than the optimizations. If the science improves, it will easily leapfrog the benefits of any optimizations. I mean if they devise a new approach that works, it could easily cut the number of models needed for accurate prediction in half. ...and then another new approach after that could cut it in half again. So, time spent on the science is the way to go.

Of course, the first optimization is to find the right algorithm to solve the problem - optimizing before that point is kind of useless, as changing the algorithm may give the 10-100x improvement that makes any small optimization effort pretty much useless.

On the other hand, if some of the people here who are willing to offer their spare time to make optimizations did them to specific parts of the code that are known to not be changing very much, that may help getting the code to work faster, without loosing any scientists time on the project - that, I think, is a Win-Win situation.


The search here isn't as cut and dried as a radio frequency pulse where the challenge is to scan spectrum in detail. We're searching the "universe" of possible atom configurations, and measuring each potential structure to find a structure with the best (lowest) energy levels. If we somehow get smarter about that search and can successfully eliminate potential structures that we've learned will not produce a low energy result, then we cut the "search" time dramatically.

With Rosetta, there is no "does it exist" question. We KNOW a native protein structure exists. The question is WHERE? To draw a SETI analogy... if I told you that ET exists, and he's broadcasting a beacon on frequency X Mhz... it would REALLY cut down the computing you have to do to perform the search! NOW your challenge would be to gather the radio data from more points in space to find WHERE he exists.

Over time, Rosetta will "learn" which structures to eliminate from the search space. To use the analogy... we will find the proper frequency to be searching.


I completely agree with this - improving FFT calculations have been done many years, and it's pretty easy to find some ready-made optimized FFT-code somewhere for processor X model N.

--
Mats

ID: 19887 · Rating: -1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 19888 - Posted: 7 Jul 2006, 17:27:41 UTC

On the other hand, if some of the people here who are willing to offer their spare time to make optimizations did them to specific parts of the code that are known to not be changing very much, that may help getting the code to work faster, without loosing any scientists time on the project - that, I think, is a Win-Win situation.


Mats, I agree. Could be helpful, the price is right (i.e. free from volunteers), and very little time requirement on the project team. However, it doesn't appear to be the approach to the issue that they've chosen to take, and I respect their decision, even if I do not completely understand the basis for it.

I was more focused on helping Rafael (and future readers) understand Rosetta as compared to other projects, because I cannot influence the decision the project team has made about how they handle release of their source code.

In order to try and channel your skills towards other things that can help Rosetta, even if they chose not to make source code available... would you be interested in helping do the minor coding needed for an easy to install CD?? This could bring thousands more users to Rosetta. Read more about this idea here, and if you can offer help, or other ideas to help Rosetta, post in that thread.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 19888 · Rating: -1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Dimitris Hatzopoulos

Send message
Joined: 5 Jan 06
Posts: 336
Credit: 80,939
RAC: 0
Message 19898 - Posted: 7 Jul 2006, 19:04:13 UTC
Last modified: 7 Jul 2006, 19:05:21 UTC

IMHO people with expertise in code optimisation should get in contact with the project directly and obtain the source code.

Doing it akosf style (work on the binary .exe with TurboDebugger) won't be very useful, because the project changes the code very frequently (even the public R@H app, 2 updates per month isn't uncommon)
Best UFO Resources
Wikipedia R@h
How-To: Join Distributed Computing projects that benefit humanity
ID: 19898 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 19900 - Posted: 7 Jul 2006, 19:16:02 UTC - in response to Message 19898.  
Last modified: 7 Jul 2006, 19:16:17 UTC

IMHO people with expertise in code optimisation should get in contact with the project directly and obtain the source code.


That's currently negotiated. I hope I will have the source in a few days and start looking on optimizations together with some experts (Mats for example).

ID: 19900 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Optimized Rosetta App...



©2024 University of Washington
https://www.bakerlab.org