Posts by Mark

1) Message boards : Number crunching : I've got (Message 80579)
Posted 30 Aug 2016 by Mark
Post:
thought you were going to say this...

https://www.youtube.com/watch?v=btEpF334Rtc
2) Message boards : Rosetta@home Science : Genetic Algorithm (Message 80318)
Posted 29 Jun 2016 by Mark
Post:
i'm noob in these, but i'm thinking that a lower energy fold minima may in some cases be possible but it may 1st need to transit to a 'higher energy' barrier to reach an even lower energy minima. if this is true i'd guess nature may after all settle for 'higher energy' minima? lol


The answer to that is "maybe" and "it depends". Its certainly possible there is an activation energy type barrier blocking its access to a lower overall energy state. More likely though is that there are multiple low energy states and in fact biology utilises these various states with their associated conformational changes for biological purposes. Haemaglobin is the first that springs to mind, although there are thousands of examples.

Dont think Rosetta is really geared up to identify multiple states, but may be wrong....
3) Message boards : Number crunching : GPU computing (Message 80092)
Posted 17 May 2016 by Mark
Post:
I get all the points made here. I'm not saying its easy. The original question is at what point does the difference between cpu and gpu performance become so great that the conversion project becomes worthwhile.

The Rosetta code got upgraded in a big project (to C++ I think) a while back. I am talking about a similar effort. Yes, I realise it will take time and ultimately money. Anyone have a feel for how much it would cost? I plucked a figure out of the air a bit, was it reasonable?
4) Message boards : Number crunching : GPU computing (Message 80082)
Posted 14 May 2016 by Mark
Post:
I've just been looking at the performance of the new GTX1080 and for DOUBLE precision calculations it does 4 Tflops!!!! For comparison a relatively high performance chip like an overclocked 5820K will do maybe 350GFlops. So we are talking an order of magnitude difference. In addition the Tesla HPC version will probably be double that at 8 TFlops. (Edit: Looks like it is actually 5.3TFlops) The Volta version of the gtx1080 (next gen on, due in about 18 months time) is rumoured to be 7TFlops FP64 in the consumer version.

There is no way that conventional processors can keep up with that level of calculation. At what point does the gap between serial CPU and parallel GPU have to be before the project leaders decide they can not afford NOT to invest in recoding to parallel processing? Because by 2 years time, HPC GPUs will be around 35 times faster than CPUs. How much will it cost to rewrite the code, $100-150K maybe?? Isn't that worth paying for such a huge step up?

With that kind of performance increase, you can make calcs more accurate. You no longer have to use approximations like LJ potentials, you can calculate the energy accurately and get a better answer in a quicker time than now. Whats not to like?

It seems like so many projects, everyone is comfortable with what they are doing now. Revolution has been forsaken for evolution. Understandable, but the best way to do things?

Be bold and take the leap!
5) Message boards : Number crunching : 300+ TeraFLOPS sustained! (Message 80061)
Posted 10 May 2016 by Mark
Post:

Interesting but going slow.


I would also like to add my thanks to you for your efforts. It appears you could make a real impact far beyond running your own crunching
6) Message boards : Rosetta@home Science : zero gravity, hi-magnetisim (Message 79735)
Posted 8 Mar 2016 by Mark
Post:
Well since astronauts survive in space I would assume their proteins are folding OK. The hydrophobic forces/hydrogen bonds/charges remain the same and so the fold is probably the same. Gravity changes what is up and down, not a concern in protein folding.

As for magnetic fields, maybe an issue with metal co-factors, but otherwise wouldn't think much difference
7) Message boards : Number crunching : Raspberry Pi 3 (Message 79687)
Posted 2 Mar 2016 by Mark
Post:
clusters of arm chips would not match the top tier haswell or skylake gflops v energy requirements (gflops per watt)


This. I had a similar thought and did some rough calcs comparing a cluster of Pis and an overclocked 5820K. The Intel chip won by a factor of about 50 IIRC (on gflops for the equivalent cost).....
8) Message boards : Rosetta@home Science : Genetic Algorithm (Message 79686)
Posted 2 Mar 2016 by Mark
Post:
Yes! On robetta we send off many individual jobs to create a population of models. These models are then recombined to create new population(s) of models iteratively.


Not sure this sounds like a "proper" GA. Do you have a paper/documentation on this?

Thanks



I actually think this is a "proper" GA. Appears they seek a "folding model" and the best way is to create a population of models. Models that perform the best are kept and mutated to see if the mutation is better or worse. Appears the goal is to find a best fit model until near perfection is met. They will probably use it across a set of proteins to determine if the model can reach lowest energy state in every set. Not a bad approach. Happy to be crunching with PhD Baker.


Yes, that is a description of how GAs work. However, in the case of Rosetta they use simulated annealing methods (rather than GA methods) to find local (and hopefully global) minima. The end results of all these folds are amalgamated to produce a graph like this:-
(snipped)

So it hinges on what is meant by "These models are then recombined to create new population(s) of models iteratively." Like I said, doesn't sound "GA - y"



Maybe they should try a real GA rather than simulated annealing. The problem though with GA in this case is the "cost function". How are they going to define a cost function for a model?


Crudely speaking, you use the energy that Rosetta calculates for that particular fold. However in practice this doesnt work perfectly as the energy calculations are, in places, approximated (for speed) and also not fully understood. Proteins also can have multiple low energy states in different shapes. Also things like pH can cause conformational change. Lastly to illustrate, I was discussing this at the Gecco conference in Madrid last summer, and I was told of a time that Rosetta was given a sequence and a fold (derived from crystallography) and Rosetta calculated it could lower the energy further. This is probably wrong as you can be pretty sure the natural fold is the lowest energy, pointing to the likely possibility that the calculations are imperfect
9) Message boards : Rosetta@home Science : Genetic Algorithm (Message 79681)
Posted 2 Mar 2016 by Mark
Post:
Yes! On robetta we send off many individual jobs to create a population of models. These models are then recombined to create new population(s) of models iteratively.


Not sure this sounds like a "proper" GA. Do you have a paper/documentation on this?

Thanks



I actually think this is a "proper" GA. Appears they seek a "folding model" and the best way is to create a population of models. Models that perform the best are kept and mutated to see if the mutation is better or worse. Appears the goal is to find a best fit model until near perfection is met. They will probably use it across a set of proteins to determine if the model can reach lowest energy state in every set. Not a bad approach. Happy to be crunching with PhD Baker.


Yes, that is a description of how GAs work. However, in the case of Rosetta they use simulated annealing methods (rather than GA methods) to find local (and hopefully global) minima. The end results of all these folds are amalgamated to produce a graph like this:-



So it hinges on what is meant by "These models are then recombined to create new population(s) of models iteratively." Like I said, doesn't sound "GA - y"
10) Message boards : Rosetta@home Science : Genetic Algorithm (Message 79657)
Posted 29 Feb 2016 by Mark
Post:
Does Rosetta use a genetic algorithm to find a lowest energy state? See "Genetic Algorithms in Search, Optimization and Machine Learning". by David E Goldberg.


Yes! On robetta we send off many individual jobs to create a population of models. These models are then recombined to create new population(s) of models iteratively.


Not sure this sounds like a "proper" GA. Do you have a paper/documentation on this?

Thanks
11) Message boards : Rosetta@home Science : has r@h folded the dna yet? (Message 79266)
Posted 21 Dec 2015 by Mark
Post:
https://www.sciencenews.org/article/art-dna-folding

could there possibly be n possible ways to fold it as well?

:o :D lol

Not really the same thing. DNA uses histones for folding, protein units that the DNA wraps around. These "beads on a string" arrangements then use more histones to form 30 nm wide chromatin. This chromatin then itself coils and folds.

The whole process is dynamic, the DNA that is exposed is transcribed, and the cell has different needs at different times, so DNA is folded up and exposed depending on environmental cues. In addition, modifications (eg methylation) turn genes off and on. For example, once you start growing a pair of eyes, you probably want to turn off the gene that starts the process!

Proteins on the other hand are machines who's function depends on shape. From translation, the protein sequence from the ribozome folds to the functional shape. It is this process that is hard to predict and what Rosetta tries to predict (in nature, you sometimes need chaperone proteins to make the newly formed protein fold correctly)

Hope that helps
12) Message boards : Rosetta@home Science : Update from IPD (Message 79251)
Posted 19 Dec 2015 by Mark
Post:
As prompt updates of Rosetta news are a bit of a "development area" at the moment, I just thought I would point people to the IPD website where there is an update of progress

http://www.ipd.uw.edu/big-moves-in-protein-structure-prediction-and-design/
13) Message boards : Cafe Rosetta : desktop-gene-editing-lab (Message 79204)
Posted 13 Dec 2015 by Mark
Post:


http://www.popularme...ne-editing-lab/

Link not working, I assume you mean http://www.popularmechanics.com/science/health/a18487/biorealize-desktop-gene-editing-lab/?
14) Message boards : Rosetta@home Science : Designing proteins from scratch thanks toy you! (Message 79185)
Posted 11 Dec 2015 by Mark
Post:
This should be posted on the front page so people can see the research this project is doing thanks to the volunteers.


+1

another +1

The Rosetta team are really missing a trick by not giving this type of feedback more widely. It doesn't really cost anything and yet delivers for free. JFDI!!!
15) Message boards : Number crunching : First Skylake CPUs hit the streets (Message 78565)
Posted 8 Aug 2015 by Mark
Post:
Also, the Haswell generation (22nm) i7-5820k is a comparative bargain now at £250 for 6 cores so I'd go for that if I were in the market at the moment.


Really? I'd be interested at that price, where did you see it?

Newegg, but having looked further, it doesn't include VAT :(
http://www.newegg.com/global/uk/Product/Product.aspx?Item=N82E16819117402

Still, at £300 I'd go for it over the alternatives at the moment:
https://uk.pcpartpicker.com/part/intel-cpu-bx80648i75820k
http://www.aria.co.uk/Products/Components/Processors/Intel+CPUs/Core+i7+-+Socket+2011-v3+X99/Intel+i7-5820K+3.30GHz+%28Haswell-E%29+x6+Core+Processor+?productId=61708&source=pcpartpicker


Hmmm best I could find was 270 at http://aqalabs.com/epages/950018197.sf/en_GB/?ObjectPath=/Shops/950018197/Products/2107 so was surprised to see your post. Hopefully the skylake introduction will drop the price. We'll see
16) Message boards : Number crunching : First Skylake CPUs hit the streets (Message 78557)
Posted 7 Aug 2015 by Mark
Post:
Also, the Haswell generation (22nm) i7-5820k is a comparative bargain now at £250 for 6 cores so I'd go for that if I were in the market at the moment.


Really? I'd be interested at that price, where did you see it?
17) Message boards : Rosetta@home Science : new supercomputer (Message 78379)
Posted 30 Jun 2015 by Mark
Post:
http://fossbytes.com...beam-splitting/

The link is broken, I assume you mean http://fossbytes.com/light-speed-computers-possible-with-beam-splitting/ ?
18) Message boards : Number crunching : R@H Scientists/Coders: An analysis of the Rosetta binaries... (Message 78362)
Posted 28 Jun 2015 by Mark
Post:
If you're going to examine this area, another option is llvm/clang which is at http://llvm.org/.

Sounds like you need an experienced computer scientist input if you dont mind me saying...
19) Message boards : Rosetta@home Science : Not a lot of updates in the past couple of months (Message 78228)
Posted 27 May 2015 by Mark
Post:
I have notice that there have not been a lot of updates to the websites news or twitter feed in the past couple of months. I always find it interesting to read what is going on behind the scenes but havnt been able to read it in a while. I was just wondering what kind of progress rosetta is making? How much has actually been discovered? Can we expect a lot of new protein materials or drugs to be coming in the near future as a result of rosetta?

Dont mean to be bothersome, would just like to know.


Agreed. You can sometimes find new Rosetta related news at http://www.ipd.uw.edu/. For example the paper on a new enzyme that was created and added into a pathway at here
20) Message boards : Number crunching : Rosetta@home using AVX / AVX2 ? (Message 78200)
Posted 15 May 2015 by Mark
Post:
The executing code seems to be compiled for a i386 and uses the 387 floating point 8-register stack model. The code (on my machine) spends about 5% of the time waiting for the "fmul st0,st1" ("====" below) to complete.

minirosetta_3.54_windows_x86_64.exe

Rosetta instruction clip ...

address instruction
0x6b3d82 add ebx, ecx
0x6b3d84 lea ebx, ptr [edi+ebx*8]
0x6b3d87 fld st0, qword ptr [edi+eax*8]
0x6b3d8a mov eax, dword ptr [ebp-0x20]
0x6b3d8d mov edi, dword ptr [ebp-0x14]
0x6b3d90 fmul st0, st1
0x6b3d92 inc ecx =========================
0x6b3d93 add eax, 0x8
0x6b3d96 fsubr st0, qword ptr [ebx]
0x6b3d98 add edx, 0x8


All post-Pentium4 CPU (newer than Nov. 2000) support the SSE2 register model. Simply adding the SSE2 target option to the builds would require the machines to be made this century but would use the SSE registers. The 16 directly addressable registers would reduce register stores to the stack and code scheduling (less shuffling of data around and more computation).

A simple recompile should make a noticeable difference without any side effects. If you compile newer than SSE2 or GPUs, you have to start worrying about and managing the population of target machines you deliver workloads to.

Beyond that, the developers would need to look more closely at the code.


Interesting. Which tool did you use to get that info may I ask?


Next 20



©2024 University of Washington
https://www.bakerlab.org