Posts by Laurent

1) Message boards : Number crunching : The most efficient cruncher rig possible (Message 95204)
Posted 23 Apr 2020 by Laurent
They can and do boot without. If they don't boot without a GPU something in the BIOS setup calls for VRAM. Finding that one can be tricky, but worth it.

Example: the KF series Intel's boost higher than the K series, even if you use a dedicated GPU with a K. Even an disabled iGPU eats into something.
2) Message boards : Number crunching : "Rosetta v4.12 i686-pc-linux-gnu" : fixed 20 h CPU time, fixed 20 credits (Message 94412)
Posted 13 Apr 2020 by Laurent
OK - then i686=32bit and x64 = 64bit?

But the needed solution is still: run only the 64bit WUs on AMD CPU.

Why is the 20 years old i686 code still in use if the problem is known?

All current intel and AMD for PC (windows) are based on the x86 system, invented 1978. The i686 is the 6 generation and works very fine.

x86-64, also called AMD64 is the 64 bit extension for x86. That one was invented by AMD and crosslicenced to intel. All current PC CPUs, even the ones from Intel include that extension, as well as all the previous extensions (i286, i386, i486, ... Till roughly generation 12, depending on how you count the generations).

There is no problem with i686, there is a problem with Rosetta. You are barking up the wrong tree. Don't blame intel or AMD.
3) Message boards : Number crunching : "Rosetta v4.12 i686-pc-linux-gnu" : fixed 20 h CPU time, fixed 20 credits (Message 94376)
Posted 13 Apr 2020 by Laurent

I have some broken WUs on my PC, new version 4.15
Why are INTEL686 WUs sent to AMD-PCs?
The x86 run perfect, please keep away these i686 WUs from non-intel-PCs.

All AMD CPUs starting with the Athlon K7 implement i686. That's ~2000, or 20 years ago. How old are your computers?

It is called i686 because intel created the instruction set. AMD can run it (and Windows) because they bought the right to use the platform from intel.

The SSSE problem popping up is related to 64 bit AMDs not implementating a part of Intel's stuff.
4) Message boards : Number crunching : Rosetta@home using AVX / AVX2 ? (Message 94085)
Posted 10 Apr 2020 by Laurent
Rjs5 said that introducing SSE2 is not so difficult (recompilation with some tricks), but i don't know if it is true

It is.

The keyword is auto vectorization. It was already available in most better compilers sometimes around 2000-2005. I remember it kicking in for the Pentium MMX-extensions.... Just as a reminder, that's Pentium I in today's numbering.

Now it is often faster to just write clean code without any extras and tell the compiler to do the magic, than to attempt to do the magic of AVX/SEE/whatever yourself. Even the free VisualStudio tiers can do that. Bonus: compilers usually emit code that runs on ALL CPUs, unless you screw it up in the parameters. The code contains fall-back stuff to run if an extension is not there. The only real advantage of dedicated exes for AVX, SSE,... are slightly smaller exes (Come on, we all download WU way bigger than the exes...)

It's a different thing for GPU. Compilers are not yet smart enough yet to do that level of vectorization.
5) Message boards : Number crunching : GPU WU's (Message 92531)
Posted 29 Mar 2020 by Laurent
Have you seen the posts by rjs5? He is an expert on parallelism (AVX, etc.) and has been trying to help them along for years, but it is slow progress.

Yes, i have seen them and the history (the no SSE thread in 2015) makes me shiver. They could push though a lot more data just using a competent compiler. Who knows how fast this thing would be on a good platform.

I'm a OpenCL developer and have already done such ports of scientific code. I offered help, they declined politely. That's life.
6) Message boards : Number crunching : GPU WU's (Message 92511)
Posted 29 Mar 2020 by Laurent
The short version: they are sitting on a huge code base, licenced in a way that limits open-soucre efforts and originally written to be highly modular. The single thread code used in boinc can also run in MPI mode on clusters. Multiple universities have extended the high level code parts, sometimes exploiting the inner mechanics of the low level code.

Porting that one without breaking code will not be easy. I'm totally with you regarding runtime and it being worth the effort. But this one calls for full time developers with formal training, not scientists doing development on the side. I'm not sure how much man-power Rosetta actually has to do this and I'm also not sure if the commercial side of Rosetta has an interest in doing this.
7) Message boards : News : Rosetta's role in fighting coronavirus (Message 92487)
Posted 28 Mar 2020 by Laurent
Let's be honest:

Best case they find something where the drug already exists. In that case, it will take at least one month (usually more) to come from the in-silico result to a WHO guideline and general availability of the drug. Most countries are way past that window of opportunity and will go though the first wave of infections w/o such a drug.

Second best case: they find something that can be used to create a drug. Even using an emergency protocol (direct testing on humans), it will be at least 3 months, more likely 6 months before the general population has got that shot.

This box of pandora is wide open and even knowing this, some countries still treat it like a seasonal flu. Germany currently expects roughly 70% of the population to get infected till the end of the year (if we stay in current lockdown mode). Dead rate is somewhere between 1% and 5%, depending on who (or WHO) you trust. That's somewhere between 0.5mio and 2.8mio COVID-19 deaths in Germany. Now look at the US and ask yourself what will happen worldwide after COVID-19 has become endemic there.

The next few boxes of pandora are already waiting for us. We had one every few years (SARS, MERS, Ebola, ....). Those upcoming boxes might be even scarier, faster, deadlier. Example: mumps in adult males causes the sperm count to drop permanently (Anti-vaxer might take notes now). The effect is minor, usually not enough to make males infertile. Now assume a variant causes a bit more damage and in females. Can you image a world after something sweeping whole continents mostly uncontrolled / unnoticed because nobody has a test for that one?

Science is working on the next boxes of pandora and should hopefully be at least a few years before the curve. The corona-family was known beforehand, that's why we can test for infections, that's why we can track it (20-40% of the cases are asymptomatic). That's why switching to COVID-19 only mode is not a good idea even if it keeps some users around for longer.

Science needs time and resources. Swamping science with resources AFTER the fact doesn't work.
8) Message boards : News : Rosetta's role in fighting coronavirus (Message 92349)
Posted 26 Mar 2020 by Laurent
As an advise from an internet stranger: pick the most commonly us

I cannot understand clearly this...

I typed a lengthy post, hit "submit" and got an server error. I checked if the post made it, saw nothing. So i repeated like 10x. Still got errors, lost interest, aborted. You see the end result: 5 partial posts (if an admin could please trim the mess down to one post .... ?)

The short version of the end:

pick the most commonly used class combinations and build a monolithic GPU implementation (like the one used for most rosetta-boinc WUs). Do not care too much about making modules, but do care a lot about data layout on the GPU. Modules and stuff comes later, but getting to a data layout that is actually efficient (low number of copies between GPU and host) is hard. Making the data layout modular in the first go will bite you hard later.

I also wrote something about not aiming for MPI and multiple GPUs in the first go. I have seen a few do-it-all port attempts that failed hard because taking into consideration a rare corner case made the data layout on the GPU really, really awful. KISS (keep it simple and stupid) is king in GPU land. Adding multiple GPU / nodes later and using them in a task parallel way is very often easier and more efficient (coding-time and runtime).

BTW and very off-topic.: i saw your post about tn-grid and GPU. Thanks for the tip, that one looks doable in 2 weeks. I contacted valterc (picked the last admin posting anything GPU related), hoping to get a bit of feedback from the attempt made by Daniel.
9) Message boards : News : Rosetta's role in fighting coronavirus (Message 92298)
Posted 25 Mar 2020 by Laurent
Unfortunately I don't think so. This is very much a long slough by the same people that wrote Rosetta the first two times. Much of the challenge here is trying to figure out how to redo the core algorithms but in parallel fashion (with performance being #1).

Pity, it sounded like a fun project. But i understand the answer after reading a bit on rosetta commons.

I wondered about the internal structure and found the page . The concept of interchangeable scoring functions and, i assume, interchangeable minimization procedures is a tough call for porting to the GPU. An efficient GPU implementation needs all those parts (Pose, MoveMap, Packer, scoring and the min-procedure) on the GPU, with a data structure adopted to GPU systems. Doing that is easy if all components are known, well maintained and no 3. party can add stuff. I can guess where the code comes from based on the papers and having worked with science people written high-performance code, that will not be fun.

That GPU-port will need to cut corners in some places. They can probably not get it all running in one go. As an advise from an internet stranger: pick the most commonly us
10) Message boards : News : Rosetta's role in fighting coronavirus (Message 92284)
Posted 25 Mar 2020 by Laurent
But yes, we want GPUs just as bad as you do. And while we are making progress with the GPU Rosetta, they've only recreated about 1% of the whole infrastructure so far.

I do OpenCL coding for a living and I'm between jobs.

Any 2-3 weeks tasks?

©2024 University of Washington