Rosetta@home using AVX / AVX2 ?

Message boards : Number crunching : Rosetta@home using AVX / AVX2 ?

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · Next

AuthorMessage
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1640
Credit: 6,465,179
RAC: 256
Message 90824 - Posted: 6 Jun 2019, 7:20:02 UTC

GIMPS project introduces AVX512 support

:-P
ID: 90824 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 272
Credit: 19,170,835
RAC: 399
Message 90832 - Posted: 10 Jun 2019, 20:44:48 UTC - in response to Message 90824.  

GIMPS project introduces AVX512 support

:-P


PrimeGrid uses GIMPS. I am see the benefit on my 9980xe machine. I have not measured it accurately, but it seemed like about 30% improvement on PrimeGrid LLR. When a dense AVX application starts crunching, the CPU will throttle back the clock because of the higher CPU power usage.

This highlighted an issue with some CPUs. Some CPUs are designed with one AVX unit for each core instead of one AVX unit per thread. This means that the AVX unit can only be used by one of the threads on that core at a time. On systems with a single AVX unit per core, the bottleneck that creates cause the application to run slower that not using AVX 512.

"Performance" is a fickle thing.
ID: 90832 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile G.L.I.S.
Avatar

Send message
Joined: 25 Dec 08
Posts: 23
Credit: 1,170,926
RAC: 0
Message 90835 - Posted: 11 Jun 2019, 20:29:55 UTC - in response to Message 90698.  

Also a Prime Grid app; but the question is that only Intel can handle it and to a much greater consumption for the CPU.
Mind you, an SSE2 / SSE3 would be fine for me too ... even in Phenom (K10) they can handle them.
ID: 90835 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile G.L.I.S.
Avatar

Send message
Joined: 25 Dec 08
Posts: 23
Credit: 1,170,926
RAC: 0
Message 90836 - Posted: 11 Jun 2019, 20:42:23 UTC - in response to Message 90835.  
Last modified: 11 Jun 2019, 21:06:42 UTC

Also a Prime Grid app; but the question is that only Intel can handle it and to a much greater consumption for the CPU.
https://blog.cloudflare.com/on-the-dangers-of-intels-frequency-scaling/

Mind you, an SSE2 / SSE3 would be fine for me too ... even in Phenom (K10) they can handle them.



Of course, less time, same credits score (imho).
ID: 90836 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1640
Credit: 6,465,179
RAC: 256
Message 90838 - Posted: 12 Jun 2019, 21:55:05 UTC

Ryzen 3xxx will have Avx 256 (Ryzen 1xxx and 2xxx have only Avx 128) support:
AMD has improved IPC by roughly 15% (though that can vary by workload) doubled the L3 cache size to keep data as close to the execution units as possible, and doubled floating point performance by stepping up to two 256-bit floating point units (FPUs) that enable support for AVX2 instructions.

ID: 90838 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mmonnin

Send message
Joined: 2 Jun 16
Posts: 51
Credit: 9,748,798
RAC: 36,091
Message 90841 - Posted: 14 Jun 2019, 1:52:42 UTC

To clarify, Zen 2 will have AVX2 support in a single cycle. Zen/+ can do AVX2 but needs 2 cycles to complete it.
ID: 90841 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile G.L.I.S.
Avatar

Send message
Joined: 25 Dec 08
Posts: 23
Credit: 1,170,926
RAC: 0
Message 91292 - Posted: 20 Oct 2019, 13:24:12 UTC

Health to all, I believe that applications that deal with proteins in Folding @ home have fast SIMDs.
Personally, I hope (and I wish the whole project) that it will not have to go through all 2020, without Rosetta @ home providing it properly.

Byez
ID: 91292 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile G.L.I.S.
Avatar

Send message
Joined: 25 Dec 08
Posts: 23
Credit: 1,170,926
RAC: 0
Message 91295 - Posted: 20 Oct 2019, 22:26:13 UTC - in response to Message 91292.  

I add that you could start with the Linux system, taking into account also the ARM platform.
So much so that these crunching devices are also spreading:

https://www.google.it/search?q=arm+hardkernel+odroid-n2&newwindow=1&sxsrf=ACYBGNRTcA7SJ1QcOdNKgjTC9WppFijsNg:1571610265481&source=lnms&tbm=isch&sa=X&ved=0ahUKEwj8s-W88KvlAhXko4sKHd9zBCwQ_AUIFCgD&biw=1485&bih=929
ID: 91295 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1640
Credit: 6,465,179
RAC: 256
Message 91412 - Posted: 28 Nov 2019, 9:08:39 UTC

Also new ThreadRippers have AVX2 full support.

P.S.
New 3960X and 3970X are monsters
ID: 91412 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1640
Credit: 6,465,179
RAC: 256
Message 91689 - Posted: 13 Feb 2020, 9:59:10 UTC - in response to Message 91412.  

New 3960X and 3970X are monsters

And this is a super-monster.
ID: 91689 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1640
Credit: 6,465,179
RAC: 256
Message 91758 - Posted: 20 Feb 2020, 9:37:40 UTC
Last modified: 20 Feb 2020, 9:37:53 UTC

C++20 is ready and will be published in a few months!!
C++20, the most impactful revision of C++ in a decade, is done!
At the ISO C++ Committee meeting in Prague, hosted by Avast, we completed the C++20 Committee Draft and voted to send the Draft International Standard (DIS) out for final approval and publication.
The following notable features are in C++20:
Modules
Coroutines
Concepts
Ranges
and a lot of others features

ID: 91758 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1640
Credit: 6,465,179
RAC: 256
Message 92035 - Posted: 17 Mar 2020, 16:56:22 UTC

ID: 92035 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1640
Credit: 6,465,179
RAC: 256
Message 93714 - Posted: 7 Apr 2020, 7:20:35 UTC

Now seems that version 4.12 is a 64 bit native version for Windows.
Great!
It's time for SSEx/Avx support??
ID: 93714 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jan Vaclavik

Send message
Joined: 26 Sep 05
Posts: 5
Credit: 340,797
RAC: 51
Message 93872 - Posted: 8 Apr 2020, 15:12:37 UTC - in response to Message 93714.  

Now seems that version 4.12 is a 64 bit native version for Windows.
Great!
It's time for SSEx/Avx support??
Are there even any x86-64 CPUs without at least SSE2 support?
ID: 93872 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1640
Credit: 6,465,179
RAC: 256
Message 93912 - Posted: 8 Apr 2020, 21:13:01 UTC - in response to Message 93872.  
Last modified: 8 Apr 2020, 21:13:19 UTC

It's time for SSEx/Avx support??

Are there even any x86-64 CPUs without at least SSE2 support?


SSE2 from Wiki:
Introduced by Intel with the initial version of the Pentium 4 in 2000... AMD added support for SSE2 in 2003

ID: 93912 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Klimax

Send message
Joined: 27 Apr 07
Posts: 35
Credit: 1,712,432
RAC: 0
Message 93952 - Posted: 9 Apr 2020, 9:31:16 UTC - in response to Message 93912.  

It's time for SSEx/Avx support??

Are there even any x86-64 CPUs without at least SSE2 support?


SSE2 from Wiki:
Introduced by Intel with the initial version of the Pentium 4 in 2000... AMD added support for SSE2 in 2003

It should be noted, that x86-64 mandates SSE2 support and as such any 64-bit CPU supports it.
ID: 93952 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jan Vaclavik

Send message
Joined: 26 Sep 05
Posts: 5
Credit: 340,797
RAC: 51
Message 93956 - Posted: 9 Apr 2020, 9:53:23 UTC - in response to Message 93912.  

SSE2 from Wiki:
Introduced by Intel with the initial version of the Pentium 4 in 2000... AMD added support for SSE2 in 2003

I know, but back in the day there were more manufacturers like VIA and Intel sometimes released CPUs like the Atom line, which did not support all the instructions sets.
But it seems like you are right - all x86-64 CPUs support SSE2 and except the first AMD K8 they support SSE3.
ID: 93956 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1640
Credit: 6,465,179
RAC: 256
Message 93957 - Posted: 9 Apr 2020, 10:20:53 UTC - in response to Message 93956.  
Last modified: 9 Apr 2020, 10:21:22 UTC

But it seems like you are right - all x86-64 CPUs support SSE2 and except the first AMD K8 they support SSE3.

Like i said in the past, even if SSE2 version will give only 0,5% of more computational power, the ten firsts systems will exceed all old systems (remaining) that don't support this extension.
Rjs5 said that introducing SSE2 is not so difficult (recompilation with some tricks), but i don't know if it is true
ID: 93957 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1366
Credit: 13,624,788
RAC: 0
Message 93960 - Posted: 9 Apr 2020, 10:40:21 UTC - in response to Message 93957.  

Rjs5 said that introducing SSE2 is not so difficult (recompilation with some tricks), but i don't know if it is true
I'm thinking different compilation flags, making sure mathsafe or similar is used, then check the resulting output of the application is as expected.
Grant
Darwin NT
ID: 93960 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Laurent

Send message
Joined: 15 Mar 20
Posts: 14
Credit: 88,800
RAC: 0
Message 94085 - Posted: 10 Apr 2020, 15:34:10 UTC - in response to Message 93957.  

Rjs5 said that introducing SSE2 is not so difficult (recompilation with some tricks), but i don't know if it is true


It is.

The keyword is auto vectorization. It was already available in most better compilers sometimes around 2000-2005. I remember it kicking in for the Pentium MMX-extensions.... Just as a reminder, that's Pentium I in today's numbering.

Now it is often faster to just write clean code without any extras and tell the compiler to do the magic, than to attempt to do the magic of AVX/SEE/whatever yourself. Even the free VisualStudio tiers can do that. Bonus: compilers usually emit code that runs on ALL CPUs, unless you screw it up in the parameters. The code contains fall-back stuff to run if an extension is not there. The only real advantage of dedicated exes for AVX, SSE,... are slightly smaller exes (Come on, we all download WU way bigger than the exes...)

It's a different thing for GPU. Compilers are not yet smart enough yet to do that level of vectorization.
ID: 94085 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · Next

Message boards : Number crunching : Rosetta@home using AVX / AVX2 ?



©2022 University of Washington
https://www.bakerlab.org