My PC RAC > 3000

Message boards : Number crunching : My PC RAC > 3000

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Michael G.R.

Send message
Joined: 11 Nov 05
Posts: 264
Credit: 11,247,510
RAC: 0
Message 33436 - Posted: 26 Dec 2006, 6:57:29 UTC

On FUD, I'm absolutely with you, Who?; it sucks and shouldn't exist.

But unfortunately, it seems to come from all sides (both Intel and AMD, depending on who's leading at the time and in what area), probably because they know that the general public doesn't know sh*t and they'll swallow anything. Even journalists keep repeating FUD without checking. Sad but true.
ID: 33436 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 33443 - Posted: 26 Dec 2006, 9:29:26 UTC

Sorry I had to make you spell it out Who? Just thought it would calm the atmosphere in here :-D Which it has.

I also just thought the Max Payne part was a bit of fun because your names (you and Mat) are next to each other on the imdb for it.

Merry Christmas and have a good new year.


Team mauisun.org
ID: 33443 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Who?

Send message
Joined: 2 Apr 06
Posts: 213
Credit: 1,366,981
RAC: 0
Message 33630 - Posted: 28 Dec 2006, 17:27:06 UTC - in response to Message 33443.  

Sorry I had to make you spell it out Who? Just thought it would calm the atmosphere in here :-D Which it has.

I also just thought the Max Payne part was a bit of fun because your names (you and Mat) are next to each other on the imdb for it.

Merry Christmas and have a good new year.




so happpy! I am in the Top 20 contributor using only 4 machines :)

As Hothardware.com said it, "the Core 2 Extreme QX6700 is easily the fastest processor on the planet in today's multithreaded applications and is well poised to retain that title with the multitude of multithreaded applications on the horizon. Although expensive, there is no other processor on the planet which can match this product's speed. If you have the cash, accept no substitutes."

Easily ... I guess the FSB is really not saturating ;-)

who?
ID: 33630 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile paulcsteiner

Send message
Joined: 15 Oct 05
Posts: 19
Credit: 3,132,449
RAC: 504
Message 33635 - Posted: 28 Dec 2006, 17:57:16 UTC - in response to Message 33630.  

[/quote]


so happpy! I am in the Top 20 contributor using only 4 machines :)

As Hothardware.com said it, "the Core 2 Extreme QX6700 is easily the fastest processor on the planet in today's multithreaded applications and is well poised to retain that title with the multitude of multithreaded applications on the horizon. Although expensive, there is no other processor on the planet which can match this product's speed. If you have the cash, accept no substitutes."

Easily ... I guess the FSB is really not saturating ;-)

who?[/quote]

Amazing, congrats to who? They must be paying you well to afford that hotrod~!
Have you thought about having flames painted on the sides of the case, or maybe some racing stripes? Question for you though, is this monster fitted with heatsink and fan combo? or is there some liquid cooling going on? I'm one for standing around looking at what’s under the hood, so post a pic if could, I’ll provide the beer.
Rock on who?
PCS
ID: 33635 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
The_Bad_Penguin
Avatar

Send message
Joined: 5 Jun 06
Posts: 2751
Credit: 4,271,025
RAC: 0
Message 33637 - Posted: 28 Dec 2006, 18:09:33 UTC - in response to Message 33635.  

Pics would be cool!

so post a pic if could, I’ll provide the beer.
Rock on who?
PCS

ID: 33637 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile paulcsteiner

Send message
Joined: 15 Oct 05
Posts: 19
Credit: 3,132,449
RAC: 504
Message 33666 - Posted: 29 Dec 2006, 3:00:43 UTC

Mmmmm Beer,...
ID: 33666 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Who?

Send message
Joined: 2 Apr 06
Posts: 213
Credit: 1,366,981
RAC: 0
Message 33667 - Posted: 29 Dec 2006, 4:29:35 UTC - in response to Message 33666.  



Stay tune, i think one of my buddy evangelist can fix a little page somewhere with some pics, just need to get approved ... I am an engineer, when it gets to marketing, i am far from being smart about it ... ;-) and it was not the goal of my PC.


who?
ID: 33667 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Who?

Send message
Joined: 2 Apr 06
Posts: 213
Credit: 1,366,981
RAC: 0
Message 33878 - Posted: 1 Jan 2007, 20:39:35 UTC - in response to Message 33667.  

What about a RAC of 3600 on the top 1 machine?

who?
ID: 33878 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 11 Feb 06
Posts: 316
Credit: 6,589,590
RAC: 198
Message 33895 - Posted: 2 Jan 2007, 5:10:25 UTC - in response to Message 33878.  

What about a RAC of 3600 on the top 1 machine?


Sweet! I am hoping Apple will release a dual C2Q soon. I will be first in line!
Reno, NV
Team: SETI.USA
ID: 33895 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MikeMarsUK

Send message
Joined: 15 Jan 06
Posts: 121
Credit: 2,637,872
RAC: 0
Message 34075 - Posted: 4 Jan 2007, 12:44:25 UTC

I have a performance question for Who?, although it's not related to Rosetta :

The CPDN coupled model uses Intel Fortran 9.0 for the Windows version, and 9.1 for the Intel Mac version in beta testing, but AFAIK everything is the same in terms of compiler switches and so forth.

The model is very demanding on the system memory, much more so than Rosetta (latency and bandwidth affect model performance significantly). I understand the Macs tend to have better memory than the typical PC?

The Mac version performs significantly faster than the Windows version, around 20-25%. Overall it looks like a factory-standard Intel Mac performs better than a top-spec overclocked Core2 Windows PC with fancy memory. Do you think this is more likely to be due to changes in the the compiler, or due to a possibly-superior Mac memory architecture?

Quad thread including Intel Mac post:
http://www.climateprediction.net/board/viewtopic.php?p=55450#55450

Intel Mac post:
http://www.climateprediction.net/board/viewtopic.php?p=55399#55399

ID: 34075 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Who?

Send message
Joined: 2 Apr 06
Posts: 213
Credit: 1,366,981
RAC: 0
Message 34122 - Posted: 4 Jan 2007, 22:15:43 UTC - in response to Message 34075.  
Last modified: 4 Jan 2007, 22:17:00 UTC

I have a performance question for Who?, although it's not related to Rosetta :

The CPDN coupled model uses Intel Fortran 9.0 for the Windows version, and 9.1 for the Intel Mac version in beta testing, but AFAIK everything is the same in terms of compiler switches and so forth.

The model is very demanding on the system memory, much more so than Rosetta (latency and bandwidth affect model performance significantly). I understand the Macs tend to have better memory than the typical PC?

The Mac version performs significantly faster than the Windows version, around 20-25%. Overall it looks like a factory-standard Intel Mac performs better than a top-spec overclocked Core2 Windows PC with fancy memory. Do you think this is more likely to be due to changes in the the compiler, or due to a possibly-superior Mac memory architecture?

Quad thread including Intel Mac post:
http://www.climateprediction.net/board/viewtopic.php?p=55450#55450

Intel Mac post:
http://www.climateprediction.net/board/viewtopic.php?p=55399#55399


1st this will be my personal answer, not representing my company here.

The S5000 chipset have very good memory bandwitch, and with the help of the large L2 cache, the latencies are hidden.FB-dIM has a little longer latency than DDR2 800
On MAC, the compiler is probably a derivative of Intel compiler, and on Windows, you have to deal with "old" CPUs that does not support SSE2. By concequences, under windows, you have to compile with more concervative flags. Under Windows, you have to go back to Katmail to make everybody happy (PIII), with on MAC, all computers support SSE2 (Yonah is the minimum on Intel MAC)
this may not sound like a advantage, but it is. Apple is a very smart company, and their compiler by default generate tones of SSE2 code, they live on the edge and support any novel idea. I wish MS could behave this way :)

To answer you, it is a little of Both, the XEON platform has better mem subsystem when the prefetchers are successful, and the compiler help too!

By the way, my Dual XEON on S5000 just passed 3800 RAC average :-)

who?
PS: please do not ask to often to comment about MAC, I would prefert if an Apple person could do it, they know their product very weel, the only think i can tell is that i love their product, I am learning MAC OS API, i love it!
ID: 34122 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MikeMarsUK

Send message
Joined: 15 Jan 06
Posts: 121
Credit: 2,637,872
RAC: 0
Message 34160 - Posted: 5 Jan 2007, 12:58:17 UTC


That's extremely interesting, thank you :-)

I think Intel Fortran 9.0 can do SSE2 if available, but IIRC the CPDN project had to turn it off since the climate model became unrealistic after a long run time. (A climate model is an iterative process, even the slightest difference in result after one timestep will be gradually magnified by the time the model finishes. The so-called 'butterfly' effect. There are over 4 million iterations before the model finishes, and this takes 2 months of CPU time on a core2 box).

But I guess Mac's 9.1 compiler won't have the overhead of detecting which instruction sets are available, and swapping between the two alternative code fragments (the SSE2 compiled model could run OK on non-SSE2 boxes).

It'll be interesting to see if SSE4 can help :-)

ID: 34160 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Who?

Send message
Joined: 2 Apr 06
Posts: 213
Credit: 1,366,981
RAC: 0
Message 34223 - Posted: 6 Jan 2007, 17:43:44 UTC - in response to Message 34160.  


That's extremely interesting, thank you :-)

I think Intel Fortran 9.0 can do SSE2 if available, but IIRC the CPDN project had to turn it off since the climate model became unrealistic after a long run time. (A climate model is an iterative process, even the slightest difference in result after one timestep will be gradually magnified by the time the model finishes. The so-called 'butterfly' effect. There are over 4 million iterations before the model finishes, and this takes 2 months of CPU time on a core2 box).

But I guess Mac's 9.1 compiler won't have the overhead of detecting which instruction sets are available, and swapping between the two alternative code fragments (the SSE2 compiled model could run OK on non-SSE2 boxes).

It'll be interesting to see if SSE4 can help :-)


may people use the FLAG /QaxW or /QaxP, if they want their code to be only SSE2, they should be using /QxW, the A guanranty a fully compatibility with previous processors and AMD "features".

Do you know if they threaded properly with workload?
80 cores machines are coming, giving 1 Tera Flops per socket, those kind of problem should be running on this kind of machine.
the pure SSE4 binaries will for sure impress the community. feel free to dig for my email, and who ever wants to optimize a BOINC project is welcome to contact me. I am very proud of never slowing down AMD when i touch a code.


who?
May the Core be with you! ;)
ID: 34223 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile DHG

Send message
Joined: 18 Dec 06
Posts: 11
Credit: 8,277
RAC: 0
Message 34224 - Posted: 6 Jan 2007, 18:10:34 UTC - in response to Message 34223.  


That's extremely interesting, thank you :-)

I think Intel Fortran 9.0 can do SSE2 if available, but IIRC the CPDN project had to turn it off since the climate model became unrealistic after a long run time. (A climate model is an iterative process, even the slightest difference in result after one timestep will be gradually magnified by the time the model finishes. The so-called 'butterfly' effect. There are over 4 million iterations before the model finishes, and this takes 2 months of CPU time on a core2 box).

But I guess Mac's 9.1 compiler won't have the overhead of detecting which instruction sets are available, and swapping between the two alternative code fragments (the SSE2 compiled model could run OK on non-SSE2 boxes).

It'll be interesting to see if SSE4 can help :-)


may people use the FLAG /QaxW or /QaxP, if they want their code to be only SSE2, they should be using /QxW, the A guanranty a fully compatibility with previous processors and AMD "features".

Do you know if they threaded properly with workload?
80 cores machines are coming, giving 1 Tera Flops per socket, those kind of problem should be running on this kind of machine.
the pure SSE4 binaries will for sure impress the community. feel free to dig for my email, and who ever wants to optimize a BOINC project is welcome to contact me. I am very proud of never slowing down AMD when i touch a code.


who?
May the Core be with you! ;)


Ah, the TeraScale idea, it's the platform for 2015 if I can recall correctly. I can't wait to see that come into common hands, instead of just prototypes and white papers (whit this I mean to actually read architecture specific things and not things like those "flyers"). Also, that 80 core thing, wasn't it more of a interconnect 'study' then a real cpu design?

But I wonder why we can't have different versions of the application, like Simap? They have a SSE and a non-SSE version. If the rosseta dev team would expand that idea and make SSE -> SSSE3 versions, then all cpu's would perform the best they can, wouldn't they? Perhaps I'm assuming that it wouldn't take too much work, but since I'm more into hardware then software, it's easy for me to take those things a bit too lightly.
ID: 34224 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Who?

Send message
Joined: 2 Apr 06
Posts: 213
Credit: 1,366,981
RAC: 0
Message 34233 - Posted: 6 Jan 2007, 19:05:06 UTC - in response to Message 34224.  


That's extremely interesting, thank you :-)

I think Intel Fortran 9.0 can do SSE2 if available, but IIRC the CPDN project had to turn it off since the climate model became unrealistic after a long run time. (A climate model is an iterative process, even the slightest difference in result after one timestep will be gradually magnified by the time the model finishes. The so-called 'butterfly' effect. There are over 4 million iterations before the model finishes, and this takes 2 months of CPU time on a core2 box).

But I guess Mac's 9.1 compiler won't have the overhead of detecting which instruction sets are available, and swapping between the two alternative code fragments (the SSE2 compiled model could run OK on non-SSE2 boxes).

It'll be interesting to see if SSE4 can help :-)


may people use the FLAG /QaxW or /QaxP, if they want their code to be only SSE2, they should be using /QxW, the A guanranty a fully compatibility with previous processors and AMD "features".

Do you know if they threaded properly with workload?
80 cores machines are coming, giving 1 Tera Flops per socket, those kind of problem should be running on this kind of machine.
the pure SSE4 binaries will for sure impress the community. feel free to dig for my email, and who ever wants to optimize a BOINC project is welcome to contact me. I am very proud of never slowing down AMD when i touch a code.


who?
May the Core be with you! ;)


Ah, the TeraScale idea, it's the platform for 2015 if I can recall correctly. I can't wait to see that come into common hands, instead of just prototypes and white papers (whit this I mean to actually read architecture specific things and not things like those "flyers"). Also, that 80 core thing, wasn't it more of a interconnect 'study' then a real cpu design?

But I wonder why we can't have different versions of the application, like Simap? They have a SSE and a non-SSE version. If the rosseta dev team would expand that idea and make SSE -> SSSE3 versions, then all cpu's would perform the best they can, wouldn't they? Perhaps I'm assuming that it wouldn't take too much work, but since I'm more into hardware then software, it's easy for me to take those things a bit too lightly.


In the case of Rosetta, they need some serious validation, you don t want to miss a cure because the computer that was in charge of calculation the miracle did skip it.
I will try to work with them and offer them to validate it properly, by develloping a validation tool. They should be the only one providing optimized version, i guess, it is all about discussion and quantifying the return of investement for them: running optimized Rosetta could increase the overall processing power of Rosetta by 30%. I am sure Matt from AMD could provide his version for his platform , making it fair for everybody.
After CES (www.cesweb.org), i ll try to speak seriously with Matt and David Baker and see if we can find a way to get better throughput for Rosetta on Intel and AMD platforms.

(Again, this is volunteer work here, not involving my company)
who?
ID: 34233 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 34259 - Posted: 7 Jan 2007, 4:07:06 UTC - in response to Message 34224.  

But I wonder why we can't have different versions of the application, like Simap? They have a SSE and a non-SSE version.


The idea has been discussed before. Technically, it is possible, but in practice the Rosetta program is changing too frequently to make multiple versions practicle. Note how often new releases are coming out, and note that each release is NEW science code. New ideas being studied.

This means that such a port would not be a one-time occurance. You'd have to do it every time the science is changed. The Baker team seems very focused on spending that software time to add more science, rather then maintain more versions.

Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 34259 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 34270 - Posted: 7 Jan 2007, 10:44:18 UTC - in response to Message 34259.  

But I wonder why we can't have different versions of the application, like Simap? They have a SSE and a non-SSE version.


The idea has been discussed before. Technically, it is possible, but in practice the Rosetta program is changing too frequently to make multiple versions practicle. Note how often new releases are coming out, and note that each release is NEW science code. New ideas being studied.

This means that such a port would not be a one-time occurance. You'd have to do it every time the science is changed. The Baker team seems very focused on spending that software time to add more science, rather then maintain more versions.


Though with Matt and Who's input along with anyone else involved, once setup it should be much simpler and then probably down to compile switches and keeping code style.

Also since the Intel Mac's are SSE2 as a minimum then there compile can certainly have the switches flicked once teh code can make use of it.

But it will take people migrating to 5.8.x series for the targeted release of specific instruction sets from the projects side, since that record the CPU's capabilities. (well to do it easily and properly).

Just look at Docking@home for the problems they are seeing trying to validate between different processor types.


Team mauisun.org
ID: 34270 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 34272 - Posted: 7 Jan 2007, 11:07:20 UTC

Protein Predictor @ home (ppah), was the first to deploy "homogeneous redundancy". They sent copies of results to users of the same manufacturer. They did this to avoid rounding differences between Intel and AMD processors.

Also, if memory serves, one of the scientists moved from PPAH to Docking and is probably aware of this.


ID: 34272 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile bruce boytler
Avatar

Send message
Joined: 17 Sep 05
Posts: 68
Credit: 3,565,442
RAC: 0
Message 34276 - Posted: 7 Jan 2007, 13:17:30 UTC - in response to Message 34269.  

As I understand it rosetta@home is mainly used to develop the software needed. In that case it might be better to optimize the applications which are using the "final versions". WCG uses a rosetta aplication, don't they?


Yes, WCG used Rosseta in both HPF and now recently in HPF2. In HPF the WCG tech optimized rosseta with a heuristic which stablized the workunit on my computer to a 6 hour size. Before they did this my computer had sizes ranging from 2 to 96 hours.

With HPF2 they were running into a problem with rosseta newest version which stopped the workunit processing but not the hour meter recording the unit proccessing. These same techs fixed that bug too, or at least troubleshot it.
ID: 34276 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 11 Feb 06
Posts: 316
Credit: 6,589,590
RAC: 198
Message 34303 - Posted: 7 Jan 2007, 17:07:38 UTC - in response to Message 34272.  

Also, if memory serves, one of the scientists moved from PPAH to Docking and is probably aware of this.


Yes, Docking uses homogeneous redundancy.
Reno, NV
Team: SETI.USA
ID: 34303 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : My PC RAC > 3000



©2024 University of Washington
https://www.bakerlab.org