QX6700 or XEONs What should you choose for crunching

Author	Message
Who? Send message Joined: 2 Apr 06 Posts: 213 Credit: 1,366,981 RAC: 0	Message 31382 - Posted: 18 Nov 2006, 17:39:28 UTC Last modified: 18 Nov 2006, 17:46:03 UTC For the Crazy OverClocker , i am using a Core 2 Extreme QX6700 with a Match II Fridge cooling, some awesome OCZ DDRII-1066 and a BadAXEII motherboard. Right now, I am using a Cross-fire X1900XT ATI but i plan to replace this soon with a G80 from NVidia, this is much faster on my favorite games. (SplinterCell Double agent) (I can t afford a 2nd one, otherwise, my wife will be killing me ;) I get good discount on the processors, my employer was nice enough to establish a program 2 years ago for us. the QX6700 is running at 4Ghz, stable for more than 2 weeks running Rosetta, 2 reboot due to maintenant, "error workloads" dissapeared after i adjusted the core voltage several time. For the rising monster , i used a S5000NVX workstation, it comes with an awesome raid 10, so, i did put 5 drives of 300 Gigs (They were for sale at Fry's electronics for 75$ each ,few weeks ago) On the top of this, it is running a G80 NVidia, I got it at central computer, in Santa Clara/Steven creek, they have them in stock yet... I got the 768Megs one, what a nice baby, fast, not too loud... Of course, I am planning to overclock the XEON, it gets a little more tricky there. For the moment, I use 2 standard copper XEON heatsink, but it will not allow more than 3.5Ghz.(I am still running at 2.6Ghz, i want to see the MAX RAC before I start overclocking) so, i am thinking about putting 2 huges water cooling system and a very big radiator and fan. a good trick for the overclockers, the car industry is full of radiators, don't pay premium on radiators, you can get them from Kragen for 40$, big enough to cool down the transmission of a F-350 Ford ... in the past i used the radiators from the DERALE, The serie 7000 was an overkill for the extreme edition 975 3.73Ghz. (For silly-con valley citizens, you can find the best choice of cooler at Winchester Autopart, on winchester blvd, use the transmission coolers, they have the right pipe diameter) I am planning as well a water cooling for the G80, the turbine is kind of annoying at night, when you try to enjoy your game. Since i am planning overclocking on those 2 XEONS, i got a 1000Watts power supply, probably not required, but you never know. With 8 Cores, I am playing SplinterCell with perfect Frame rate and rendering Rosetta, with a requested 8 rosetta task running in the same time. On most of the workload i tried, the Front side bus never saturated, thanks to the prefectchers and large cache. If somebody tells you that you need Hypertransport ... that's need to be proven with real data, not powerpoint slide. Even on workload a little less cache friendly, the XEONs get out a head of their aging chalangers. if you want to check for your cache success rate, a little tool can help you: CPUID perfMon in few minutes, you can understand your L2 cache success. pretty cool! if you get addicted to understand those parameters, then, you can install vTune. Notice tha the XeONs are far from their maximum RAC yet. PS: About overclocking, my employer dissagree with me :-P ... ID: 31382 · Rating: 0 · rate: / Reply Quote

Michael G.R. Send message Joined: 11 Nov 05 Posts: 264 Credit: 11,247,510 RAC: 0	Message 31385 - Posted: 18 Nov 2006, 18:35:49 UTC Wow, lots of power at your fingertips! Thanks for using it for important science. I wish all cutting-edge early adopters and gamers like you did the same. ID: 31385 · Rating: 0 · rate: / Reply Quote

BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0	Message 31403 - Posted: 19 Nov 2006, 3:54:29 UTC It's nice to see some competition between Intel and AMD again to help spur innovation. Wonder what will be available from AMD when these chips eventually migrate down to the Dell home desktop machines. ID: 31403 · Rating: 0 · rate: / Reply Quote

Who? Send message Joined: 2 Apr 06 Posts: 213 Credit: 1,366,981 RAC: 0	Message 31407 - Posted: 19 Nov 2006, 4:29:23 UTC - in response to Message 31403. It's nice to see some competition between Intel and AMD again to help spur innovation. Wonder what will be available from AMD when these chips eventually migrate down to the Dell home desktop machines. They already did, no? So? Do you subjest that Dell will build better AMD machines than HP or Gateway? I hope they will, I need some challanging machines to convince my managers to include more of my stuff in the next CPU :) Who? ID: 31407 · Rating: 0 · rate: / Reply Quote

FluffyChicken Send message Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0	Message 31419 - Posted: 19 Nov 2006, 15:25:37 UTC - in response to Message 31407. Last modified: 19 Nov 2006, 15:27:00 UTC It's nice to see some competition between Intel and AMD again to help spur innovation. Wonder what will be available from AMD when these chips eventually migrate down to the Dell home desktop machines. They already did, no? So? Do you subjest that Dell will build better AMD machines than HP or Gateway? I hope they will, I need some challanging machines to convince my managers to include more of my stuff in the next CPU :) Who? Quick question, does the dual setup still to have both processors running at the same freqency or can they run at different ones ? I assume SSE4 is comping it the next parts how would that help out ? <maybe I should just go and read what instructions should be added, doh!> Team mauisun.org ID: 31419 · Rating: 0 · rate: / Reply Quote

Who? Send message Joined: 2 Apr 06 Posts: 213 Credit: 1,366,981 RAC: 0	Message 31426 - Posted: 19 Nov 2006, 17:53:58 UTC - in response to Message 31419. It's nice to see some competition between Intel and AMD again to help spur innovation. Wonder what will be available from AMD when these chips eventually migrate down to the Dell home desktop machines. They already did, no? So? Do you subjest that Dell will build better AMD machines than HP or Gateway? I hope they will, I need some challanging machines to convince my managers to include more of my stuff in the next CPU :) Who? Quick question, does the dual setup still to have both processors running at the same freqency or can they run at different ones ? I assume SSE4 is comping it the next parts how would that help out ? <maybe I should just go and read what instructions should be added, doh!> For the moment, Windows Vista and Windows XP are unable to support different speed for each core. Who ever say that it has it running is doing good marketing, but in fact, you get nice blue screens. It is the same for linux and BSD based OS except that the screen is not blue ... hehehe (I don't know about MAC OS) SSE4 will be available next year (In Penryn), what you have in Core 2 is SSSE3. The SSE4 instruction set is very useful for processing and crunching, it helps the automotic vectorisation of the algorythm by the compiler. it has instructions to help avoiding branches when using SIMD (Single instructions for Multiple data) SSSE3, the instruction set in Core 2 is more a final tunning of SSE3. for example, it has a Int Mul by 8 vectors, this help dramatically the encoding of MPEG4 video codecs. who? PS: Information about SSE4 are openly available on the intel web site. ID: 31426 · Rating: 0 · rate: / Reply Quote

FluffyChicken Send message Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0	Message 31435 - Posted: 19 Nov 2006, 19:38:39 UTC - in response to Message 31426. Last modified: 19 Nov 2006, 19:39:39 UTC For the moment, Windows Vista and Windows XP are unable to support different speed for each core. Who ever say that it has it running is doing good marketing, but in fact, you get nice blue screens. It is the same for linux and BSD based OS except that the screen is not blue ... hehehe (I don't know about MAC OS) How does that work with Speedstep or Cool'n'quiet P.S. by core are you refering to the whole processor core (as in a cpu) or the individual cores on a multicore cpu (1 of the dual or quads) since I thought the unused core on a single cpu are supposed to step down and even power off (well at least the new gen. AMD Turions are.) I'm a little vauge on most the Core2 and Athlon64 happening, just never got around to reading ;-) . Been more interested with the epitaxy and microstepper/lithography part of things for the past years :-) Team mauisun.org ID: 31435 · Rating: 0 · rate: / Reply Quote

Who? Send message Joined: 2 Apr 06 Posts: 213 Credit: 1,366,981 RAC: 0	Message 31438 - Posted: 19 Nov 2006, 20:38:08 UTC - in response to Message 31435. For the moment, Windows Vista and Windows XP are unable to support different speed for each core. Who ever say that it has it running is doing good marketing, but in fact, you get nice blue screens. It is the same for linux and BSD based OS except that the screen is not blue ... hehehe (I don't know about MAC OS) How does that work with Speedstep or Cool'n'quiet P.S. by core are you refering to the whole processor core (as in a cpu) or the individual cores on a multicore cpu (1 of the dual or quads) since I thought the unused core on a single cpu are supposed to step down and even power off (well at least the new gen. AMD Turions are.) I'm a little vauge on most the Core2 and Athlon64 happening, just never got around to reading ;-) . Been more interested with the epitaxy and microstepper/lithography part of things for the past years :-) The CPU you are speaking about can do it. But it does not :) see what i mean? marketing ... marketing ... Core 2 change the speed of the 2 cores at the same time. who? ID: 31438 · Rating: 0 · rate: / Reply Quote

Mats Petersson Send message Joined: 29 Sep 05 Posts: 225 Credit: 951,788 RAC: 0	Message 31461 - Posted: 20 Nov 2006, 14:20:46 UTC For the AMD processors, it's not possible to run the current generation of dual cores at different speeds. Whether that blue-screens or not isn't an issue, as there's only one shared set of registers for the two cores, so you get what you set last (if CPU0 sets 1.4GHz and CPU1 sets 1.8GHz immediately after, both cores will run at 1.8GHz, for example). This is a design desicion, and there's probably nothing technically preventing the individual cores at the same time, but I agree with Who? that it's probably not such a great idea to set them to different speeds, as the OS's that can run on the machine will most likely not like it very much - they tend to get "upset" if the machine isn't going at the same speed on all cores. Also, most process scheduling will have to be modified if the OS has to take into account whether the processor is running fast or not before scheduling a task on it... -- Mats ID: 31461 · Rating: 0 · rate: / Reply Quote

River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0	Message 31478 - Posted: 20 Nov 2006, 20:46:18 UTC - in response to Message 31461. ... there's probably nothing technically preventing the individual cores at the same time ... First point there needs to be a clock signal running at the designated speed, so to run one core at 1.4GHz and the other at 1.8GHz you would need two separate oscillators to produce the signals. This is not the case if you turn one core off, as when the core is off it doesn't care if its clock is still running. So to run with diverse speeds needs more hardware to be burned onto the chip that to run with just one speed for all cores, and that is more than just needing an extra register Second point The clock signal synchronises various parts of the chip, so that they are running in sync. Even when you have one part of a chip running at some multiple of another part, they are usually simple multiples (like integers, or ratios of small integers). This means that when a signal intereferes with another it always happens at defined points in the cycle. This in turn means that random errors are reduced. in step. Third point Some designs allow cores to share resources (whether it is sharing cpu components like the threads of an Intel HT core do, or sharing the same on chip cache, or even exchanging data between the caches or cpus without going through the external bus like AMD are experimenting with). All these forms of data sharing are much easier to design if you can be sure that both cpus are at the same point in their clock cycle at all times, ie running on the same clock. So it is a design decision, but not one that is likely to be changed by either AMD or Intel. This is my layman's view based on a fairly simplistic understanding of how the processors are designed. I'd welcome correction where who? knows better. R~~ ID: 31478 · Rating: 0 · rate: / Reply Quote

Mats Petersson Send message Joined: 29 Sep 05 Posts: 225 Credit: 951,788 RAC: 0	Message 31506 - Posted: 21 Nov 2006, 12:26:38 UTC 1. Both cores have individual PLL's (Phase Locked Loop's) which take an input of (say) 200MHz and "mulitplies" it by some number. So from that perspective, there's nothing preventing it from runnint at different speed. 2. The clock-signals do indeed synch things within the chip, but a dual core processor does actually have two individual cores with no direct interaction - there is a bus interface that would need to sync the interaction between the cores, but I believe this uses some sort of FIFO to signal things anyways, so it shouldn't be a problem there... 3. Well, there's not much shared stuff between the cores, but a shared cache would have to run at a "shared speed", of course. I used to work on a graphics chip where the two major components of the graphics chip ran at different speeds, about 50% higher on one half than the other. It works fine as long as you have a simple FIFO circuit to sync the two parts. But I agree, it's a design decision, and it's unlikely to change anytime soon. -- Mats ID: 31506 · Rating: 0 · rate: / Reply Quote

River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0	Message 31527 - Posted: 21 Nov 2006, 19:01:56 UTC - in response to Message 31506. 2. The clock-signals do indeed synch things within the chip, but a dual core processor does actually have two individual cores with no direct interaction - there is a bus interface that would need to sync the interaction between the cores, but I believe this uses some sort of FIFO to signal things anyways, so it shouldn't be a problem there... bold text should possibly read Intel dual core processors actually have etc If I understand correctly what I have read, AMD have ?announced ?released dual core processors where there is communication between the cores without going via RAM. In which case my comments apply. One of the criticisms AMD seem to have have made about the Intel duo core products is exactly that they are 'only' two unconnected cores. (How far this is marketing hype and how far reality is another matter). If the cores are truly separate electrically, as in the Intel designs, then you are right it is only the bus interface that will need to cope. And as you rightly say, the bus interface needs some way of resolving contention between the two cores anyway, so speed mismatches would not add a lot to the complexitiy there. R~~ ID: 31527 · Rating: 0 · rate: / Reply Quote

Mats Petersson Send message Joined: 29 Sep 05 Posts: 225 Credit: 951,788 RAC: 0	Message 31533 - Posted: 21 Nov 2006, 20:07:31 UTC - in response to Message 31527. 2. The clock-signals do indeed synch things within the chip, but a dual core processor does actually have two individual cores with no direct interaction - there is a bus interface that would need to sync the interaction between the cores, but I believe this uses some sort of FIFO to signal things anyways, so it shouldn't be a problem there... bold text should possibly read Intel dual core processors actually have etc If I understand correctly what I have read, AMD have ?announced ?released dual core processors where there is communication between the cores without going via RAM. In which case my comments apply. One of the criticisms AMD seem to have have made about the Intel duo core products is exactly that they are 'only' two unconnected cores. (How far this is marketing hype and how far reality is another matter). If the cores are truly separate electrically, as in the Intel designs, then you are right it is only the bus interface that will need to cope. And as you rightly say, the bus interface needs some way of resolving contention between the two cores anyway, so speed mismatches would not add a lot to the complexitiy there. R~~ Ok, we're probably splitting hairs here, but the CORE ITSELF doesn't communicate with the other CORE even on AMD processors. There is a special portion of the processor logic that deals with talking between the cores, which is the bus-unit. It has a direct connection to the other core in the AMD case, but it still has to pass through the bus-unit. Most likely, one core isn't exactly in sync with the other core anyways, so you need some sort of syncing mechanism between them to make it work - by that, I mean that there is a some sort of FIFO/LATCH that is using the one cores clock-domain on one side, and the other cores clock-domain on the other side - otherwise the clocks would have to be perfectly in sync, and as I understand it, they are not, even if it's a single-chip dual core processor... -- Mats ID: 31533 · Rating: 0 · rate: / Reply Quote

River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0	Message 31534 - Posted: 21 Nov 2006, 20:18:10 UTC - in response to Message 31533. otherwise the clocks would have to be perfectly in sync, and as I understand it, they are not, even if it's a single-chip dual core processor... thanks Mats. This was an important detail I had not realised before. If the clocks are not quite in sync anyway then you're right: there would be no particular advantage in having them run at the same speed. R~~ ID: 31534 · Rating: 0 · rate: / Reply Quote

Mats Petersson Send message Joined: 29 Sep 05 Posts: 225 Credit: 951,788 RAC: 0	Message 31535 - Posted: 21 Nov 2006, 20:24:38 UTC - in response to Message 31534. otherwise the clocks would have to be perfectly in sync, and as I understand it, they are not, even if it's a single-chip dual core processor... thanks Mats. This was an important detail I had not realised before. If the clocks are not quite in sync anyway then you're right: there would be no particular advantage in having them run at the same speed. R~~ Yet, if you have a BIG difference in speed (rather than +/- a clock-cycle or so), you'll need longer fifos to avoid stalling on the communications link. I don't know exactly how all of these things works, but I bet that you'd have stalls more frequently if you run different speeds, which are avoided by having the same speed... Stalls aren't too big of a problem if the data is unidirectional, but if you send something and expect a reply, then the stalls are really annoying (especially if you stall in both directions on the same transaction). -- Mats ID: 31535 · Rating: 0 · rate: / Reply Quote

Who? Send message Joined: 2 Apr 06 Posts: 213 Credit: 1,366,981 RAC: 0	Message 31667 - Posted: 25 Nov 2006, 22:16:05 UTC - in response to Message 31535. Last modified: 25 Nov 2006, 22:17:51 UTC otherwise the clocks would have to be perfectly in sync, and as I understand it, they are not, even if it's a single-chip dual core processor... thanks Mats. This was an important detail I had not realised before. If the clocks are not quite in sync anyway then you're right: there would be no particular advantage in having them run at the same speed. R~~ Yet, if you have a BIG difference in speed (rather than +/- a clock-cycle or so), you'll need longer fifos to avoid stalling on the communications link. I don't know exactly how all of these things works, but I bet that you'd have stalls more frequently if you run different speeds, which are avoided by having the same speed... Stalls aren't too big of a problem if the data is unidirectional, but if you send something and expect a reply, then the stalls are really annoying (especially if you stall in both directions on the same transaction). -- Mats The synch of cores is a much more problematic problem than you explained here. the real issue is to get the load and store buffers of each core to unlock the data for the other core. During thread migration, you need to realease cache lines of your L1, and you load/store buffers without delay. In the case of slowed cores, you ll wait longer for the release. Some AMD fan boys this that NUMA fixed this, it actually made it worse, because you have one more latency between memory controlers, stalling your load buffers during the snooping loop by few more cycles) In the case of Core 2, the thread migration uses the L2 caches, a bit of ownership is changed in the L2 cache, and that 's it :) Windows does run what is called "thread migration". meaning, windows make your threads go around your processor cores, in the logical order exposed by your BIOS. If you try to slow down a core, compare to an other one, Windows will wake it up ever other migration in the case of Dual core. as long as MS does not change this, power management between core will be useless. BSD, linux and SCO Unix are all symetrical OS, they all behave like windows. In the case of Multi package computers, with 2 sockets, you still have to keep all CPU at the same speed for this reason. AMD is marketing the core power management as a major feature, it is probably because they already figured out that they will not beat Core 2 and they need features to market. In the mean time, this feature is useless, because OS will not support this. This may show up when EFI get more available, EFI "bios" has the capability to enumerate the CPU on the fly, this was design for Hot swap CPU by Intel. May be AMD will use it. Who? Again: My employer is not responsable for my positing here, i post by my own using the 1st amandment :) ID: 31667 · Rating: 0 · rate: / Reply Quote

River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0	Message 31676 - Posted: 26 Nov 2006, 9:41:45 UTC - in response to Message 31667. ... If you try to slow down a core, compare to an other one, Windows will wake it up ever other migration in the case of Dual core. as long as MS does not change this, power management between core will be useless. BSD, linux and SCO Unix are all symetrical OS, they all behave like windows... In the case of Linux this is chicken and egg. Linux tends to get written in response to what is needed. At one time linux did not mave support for multi cpus at all, because PCs did not have it. Then the first multi socket boards came out and the motherboards were made so that the cores had to run at the same speed (no doubt for the reasons you explain), so the "SMP" (symmettric mutli processors) code was written for the linux kernel by some group of volunteers who had both the skill and the motivation to do it. The same will happen with AMP (asymettric multi processors). When someone has a board or a chip that will do it, and the skill to write extensions to the linux kernel (not me I'm afraid) then linux will have it. AMD is marketing the core power management as a major feature, ... In the mean time, this feature is useless, because OS will not support this. ... One of the non-technical contrasts between Intel and AMD is their respective attitudes to linux. AMD tend to be much more co-operative than Intel in sharing tech detail with the linux community (so much so that one linux derived project, linuxbios, has called for a boycott of Intel!). AMD are willing to try things anyway, and let the linux community show that it works (or doesn't). In my analysis one reason AMD are pushing this may be to get the geeky end of the Linux community to show that it works. This is not empty marketing. It is the classic response of the firm with the lower market share to take bigger risks than the market leader, to be more flexible in their willingness to work with unconventional partners, and to . Similar things happen in other markets too, as in Avis's line "We're no 2, we try harder". Until (unless) linux takes the bait, you are right that the ordinary overclocker who does not have the skills to roll their own kernel cannot make use of it, far less so the ordinary PC user. But I thnk you are mistaken about AMD's motivation. AMD know that linux won't start to write for hardware that does not exist, so they need to put the hardware out first. I am glad that not every chip designer waits for Windows support before trying it. If we always waited for Bill Gates to spot a good idea we would not have the internet, for example ;-) OK - I am making a friendly dig at Intel in this post. In fairness tho, if there was an AMD staff member rubbishing Intel's motive's I'd be saying similar stuff to them. Both Intel and AMD have learned from, and taken advantage of, each other's mistakes in the past. The divergence in the two firms' technical and non-technical strategies is actually good for the end customer. I certainly appreciate your willingness to come here and explain some of the tech issues in current cpu design. R~~ ID: 31676 · Rating: 0 · rate: / Reply Quote

FluffyChicken Send message Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0	Message 31677 - Posted: 26 Nov 2006, 11:12:43 UTC All in all though. Hypertransport this that and the other is kind of irrelivant for Rosetta Powering down of core is also (hey they should be crunching ;-)). Who makes them (unless you are employed by them since we all need money) It has to be said these Core2 processors are damn good a crunching Rosetta work :-) Which is also very good for all the Mac/x86 users, never seen so many near the top of the RAC charts. Now if only they where affordable.... (by me) Apple maybe the people to lead with the powering down problems. Would be a good selling point for there computers. Team mauisun.org ID: 31677 · Rating: 0 · rate: / Reply Quote

Who? Send message Joined: 2 Apr 06 Posts: 213 Credit: 1,366,981 RAC: 0	Message 32499 - Posted: 12 Dec 2006, 7:42:51 UTC I upgraded my V8 (cores) computer to Windows VISTA. no slow down :) The monster running Vista it is getting back to number one position, it will beat my overclocked quad core over night i think. still no saturation of the front side buses ... that does not happen mister Ruiz! (yep! I got 2 FSB on this motherboard! the snooping filters are more efficent than Hypertransport aging protocole) who? ID: 32499 · Rating: 0 · rate: / Reply Quote

Mats Petersson Send message Joined: 29 Sep 05 Posts: 225 Credit: 951,788 RAC: 0	Message 32522 - Posted: 12 Dec 2006, 16:05:46 UTC - in response to Message 32499. I upgraded my V8 (cores) computer to Windows VISTA. no slow down :) The monster running Vista it is getting back to number one position, it will beat my overclocked quad core over night i think. still no saturation of the front side buses ... that does not happen mister Ruiz! (yep! I got 2 FSB on this motherboard! the snooping filters are more efficent than Hypertransport aging protocole) who? And again, Rosetta isn't going to saturate any bus on a machine with decent L2 cache, so why are you going on about the FSB/HyperTransport - it's quite clear that you DO understand that this is not an issue from previous posts. It may be an issue for some other applications, but not for Rosetta, so Rosetta makes a very poor example for comparing these things, right? -- Mats ID: 32522 · Rating: 0 · rate: / Reply Quote