Message boards : Number crunching : Intel i7 CPU
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
FoldingSolutions![]() Send message Joined: 2 Apr 06 Posts: 129 Credit: 3,506,690 RAC: 0 |
As previously, HT might help other threads keep out of the way of Rosetta, but for rosetta throughput the critical factor will be whether the increased use of the L3 (with its latency penalty) will be compensated for by the interleaving of the threads under HT I agree that HT will have to share the L2 cache and that this could be detrimental. However, as the L2 in Core 2 Duos can be as much as 3MB/core, i dont think that 128KB going amiss in an i7 is going to affect it too much. And the quicker access to main memory should even out any short-falls. But you could be right that disabling HT could improve performance as Rosetta mostly uses the FPU so two threads using the same unit could get messy. But I did hear somewhere that Intel had looked at this and that the new HT is different to the old P4 version. Time will tell :) |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Just think of it as 2/3 + 2/3 = 4/3. So even if each "hyperthreaded" core is only 2/3 as powerful as a full core You can also think of it as a 10 year old being 2/3rds of a baker... now picture 8 of them in your kitchen and what will happen to your cookies? Faster? or slower? Either outcome is possible. Time will tell. Rosetta Moderator: Mod.Sense |
FoldingSolutions![]() Send message Joined: 2 Apr 06 Posts: 129 Credit: 3,506,690 RAC: 0 |
Just think of it as 2/3 + 2/3 = 4/3. So even if each "hyperthreaded" core is only 2/3 as powerful as a full core Probably not the best analogy to compare a child to a piece of silicon. I think Intel designed HT to improve efficiency, not to create more of a weaker version of the previous model. The maths of 2 + 2 = 4 is very different to 8 children + cookies = no cookies ;) |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
8 children + cookies = no cookies Funny! My point was simply that "it depends". ...if the kitchen is large enough, it can work more efficiently. But even if floor space (L2 cache) is not the problem, other key resources such as mixing bowel (bus), and oven (math coprocessor) may become bottlenecks. Rosetta Moderator: Mod.Sense |
FoldingSolutions![]() Send message Joined: 2 Apr 06 Posts: 129 Credit: 3,506,690 RAC: 0 |
8 children + cookies = no cookies OK I see your point, also if there isn't enough flour (data) to go round then some of those resouces are wasted. And Hyperthreading can provide a little extra flour :) |
![]() ![]() Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
|
AlphaLaser Send message Joined: 19 Aug 06 Posts: 52 Credit: 3,327,939 RAC: 0 |
We will most likely see the L2 cache size problem addressed in Westmere. AnandTech published an article where Intel recognized that the current cache situation is not ideal. A larger L2 size on the current process is hard to justify because that would decrease the effective size of the L3 cache, due to its inclusiveness as Mastergee pointed out. A larger L2 would have to come with a larger L3, which can be made possible with the extra transistors in 32 nm. What would be interesting is that if HT for Rosetta is limited due to the shared use of the FPU, then would there be less of a performance drop when running an integer-heavy BOINC project on the other logical core. If this were true, then the ideal case is for BOINC or the OS to set process affinity such that each physical CPU runs an ideal mix of work. ![]() |
FoldingSolutions![]() Send message Joined: 2 Apr 06 Posts: 129 Credit: 3,506,690 RAC: 0 |
We will most likely see the L2 cache size problem addressed in Westmere. AnandTech published an article where Intel recognized that the current cache situation is not ideal. A larger L2 size on the current process is hard to justify because that would decrease the effective size of the L3 cache, due to its inclusiveness as Mastergee pointed out. A larger L2 would have to come with a larger L3, which can be made possible with the extra transistors in 32 nm. That is perfectly true. If one math unit is being used by one application and then the other by another application. Then HT is being utilised to near 100% effectivenes (Probably never actually as good as 2 physical cores, but nearly) HT is mostly useful in enterprise applications (Data mining, 3D rendering) where all units are being used. However in scientific apps like Rosetta or Folding@home, I'm not too sure if there is much use of the interger (ALU) unit as opposed to the FPU. |
mikey![]() Send message Joined: 5 Jan 06 Posts: 1898 Credit: 12,723,752 RAC: 682 ![]() |
I thought BOINC only ran one task per physical CPU now? If not, we should be telling people to do that for better throughput... No Boinc uses all cpus the BIOS tells it are there. I have an older dual core Intel P4 with HT on each core. The BIOS reports 4 cpus, Boinc runs on all 4 at once. Now there is a performance hit because it is not 4 physical cores and there is some memory swapping going on. Here is a link it it http://abcathome.com/show_host_detail.php?hostid=65041 But it does do more work using 4 cpus than only using 2 when crunching under Boinc. It is a Boinc only machine now, it used to be my main machine but it had some "issues" and now runs just fine. ![]() |
![]() ![]() Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
I thought BOINC only ran one task per physical CPU now? If not, we should be telling people to do that for better throughput... is it a dual CPU motherboard? Or is it a single chip with two cores each with HT on? ![]() |
![]() Send message Joined: 3 Nov 05 Posts: 1834 Credit: 124,260,318 RAC: 9 |
I thought BOINC only ran one task per physical CPU now? If not, we should be telling people to do that for better throughput... when you say 'boinc' do you mean rosetta? there'll be big differences between projects! |
![]() ![]() Send message Joined: 21 Sep 05 Posts: 56 Credit: 575,419 RAC: 0 |
FAccording to most reports, hyperthreading on the Core i7 is much more efficient then prior implementations. While I can't confirm for Rosetta, on most other DC projects and most applications, it does improve efficiency, not decrease it. When calculating credit per day, it is important to treat it as 8 processors not just 4 which someone mentioned earlier. Really is a darn fast machine for most things. Will know how it does on Rosetta soon enough. As dcdc mentioned earlier I've seen a flaw in my previous logic though - my original calc was per core, but of course there are 8 logical cores here, and each is getting slightly less credit than my Q6600 per cycle, but as there are two threads (it looks like that machine is running 8 threads from the sum of the time on all tasks), then that's quite impressive. I'd be interested to see one with and without HT for comparison as it might be even quicker with only four rosetta threads running. ![]() Team MacNN - The best Macintosh team ever. |
The_Bad_Penguin![]() Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,276,053 RAC: 0 |
|
mikey![]() Send message Joined: 5 Jan 06 Posts: 1898 Credit: 12,723,752 RAC: 682 ![]() |
I thought BOINC only ran one task per physical CPU now? If not, we should be telling people to do that for better throughput... It is a single chip with 2 cores, each with HT on it. 3 years ago the cpu alone was $1000.00US. ![]() |
mikey![]() Send message Joined: 5 Jan 06 Posts: 1898 Credit: 12,723,752 RAC: 682 ![]() |
I thought BOINC only ran one task per physical CPU now? If not, we should be telling people to do that for better throughput... I mean Boinc the program, I have the pc on ABC at home right now. And yes I guess there is a difference in how the different projects will use the L2 cache, etc which will make a difference in how fast overall the cpu is. ![]() |
![]() ![]() Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
I thought BOINC only ran one task per physical CPU now? If not, we should be telling people to do that for better throughput... those must've (<- why does firefox say that's spelled wrong?) been the very first dual-cores then... ![]() |
mikey![]() Send message Joined: 5 Jan 06 Posts: 1898 Credit: 12,723,752 RAC: 682 ![]() |
I thought BOINC only ran one task per physical CPU now? If not, we should be telling people to do that for better throughput... I think it was close. My wife got tired of me being retired so set me up with an interview at her work and I got the job. She bought me the new pc as payment. I cost $2000.00 at a mom and pop pc store. It had the cpu, motherboard, 1 gig of memory, a sata dvd burner, an 80 gig hard drive, a 256 meg video card, Windows XP Pro and the case. I brought it home and added 3 more 250 gig or bigger hard drives and made it my main pc. It is now a Boinc only machine running Linux because under windows it will not keep all 4 cores running. It does fine for a few days and then drops back to 2 cores and then 1 core. Under Ubuntu it is running all 4 cores just fine. It may be getting old, but it still is crunching great AND is actually faster than my, much newer, AMD 9850 true quad core!!! ![]() |
Paul Send message Joined: 29 Oct 05 Posts: 193 Credit: 68,143,842 RAC: 1 |
Do we have enough stats on the i7 to know if it is good for R@H? I looked at some of the BOINC Stats pages but I don't understand what it is telling me. The Q6600 is a work horse and the Q9450 should be even faster with 45nm technology and an expanded cache. What is the sweet spot for processors on R@H? Clearly faster processors are better but I am looking for the best RAC to $ ratio. Thx! Paul ![]() |
![]() ![]() Send message Joined: 21 Sep 05 Posts: 56 Credit: 575,419 RAC: 0 |
Do we have enough stats on the i7 to know if it is good for R@H? Don't know yet, but I just brought my i7 online today. It will be 100% Rosetta for a little while, not shared so should eventually have a good idea how it does. https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=974720 ![]() Team MacNN - The best Macintosh team ever. |
mikey![]() Send message Joined: 5 Jan 06 Posts: 1898 Credit: 12,723,752 RAC: 682 ![]() |
Do we have enough stats on the i7 to know if it is good for R@H? That's 1,000 credits since YESTERDAY!!!! Wow that is fast!! I have an intel P4 T2300 running at 1.66ghz and your i7 measures twice as fast in both floating and integer scores. You are also doing units in about an hour or so, mine is doing them in over twice as long! ![]() |
Message boards :
Number crunching :
Intel i7 CPU
©2025 University of Washington
https://www.bakerlab.org