Strange problem with dual Xeon machine

Message boards : Number crunching : Strange problem with dual Xeon machine

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 52256 - Posted: 5 Apr 2008, 2:02:06 UTC - in response to Message 52248.  

Mod.Sense - can you have a look at this as the scores don't look right. Any ideas?


I don't have any additional means to look in to this either. But let's compare your 2.66 Ghz Xeon with your 3Ghz P4.

I found WUs with the same name and batch number in the results list for each.

Xeon WU CPU seconds: 8,074 Models: 2 claimed: 33.27 granted: 10.39
P4 WU CPU seconds: 9,430 Models: 3 claimed: 20.60 granted: 15.57

Credit is granted based on the completed models. The Xeon only completed two, and so received 2/3'rds the credit of the P4 which completed 3.

But your Xeon is running 8 CPUs, and the P4 only 2. So if you calculated credit per hour of CPU on the above, the Xeon is pulling much more credit per hour. (37.06 vs 11.89). And once it has been reporting in work consistently for 2 weeks, you will see this reflected in the RAC for the Xeon.
Rosetta Moderator: Mod.Sense
ID: 52256 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 52257 - Posted: 5 Apr 2008, 2:03:04 UTC - in response to Message 52254.  

Thanks; regarding the memory, that's a big relief. I keep thinking that something is grossly wrong in my CMOS, but for the most part I have the defaults set. I do have it set so that the CPU doesn't slow down if it gets hot, and I installed a 4-fan controller and are keeping the case fans on max.

I still can't get over why the Integer speed is so low.

Thanks for the link to CPUID! Running that, it says the CPU is running at 2.66Ghz, but I was surpised that the max bandwidth of the FBDIMM PC2-5300 DDR2 667 ram is only 333Mhz. I guess they split the bandwidth for each CPU? I have one 4-Gig Ram chip in Slot 0 of each of 4 banks. This RAM was recommended by ASUS as being compatible, but I just couldn't afford faster RAM.


Your RAM is fast enough. CPUZ reports 333MHz as it's DDR (double data rate) so 333MHz frequency is correct, and will have very little effect as it's pretty quick anyway, and the large cache on those CPUs makes a big difference too.

I'd leave CPUz running for a while and montior the CPU speed just incase it is dropping the CPU speed. It will do that if the CPUs get too hot too.

HTH
Danny

ID: 52257 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 52259 - Posted: 5 Apr 2008, 2:09:08 UTC - in response to Message 52256.  

Thank you very, very much! I'm sure that my tweaking bios settings a couple times a day isn't helping at all, so I'm trying to do it as little as possible.

I really appreciate you taking the time to search for identical work units across my systems and compare them. I hadn't thought of that!

Mod.Sense - can you have a look at this as the scores don't look right. Any ideas?


I don't have any additional means to look in to this either. But let's compare your 2.66 Ghz Xeon with your 3Ghz P4.

I found WUs with the same name and batch number in the results list for each.

Xeon WU CPU seconds: 8,074 Models: 2 claimed: 33.27 granted: 10.39
P4 WU CPU seconds: 9,430 Models: 3 claimed: 20.60 granted: 15.57

Credit is granted based on the completed models. The Xeon only completed two, and so received 2/3'rds the credit of the P4 which completed 3.

But your Xeon is running 8 CPUs, and the P4 only 2. So if you calculated credit per hour of CPU on the above, the Xeon is pulling much more credit per hour. (37.06 vs 11.89). And once it has been reporting in work consistently for 2 weeks, you will see this reflected in the RAC for the Xeon.

ID: 52259 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1829
Credit: 117,426,647
RAC: 50,397
Message 52261 - Posted: 5 Apr 2008, 12:00:11 UTC
Last modified: 5 Apr 2008, 12:00:29 UTC

those Xeons are Core2/Penryn based and should get at least 90% of the credit per core that my core2 duo gets per hour (my C2D is 3.2GHz, but has only 2/3rds of the cache per core and is the slightly slower Conroe rather than Penryn). As it stands it is getting around 10 credits per 10000-second task where it should be getting more like 55 credits.

Something not right... I think that either Sandra benchmarks will flag it up or there's something wrong with your BOINC/Rosetta installation maybe?
ID: 52261 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 52262 - Posted: 5 Apr 2008, 12:33:07 UTC - in response to Message 52252.  
Last modified: 5 Apr 2008, 12:35:00 UTC

I downloaded Sandra Lite this morning, and the Benchmark test is coming up with the same Integer speed as Rosetta Benchmarks-approx 48563 Dhrystone MIPS. Strangely, one of the comparison CPU's in Sandra is a Clovertown 2.33 Ghz model (E5345) which shows 84426 MIPs. Whetstone is similar: 41932 MFLOPS for mine (2.66 Ghz E5430 Harpertown) and 58749 MFLOPS for the 2.33 Clovertown. Since the 2.33 Ghz Clovertown was actually cheaper than my Harpertown processors, I'm scratching my head at these results....


Could the difference be because he's using XP PRO/64 while I'm using Pro/32?

Nice machine :D

No - 64 bit isn't an advantage for BOINC/Rosetta at the moment (might be in the future). Even with those benchmarks you should be getting higher scores for your granted credit... What are your preferences set as - i.e. do you have 'use at least 8 CPUs' and use 100% of CPU?
You could download Sandra Lite and running a few benchmarks, but it sounds like it's running fine. Are there 8 Rosetta processes each using ~12% CPU utilisation? (easiest way to check is to get Task Manager up and have the 'CPU time' column showing and then sort the column by that so the longest running threads are at the top...
ID: 52262 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ingleside

Send message
Joined: 25 Sep 05
Posts: 107
Credit: 1,514,472
RAC: 0
Message 52263 - Posted: 5 Apr 2008, 12:40:56 UTC - in response to Message 52249.  

Further, comparing it to another 2.66 Dual CPU Xeon X5355 machine (although not the same exact model) on BOINC showed that computer (owned by ROBiie) showed a FPS of 2531.02 but an astounding 8193.16 Integer speed!

That's the performance I was looking for--especially since my Xeon processors are the 45nm Hi-K models Intel's been ranting about. Could the difference be because he's using XP PRO/64 while I'm using Pro/32?

While win-64 will probably have little or no effect in Rosetta@home, didn't you say your system has 16 GB ram? Meaning, over 75% of the installed memory can't be used as long as runs 32-bit...

The BOINC-benchmark seems to be higher on 64-bit, especially integer, but this isn't a good indication of actual performance-increase.


I've no idea if Rosetta@home is influenced by cache-size and memory-bandwidth, so is possibly memory-bandwidth-limited then tries to run 8 instances even with large cache-size... One method to test this would be to see how performs then running only 1 instance, 2 instances, 3 ... upto 8. But, with the large variations in rosetta-wu's, should preferably test this by running the exact same wu on all cores...

Hmm, it would be possible to test if single/dual-channel-memory has any effect, even if runs 8 different wu's...


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
ID: 52263 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 52264 - Posted: 5 Apr 2008, 13:24:04 UTC - in response to Message 52263.  

Thank you!! I completely forgot about the memory limitation ceiling for 32-bit machine. That explains why it only shows 4 gig of memory even though I have 16 gig installed.

I was looking in my BIOS to configure the memory. They have several different options: Rank Interleaving 1:1, 2:1 and 4:1. The default is 4:1, and that's where I left it. It also allows Branch Sparing, but it's default is disabled, so that's where I kept it too.

I'll have to see whether I get more credit per WU by running less WU's at once, but in the long run I wonder if it would just balance out--less WUs complete, but fewer ones completed faster for more credit per WU....



While win-64 will probably have little or no effect in Rosetta@home, didn't you say your system has 16 GB ram? Meaning, over 75% of the installed memory can't be used as long as runs 32-bit...

The BOINC-benchmark seems to be higher on 64-bit, especially integer, but this isn't a good indication of actual performance-increase.


I've no idea if Rosetta@home is influenced by cache-size and memory-bandwidth, so is possibly memory-bandwidth-limited then tries to run 8 instances even with large cache-size... One method to test this would be to see how performs then running only 1 instance, 2 instances, 3 ... upto 8. But, with the large variations in rosetta-wu's, should preferably test this by running the exact same wu on all cores...

Hmm, it would be possible to test if single/dual-channel-memory has any effect, even if runs 8 different wu's...


ID: 52264 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 52265 - Posted: 5 Apr 2008, 13:45:45 UTC

I wonder if it would just balance out--less WUs complete, but fewer ones completed faster for more credit per WU....


I think the point was that if a test running say 2 WUs shows better credit per hour per core (with 2 cores) then when running 8 WUs at once, it would basically prove that memory is constrained and therefore it would indicate it might be worth installing an OS that can support all of your memory. Then you could expect to run all 8 at once and yield the same (better) credit per core you saw while running 2.
Rosetta Moderator: Mod.Sense
ID: 52265 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 52266 - Posted: 5 Apr 2008, 14:09:26 UTC - in response to Message 52265.  

I see. I didn't catch that. Thank you! I'll give it a try right away.

I wonder if it would just balance out--less WUs complete, but fewer ones completed faster for more credit per WU....


I think the point was that if a test running say 2 WUs shows better credit per hour per core (with 2 cores) then when running 8 WUs at once, it would basically prove that memory is constrained and therefore it would indicate it might be worth installing an OS that can support all of your memory. Then you could expect to run all 8 at once and yield the same (better) credit per core you saw while running 2.

ID: 52266 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ingleside

Send message
Joined: 25 Sep 05
Posts: 107
Credit: 1,514,472
RAC: 0
Message 52268 - Posted: 5 Apr 2008, 14:29:43 UTC - in response to Message 52265.  
Last modified: 5 Apr 2008, 14:32:44 UTC

I wonder if it would just balance out--less WUs complete, but fewer ones completed faster for more credit per WU....


I think the point was that if a test running say 2 WUs shows better credit per hour per core (with 2 cores) then when running 8 WUs at once, it would basically prove that memory is constrained and therefore it would indicate it might be worth installing an OS that can support all of your memory. Then you could expect to run all 8 at once and yield the same (better) credit per core you saw while running 2.

Yes, if running 1 instance shows example 50 credit/hour per core and 8 instances shows 30 credit/hour per core, it's a clear indication your computer is memory-bandwidth-limited in Rosetta@home. In this example Rosetta@home is likely maxed-out at 5 or 6 cores, meaning it's probably possible to find another non-memory-bandwidth-limited BOINC-project and run this for 25% of the time "for free", since Rosetta@home will get the same credit/day regardless of uses all 8 cores or only 6 of them...

Or, switching to faster memory would increase Rosetta@home-production...

If on the other hand 1 instance gives example 50 credit/hour per core and 8 instances gives 47 credit/hour per core, it indicates Rosetta@home is not memory-bandwidth-limited, and there's likely another reason for your computers mediocre Rosetta@home-production...


As for switching to 64-bit OS, this should be done regardless of whatever test-results you're getting, since running an OS that can't use 78% of installed memory doesn't make much sence...

Wouldn't expect OS-switch will change anything significantly for Rosetta@home...
... Except...
I've no idea if it's true or not for some mainboards, but if you're running 4x 4 GB-memory-sticks, it's maybe possible your mainboard somehow only uses the 1st. memory-stick in win32, so in practice it's single-channel-mode in win32...
"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
ID: 52268 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 52270 - Posted: 5 Apr 2008, 15:23:58 UTC - in response to Message 52265.  
Last modified: 5 Apr 2008, 15:36:29 UTC

Running the memory Bandwidth test in Sandra revealed a memory issue I'm trying to figure out. There were two notes after running the benchmark (My system was at the bottom of the chart as far as performance goes). The first note says, "System bandwidth appears FSB limited. Attempt to increase FSB." The second note says, "Low bandwith efficiency (advanced). Check memory timings and settings." I changed the memory configuration from Branch Sequencing to Branch Interleaving under the Northbridge Chipset Configuration menu, and I no longer get the "Low bandwidth efficiency" error message, but I still get the FSB Limited error message.

Where do I change FSB? In the BIOS under Advanced CPU Settings , there is no option for changing FSB. There IS an option for Ratio CMOS Setting, and I've set it to the max of 8. It was originally the default of 6. I also have Virtualization Technology disabled. I don't know what would be limiting my FSB. All BIOS options to slow the CPU down for overheating (CPU TM, Speedstep) are disabled.

There is also a Rank Interleaving option under the Northbridge Chipset Configuration. The default is 4:1, and that's what mine is set at. There is also a 1:1 and a 2:1 option, but I haven't tried those options yet. Would changing this improve performance, or is it already set at the best setting?

My CPUs are 1333 FSB capable, and I'm sure the ASUS DSEB-D16/SAS Mobo is also. There ia a note in the Mobo manual that says, "The FBDIMM 800 Mhz has to work with the 1600FSB CPU or above. Otherwise, the memory module downgrades and runs at the speed of 667Mhz." I'm using PC2-5300 DDR667 FBDIMMs anyway, and since the Harpertown CPUs only run at 1333FSB, the memory shouldn't be downgrading. Sandra shows my memory timings at 5.0-5-5-15.

One final note. After changing the Northbridge chipset from Branch Sequencing to Branch Interleaving, and forcing the BIOS Ratio CMOS setting from 6 to 8, I re-ran the Processor Arithmatic Test on Sandra, and this time my 2.66 Harpertown is beating the 2.33Ghz Clovertown comparison CPU. My numbers are now Dhrystone ALU 89834 MIPS and Whetstone iSSE3 77535 MFLOPS. I'm still getting the bandwidth FSB Limited error message under the Memory Bandwidth test, however.

I wonder if it would just balance out--less WUs complete, but fewer ones completed faster for more credit per WU....


I think the point was that if a test running say 2 WUs shows better credit per hour per core (with 2 cores) then when running 8 WUs at once, it would basically prove that memory is constrained and therefore it would indicate it might be worth installing an OS that can support all of your memory. Then you could expect to run all 8 at once and yield the same (better) credit per core you saw while running 2.
ID: 52270 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 52272 - Posted: 5 Apr 2008, 15:33:52 UTC - in response to Message 52268.  
Last modified: 5 Apr 2008, 15:48:34 UTC

Wow, so much to learn... Many thanks to everyone for helping answer my myriad questions regarding system performance!!!

I'm running two WU's right now and have about 3 hours till they are completed. Then I'll see what credit I received for them.

Yes, I was considering removing the other 3 sticks of 4-gig RAM; it's just that I don't have another FBDIMM-capable motherboard laying around....However, if it's somehow slowing me down leaving them in there, then I'll remove them!

Please see my rather lengthy post on Sandra testing this morning. I got some significant improvements in CPU performance; although memory bandwidth due to FSB is still a problem. I'm still trying to figure out how to increase FSB in my BIOS. I think it's the Ratio CMOS setting, and that's now set to the max of 8 (it was at 6). Actually, according to Sandra's Computer Overview section, my FSB is running at 1.33Ghz. Why I'm still getting an FSB-limited error under Memory Bandwidth testing is very confusing.



Yes, if running 1 instance shows example 50 credit/hour per core and 8 instances shows 30 credit/hour per core, it's a clear indication your computer is memory-bandwidth-limited in Rosetta@home. In this example Rosetta@home is likely maxed-out at 5 or 6 cores, meaning it's probably possible to find another non-memory-bandwidth-limited BOINC-project and run this for 25% of the time "for free", since Rosetta@home will get the same credit/day regardless of uses all 8 cores or only 6 of them...

Or, switching to faster memory would increase Rosetta@home-production...

If on the other hand 1 instance gives example 50 credit/hour per core and 8 instances gives 47 credit/hour per core, it indicates Rosetta@home is not memory-bandwidth-limited, and there's likely another reason for your computers mediocre Rosetta@home-production...


As for switching to 64-bit OS, this should be done regardless of whatever test-results you're getting, since running an OS that can't use 78% of installed memory doesn't make much sence...

Wouldn't expect OS-switch will change anything significantly for Rosetta@home...
... Except...
I've no idea if it's true or not for some mainboards, but if you're running 4x 4 GB-memory-sticks, it's maybe possible your mainboard somehow only uses the 1st. memory-stick in win32, so in practice it's single-channel-mode in win32...
ID: 52272 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 52275 - Posted: 5 Apr 2008, 16:42:35 UTC - in response to Message 52268.  
Last modified: 5 Apr 2008, 16:56:05 UTC

I just noticed that my memory Bandwidth Efficiency % in Sandra is only 52.48!!! The program mentioned the risks of sharing bandwidth with the on-board VGA adapter (just like Paul D. Buck mentioned earlier in this thread), so when the new PCIExpress-2.0 card arrives Monday it will be interesting to see how it impacts the memory bandwidth efficiency. I was really shocked to see how low it really is on my machine!

I removed two banks of 4G FBDIMMs, making sure they were not Slots 0 & 1. My Bandwidth Efficiency dropped to 37.03%!!

So, even though XP Pro/32 cannot utilize anything above 4 gig, apparently the system can use the DIMMs to increase bandwidth. Interesting!!!


Yes, if running 1 instance shows example 50 credit/hour per core and 8 instances shows 30 credit/hour per core, it's a clear indication your computer is memory-bandwidth-limited in Rosetta@home. In this example Rosetta@home is likely maxed-out at 5 or 6 cores, meaning it's probably possible to find another non-memory-bandwidth-limited BOINC-project and run this for 25% of the time "for free", since Rosetta@home will get the same credit/day regardless of uses all 8 cores or only 6 of them...

Or, switching to faster memory would increase Rosetta@home-production...

If on the other hand 1 instance gives example 50 credit/hour per core and 8 instances gives 47 credit/hour per core, it indicates Rosetta@home is not memory-bandwidth-limited, and there's likely another reason for your computers mediocre Rosetta@home-production...


As for switching to 64-bit OS, this should be done regardless of whatever test-results you're getting, since running an OS that can't use 78% of installed memory doesn't make much sence...

Wouldn't expect OS-switch will change anything significantly for Rosetta@home...
... Except...
I've no idea if it's true or not for some mainboards, but if you're running 4x 4 GB-memory-sticks, it's maybe possible your mainboard somehow only uses the 1st. memory-stick in win32, so in practice it's single-channel-mode in win32...
ID: 52275 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ingleside

Send message
Joined: 25 Sep 05
Posts: 107
Credit: 1,514,472
RAC: 0
Message 52277 - Posted: 5 Apr 2008, 17:18:08 UTC - in response to Message 52272.  

Wow, so much to learn... Many thanks to everyone for helping answer my myriad questions regarding system performance!!!

I'm running two WU's right now and have about 3 hours till they are completed. Then I'll see what credit I received for them.

Yes, I was considering removing the other 3 sticks of 4-gig RAM; it's just that I don't have another FBDIMM-capable motherboard laying around....However, if it's somehow slowing me down leaving them in there, then I'll remove them!

Well, if system has only 1 stick, the memory-bandwidth should drop to half (or possibly 1/4th) of currently, meaning much worse than currently. If it's not dropping, it atleast indicates win32 is only using 1 memory-stick...



Please see my rather lengthy post on Sandra testing this morning. I got some significant improvements in CPU performance; although memory bandwidth due to FSB is still a problem. I'm still trying to figure out how to increase FSB in my BIOS. I think it's the Ratio CMOS setting, and that's now set to the max of 8 (it was at 6). Actually, according to Sandra's Computer Overview section, my FSB is running at 1.33Ghz. Why I'm still getting an FSB-limited error under Memory Bandwidth testing is very confusing.

Taking a look on the manual, wow, 16 memory-slots, 4 channels, so should atleast in theory have a ton of memory-bandwidth...

Hmm, with 4 sticks, the optimal is to put one in each "channel". Make sure it's in "DIMM_00", "DIMM_10", "DIMM_20", "DIMM_30", it's likely easy to put it wrong... Hmm, would guess the BIOS "System Memory Information" will show there each stick is placed?

In BIOS, on "NorthBridge Chipset Configuration", would guess the optimal is:

"MCH Branch Mode" - Interleaving

"*** sparing" - disabled

"Branc 0/1" - enabled

"Rank Interleaving" - good question... Hmm, not sure if 1:1 is best here or not, with 2 sticks in each "branch"... You'll have to test this...




"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
ID: 52277 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 52278 - Posted: 5 Apr 2008, 17:19:22 UTC

You can also get odd results in the case of using "oversized" ram sticks ... in other words, but using the 4G sticks the system is fumbling around confused by the "excess" ... it has been so long since I have struggled with these issues that I cannot offer more.

You COULD also try over at the SAH boards though you are interested in RAH processing. Some of the Over-Clocking crowd over there may be able to offer more sage advice as to settings ...

Though it is limiting me as to projects in the long run, and I am still struggling with some issues with the new system ... I am sure glad I am moving to Mac dominance in this house ... far less odd issues ...

Even better, my brother is getting a half a ton of old PC parts I have been lugging about for like forever ...
ID: 52278 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ingleside

Send message
Joined: 25 Sep 05
Posts: 107
Credit: 1,514,472
RAC: 0
Message 52280 - Posted: 5 Apr 2008, 17:55:01 UTC - in response to Message 52278.  

You can also get odd results in the case of using "oversized" ram sticks ... in other words, but using the 4G sticks the system is fumbling around confused by the "excess" ... it has been so long since I have struggled with these issues that I cannot offer more.

According to manual the board supports upto 128 GB memory, with 8 GB-dimms, but it's possible win32 gets confused by so much memory...


Long time since last, Paul. :)


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
ID: 52280 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 52282 - Posted: 5 Apr 2008, 18:04:57 UTC - in response to Message 52277.  
Last modified: 5 Apr 2008, 19:04:22 UTC

I verified they are in DIMM_00, etc. I've been changing BIOS settings one at a time, rebooting, then re-running Sandra. Branch Interleaving gives the best performance regarding bandwidth efficiency, and sparing is disabled. As for Rank Interleaving, I tried all three choices and 4:1 was the best, but even the worst didn't lower my bandwidth efficiency by more than 10%. As of right now, it's showing 52% efficient and that's the best I've been able to reach.

It would be interesting swapping out all 4 4-gig DIMMS and replacing them with 1-gig DIMMS, but that would be an expensive test... I think I'll wait for the video card to arrive and see how that improves bandwidth.

I've been rebooting the machine so often this morning that I question the validity of the last two results posted for that machine. Rosetta says the claimed credit was approx 15, but the granted credit was 32. I wonder if that's because I boosted the Ratio CMOS Setting from 6 to 8, and now the CPU is performing much better even though memory bandwidth is still suffering. The Rosetta benchmarks were based on the CPU before I made the changes...

Now I'm back to running 6 processes. I noticed that whenever I change the number of CPUs available, Rosetta re-runs the benchmark test. We'll see what this next batch shows in a few hours. I've just about changed every BIOS setting I can find, so I think I've run out of options there to improve memory bandwidth efficiency.

Interestingly enough, I just finished running Sandra on my two other 2.4 Ghz Quad machines. They both have 4 gigs of DDR800 RAM installed (so no excess RAM for XP Pro32 to be confused over), and both have PCI Express V1 video cards (Intel D975XBX2KR Mobos). They both show memory bandwidth efficiencies of 56 and 58%. So, it seems that my memory bandwidth deficiencies are not limited to the Dual Xeon machine alone. Granted credits for those two machines are in the low 50's.

My head hurts.... ;o)


Well, if system has only 1 stick, the memory-bandwidth should drop to half (or possibly 1/4th) of currently, meaning much worse than currently. If it's not dropping, it atleast indicates win32 is only using 1 memory-stick...

Taking a look on the manual, wow, 16 memory-slots, 4 channels, so should atleast in theory have a ton of memory-bandwidth...

Hmm, with 4 sticks, the optimal is to put one in each "channel". Make sure it's in "DIMM_00", "DIMM_10", "DIMM_20", "DIMM_30", it's likely easy to put it wrong... Hmm, would guess the BIOS "System Memory Information" will show there each stick is placed?

In BIOS, on "NorthBridge Chipset Configuration", would guess the optimal is:

"MCH Branch Mode" - Interleaving

"*** sparing" - disabled

"Branc 0/1" - enabled

"Rank Interleaving" - good question... Hmm, not sure if 1:1 is best here or not, with 2 sticks in each "branch"... You'll have to test this...



ID: 52282 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 52285 - Posted: 5 Apr 2008, 20:56:28 UTC - in response to Message 52280.  

WOOOOOOOTTT!

Ok, the last series of 8 WU's were in the 10ksec range and had granted credits in the mid-50s. Much better than the 10-15 I was getting before!!!

Thanks to everyone for all the great advice. I learned a whole lot about CPU and memory performance, not to mention all those pesky BIOS settings that can really cripple a system.

I'm still looking forward to seeing if my memory bandwidth efficiency improves with the addition of a video card instead of using the MB video, but otherwise I'm much happier with the results!

You can also get odd results in the case of using "oversized" ram sticks ... in other words, but using the 4G sticks the system is fumbling around confused by the "excess" ... it has been so long since I have struggled with these issues that I cannot offer more.

According to manual the board supports upto 128 GB memory, with 8 GB-dimms, but it's possible win32 gets confused by so much memory...


Long time since last, Paul. :)


ID: 52285 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Strange problem with dual Xeon machine



©2024 University of Washington
https://www.bakerlab.org