Still ignoring 64 bit users?

Message boards : Number crunching : Still ignoring 64 bit users?

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Mats Petersson

Send message
Joined: 29 Sep 05
Posts: 225
Credit: 951,788
RAC: 0
Message 31052 - Posted: 13 Nov 2006, 14:52:28 UTC - in response to Message 31022.  

One of the reason why is because the Athlon is not dealing correctly with the X87 stack, while Core 2 does as good as with SSE2 (Athlon X87 insructions are slower than Athlon SSE2)

This is why you see 64bits performance improvement on Athlon and you don't see it on Core 2: It expose a weakness of Athlon, and was marketed as a "64bit improvement"! good job from the 3 marketeers ;-)


Under what conditions? x87 may be slower than SSE[2] in some circumstances, but it does depend on the exact conditions that you test under, and what dependancies are in the code. I don't agree at all that it's generically that all 64-bit "gains" are purely switching from x87 to SSE.

Now, if you compare the performance improvement of "64bits" and the performance improvement of multi-core, it is pretty clear that HyperThreading, then multicore was the right way to go. 64bits for the workload of the side of Rosetta and Seti is irrelevent, if you do not have 64 processors to feel the memory with 64 workload. (duplicating threads multiply the memory size)

I hope the K8L will fix the x87 floating point glass jaw of K8, and that we will see less 64bits marketing stuff for performance related topic! 64bits is good for memory addressing, the rest is pure marketing! your CPU since long time has 128bits instruction set, it was called SSE!

This posting only engage myself, it reflect experimentation i did by myself at home, my employer is not responsible for any of those statements. I am just fedup with all of this #@!$!$#!@$ about 64bits.


I agree that there are LOADS of $$%"£ going on about both SSE _and_ 64-bit optimizations, that are actually not attributed correctly... Which is why I've said from the beginning of my discussions (going back several months) on "optimizing rosetta", that one has to look at what can be done, and what benefit it has, not just assume that if we compile for 64-bit or SSE or some other "magic pill". Optimization starts with understanding what is the bottleneck and fixing it.



The common mistake will be to run several instances of the same program, duplicating the original set of data.
in the case of Seti and rosetta, you can open the set of data in read only, and share it with all the processors, minimizing the memory traffic dramatically, and avoiding L2 cache or L3 caches synch issues.
Of course, for marketing reason, the multiple instances of the workload with be pushed by the companies with several memory controler, but this is not the optimum algorythm, read only shared memory buffers is the way to go for protein unfolding, and pattern matching: Demo coming soon on SETI.

Who?


Great, but there is one flaw: Rosetta, as opposed to for example SETI, works on a base-configuration that is read once, and then modified stepwise to find the "answer". This means that the data is only the same for a very brief period of time, then it starts changing.

Whilst there are techniques to deal with sharing common data and then splitting when it changes (using copy-on-write methods), I very much doubt that it's going to have any beneficial role for rosetta.

--
Mats
ID: 31052 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
ebahapo
Avatar

Send message
Joined: 17 Sep 05
Posts: 29
Credit: 413,302
RAC: 0
Message 31055 - Posted: 13 Nov 2006, 15:27:50 UTC - in response to Message 31022.  

I think working on the threading is much more important than the 64bits side.

BOINC does not support multi-threaded applications. In order to guarantee that several projects run along side on a multi-processor/multi-core system, each one must not run more than one thread at a time.

I suggested before to the BOINC developers to allow some projects to use all barrels on such systems, especially those with a need for results soon, such as Folding, but they said that it would require a major rework in the client that they are not willing to do at the moment.

HTH


ID: 31055 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile skgiven
Avatar

Send message
Joined: 7 Jun 06
Posts: 9
Credit: 642,742
RAC: 0
Message 31291 - Posted: 17 Nov 2006, 10:06:16 UTC

As an IT Consultant, and systems builder, I strongly recommend that Boinc is redesigned around multi-core 64bit chips. ALL NEW Processors are 64bit and ALL NEW Processors are Multicore. In another 2 or 3 years, there will be few computers with single core 86 processors, and the project will start to wain.

ID: 31291 · Rating: -1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mats Petersson

Send message
Joined: 29 Sep 05
Posts: 225
Credit: 951,788
RAC: 0
Message 31293 - Posted: 17 Nov 2006, 10:47:39 UTC - in response to Message 31291.  

As an IT Consultant, and systems builder, I strongly recommend that Boinc is redesigned around multi-core 64bit chips. ALL NEW Processors are 64bit and ALL NEW Processors are Multicore. In another 2 or 3 years, there will be few computers with single core 86 processors, and the project will start to wain.


As an IT consultant, you should also understand that running the same application multiple times will with different workloads is comparable, performance-wise, to running a single copy of the application with multiple threads - at least as long as the different instances can't make use of shared data. The latter is the case for Rosetta. As I haven't studied how other BOINC projects work, I don't know if this is also true for other projects.

To explain further: Rosetta uses a base-configuration that is sent out with a different random seed (starting point for random number generation) for each work-unit. It then starts with the base configuration and randomly "rearranges" the molecular structure. So the starting point is the same, but once it's gone one step from the starting point, the molecular structure is different for each instance, so there's no data to be shared between the two instances.

As has been discussed many times, compiling something for 64-bit instead of 32-bit in and of itself will not always make the application run faster. For some types of applications it does (those that have frequent calls to small functions in particular, as the 64-bit (x86) architecture allows for more parameters from one function to another to be passed as register arguments rather than on the stack). However, Rosetta uses fairly lengthy functions with heavy inlining, which means that each function runs for quite some time, and calls to different functions is a minute component in the overall time used by the application.

--
Mats
ID: 31293 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile skgiven
Avatar

Send message
Joined: 7 Jun 06
Posts: 9
Credit: 642,742
RAC: 0
Message 31297 - Posted: 17 Nov 2006, 13:54:16 UTC - in response to Message 31293.  
Last modified: 17 Nov 2006, 14:01:05 UTC

Why then does your program always use 50% of my processing power on my 64bit dual processors and not 100% as with my 32bit systems? Also, what proportion of users would know they could run BOINC two or more times, let alone how to? I tried and it did not work, it just opened up the same Manager window twice, so I left it at that?
ID: 31297 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mats Petersson

Send message
Joined: 29 Sep 05
Posts: 225
Credit: 951,788
RAC: 0
Message 31300 - Posted: 17 Nov 2006, 15:02:20 UTC - in response to Message 31297.  

Why then does your program always use 50% of my processing power on my 64bit dual processors and not 100% as with my 32bit systems? Also, what proportion of users would know they could run BOINC two or more times, let alone how to? I tried and it did not work, it just opened up the same Manager window twice, so I left it at that?


"My program"? I happen to have the source code for Rosetta, but that doesn't make it "mine"... ;-)

BOINC, if it's configured correctly will use all available processor to run one instance per processor (i.e you have set it to use "On multiprocessors, use at most 8 processors"), and should use 100% of all available CPU's (up to 8). This is a setting in general preferences.

If it's showing 50% in Windows, then there's one of two problems:
1. BOINC doesn't know that you've got more than one core, for some reason.
2. You've set BOINC preferences to only use one processor.

Since https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=351407 says that you have 2 CPU's, we can rule out #1 above.

You don't need to run BOINC twice, it's only the application (Rosetta, SETI, etc) that needs to run multiple times - and the user shouldn't need to do anything in particular to make this happen - at least, I think it automatically sets to use multiple processors...

--
Mats

ID: 31300 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
ebahapo
Avatar

Send message
Joined: 17 Sep 05
Posts: 29
Credit: 413,302
RAC: 0
Message 31303 - Posted: 17 Nov 2006, 15:37:51 UTC - in response to Message 31293.  

However, Rosetta uses fairly lengthy functions with heavy inlining, which means that each function runs for quite some time, and calls to different functions is a minute component in the overall time used by the application.

... when the extra registers of AMD64 could come in handy to avoid expensive stack operations. ;-)

But we'll see about that, as I promised.

ID: 31303 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
ebahapo
Avatar

Send message
Joined: 17 Sep 05
Posts: 29
Credit: 413,302
RAC: 0
Message 31304 - Posted: 17 Nov 2006, 15:43:05 UTC - in response to Message 31291.  
Last modified: 17 Nov 2006, 15:44:20 UTC

As an IT Consultant, and systems builder, I strongly recommend that Boinc is redesigned around multi-core 64bit chips. ALL NEW Processors are 64bit and ALL NEW Processors are Multicore. In another 2 or 3 years, there will be few computers with single core 86 processors, and the project will start to wain.

In a way, it already is. Depending on your preference settings, BOINC will run as many applications as there are cores available. And if one signed up for more than one BOINC project, it can run different applications at the same time. Otherwise, it will run several instances of the same application, but crunching different WUs.

But I am not disagreeing with you. Some projects which require the completion of a WU to generate the next WU could benefit from the support for multi-threaded applications, taking over all cores available to decrease the turn-around time.

And BOINC has mechanisms to make sure that other projects would get their fair share of the CPU which could accommodate such applications.

Perhaps you'd like to post the BOINC developers mailing-list.

HTH
ID: 31304 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile skgiven
Avatar

Send message
Joined: 7 Jun 06
Posts: 9
Credit: 642,742
RAC: 0
Message 31306 - Posted: 17 Nov 2006, 16:11:51 UTC - in response to Message 31300.  

[quote] BOINC, if it's configured correctly will use all available processor to run one instance per processor (i.e you have set it to use "On multiprocessors, use at most 8 processors"), and should use 100% of all available CPU's (up to 8). This is a setting in general preferences. [quote]

Are you refering to BOINC Manager Ver: 5.4.11 ? I can not see a General Preferences configuration applet/panel/screen or other tool. I have also looked in Advanced Options. Perhaps I have a different version or your version is adopted.

Would you tell me how to set BOINC preferences to use all processors.
Thanks,
ID: 31306 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mats Petersson

Send message
Joined: 29 Sep 05
Posts: 225
Credit: 951,788
RAC: 0
Message 31307 - Posted: 17 Nov 2006, 16:20:15 UTC - in response to Message 31306.  

[quote] BOINC, if it's configured correctly will use all available processor to run one instance per processor (i.e you have set it to use "On multiprocessors, use at most 8 processors"), and should use 100% of all available CPU's (up to 8). This is a setting in general preferences. [quote]

Are you refering to BOINC Manager Ver: 5.4.11 ? I can not see a General Preferences configuration applet/panel/screen or other tool. I have also looked in Advanced Options. Perhaps I have a different version or your version is adopted.

Would you tell me how to set BOINC preferences to use all processors.
Thanks,


No, it's part of the preferences you can set on the web-site, not in boinc itself.

If you click at the [ Home ] link at the top of this page, you'll get to the Rosetta home-page. There you'll find a link to "Your account". In this, you can select "View or Edit" for "General preferences" - once you click on the "view or edit", you'll need to say "Edit preferences" if your setting for number of processors on multiprocessor is less than two. Select some higher number (such as 8) and you'll be ready to crunch on all processors.

[As a step here, you may need to log in - I don't because I always select "automatically log me in whenever I connect" whenever I can on whatever system I use - I know, it reduces the security against someone else using my system to do nasty things that I'll be held responsible, but seeing as the security in the building is pretty good, I don't fear too much - I'd be more worried about the $xxxxx amount of comuter hardware sitting around, if that was the case...]

--
Mats
ID: 31307 · Rating: 2 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile skgiven
Avatar

Send message
Joined: 7 Jun 06
Posts: 9
Credit: 642,742
RAC: 0
Message 31308 - Posted: 17 Nov 2006, 16:48:09 UTC - in response to Message 31307.  
Last modified: 17 Nov 2006, 16:49:47 UTC

Excellent, thats it sorted. I vaguely remember setting this to 1 about 9 months ago, and I had forgotten where I did it.
I was using a Dell 3.2GHz computer with fans that went bonkers when the CPU's maxed out. Even if I stopped doing any processing the fans kept running as fast as possible for hours - the noise was more like a jet aircraft than a vacuum cleaner!

Thanks, and I now accept that the code does not need altered.
ID: 31308 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile River~~
Avatar

Send message
Joined: 15 Dec 05
Posts: 761
Credit: 285,578
RAC: 0
Message 31315 - Posted: 17 Nov 2006, 17:24:33 UTC - in response to Message 31306.  

Would you tell me how to set BOINC preferences to use all processors.
Thanks,


General preferences are set on the project website and are picked up by your compurters when they ask for work, or turn in completed work. This makes it possible to change settings on all your copmuters in one go, though it can take a while for the change to propagate to them.

If you support more than one project, it is suggested that always make changes to general prefs on the same project, and allow BOINC to propagate the changes. Making changes on more than one different project can cause confusion for a few days as different versions of your settings reach each of your machines.

So, go to the main page of the project you are going to use for this.

Click on the link to "Your account".

Click on "View/edit general preferences"

Click on "edit preferences"

Set the max number of cpus to use (example for twin core each HT you'd enter 4, and so on).

Click (can't remember) save? whatever button it is to commit the changes to the website.

If you are patient, you have finished - wait for the change to reach your machine.

If you are in more of a hurry, go to that machine, project tab, highlight the project that you used to set gen prefs. Click update.

Note that if you do successfully increase the number of cpus used, the first thing boinc does is to re-run the benchmarks. If all your projects seem to stop, look in the message tab and hopefully you will see the message "Number of usable CPUs has changed - running benchmakrs"

Good luck
ID: 31315 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 34209 - Posted: 6 Jan 2007, 15:36:58 UTC

Any update from the people looking at 64bit ports ?
Team mauisun.org
ID: 34209 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
ebahapo
Avatar

Send message
Joined: 17 Sep 05
Posts: 29
Credit: 413,302
RAC: 0
Message 34218 - Posted: 6 Jan 2007, 17:00:55 UTC - in response to Message 34209.  

Any update from the people looking at 64bit ports ?

On my part, it's going slow because Rosetta was written with a lot of 32-bit assumptions even in non-scientific code. It's still too early to say more, but it'll be a while...

ID: 34218 · Rating: -1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Who?

Send message
Joined: 2 Apr 06
Posts: 213
Credit: 1,366,981
RAC: 0
Message 34266 - Posted: 7 Jan 2007, 8:57:22 UTC - in response to Message 34218.  

Any update from the people looking at 64bit ports ?

On my part, it's going slow because Rosetta was written with a lot of 32-bit assumptions even in non-scientific code. It's still too early to say more, but it'll be a while...


64bits on x86 is nice .. it increase the memory footprint of your instruction stream. it is nice for AMD, because it fixes their x87 issue on 32bits by using intel's SSE2 ;)
For Core 2, it does not make any difference, as it should, because core 2 FP calculation is make by 2 x 128bits execution units ...

why bother with 64bits on Rosetta? SSE3 or SSE4 make more sense, even for AMD!

who?
ID: 34266 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Still ignoring 64 bit users?



©2024 University of Washington
https://www.bakerlab.org