Would a PPU client be possible, down the road?

Message boards : Rosetta@home Science : Would a PPU client be possible, down the road?

To post messages, you must log in.

AuthorMessage
mage492

Send message
Joined: 12 Apr 06
Posts: 48
Credit: 17,966
RAC: 0
Message 14165 - Posted: 20 Apr 2006, 8:32:50 UTC

I was looking at the new AGEIA PhysX physics processor (apparently the first of a new kind of specialized card that handles movement, projectiles, etc., like a GPU handles graphics), and I was wondering if this could possibly be used for crunching? If it is designed to model physics, this should be right up its alley, I would think. Then again, I haven't seen anyone else mention it, yet, so maybe I'm missing something.

I've seen this specific card used in a gaming context (There's a demo video floating around.), and the speed of the card is unbelievable. What if this kind of card could be harnessed for scientific calculations? From what I've been reading, physics cards seem likely to be installed on a lot of gaming computers in the near future, so maybe it's a useable resource.

So, what do people think? Feasable, or am I completely off my rocker?
"There are obviously many things which we do not understand, and may never be able to."
Leela (From the Mac game "Marathon", released 1995)
ID: 14165 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tribaal
Avatar

Send message
Joined: 6 Feb 06
Posts: 80
Credit: 2,754,607
RAC: 0
Message 14166 - Posted: 20 Apr 2006, 9:23:14 UTC

Well I believe the effort will first be on GPUs (graphic cards) since they are more easily available.

This is all really distant future, though.

I'm of course not part of the dev team, so don't take my word for it :)

- trib'
ID: 14166 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 14167 - Posted: 20 Apr 2006, 9:27:21 UTC

Probably as feasable as using the GPU itself, as long as they can do the necessary mathematical instruction it should be possible.

But it would take some serious studying and hard work to port/code everything to run on them (boinc client itself will not need to be ported, just the Science apps. As the boinc client will still run on the CPU ;-)
afaik, a credit/points/work-done scheme will need to be evaluated, but I remember reading trhe ground work is already in the boinc client for that.

Though GPU would be the first I would target (large audience already available) then the XBOX360 (not sure of audience size but quite a lot of crunching power available and the novelty value. (then PS3 if it ever get released) then look into the PhysX processor (as I also remember the GPU comapines may start integrating more of that function into their cards)

Maybe the best way to get a port would be to contact the PhysX company, ask them if they could do it as that would generate publicity among a lot of people and give them a wider audience (and they know what they're doing with there product and I'm sure Rosetta wouldn't mind them having the code to do that)
Team mauisun.org
ID: 14167 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tribaal
Avatar

Send message
Joined: 6 Feb 06
Posts: 80
Credit: 2,754,607
RAC: 0
Message 14169 - Posted: 20 Apr 2006, 9:49:27 UTC

As far as the Xbo360 is concerned, I'll try to do what I did with the Xbox: install a linux OS on it and then use it to crunsh :)

Of course, that wastes quite a lot of computing power since it's not using the graphics card :(
But hey, a few cycles is better than none ;)

- trib'
ID: 14169 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 14175 - Posted: 20 Apr 2006, 13:49:03 UTC - in response to Message 14169.  

As far as the Xbo360 is concerned, I'll try to do what I did with the Xbox: install a linux OS on it and then use it to crunsh :)

Of course, that wastes quite a lot of computing power since it's not using the graphics card :(
But hey, a few cycles is better than none ;)

- trib'



We don't have a Linux PPC app though ? and 3x3.2GHz cores wouldn't waste too much (though I'm not actually sure how efficeint it would be at using all that power, due to it being a gaming/media console I never really get to interested in them.
Team mauisun.org
ID: 14175 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Housing and Food Services

Send message
Joined: 1 Jul 05
Posts: 85
Credit: 155,098,531
RAC: 0
Message 14178 - Posted: 20 Apr 2006, 14:49:20 UTC

I actually sent off an email to the Ageia folks on Monday asking that exact question. I'll post here if/when they get back to me.

-Ethan
ID: 14178 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mage492

Send message
Joined: 12 Apr 06
Posts: 48
Credit: 17,966
RAC: 0
Message 14230 - Posted: 21 Apr 2006, 5:59:31 UTC

I'm sure it's pretty low on the priority list (I hadn't even heard of these cards until a week ago.), but it would be interesting to see down the road. Heck, maybe in a year or two, I'll be able afford one! *looking at the abandoned computer I'm using as a full-time cruncher*

Regarding the company coding it themselves, I'm not sure how one could make a business case for it. Their target market (gamers) will buy it whether or not it'll run Rosetta. How many additional cards would they sell by porting a program to it that a lot of people have never heard of? Still, if they went along with it, I'm sure they could do a good job (since they know the card's exact capabilities). It'll be interesting to hear back.
"There are obviously many things which we do not understand, and may never be able to."
Leela (From the Mac game "Marathon", released 1995)
ID: 14230 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 14243 - Posted: 21 Apr 2006, 7:44:29 UTC

If you look at what's happening over at Folding@Home with their attempts to get a GPU client ready (which gets reworked every time ATI and nVidia come out with new chips with even better abilities) and the fact that they've taken 6? months working with at least one of the physics add in co-processors - and I'd bet on our having to wait quite awhile before seeing a useable card and client here. FaH has the advantage of various cores that are basically set in stone, unlike here where we're constantly upgrading the client.

If the card is released with clients for various DC projects and can perform at 7x the rate of the system without the card, then it'll be easy to sell to those of us with limited room (and desires for larger pharms). Given a list of the games it improves, and some of us would give those a try as well. Perhaps it'd be just as easy to convince some of those special game players to give DC projects a try. Especially when they find out that their single machine cranks out WUs 7+ times as fast as a normal system. :)

If the performance ends up being less than 7x a normal client.. i.e. 3.5x or 2x a normal client, then the price would have to be in line with the lessened performance increase to make it tempting..


ID: 14243 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 14252 - Posted: 21 Apr 2006, 9:54:47 UTC - in response to Message 14230.  

I'm sure it's pretty low on the priority list (I hadn't even heard of these cards until a week ago.), but it would be interesting to see down the road. Heck, maybe in a year or two, I'll be able afford one! *looking at the abandoned computer I'm using as a full-time cruncher*

Regarding the company coding it themselves, I'm not sure how one could make a business case for it. Their target market (gamers) will buy it whether or not it'll run Rosetta. How many additional cards would they sell by porting a program to it that a lot of people have never heard of? Still, if they went along with it, I'm sure they could do a good job (since they know the card's exact capabilities). It'll be interesting to hear back.


It's called initial advertising, they may get Rosetta to work on their card. Rosetta is used by universities and institutes let along WCG and BOINC platforms. So if one is released they would have the opertunity to get their name plastered on many web pages associated with these AND get their brand name in the univerities/companies. So although they're probably going to become common place in PC (I think that's and Intel idea) they have the bigger brand name to start with.


Side, I also remember (following from somehting earlier) it was ATI saying they where going to offload onto their graphics card when it was idle or not using the full cpabilities for graphics... Think it was with the X1xxx series card, then of course Nvidia mentioned something similar. Given they would make compilers available maybe it'll get easier. I would have though both ATI/Nvidia would be looking to diversify into a 'first to do this' product ;-)

Be interesting to see.
Team mauisun.org
ID: 14252 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Housing and Food Services

Send message
Joined: 1 Jul 05
Posts: 85
Credit: 155,098,531
RAC: 0
Message 14275 - Posted: 21 Apr 2006, 16:25:43 UTC - in response to Message 14252.  
Last modified: 21 Apr 2006, 16:26:39 UTC

I asked if the card can do the types of general purpose calculations R@H uses (single precision floats as stated in another thread), and if so, how much faster can it do them than a modern cpu.

If you can give the same commands to the ppu that you can to a cpu, then it shouldn't be a hard port. If it's more specialized, like a gpu, then it may not be worth the trouble.

While it would be great if it worked and a percentage of users bought cards, it would still be a small percentage of work. I pictured these being used by the Rosetta folks in house.

These cards are pci, so you could potentially put 4 or 5 into a machine. If each card is 10x faster than the cpu, all of a sudden you have the equivalent of 50 cpu's where before you may have only been able to fit 10 in the rack. For a group that's out of rack space, this could help them out.

-E
ID: 14275 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dgnuff
Avatar

Send message
Joined: 1 Nov 05
Posts: 350
Credit: 24,773,605
RAC: 0
Message 14806 - Posted: 28 Apr 2006, 0:45:52 UTC - in response to Message 14167.  

Though GPU would be the first I would target (large audience already available) then the XBOX360 (not sure of audience size but quite a lot of crunching power available and the novelty value. (then PS3 if it ever get released) then look into the PhysX processor (as I also remember the GPU comapines may start integrating more of that function into their cards)


While the PS3 (Cell) looks incredibly attractive due to the obscenely high flops it offers, when you drill down into the architecture of the Cell, it may be incredibly hard to get that to work for Rosetta.

There are two things I see that may make Rosetta on Cell not an option.

1. Rosetta has a very large WSS, meaning it's referencing a large amount of data during the computation. Just taking a quick peek at one of my system, 65 megs of WSS. i.e. 65 megs of data sctually resident in chip memory.

2. The SPEs on the cell are the coprocessors that get the high flops count, the main CPU that drives the whole enchilada is a very generic PPC architecture. The problem with the SPEs is that they have very little memory: only 256K. In essence, they have no direct access to main memory, instead a small cache (that runs at cache speeds, i.e. instant) that you DMA into and out of.

So unless Rosetta can be coded in such a way that a significant hunk of work (several milliseconds) can be done using only 128 K of data (assuming double buffering), then all bets are off.

Also they're SIMD processors. Think of running on an x86 with no normal fp at all, SSE being your only option. How much of that four wide SIMD could we actually use? While I'm on the subject, this is something you're extremely likely to faceplant into when designing a GPU version. Don't say I didn't warn you. :)

Any thoughts from the Rosetta team about this?
ID: 14806 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile XeNO

Send message
Joined: 21 Jan 06
Posts: 9
Credit: 109,466
RAC: 0
Message 15548 - Posted: 5 May 2006, 4:03:36 UTC - in response to Message 14275.  

I asked if the card can do the types of general purpose calculations R@H uses (single precision floats as stated in another thread), and if so, how much faster can it do them than a modern cpu.

If you can give the same commands to the ppu that you can to a cpu, then it shouldn't be a hard port. If it's more specialized, like a gpu, then it may not be worth the trouble.

While it would be great if it worked and a percentage of users bought cards, it would still be a small percentage of work. I pictured these being used by the Rosetta folks in house.

These cards are pci, so you could potentially put 4 or 5 into a machine. If each card is 10x faster than the cpu, all of a sudden you have the equivalent of 50 cpu's where before you may have only been able to fit 10 in the rack. For a group that's out of rack space, this could help them out.

-E


Although my knowledge of hardware architecture is limited, I do have to ask, if all those cards are running on the same PCI bus, are they not limited by the measly 133MB/s data transfer rate of the PCI bus? Considering the average computer gets at least 6.4 (theoretical) GB/s between memory and processor, thats seems like it would fall down pretty fast.

ID: 15548 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 15569 - Posted: 5 May 2006, 13:45:32 UTC - in response to Message 15548.  

I asked if the card can do the types of general purpose calculations R@H uses (single precision floats as stated in another thread), and if so, how much faster can it do them than a modern cpu.

If you can give the same commands to the ppu that you can to a cpu, then it shouldn't be a hard port. If it's more specialized, like a gpu, then it may not be worth the trouble.

While it would be great if it worked and a percentage of users bought cards, it would still be a small percentage of work. I pictured these being used by the Rosetta folks in house.

These cards are pci, so you could potentially put 4 or 5 into a machine. If each card is 10x faster than the cpu, all of a sudden you have the equivalent of 50 cpu's where before you may have only been able to fit 10 in the rack. For a group that's out of rack space, this could help them out.

-E


Although my knowledge of hardware architecture is limited, I do have to ask, if all those cards are running on the same PCI bus, are they not limited by the measly 133MB/s data transfer rate of the PCI bus? Considering the average computer gets at least 6.4 (theoretical) GB/s between memory and processor, thats seems like it would fall down pretty fast.



I couldn't (read didn't look to hard) on their website to see if by PCI bus, that also allowed for PCI-E & PCI-X busses as well which are differnet to each other but have greater bandwidth.

Though knowing they are designed for gamers and so that they can simply be dropped in and take over the code that is already used by the games (lots of games use their Software Physics Engine, hence why they can do this) it'll come out in PCI & PCI-E(probably x1 up to x4 sepending on what they need), but I doubt PCI-X will be targeted, shame as it is already in a lot of servers and Mac's as well.

I would assume Dell will have them in it XPS and Alienware ranges shortly ;-)
Team mauisun.org
ID: 15569 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Rosetta@home Science : Would a PPU client be possible, down the road?



©2024 University of Washington
https://www.bakerlab.org