FPU/SSE2 performance

Author	Message
dcdc Send message Joined: 3 Nov 05 Posts: 1834 Credit: 124,260,318 RAC: 3	Message 21262 - Posted: 27 Jul 2006, 12:43:39 UTC - in response to Message 21257. I try to follow : - How many models are available for a given WU? I guess this is down to how many random seeds are available - I don't think we need to worry about hitting the limit here if Rosetta gets 10x the users overnight though! - If a run a WU for threes hours and make lets say 10 models, another users will receive the same WU but for other models? yeah- there are many thousands of random seeds available for each protein - your computer will run some of those available models and other computers will run others. I make a test. Normaly I crunch WU in 3 hours. I change my preferences from the default to 6 hours. I have a WU running and when it reach a checkpoint, the 'Completion' drop from 70% to 48%, which is correct because now 'calculation time' + 'to completion' = 6 hours. But finally, doing 10 models by WU every 3 hours or 20 models by WU every 6 hours is the same. I mean the contribution is the same, isn't it? Correct again - the shorter times are only useful for computers that are either unreliable (i.e. resulting in Rosetta crashing and corrupting the work unit and therefore losing the work done), or don't spend enough time on Rosetta to complete the unit by its deadline. Is it a better way to crunch? Or crunching is crunching, whatever the preferences? The project prefer you to use larger work units if possible, to reduce the bandwidth used on thier servers. It also means you'll use less bandwidth if this is important to you. HTH Danny ID: 21262 · Rating: 0 · rate: / Reply Quote

[B^S] thierry@home Send message Joined: 17 Sep 05 Posts: 182 Credit: 281,902 RAC: 0	Message 21263 - Posted: 27 Jul 2006, 13:43:09 UTC Thanks for your answers. ID: 21263 · Rating: 0 · rate: / Reply Quote

Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0	Message 21264 - Posted: 27 Jul 2006, 14:40:01 UTC - in response to Message 21251. Why are the longer WU's larger in size to download, then? I'm on dial-up... Since you are on dial up, perhaps you are in the best position to know. But my instinct is to say that longer WUs are NOT longer to download. There will be more results to report back, and so rather than 100K you might send back 200k. But the uploads are minor compared to the download sizes. Since I can download a WU with the idea that it will run for a default 3hr time, and then I can update to the project and change my preference to 24hrs, and see the WU then crunch for 24hrs, without downloading anything more for the existing WU, I assert it's in fact identical. Thierry Dr. Baker has explained that most of the combinations of amino acids will take one of 3 orientations. So, if we've got 100 amino acids, there are 3 to the 100 power combinations to explore. I'm not clear on exactly what the random numbers represent, but the space to explore on that protein is 3 to the 100 power. And, as you've seen, some of the CASP proteins are 300 and 400 proteins long, so 3 to 300 power! These longer ones, the project team works hard on before they ever send out to us. They work to eliminate as much of that search space as possible by using what they've learned by studying known proteins. This phase of the exploration is where the science can improve the most, and as we have more and more known proteins to learn from, this is where breakthroughs will come. And the variences that result in the WU they produce, with the narrowed search space, is why testing the WUs on Ralph is important. Think of these large WUs as being individually hand crafted. So, literally trillions of combinations. And a single model run is exploring millions of them. There are so many models available, that Dr. Baker once said that all of us put together, picking our random numbers, only duplicate each other (i.e. pick the same number) 5% of the time... and I think that was on the smaller proteins before CASP, so I'd expect the rate to be much lower on a large CASP protein. The more of these trillions of combinations that are studied, the better the odds that we've explored near the low energy area, and found our closest prediction for the native structure. This is why Dr. Baker wants more computing power when he's got more proteins to study in a given time. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ ID: 21264 · Rating: 0 · rate: / Reply Quote

[B^S] thierry@home Send message Joined: 17 Sep 05 Posts: 182 Credit: 281,902 RAC: 0	Message 21265 - Posted: 27 Jul 2006, 14:45:52 UTC Thank you also for your answer. ID: 21265 · Rating: 0 · rate: / Reply Quote

FluffyChicken Send message Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0	Message 21270 - Posted: 27 Jul 2006, 16:22:09 UTC - in response to Message 21251. Why are the longer WU's larger in size to download, then? I'm on dial-up that is not connected 24/7, so used bandwidth comes at a premium :( BTW - thanks for keeping the discussion clean so far guys! :D They're not (meaning the large xxxxxxx.gz files) they are the same no matter what. Though it's pot luck if you get small or large files. Us dial-up users just notice it more if they're the <1.5Mb 'boinc_' ones compared to the 2Mb+ original style ones. though even some of these 'boinc_' ones have creapt up up in size, though these seem to be the ...full... ones so I just abort the downloads and try and get the smaller ones (hope they can implement some sort of distribution like they tried to for 1GB computers but for dial-up users using the average download speed) Uploads do increase though and can often be >500kb per 24hr task uploaded. I hope they'll be able to reduce that (the newer 5.5.x series implement compression on upload) and I've just added a pentium4 mobile 1.6Ghz (though it's only a touch faster than my PentiumIII-m 1GHz Team mauisun.org ID: 21270 · Rating: 0 · rate: / Reply Quote

Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0	Message 21271 - Posted: 27 Jul 2006, 16:27:03 UTC Thierry, You're user of the day!! Congrats on drawing the lucky card. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ ID: 21271 · Rating: 0 · rate: / Reply Quote

Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0	Message 21273 - Posted: 27 Jul 2006, 17:20:27 UTC BTW When the credit per models comes in line , more users participating in RALPH will help the system. ID: 21273 · Rating: 0 · rate: / Reply Quote

[B^S] thierry@home Send message Joined: 17 Sep 05 Posts: 182 Credit: 281,902 RAC: 0	Message 21277 - Posted: 27 Jul 2006, 20:54:49 UTC Thanks Feet1st ;-) ID: 21277 · Rating: 0 · rate: / Reply Quote

Divide Overflow Send message Joined: 17 Sep 05 Posts: 82 Credit: 921,382 RAC: 0	Message 21330 - Posted: 28 Jul 2006, 17:55:55 UTC - in response to Message 21273. BTW When the credit per models comes in line , more users participating in RALPH will help the system. I’m eager to hear more about this new process. Would the initial WU testing on RALPH provide a touchstone (one could even say Rosetta stone) for interpreting the credit to be awarded for each model produced specific to that WU? ID: 21330 · Rating: 0 · rate: / Reply Quote

Keith Akins Send message Joined: 22 Oct 05 Posts: 176 Credit: 71,779 RAC: 0	Message 21333 - Posted: 28 Jul 2006, 18:27:18 UTC Seems I heard Dr. Baker say something about possibly benching with an actual WU that would run for one minute in AB initio and count the model steps. That might be possible, since it would tax the memory access and CPU speed. What they actually decide remains to bee seen. ID: 21333 · Rating: 0 · rate: / Reply Quote

Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0	Message 21344 - Posted: 28 Jul 2006, 20:03:55 UTC - in response to Message 21330. Last modified: 28 Jul 2006, 20:13:37 UTC Would the initial WU testing on RALPH provide a touchstone (one could even say Rosetta stone) for interpreting the credit to be awarded for each model produced specific to that WU? Yes, since the proteins and the methods used vary so much, my understanding is that credit per model will be determined for each WU issued. i.e. no one single benchmark. ...but Keith is right too, Dr. Baker did say something about having created a benchmark as well. Not sure what the purpose behind that was, or if they've perhaps changed the approach since then. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ ID: 21344 · Rating: 0 · rate: / Reply Quote

MikeMarsUK Send message Joined: 15 Jan 06 Posts: 121 Credit: 2,637,872 RAC: 0	Message 21347 - Posted: 28 Jul 2006, 21:07:20 UTC I prefer the per-workunit approach to the Rosetta benchmark approach, so I hope they go that way. The reason is that large workunits are different to small workunits, and stress different parts of the PC (i.e., more memory etc). ID: 21347 · Rating: 0 · rate: / Reply Quote

XS_STEvil Send message Joined: 30 Dec 05 Posts: 9 Credit: 189,013 RAC: 0	Message 21364 - Posted: 29 Jul 2006, 3:59:44 UTC - in response to Message 21347. I prefer the per-workunit approach to the Rosetta benchmark approach, so I hope they go that way. The reason is that large workunits are different to small workunits, and stress different parts of the PC (i.e., more memory etc). This is pretty well how it would have to be done otherwise you will be awarding too high of credit for some WU's vs too low for others. The built-in bench WU is more to create a performance matrix like seti@home and D2OL had to tell which types of CPU's do more work per clock cycle or how memory/L1/L2/L3/etc may effect performance. At least i'm going to assume this ;) This signature was annoying. ID: 21364 · Rating: 0 · rate: / Reply Quote