Message boards : Number crunching : 300+ TeraFLOPS sustained!
Previous · 1 · 2 · 3
Author | Message |
---|---|
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
No, host RAC suffers the same fate as the project TFlops (which is essentially the aggregation of all of the host RACs). This is part of why rjs5 has been saying it is difficult to estimate how a given improvement might actually effect overall efficiency. But I thought the RAC was determined largely by the NUMBER of models that the WU did during execution. Then so, if on average, this number increases because of optimizations and what not... and with all things equal... the RAC should increase (?) |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
RAC comes from history of granted credit. Granted credit is based on number of models completed times the credit claimed for the average model of that task series. Picture two hosts, one ("A") has a BOINC benchmark of 100 and the other ("B") at 200. "A" completes 10 models and claims 600 credits (took him 6 hours), "B" reports from the same series of work that it completed 30 models. "B" will be granted 1800 credits, and whatever "B" claims for the benchmark rating and time taken to run the 30 models will drop in to the running average with "A"'s report. See, it doesn't matter how long "B" took to run the work. That only effects the "claimed" credit, not the "granted" credit. The claims of all hosts are averaged together as they report and used to calculate the granted credit. So, assuming any optimization would improve all hosts by the same degree (say 15%), the number of models completed per unit time increases 15%. But the credit claims that drive the credit system on R@h are based on time consumed and the host's benchmark rating (i.e. the claimed credit). So, "A" now reports in 11 models and a claim of ~510 credits (in slightly more than five hours), and "B" is granted ~1530 credits, and each took slightly less crunch time to report 10% more models completed. So this is what I meant about how the only variance you will notice is based on how the benchmark varies from the actual crunching work. The two hosts will have the same benchmark after any optimizations. But, perhaps host "B" has a larger L2 cache and the optimizations work better in that environment and bring a 25% improvement to "B" whereas "A" only sees a 15% improvement. But again, the difference is hard to discern, and shows you a skew between the benchmark and the actual work more than anything else. If you assume all of the hosts run the same number of hours per day as prior to the optimizations, the total credit claimed, and granted, per day will be the same as it was before. But all hosts complete more models per day. The science needs the models. The more, the better. Faster turn-around time, also a great thing. So optimization will help the research, but not reflect as a blip on a chart. Rosetta Moderator: Mod.Sense |
shanen Send message Joined: 16 Apr 14 Posts: 195 Credit: 12,662,308 RAC: 0 |
RAC comes from history of granted credit. Granted credit is based on number of models completed times the credit claimed for the average model of that task series. Recent comment on this topic, but still confusing and it still feels like a penalty. If a machine "claims" a certain amount of credit, it usually receives less, and often much less--and nothing that I can do about it. Especially in cases where credit was 0, it makes me feel like you don't actually care about the contributions. Just an embarrassment of riches, eh? Related annoying cases are computation errors after many hours of work. NOT my fault, but your bug, and 0 credit. Ditto deadline problems. (Fortunately for Rosetta, at this point I don't even care enough to look for a better project. This is about the 3rd or 4th one I've supported, and none of them were significantly less annoying.) #1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech) |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
Speak for yourself, I couldn't care less about the 'credit' etc. it's a nice-to-have measurement of contribution but in the end I do this for the science. Period. If I could 'donate' my credit to people who care so much about it I'd gladly go with zero credit to stop everyone else complaining. lol. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Some machines will typically be granted less credit than claimed, other machines will swing the other way. Basically being granted less credit than claimed is an indication that your machine does not run actual R@h work as much better as it runs the BOINC benchmark. In other words, comparing your machine to some hypothetical one, your machine reports 2x better on the BOINC benchmark, but is only producing 1.8x (pick number less than 2 there) more R@h results per unit time. Tasks that fail are granted credit within 24hrs, and when this occurs, the credit granted is only visible from the task details. This change to BOINC was done specifically to reflect that the project appreciates the effort, and the fact that the science benefits from learning about and working through the failure. Rosetta Moderator: Mod.Sense |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2124 Credit: 41,226,850 RAC: 11,023 |
Basically being granted less credit than claimed is an indication that your machine does not run actual R@h work as much better as it runs the BOINC benchmark. Ha! I like this. So true as well. I'm wondering whether to run something intensive while doing a new CPU benchmark so that I can get granted credits higher than claimed. Weirdly, my phone gets great credits while my overclocked desktop is always down. |
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
Same here. I just like how 'credits' measure your performance. |
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
|
rjs5 Send message Joined: 22 Nov 10 Posts: 273 Credit: 23,051,657 RAC: 8,071 |
It's been a while since we last heard of these optimizations. Well ... nothing really worthwhile. I haven't touched base with David in a couple weeks. Using the ICC compiler and turning off the aggressive inline and unroll options that create a large code footprint seems to give 40%+ improvement. In the 3.73 thread, I outlined the impact of running Primegrid in parallel with Rosetta. Primegrid uses the gimp library which prefetches a large array of data into the caches and kicks all the data in the data cache out. It caused a 3x slowdown in Rosetta and shows how big the code and data footprint is. You have to use -mtune=generic -march=core2 to maintain a common binary that runs on all machines. You can get some benefit from adding the "-ax<target>" option that generates a FAT binary with both generic and "<target>" code for SSE4.2 or AVX. I have been looking higher up in the program source code and running some experiments to see if I can coerce the program to auto-vectorize (where the compiler is able to use PACKED SSE/AVX instructions instead of scalar) without much luck. - removing call statements from the inner loops - breaking complex loops up into sequential simple loops - possibly using HUGE data pages I will probably build a couple more test cases to exercise some of the other protocols. Interesting but going slow. |
Dr. Merkwürdigliebe Send message Joined: 5 Dec 10 Posts: 81 Credit: 2,657,273 RAC: 0 |
Nonetheless, your efforts are highly appreciated! |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Very much so. Thanks rjs5. Even if you find no silver bullet, it is a much better position that prior to your research. Rosetta Moderator: Mod.Sense |
Mark Send message Joined: 10 Nov 13 Posts: 40 Credit: 397,847 RAC: 0 |
I would also like to add my thanks to you for your efforts. It appears you could make a real impact far beyond running your own crunching |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
Same here, awesome to see someone in the community really step up and get involved. In other news, this: TeraFLOPS estimate: 600.836 .. if this keeps up, R@H could become a petaFLOPS cluster.. of x86 power at that.. that is incredibly remarkable! |
sgaboinc Send message Joined: 2 Apr 14 Posts: 282 Credit: 208,966 RAC: 0 |
400+ Tflops sustained 24x7, rosetta@home can possibly rank with top 500 supercomputer networks :o :p lol |
Computing for Humanity (Account) Send message Joined: 8 Jan 16 Posts: 2 Credit: 480,188,370 RAC: 69,882 |
441st Place |
Message boards :
Number crunching :
300+ TeraFLOPS sustained!
©2024 University of Washington
https://www.bakerlab.org