Message boards : Number crunching : So is it cheating or not?
Author | Message |
---|---|
Insidious Send message Joined: 10 Nov 05 Posts: 49 Credit: 604,937 RAC: 0 |
I understand that some people are using an "optimized" Boinc client that makes the credit be higher for a given work unit. I don't want to be crunching for less credit than other people, but I don't want to do anything unscrupulous either. Please advise -Sid Proudly crunching with TeAm Anandtech |
stephan_t Send message Joined: 20 Oct 05 Posts: 129 Credit: 35,464 RAC: 0 |
Well it depends what you mean by 'optimized' - for most, optimized means compiled for your CPU - that's not considered cheating. However, cheating could be found in clients by artificially inflating the benchmark numbers, etc. I wouldn't call that 'optimized'. So in a nutshell, your credit is benchmark results * time spent on WU. An optimized client wouldn't necessarily affect your score for the better, as it might benchmark higher but also crunch faster. Which client did you have in mind? Team CFVault.com http://www.cfvault.com |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,362 RAC: 10 |
I don't want to be crunching for less credit than other people, but I don't want to do anything unscrupulous either. You will probably get as many different answers as there are people who respond... my own PERSONAL opinion is that if you are running an optimized SETI app, and are running an optimized BOINC Client to "match" it, and SETI is your main project, with Rosetta being secondary - then I wouldn't feel bad about it. If you are ONLY running Rosetta, then you have no legitimate reason to be running an optimized client. I was running the heavily-optimized Trux 5.3.1 on my AMD when Rosetta was getting 10% of my resource share - now that Rosetta is getting 50% of that CPU and SETI is getting almost nothing (5%), I uninstalled 5.3.1 and put on a "stock" 5.2.7. On the other hand - if you are running Linux, and can find a similar/identical CPU running Windows to compare against, you may note that your benchmarks are extremely low in comparison. In that case, I don't think there would be a problem with running a "mildly optimized" (say SSE) BOINC Client even if Rosetta is your only project. Your intent would be to _equalize_ the benchmarks against Windows users, rather than to unreasonably inflate your scores. If others with similar CPUs get 15 credits for two-hour WUs while you're getting 10 - then fix it. If you're getting 20... or 40... then shame on you. |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,362 RAC: 10 |
An optimized client wouldn't necessarily affect your score for the better, as it might benchmark higher but also crunch faster. An optimized client will have no effect on crunching times. |
Insidious Send message Joined: 10 Nov 05 Posts: 49 Credit: 604,937 RAC: 0 |
Thanks for the opinions. I think the client I have heard mention of is the Seti Optimized one that Bill mentions. I personally am not using anything optimized. I am using 5.2.7 I sure wish this kind of thing was prevented by something other than my hope that everybody would play nice. Well, enough about this. I'm enjoying Rosetta others can do as they wish. Best to all -Sid Proudly crunching with TeAm Anandtech |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,362 RAC: 10 |
I think the client I have heard mention of is the Seti Optimized one that Bill mentions. Let me take a moment to give more background information and history here... SETI is open source. Early in the project, someone realized that simply recompiling the source code with CPU-specific flags set on the compiler would result in an application that ran exactly the same - only faster, on that particular CPU. That was the original "optimized SETI application". Since that point, using different compilers, different math libraries, and even modifying and improving the original code, has resulted in approximately 987098745 different optimized SETI applications. The problem with this is that credits are assigned based on CPU time * a benchmark value. Doing a WU considerably faster, while changing nothing else, meant that instead of claiming 20 credits, they might be claiming 15. Or 10. For the _same_ work someone else was doing and claiming 20 for. The answer of course is that the benchmarks, never particularly accurate to begin with, are WAY off when compared to an optimized application. So someone optimized the BOINC Client itself, or at least the benchmarking portion. If the BOINC Client was optimized the same way as the SETI application, then the benchmarks were once more "in sync". Optimizing the Client doesn't affect any _work_, only the benchmarks, but means that you're getting the "right" credit. (This is complicated by things like Einstein optimizing their Mac application quite well, such that Macs on Einstein were suddenly asking for too little credit, unless they were running an optimized BOINC Client...) Regardless, while there are fewer optimized BOINC Clients than SETI applications, there's still dozens of them to choose from, each of which will give you slightly different benchmark values, and (hopefully) affect nothing else. The problem here of course is that the benchmarks apply to ALL projects you're running, not just SETI. So now people were requesting too MUCH credit on other projects. In many cases, this doesn't matter much, because of the redundant issuing of work, and the averaging of credit claims. Of course, for Rosetta, it matters a great deal. (See thread on "code release and redundancy" for much more info.) The "fix" is simple. Do away with the reliance on benchmarks. Paul D. Buck has a proposal for using calibrated hosts, that would bring claimed credit to a closer match with the theoretical "cobblestone computer". Meanwhile, the BOINC folks have come up with a _different_ method of "measuring" a work unit, "flop-counting". This is project-specific, so if you optimize SETI, you'll still get the "right" credit for SETI, and it won't change your claims for Rosetta. That is "coming real soon now" to SETI. Flop-counting is FAR more repeatable than benchmarks - testing shows that if you run a WU on a Mac G4, and I run the same WU on Windows/AMD, and someone else runs it on Linux/Intel - we'll all ask for the same number of credits to within about 1%. It doesn't totally eliminate the possibility of cheating however - THAT will require some "server side" effort. The "optimal" method still seems to me to be to use Paul's calibration approach, or some variation thereof, on top of the flops-counting approach. Other methods are available to projects that use redundancy, as well. Regardless, Rosetta is, I believe, seriously looking at using flops-counting at least as a way to avoid having to go to redundancy just to prevent the most obvious methods of cheating. This should be _fairly_ easy to do, and once done, you won't have to worry about matching the Client to the application to the project to the phase of the moon. If it makes you feel any better - any cheating that is going on right now is of such a low scale and low impact, that it's unlikely to matter any. Maybe someone is a few positions higher in the stats than they should be, because they're asking for a few % more credit on each WU - but really, who cares? If someone starts _really_ getting out of hand... well, let's just say, they'll get caught. Suddenly looking at their account and finding they only have 10 credits... they'll get the hint! |
Insidious Send message Joined: 10 Nov 05 Posts: 49 Credit: 604,937 RAC: 0 |
Thanks for taking the time to explain this Bill. If it wouldn't get warm by the time I found you, I'd buy you a beer. -Sid Proudly crunching with TeAm Anandtech |
stephan_t Send message Joined: 20 Oct 05 Posts: 129 Credit: 35,464 RAC: 0 |
An optimized client wouldn't necessarily affect your score for the better, as it might benchmark higher but also crunch faster. You're right, and I stand corrected Team CFVault.com http://www.cfvault.com |
Astro Send message Joined: 2 Oct 05 Posts: 987 Credit: 500,253 RAC: 0 |
This has been my big concern about optimized clients all along. They are a wonderful tool to bring ones benchmark back into line if they are ONLY running an Optimized seti application. It has morphed into a way to cheat, especially for Rosetta which doesn't do redundant analysis. With other projects that do redundancy it's not as noticeable. I think code should be included with the optimized clients that only lets it attach to ONE project. In seti and other projects granted credit is determined by throwing out both the high and low "claimed credit", and averaging the other two or just the one (in the case of 3 valid returns). This tends to alleviate the inflated claims since 90% of users are using a standard application anyway. But with Rosetta, they claim can easily claim and GET 50% more than others for the same work. It's not fair. It doesn't matter to me if you "mainly run seti" or not. Let's face it, there's only a few of us that even have to be concerned about being in the top 10 (i.e real competition), so it shouldn't be a big deal, but a level playing ground would be nice. There are four types of individuals using an optimized client:those that use it just and only with an optimized application (the intended purpose), those that just blatantly cheat if they can, those that cheat but not much so it's ok (justification), and those that just aren't informed that they are doing a bad thing. It's not clearly explained at most sites offering optimized clients, (oh, it's there but towards the bottom and who really wants to read that far, LOL) Bill, Ingleside tells us that FAH (folding at home) and CPDN don't use Fpops. On project with redundancy Optimized clients aren't really needed if you run a bigger cache (like 4 or more days). You're results will be the last in and in most instances, granted credit will have already been determined by the other users, and you'll get what you should to begin with. So, Optimized Clients aren't even necessary for credit equalization. My Average granted credit is 22 credits per wu, which is within a point or two of everyone else, optimized client or not. (seti only) |
PCZ Send message Joined: 16 Sep 05 Posts: 26 Credit: 2,024,330 RAC: 0 |
Folks have been knocking the boinc benchmark system for a long time and rightly so it's a steaming *urd. People have come up with ideas to change it but so far nothing has been implemented, will it be any different this time ? Optimising clients were originally used exclusively by participants running nix. The nix benchmaks being very low compared to windows. They discovered that compiling there own boinc clients using compiler optimisations helped them to bridge the gap with there fellow participants running windows. I have no problem with this at all the windows compiler used for the standard windows clients obviously was using optimisations so doing the same with the nix clients was OK. Also the benchmarks on SMP machines are very very messsed up the more CPU's you have the lower it is. Guys running the big tin are seriously disadvantaged. To a certain extent this is corrected by the quorum system but on projects such as rosetta which dont use it they just get a fraction of the credit they should be claiming. Now the above problem can and is fixed by changing the benchmark figures in one the client files. The figures are altered to bring them them more in line with a UNI processor equivalent. I am not saying that this is the correct course of action but can well understand folks running big tin doing it. What is happening now though is blatant cheating by editing the client xml files and this is getting out of hand. Rosetta doesn't have a quorum so the cheats are getting away with it. The project admins have said that they don't want to waste resources using a quorun system so what can we do ? Right now a quorum is the only option. A quorum of 2 would go a long way towards negating the inflated credit claims. Going forward an alternative benchmark system such as that proposed by Paul could be implemented. How much effort is involved ? and perhaps more importantly would the boinc developers implement it. This problem has been around since day 1 and they don't seem to be interested in fixing it. Another thing that could be done is to assign credit to the WU's like FAH does. IMHO this would be the best system for rosetta. Sure a few WU's might take longer than the average for that protein but equally some will finish quicker than expected so things would even out. I hope this is the last time this monster rears it's ugly head but going on past form i have to bet it won't be. |
Insidious Send message Joined: 10 Nov 05 Posts: 49 Credit: 604,937 RAC: 0 |
Folks have been knocking the boinc benchmark system for a long time and rightly so it's a steaming *urd. I agree with the sentiment entirely, but I do have to disagree with the F@H example of a "good system" The reason I have switched to Rosetta... From F@H was their credit system. As they began to develop some project types (QMD in particular) which would only be assigned to a given brand of CPU (Intel) because of license restrictions, they also implimented a "bonus point" system to encourage people to crunch the Work Unit type they were most interested in. Which happened to be the ones that only Intel machines could receive assignment to. The result, the project developers intentionally gave one brand of CPU an EXTREME (up to 4X) crediting advantage. I hope F@H is NEVER used as a standard for any project in terms of crediting. respectfully, -Sid Proudly crunching with TeAm Anandtech |
Scott Brown Send message Joined: 19 Sep 05 Posts: 19 Credit: 8,739 RAC: 0 |
It's not fair. It doesn't matter to me if you "mainly run seti" or not. Let's face it, there's only a few of us that even have to be concerned about being in the top 10 (i.e real competition), so it shouldn't be a big deal, but a level playing ground would be nice. A level playing field would be nice, but since when has this ever been the case with BOINC? I think those individuals with LINUX and sometimes MAC boxes would have liked to have fairness from day one. The simple truth (as Bill noted below) is that the benchmarking for credit idea is the problem, and a level field will never occur until the credting system is changed! |
Astro Send message Joined: 2 Oct 05 Posts: 987 Credit: 500,253 RAC: 0 |
I've been with Boinc and multiple projects for a year and a half. I've seen MANY "new" crediting ideas come and go, Each one has had it's own flaws. Some where better for one project, some better for another, some were just bad, but all had a flaw. The current system, flawed as it is still the best system found. I'm a Boinc Alpha member, and I test the new boinc software. The project managers, do care about the Benchmark system. Paul Buck has done some good work, and the management is certainly aware of his ideas. They will listen to each individuals ideas. If you're interested to see what's going on behind the scenes you can subscribe to the Alpha and/or Dev mail lists. The alpha mail list is here and the Developmental mail list is here. |
PCZ Send message Joined: 16 Sep 05 Posts: 26 Credit: 2,024,330 RAC: 0 |
OK I used the FAH example because i thought it would be familiar to everyone and the quickest way to explain what i meant. Agreed FAH does have problems because of the monster WU they hand out to P4's with SSE2/3 which have a disproportionate amount of credit associated with them. THe FAH system should work i fail to understand how they get the credit so wrong sometimes. I don't think that would happen here though. :) A sample of a new query to be worked on could be run through a test box and the amount of credit for that particular query worked out based on the time it takes to comlete. What should the test box be ? Does it matter ? no not really. If it takes twice as long to do a new query on the test box as the previous query did then it's worth twice the credit simple. As for subscribing to the mailing lists no thanks. I don't believe it would do any good. If the developers did care about the benchmark system they would have fixed it long ago. My first posts on the boinc forums highlighted the problems with the benchmark system. Never once did a developer respond. |
Insidious Send message Joined: 10 Nov 05 Posts: 49 Credit: 604,937 RAC: 0 |
OK F@H blows it when they start medling with their "benchmark results" They design a WU with SSE2 optimizations, but turn them off on the benchmark machine. They were able to realize that made little sense so.... They add "bonus" points to the benchmark result when it was convenient (this was their biggest mistake... IMO) Bottom line, the scientists at F@H couldn't maintain their objectivity and skewed their scoring system to fit adjendas that (again, IMO) weren't based on fairness. With Boinc and Rosetta, I don't see intentional efforts to skew the scoring system by developers. I see people trying to balance the lesser of evils and missing the obvious for the minutia. (forest/trees, if you will) It seems like all acknowledge the CPU benchmark is not a good thing... but continue using it none the less..... I am thrilled David is willing to look for alternatives. As long as he is willing to keep an open mind... he'll find it. -Sid Proudly crunching with TeAm Anandtech |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,362 RAC: 10 |
People have come up with ideas to change it but so far nothing has been implemented, will it be any different this time? To move to Paul's calibrated-host approach, or some other approach that will effectively eliminate the possibility of cheating... maybe. To move to flops-counting to effectively eliminate the credit claim differences between different platforms and different optimized applications - YES! The code to report the counts is already IN V5.2.7, and the code to do the counting is already in the enhanced SETI app that is scheduled to have the first 'production' WUs released against it within the next month. You have to realize that BOINC provides the "framework" - the projects decide what portions of that they wish to use. F@H and CPDN did not feel the standard BOINC credit system worked for them, so they "rolled their own". The other projects all use the benchmark-for-credit method. It has worked better for some than for others; it depends on how equal their app is on different platforms, and on how consistent their WUs are. Even with the current "fairly short" and "fairly equal" SETI WUs, that has been less than ideal, and with the much longer and more variable "enhanced" WUs - well, you wouldn't want to see the boards if someone requests 150 credits and only gets 70... so they came up with flops-counting. It's just one more option, that the projects can use, or not. My opinion is that flops-counting would be the simplest, quickest, and best approach for Rosetta, simply to avoid having to go to redundancy because of credits. It's not perfect, cheating is still possible, though a little bit harder - but it fixes the _non_ cheating discrepancies. Let's face it - the credits are a side issue to the projects. They don't CARE if someone gets one credit or a hundred for turning in a result. They want the results. But to "pay" the participants, to promote their project over all the others, the credits are the "public relations" part of the project. EVERY project has threads on their boards complaining about credit. So the better a project handles the credit issue, the happier the participants, and the more likely they are to contribute larger portions of their CPU time to the project. BOINC is about to grow - a lot. The credit-by-benchmark system has been "adequate" - barely - but is about to cause a firestorm among the new BOINCers coming from Classic. Flops-counting will at least hopefully reduce the flames a bit. Will something even better than that be implemented? I hope so, eventually, but realistically, it's not going to happen in the next few months. |
Purple Rabbit Send message Joined: 24 Sep 05 Posts: 28 Credit: 4,247,127 RAC: 1,377 |
A level playing field would be nice, but since when has this ever been the case with BOINC? I think those individuals with LINUX and sometimes MAC boxes would have liked to have fairness from day one. The simple truth (as Bill noted below) is that the benchmarking for credit idea is the problem, and a level field will never occur until the crediting system is changed! I'll offer an example for Linux vs Windows vs "Optimized BOINC". My P3 1.3 GHz Celeron was running BOINC under Windows XP. It benchmarked at 1.3 GFlops FPU. I recently converted it to SuSE 10.0 (Linux). The benchmark is 612 MFlops. An optimized BOINC for Linux restores that to 1.3 GFlops. I think that's a fair use of an optimized BOINC client. Yes, the benchmarking process is flawed, but to say there is no reason to run an optimized BOINC client is incorrect. It's a useful workaround until the credit process is overhauled. |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
Just a side note or two ... For my calibration process, the first step is to instrument and count flops ... so ... one of the reasons I am not pressing hard (aside from other problems) is that this needs to be fielded and we need to see if it come close to "solving" the problems. I don't think it will, but, we have to wait and see ... I know that one of the things is that the idea of the minimum needed change is high on the list ... so, if this gets it all (flops counting I mean) there is no reason to go further ... though I do not think that it will do much beyond reducing the error limits SLIGHTLY ... CPDN basically does a "flat" award of credit per work unit/trickle. If you complete the work, you get a "bonus" (the slab models get you about 5,700+ CS, and a final will be 6,800) ... if you look at CS/second, CPDN is the project to go for ... Using ONLY optimized SETI@Home/Einstein@Home science applications WILL improve your credit scores ... when you get averaged in you will see much higher awards "pulling" you up. For example, I have SETI@Home and Predictor@Home (well I did) at the same resource share ... SETI@Home is much higher in RAC ... ==== Edit Ooops, Einstein@Home and Rosetta@Home also had the same Resource Share until yesterday when I raised RAH to pull up the RAC some ... |
Scott Brown Send message Joined: 19 Sep 05 Posts: 19 Credit: 8,739 RAC: 0 |
I think you didn't get my point. Since the benchmarking system is fundamentally flawed and unfair, it makes no difference whether optimized clients are allowed or not in the sense of system flaws and system fairness. Not only did I not say that there was "no reason to run an optimized BOINC client", I actually use an optimized client (TRUX's 5.3.1) because I give a substantial resource share to an optimized SETI. |
Wizzard~Of~Ozz Send message Joined: 4 Nov 05 Posts: 2 Credit: 251,362 RAC: 0 |
I re-compiled the Boinc Client for my linux machines as I noticed the huge drop in GFlops, I have 2 XP's @ 2.25Ghz, 1 running windows, the other linux, it was approx. half the rating, after re-compiling (with different flags specific to the type of CPU) that rating is now approx 2.5% higher under linux.. Re-compiling the windows client may equalize it a bit better, but I don't have the ambition to try that out. |
Message boards :
Number crunching :
So is it cheating or not?
©2024 University of Washington
https://www.bakerlab.org