How to fake out the new credit system

Message boards : Number crunching : How to fake out the new credit system

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 24808 - Posted: 25 Aug 2006, 3:53:28 UTC
Last modified: 25 Aug 2006, 3:54:10 UTC

I posted how to fake out the old credit system on Ralph, I'll copy it here (with minor revision for clarity). It's very easy to do, and so well-known that noone to date has even commented on that post.

Can anyone tell in such a step-by-step mannar how they will fake out the new credit system? There's been talk about cherry picking, and overclaiming and racing back with completed WUs with a huge overclaim in hopes that you're one of the first to report... but keep in mind the odds and critical timing required to make any of that work. Also, the ease with which it is headed off (by taking averages from Ralph, Bakerlab Linux farm, or creating some means of requiring 100 returned results prior to issuing credits for the first reports of a WU.

None of the discussion seems to point out that if I always overclaim that my credits may be banned, that a credit maximum claim can easily be established (as it was for failed WUs previously), or that it would be quite easy to correct overclaims once each WU run is completed.

I don't want this thread to degrade into a credit discussion. My purpose here was simply to demonstrait how easily BOINC's numbers are modified and credits effected. It seems the simplicity with which it is done is unclear to some folks, and so I hope that by review of this, they can better understand why some new method of establishing credit claims is desireable.

I did the following, all on this system, which is a dual core Windows P4.

Ran two WUs for 20hrs in to a 24hr target time.

Ran benchmarks from BOINC Manager.

Exit BOINC Manager.

Edited client_state.xml file found in /program files/BOINC
original contents of the file:
<client_state>
<host_info>
<timezone>-21600</timezone>
<domain_name>xxxxxxx</domain_name>
<ip_addr>9.10.54.138</ip_addr>
<host_cpid>c2b6ed128b9f00a4cd22b5241dc3378e</host_cpid>
<p_ncpus>2</p_ncpus>
<p_vendor>GenuineIntel</p_vendor>
<p_model> Intel(R) Pentium(R) 4 CPU 3.00GHz</p_model>
<p_fpops>1284961240.310078</p_fpops>
<p_iops>1198067125.658179</p_iops>
<p_membw>1000000000.000000</p_membw>
<p_calculated>1156173189.089874</p_calculated>
<os_name>Microsoft Windows XP</os_name>
<os_version>Professional Edition, Service Pack 2,

(05.01.2600.00)</os_version>
<m_nbytes>2674896896.000000</m_nbytes>
<m_cache>1000000.000000</m_cache>
<m_swap>5188567040.000000</m_swap>
<d_total>80031514624.000000</d_total>
<d_free>52115275776.000000</d_free>
</host_info>
.... that's the top of the file, which continues
============== changed entries, add a 1 in front
<p_fpops>11284961240.310078</p_fpops>
<p_iops>11198067125.658179</p_iops>


Both WUs ended in computation error
8/21/2006 10:15:19 AM|rosetta@home|Computation for task
1ptq__BOINC_BACKBONE_O_PENALTY_ABRELAX_SAVE_ALL_OUT__1176_130_0 finished

8/21/2006 10:15:39 AM|rosetta@home|Computation for task
4ubpA_BOINC_BACKBONE_O_PENALTY_ABRELAX_SAVE_ALL_OUT__1176_694_0 finished

19 new WUs came down (way more then the 14 possible to crunch before the 1 week deadline... but BOINC assumes my PC can crunch faster now).
Estimated runtimes went from 24hrs to 2:44.

=======================
The 2 failed WUs were uploaded:
33549552 29116470 20 Aug 2006 0:00:07 UTC 21 Aug 2006 15:17:42 UTC Over Client
error Compute error 67,740.38 881.37 ---

33437061 29011302 19 Aug 2006 5:42:12 UTC 21 Aug 2006 15:17:42 UTC Over Client
error Compute error 72,037.97 937.29 ---

I believe each of these will eventually be awared the 500 point (whatever it is) maximum credit for an errored WU.

======= compare to my prior two 24hr WUs.
33385320 28961891 19 Aug 2006 0:00:06 UTC 21 Aug 2006 0:00:27 UTC Over Success
Done 86,441.19 120.10 136.19

33263753 28846659 18 Aug 2006 5:46:02 UTC 21 Aug 2006 0:00:27 UTC Over Success
Done 86,130.13 119.66 137.65

Should have been worth about 194 credits for 40 hrs of work, but it claimed 1,818.66 credits
======================
Let the new WUs crunch just over 4 hrs, change Rosetta preference to 4hrs so they will end and report normally. They each completed the model they were working on, and were reported.

33675010 29233799 21 Aug 2006 0:00:27 UTC 21 Aug 2006 20:45:30 UTC Over Success Done 16,116.42 209.69 23.19

33582288 29147070 20 Aug 2006 5:25:54 UTC 21 Aug 2006 20:45:30 UTC Over Success Done 16,093.91 209.40 23.89

I then reran BOINC benchmarks and set my WU runtime preference low enough that I can crunch and report all of those WUs before their deadlines.

You can see the WUs with just over 4hrs of work on them each claimed over 200 credits. That's more then the 120 or so that the same box was granted for 24hrs of crunching.

But when you compare the "granted work credit" (which is from the new system), they each received 23 credits, which is just under 4 credits per hour, which is consistent with the rates that were earned by this box previously.

Under the existing system, my modified credit claims are granted, because Rosetta trusts BOINC to measure the machine's ability and time spent.

Under the new system, I get credit for the work I've actually completed in those 4 hours.


Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 24808 · Rating: 9.9920072216264E-15 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile carl.h
Avatar

Send message
Joined: 28 Dec 05
Posts: 555
Credit: 183,449
RAC: 0
Message 25040 - Posted: 26 Aug 2006, 22:02:53 UTC

Which would have created intra-project parity but not prevented manual cheating.


Are you stating catagorically this system cannot be cheated ?


Not all Czech`s bounce but I`d like to try with Barbar ;-)

Make no mistake This IS the TEDDIES TEAM.
ID: 25040 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 25043 - Posted: 26 Aug 2006, 22:14:56 UTC - in response to Message 25040.  

Which would have created intra-project parity but not prevented manual cheating.


Are you stating catagorically this system cannot be cheated ?

At least I don't know any way to do so.
ID: 25043 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
STE\/E

Send message
Joined: 17 Sep 05
Posts: 125
Credit: 3,540,897
RAC: 19,902
Message 25046 - Posted: 26 Aug 2006, 22:23:40 UTC - in response to Message 25043.  

Which would have created intra-project parity but not prevented manual cheating.


Are you stating catagorically this system cannot be cheated ?

At least I don't know any way to do so.


Hopefully there's not a way to cheat the Rosetta Credit System in place now, but you have some very enterprising people out there that will work 24 hr's a day to find a way. How long it will take people to figure something out is anybodies guess & how subtle they are with the Cheat will determine if they get caught or not.
ID: 25046 · Rating: 9.9920072216264E-15 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Trog Dog
Avatar

Send message
Joined: 25 Nov 05
Posts: 129
Credit: 57,345
RAC: 0
Message 25052 - Posted: 27 Aug 2006, 0:40:09 UTC

The current system can still be manipulated. Nobody at a project level has said that the optimised clients can't be used.

Over on the RALPH boards somebody posted a snippet from a post by one of the project scientists where he states the biggest wu run he had was 1.5 million. Now assume there are 150,000 hosts (according to the front page there are 173,441 - but lets keep the numbers round). Assume that everybodys computer is the same and connects at the same frequency - that means you will each have 1000 wu's to process. Thats 1000 chances you get to influence the credit claimed.

Of course not everybodys computer is the same, if you can process twice as many as the next guy then you will have 667 chances to affect the credits against his 333. Yes you will increase his credits too, but as you're completing wu's at a greater rate than he is you will increase the difference between your scores at an exponential rate.

If everytime you report, you increase your benchmarks the effect is even greater. Can't be done? QMC used to have a maximum limit of 1000 credits per wu (it's since been increased to 2000)there's already been one reported incident of a host returning wu's reporting different benchmarks and times taken such that each credit claim was just under the 1000 credit limit.


ID: 25052 · Rating: -2 · rate: Rate + / Rate - Report as offensive    Reply Quote
soriak

Send message
Joined: 25 Oct 05
Posts: 102
Credit: 137,632
RAC: 0
Message 25060 - Posted: 27 Aug 2006, 2:58:37 UTC
Last modified: 27 Aug 2006, 3:07:29 UTC

Here's an example of why it's difficult to cheat the system:

After 10,000 models the average credit is 10 per model. Enter Mr C who changed his client to claim 10 times as much as he earned. He submitts 10 models and wants 1,000 instead of the 100 credits.

1,000 + 100,000 / 10,010 = ~10,09

By overclaiming 10 times he increased the average per model from 10 to 10.09 at a very early stage of the run.

The reason the average credit doesn't go down from the change of the system is that to the project it doesn't matter if one user claims 20 credits and the other 60, or both of them claim 40 credits. The total is still 80.


And even though it should be obvious: Faster systems still get more credit. They get the same amount of credit per model, but by virtue of being faster they do more models per hour - hence more credits.


edit: Quick addition... to those calling for a quorum: Rosetta is based on a lot of randomization, so even if two people get the same WU, the calculations would be different. There is now, however, a 'quorum' for models - those are comparable across work units of the same protein. So what you have now is a quorum of 100,000 essentially - as far as the credit system is concerned. ;)

You don't hear anyone claiming slow computers drag down the credit granted on those projects, because there it's even more obvious that a slower computer just takes longer to get the job done.


Personally, I think in almost any project a quorum of any kind is a MASSIVE amount of cpu power wasted. If they use a quorum of 3 (common) then 2/3rds of the processing power is wasted on the credit system. If a quorum of 2 has to be used for confirmation for projects like SIMAP, that's still 50% processing power that could instead crunch more stuff.
ID: 25060 · Rating: 0.99999999999999 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Trog Dog
Avatar

Send message
Joined: 25 Nov 05
Posts: 129
Credit: 57,345
RAC: 0
Message 25061 - Posted: 27 Aug 2006, 3:15:54 UTC - in response to Message 25060.  

Here's an example of why it's difficult to cheat the system:

After 10,000 models the average credit is 10 per model. Enter Mr C who changed his client to claim 10 times as much as he earned. He submitts 10 models and wants 1,000 instead of the 100 credits.

1,000 + 100,000 / 10,010 = ~10,09

By overclaiming 10 times he increased the average per model from 10 to 10.09 at a very early stage of the run.

The reason the average credit doesn't go down from the change of the system is that to the project it doesn't matter if one user claims 20 credits and the other 60, or both of them claim 40 credits. The total is still 80.


And even though it should be obvious: Faster systems still get more credit. They get the same amount of credit per model, but by virtue of being faster they do more models per hour - hence more credits.




Agreed, but if Mr C get's in at some point before the 10,000 result mark then he has a bigger impact, particularly if joined by his mates Mr D, E & F. Now if these 4 mates also happen to have top of the range boxes, and a couple each then it has an even bigger impact.

One user by themselves won't be able to make a difference, but teams of them acting in concert could.

ID: 25061 · Rating: -1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Vester
Avatar

Send message
Joined: 2 Nov 05
Posts: 257
Credit: 3,295,996
RAC: 15,143
Message 25066 - Posted: 27 Aug 2006, 3:39:54 UTC

Don't forget that team None (members not on a team) has more members and points than all teams together.
ID: 25066 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 25073 - Posted: 27 Aug 2006, 4:06:37 UTC - in response to Message 25061.  

One user by themselves won't be able to make a difference, but teams of them acting in concert could.

Please go on. I created a thread just for such a description of how to manipulate the new system. Please post the details there about how a team will be able to manipulate the new system. Keep in mind that the hypothetical 10x claim is pretty easily screened and omitted from the averages as well. ...and before anyone starts complaining about how their machine is optimized or quad cored or dual math processors, or whatever... the new system reflects all of that quite elegantly. And if you don't understand that your fast box will get credit that accurately reflects it as a fast box, then you need to be asking questions about the new system so that you understand it.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 25073 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Trog Dog
Avatar

Send message
Joined: 25 Nov 05
Posts: 129
Credit: 57,345
RAC: 0
Message 25079 - Posted: 27 Aug 2006, 4:50:32 UTC - in response to Message 25073.  


Keep in mind that the hypothetical 10x claim is pretty easily screened and omitted from the averages as well.


Is that kind of screening being carried out? If it is, why not just narrow the acceptable range so that it catches the optimised clients and be done with the rest?

ID: 25079 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 25080 - Posted: 27 Aug 2006, 5:05:27 UTC
Last modified: 27 Aug 2006, 5:20:05 UTC

To my knowledge there is presently no screening. Since we DO have a new credit system, I'll not comment on your other question. Perhaps you could rephrase it if there are questions there. Under both credit systems, the "optimized" client's credit claims are accepted as being within any such limits.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 25080 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Trog Dog
Avatar

Send message
Joined: 25 Nov 05
Posts: 129
Credit: 57,345
RAC: 0
Message 25082 - Posted: 27 Aug 2006, 5:17:19 UTC - in response to Message 25080.  

To my knowledge there is presently no screening. Since we DO have a new credit system, I'll not comment on your other question. Perhaps you could rephrase it if there are questions there. Under both credit systems, the "optimized" client's credit claims are accepted as within any such limits.


OK then, with no screening what is to stop someone writing a new optimised client that claims even more than the current batch do, or writing a script that increments a hosts benchmarks?
ID: 25082 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Trog Dog
Avatar

Send message
Joined: 25 Nov 05
Posts: 129
Credit: 57,345
RAC: 0
Message 25092 - Posted: 27 Aug 2006, 8:05:33 UTC - in response to Message 25082.  

To my knowledge there is presently no screening. Since we DO have a new credit system, I'll not comment on your other question. Perhaps you could rephrase it if there are questions there. Under both credit systems, the "optimized" client's credit claims are accepted as within any such limits.


OK then, with no screening what is to stop someone writing a new optimised client that claims even more than the current batch do, or writing a script that increments a hosts benchmarks?


Are there any hosts that run a standard client and unmodified xml benchmarks that are getting less granted than claimed?
ID: 25092 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mnb

Send message
Joined: 15 Dec 05
Posts: 51
Credit: 69,458
RAC: 0
Message 25093 - Posted: 27 Aug 2006, 8:22:01 UTC - in response to Message 25092.  

Are there any hosts that run a standard client and unmodified xml benchmarks that are getting less granted than claimed?

Yes, I have some WU's.

click here

list of my results
ID: 25093 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile carl.h
Avatar

Send message
Joined: 28 Dec 05
Posts: 555
Credit: 183,449
RAC: 0
Message 25094 - Posted: 27 Aug 2006, 8:30:12 UTC

Soriak, as I understand what your saying is that the system works on an average per model, is that correct ?

Are all models equal ?
Not all Czech`s bounce but I`d like to try with Barbar ;-)

Make no mistake This IS the TEDDIES TEAM.
ID: 25094 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
soriak

Send message
Joined: 25 Oct 05
Posts: 102
Credit: 137,632
RAC: 0
Message 25097 - Posted: 27 Aug 2006, 8:43:51 UTC - in response to Message 25094.  

Soriak, as I understand what your saying is that the system works on an average per model, is that correct ?

Are all models equal ?


Yep, that's correct - you get credits based on the average per model claimed so far.

Different models of the same protein are not all exactly the same. Sometimes you get a run that takes a little longer (not much though), other times the application realizes the model isn't going to lead anywhere useful and ends the run early.

In the first scenario you get a little less credit, in the second you get a little more. The longer your workunits run, the smaller the effect on the credits per Workunit.

If you run for 24hrs it will get a lot of models done, so the difference in runtime will likely average out within the WU itself. If you run for only 1-2hrs, you may only get one model done and see a much bigger effect of an early abort or longer runtime. There's no difference to your stats, it'll just jump out at you more on the results page.
ID: 25097 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Trog Dog
Avatar

Send message
Joined: 25 Nov 05
Posts: 129
Credit: 57,345
RAC: 0
Message 25098 - Posted: 27 Aug 2006, 8:54:03 UTC - in response to Message 25093.  

Are there any hosts that run a standard client and unmodified xml benchmarks that are getting less granted than claimed?

Yes, I have some WU's.

click here


Cheers

It will be interesting to keep an eye on how the results pan out for this box.

Any others?
ID: 25098 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 25105 - Posted: 27 Aug 2006, 9:48:43 UTC - in response to Message 25098.  

Are there any hosts that run a standard client and unmodified xml benchmarks that are getting less granted than claimed?

Yes, I have some WU's.

click here


Cheers

It will be interesting to keep an eye on how the results pan out for this box.

Any others?


All PowerMACs with the old IBM processors get less than 50%. For example here:
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=167475

Pentium 3 and Pentium M seem to gain above average.
ID: 25105 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Trog Dog
Avatar

Send message
Joined: 25 Nov 05
Posts: 129
Credit: 57,345
RAC: 0
Message 25108 - Posted: 27 Aug 2006, 10:12:34 UTC - in response to Message 25105.  

All PowerMACs with the old IBM processors get less than 50%. For example here:
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=167475

Pentium 3 and Pentium M seem to gain above average.


Wow! So that means in the terms of the new credit system that powermacs either take longer to do the same work as the "average" host or that the standard boinc client overestimates the benchmarks for macs.


ID: 25108 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 25109 - Posted: 27 Aug 2006, 10:15:38 UTC - in response to Message 25108.  

All PowerMACs with the old IBM processors get less than 50%. For example here:
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=167475
Pentium 3 and Pentium M seem to gain above average.

Wow! So that means in the terms of the new credit system that powermacs either take longer to do the same work as the "average" host or that the standard boinc client overestimates the benchmarks for macs.
It could be a compiler problem. Compiler for Macs are in most cases much slower than windows compiler.
ID: 25109 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : How to fake out the new credit system



©2024 University of Washington
https://www.bakerlab.org