code release and redundancy

Author	Message
Yeti Send message Joined: 2 Nov 05 Posts: 45 Credit: 10,901,509 RAC: 70,541	Message 2825 - Posted: 10 Nov 2005, 19:24:33 UTC Last modified: 10 Nov 2005, 19:25:31 UTC Redundancy will not really become a problem. I don't like crunchers that cheat; if I know, that there is redundancy, I know, that they not really have a big impact, so I can live with it. And, I don't think, that redundancy is a big waste of computing power; Seti@Home has wasted very much computing power by letting the user crunch the same result 10 times and more ... As Janus said: 3) I've learned that in science an independently confirmed result is worth at least twice as much as a result that cannot be fully trusted. I think, thats's important ! Supporting BOINC, a great concept ! ID: 2825 · Rating: 0 · rate: / Reply Quote

Jocelyn Larouche Send message Joined: 9 Nov 05 Posts: 11 Credit: 6,994 RAC: 0	Message 2827 - Posted: 10 Nov 2005, 19:33:22 UTC As far as coding goes you probably should get help from experts you have very close contact with, to prevent unofficial program being run. As far as algorythm goes, this should be open to general public for discussion. ID: 2827 · Rating: 0 · rate: / Reply Quote

rbpeake Send message Joined: 25 Sep 05 Posts: 168 Credit: 247,828 RAC: 0	Message 2837 - Posted: 10 Nov 2005, 20:22:19 UTC - in response to Message 2781. Even with the progress of the last two days, it is clear we are going to be CPU time limited for the forseeable future. So redundancy really is wasteful. How would people feel about this: we do not resort to redundancy, so every calculation is unique, but we wait to release the code until the credit issue is resolved. Maybe I am a bit naive, but as they say "if it ain't broke, don't fix it". The way this project is set up now is fine with me! I do not believe Folding@home uses redundancy either, and for the same reason as Rosetta@home, which is if a result is garbage it doesn't make any difference. It is not used. And I am not so concerned with cheaters, because the science and tracking my own progress is what counts most for me. And it does not appear cheating is rampant at any rate. So David, bottom line, I agree with your statement! :) Regards, Bob P. ID: 2837 · Rating: 0 · rate: / Reply Quote

David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0	Message 2841 - Posted: 10 Nov 2005, 20:51:00 UTC I agree with Janus. Projects that have been around for a while using BOINC (Seti, in particular), have already dealt with these issues. The possible benefits of releasing the code (user's can contribute in yet another way, which can directly impact the project positively such as code optimization) and using redundancy (to prevent cheating) outweigh the decrease in production, in my opinion. Also, there are a lot of talented developers out there and I wouldn't rule out the possibility of increased production from their contributions. Here is a copy of the license that has been drafted by UW TechTransfer so far: Rosetta++ Software for BOINC The University of Washington (UW) and the developers of the Rosetta software (Developers) give permission for you (You) to optimize and modify the grid version of Rosetta++ protein prediction software (Software) in order to assist in the advancement of computational protein prediction as part of the Rosetta@Home project. Software was developed through support from the National Institutes of Health, Human Frontier Science Program Grant, National Science Foundation, Office of Naval Research, Packard Foundation, the Damon Runyon Cancer Research Foundation, Jane Coffin Childs Foundation, in part by researchers of the Howard Hughes Medical Institute (HHMI) at UW, and a cast of thousands of biochemistry researchers like you. UW and the Developers allow You to copy and modify Software, solely for internal, non-profit research purposes, on the following conditions: The Software remains with You and is not published, distributed, or otherwise transferred or made available to anybody else except those of us here at Rosetta@Home. You agree not to use the Software for research or any other purpose than optimization and debugging of Software for use in the Rosetta@Home project. If you wish to obtain the software for research purposes, please go to http://depts.washington.edu/ventures/ and go to the Express Licensing section on that website. You agree to provide the Developers with feedback on the operation of the Software and modifications You make to the Software which optimize performance of the Software (Modified Software), and that the Developers and UW are permitted to use any information You provide to Rosetta@Home in making changes to the Software. You grant UW the right to copy, modify, and distribute Modified Software, including the Modified Software source code, as part of the Rosetta@Home BOINC distributed computing project and to use your feedback and code suggestions in any other distributed computing projects utilizing Modified Software. If You wish to obtain Software for any commercial purposes, including fee-based service projects, You will need to execute a separate licensing agreement with the University of Washington and pay a fee. In that case please contact: license@u.washington.edu You retain in Software and any modifications to Software, the copyright, trademark, or other notices pertaining to Software as provided by UW. You acknowledge that the Developers, HHMI, UW and its licensees may develop modifications to Software that may be substantially similar to your modifications of Software, and that the Developers, HHMI, UW and its licensees shall not be constrained in any way by You in HHMI�s, UW�s or its licensees� use or management of such modifications. You acknowledge the right of the Developers, HHMI, and UW to prepare and publish modifications to Software that may be substantially similar or functionally equivalent to your modifications and improvements, and if You obtain patent protection for any modification or improvement to Software You agree not to allege or enjoin infringement of your patent by the Developers, HHMI, the UW or by any of UW�s licensees obtaining modifications or improvements to Software from the University of Washington or the Developers. Any risk associated with using the Software at your institution is with You and your Institution. Software is experimental in nature and is made available as a research courtesy "AS IS," without obligation by UW to provide accompanying services or support. UW AND THE AUTHORS EXPRESSLY DISCLAIM ANY AND ALL WARRANTIES REGARDING THE SOFTWARE, WHETHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES PERTAINING TO NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. ID: 2841 · Rating: 0 · rate: / Reply Quote

Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0	Message 2844 - Posted: 10 Nov 2005, 20:53:29 UTC Hi all ! I'd like to make a suggestion how to solve the cheating issue without resorting to redundancy. As some people seem to have spent quite a lot of thought on the credit/cheating issue, I am sure something similar must alreay have been discussed. Here is my suggestion anyway: Looking at the required CPU times of individual WU types (same WU name without the running number at the end) it seems that for the same CPU/OS combination the required times don't vary all that much; by less than 20% (stdv) I would guess. So the number of CPU cycles required for a given WU type should vary by about the same amount. My suggestion therefore is to take the median of the claimed credit for the first 100 or first 1000 units of a given WU type and then grant that value to all WUs of that type. Assuming that only a small fraction of WUs will be from cheaters the median should be quite well defined. We would at any rate average over a much larger number than in the twofold redundancy case. The 20% or so difference in required cycles for a given WU type should quickly average out for individual users after they have received a certain number of WUs of that type. So, would the BOINC infrastructure be able to handel such an approach? -Hermann ID: 2844 · Rating: 0 · rate: / Reply Quote

hugothehermit Send message Joined: 26 Sep 05 Posts: 238 Credit: 314,893 RAC: 0	Message 2866 - Posted: 11 Nov 2005, 4:07:50 UTC My vote is for open source. Although, the WU that is the lowest Energy and RMSD (before using it for more computation or being placed in the top predictions) should be checked by one of your local machines running the official app, thereby getting rid of redundancy on a major (full user base) scale, bad compiles and idiots. Cheating is BOINC's problem not yours, and should be forgotten about at your end. I would hate to think that you would go for redundancy on all WU's as your computing power will be more than halved, re-sending the conflicting WU's out again. I probably wouldn't be able to help with the optimization as I've been out of it too long ,and self taught to boot, and for all you knockers of C / C++ it's a beautiful language let down by it's compilers, which end I shan't say :) My bet for 1 million a day is 2006 Jan 05 8:45:25AM GMT , if I'm right I want a postcard signed by the R@H team ;P ID: 2866 · Rating: 0 · rate: / Reply Quote

EclipseHA Send message Joined: 3 Nov 05 Posts: 12 Credit: 284,797 RAC: 0	Message 2868 - Posted: 11 Nov 2005, 4:25:48 UTC What really dumbfounds me with the whole "open source" discussion, is that with the exception of seti, there's not been another project where it's not been clear to to folks running the project that the cruncher code needed to be controlled. Why is this such a mystery here? Are the rosetta admins not talking to the PP, the Einstine, the LHC, or CPDN folks, on a restricted email list to exchange experiences? We all know that Seti isn't realy doing "science", but only looking in general for signals that may be worth looking into (the real signal data isn't that important), while other projects, like rosetta, are doing specific calculations, to find real solutions to real world problems. I'd guess, that most that want "open source", won't use the code themselves, but are looking for an optimized cruncher so they can get more credits.... Einstine, for example, had a problem that the Linux cruncher was much slower than the windows cruncher. They didn't open the source code, but instead, fixed the linux cruncher. That seems to be the "model" on how projects should fix their crunchers... ID: 2868 · Rating: 0 · rate: / Reply Quote

hugothehermit Send message Joined: 26 Sep 05 Posts: 238 Credit: 314,893 RAC: 0	Message 2869 - Posted: 11 Nov 2005, 4:58:12 UTC azwoody let me explain my position on this I'd guess, that most that want "open source", won't use the code themselves, but are looking for an optimized cruncher so they can get more credits.... The ability to optimise code isn't a bad thing, it helps to send WU's back faster and thereby increasing the computing power of the project. It's only a bad thing when someone inadvertently or deliberately changes the code and starts to produce incorrect results or malware. I'm sure the R@H team would love there code to be fully optimised for every OS and CPU combination. I would like the programme open source and I don't really care about credits, it's just that if we can optimise the code we can speed up the science. ID: 2869 · Rating: 0 · rate: / Reply Quote

EclipseHA Send message Joined: 3 Nov 05 Posts: 12 Credit: 284,797 RAC: 0	Message 2870 - Posted: 11 Nov 2005, 5:09:57 UTC - in response to Message 2869. azwoody let me explain my position on this I'd guess, that most that want "open source", won't use the code themselves, but are looking for an optimized cruncher so they can get more credits.... The ability to optimise code isn't a bad thing, it helps to send WU's back faster and thereby increasing the computing power of the project. It's only a bad thing when someone inadvertently or deliberately changes the code and starts to produce incorrect results or malware. I'm sure the R@H team would love there code to be fully optimised for every OS and CPU combination. I would like the programme open source and I don't really care about credits, it's just that if we can optimise the code we can speed up the science. Then why, other than seti, does no other project release their source code? Many smart folks, directly working with the other main-stream projects, seem have all come to the the same conclusion. And that conclusion is "do not release source code"? Remember, faster is not always better! And most projects seem to understand that! ID: 2870 · Rating: 0 · rate: / Reply Quote

UBT - Halifax--lad Send message Joined: 17 Sep 05 Posts: 157 Credit: 2,687 RAC: 0	Message 2873 - Posted: 11 Nov 2005, 6:36:09 UTC Many other projects already have the technical experts in there buildings who are able to provide clients for every type of computer the guys here dont so some people are exculded from this project open source wont be a problem in my opinion it will just advance and help the project. Its exactly like someone else mentioned further up what ever decision is made some will like it and some will hate it Roll on OPEN SOURCE!! Join us in Chat (see the forum) Click the Sig Join UBT ID: 2873 · Rating: 0 · rate: / Reply Quote

Shaktai Send message Joined: 21 Sep 05 Posts: 56 Credit: 575,419 RAC: 0	Message 2876 - Posted: 11 Nov 2005, 8:26:48 UTC I think releasing the code is a good idea. However, I think it would be best to manage the release of the code to volunteers who agree to the terms of the code. I agree with David Kim's proposal. That way you control the code of the apps that actually goes out to users. There are a lot of people who are very good at optimizing code, and if they are willing to donate their expertise so it can be rolled back into the project, that is a good thing. I don't like the idea of redundancy if it does not benefit the science. For some projects, redundancy is beneficial, but if it doesn't benefit the project, don't do it. I'm here for the science. The credit is just for fun. Roll out the code, but manage it in a manner that restricts the code release to individuals who agree to the science. Question: Is it possible to take the optimized code from the volunteers, and then incorporate a "verification code" of some kind that would be required to actually get credit? The "verification code" would then be added by the Rosetta team to actual public released apps. There is no perfect answer or solution. However, I think that redundancy would be more damaging to the project in the long run, if it is not needed for the science. Avoid redundancy if there is no scientific benefit. Open to optimization, if you can manage and verify the contributions. Team MacNN - The best Macintosh team ever. ID: 2876 · Rating: 0 · rate: / Reply Quote

Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0	Message 2878 - Posted: 11 Nov 2005, 9:38:57 UTC - in response to Message 2876. Last modified: 11 Nov 2005, 9:45:49 UTC I think releasing the code is a good idea. However, I think it would be best to manage the release of the code to volunteers who agree to the terms of the code. I agree with David Kim's proposal. That way you control the code of the apps that actually goes out to users. There are a lot of people who are very good at optimizing code, and if they are willing to donate their expertise so it can be rolled back into the project, that is a good thing. I don't like the idea of redundancy if it does not benefit the science. For some projects, redundancy is beneficial, but if it doesn't benefit the project, don't do it. I'm here for the science. The credit is just for fun. Roll out the code, but manage it in a manner that restricts the code release to individuals who agree to the science. Question: Is it possible to take the optimized code from the volunteers, and then incorporate a "verification code" of some kind that would be required to actually get credit? The "verification code" would then be added by the Rosetta team to actual public released apps. There is no perfect answer or solution. However, I think that redundancy would be more damaging to the project in the long run, if it is not needed for the science. Avoid redundancy if there is no scientific benefit. Open to optimization, if you can manage and verify the contributions. I liked the comments by Shaktai: releasing the code in a controlled way but no redundancy plus measures to prevent the use of self-compiled applications ! "verification code" is probably meant to mean the returned results should be signed, with the key being tucked away in the executable of the application. As Shaktai suggested, the key and the part of the code that does the signing would have to be kept secret. Whenever it is feared that the key has been compromised one would simply have to distribute a new version of the application containing a new key. I haven't investigated how things are handled in the current software, but I had assumed that this is what is done even today ? In addition, credit fraud using self-compiled BOINC clients could be prevented by the 'granted credit' calculation I proposed further down in this thread. ID: 2878 · Rating: 0 · rate: / Reply Quote

stephan_t Send message Joined: 20 Oct 05 Posts: 129 Credit: 35,464 RAC: 0	Message 2879 - Posted: 11 Nov 2005, 10:10:04 UTC - in response to Message 2878. Last modified: 11 Nov 2005, 10:10:40 UTC I liked the comments by Shaktai: releasing the code in a controlled way but no redundancy plus measures to prevent the use of self-compiled applications I strongly disagree - a) your suggested way of calculating credit involves giving people a credit 'average' - I don't want an average of my last 30 WU, I want the precise credit for each b)if the code is out, it's out. Think Pandorra's box. Just go to alt.binaries.warez to see where 'nda/For your eyes only' code end up. c) preventing the use of self-compiled application? With that sort of mentality I take you are not using Linux. I'm much more enclined to agree with Janus' comments above. If 100% bulletproof cheating prevention can only be implemented via redundancy, then redundancy it should be. I can't imagine the project getting many hardcore crunchers if it becomes a big cheat 'free-for-all'. Team CFVault.com http://www.cfvault.com ID: 2879 · Rating: 0 · rate: / Reply Quote

Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0	Message 2880 - Posted: 11 Nov 2005, 10:28:45 UTC - in response to Message 2879. Last modified: 11 Nov 2005, 10:30:41 UTC a) your suggested way of calculating credit involves giving people a credit 'average' - I don't want an average of my last 30 WU, I want the precise credit for each Well, f@h uses fixed credits for each WU type. I am sure they don't all require the same number of cycles to complete. b)if the code is out, it's out. Think Pandorra's box. Just go to alt.binaries.warez to see where 'nda/For your eyes only' code end up. c) preventing the use of self-compiled application? With that sort of mentality I take you are not using Linux. I don't get it. pgp/gnupg is open source and everyone can play around with the code, yet it is supposed to be save. ID: 2880 · Rating: 0 · rate: / Reply Quote

stephan_t Send message Joined: 20 Oct 05 Posts: 129 Credit: 35,464 RAC: 0	Message 2881 - Posted: 11 Nov 2005, 11:05:06 UTC - in response to Message 2880. I don't get it. pgp/gnupg is open source and everyone can play around with the code, yet it is supposed to be save. You misread me. I agree that open source CAN be safe and in fact sometimes safer than closed source. What you call 'self compilation' is just the natural result of open source, that is, I can take the source and compile it myself. Linux users do that everyday with all sorts of apps. So yes, you can take the source of PGP and compile it, it's still safe. My comment was just highlighting that forbidding 'self compilation' is impossible and serves no purpose. Team CFVault.com http://www.cfvault.com ID: 2881 · Rating: 0 · rate: / Reply Quote

Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0	Message 2888 - Posted: 11 Nov 2005, 12:35:16 UTC - in response to Message 2881. I don't get it. pgp/gnupg is open source and everyone can play around with the code, yet it is supposed to be save. You misread me. I agree that open source CAN be safe and in fact sometimes safer than closed source. What you call 'self compilation' is just the natural result of open source, that is, I can take the source and compile it myself. Linux users do that everyday with all sorts of apps. So yes, you can take the source of PGP and compile it, it's still safe. My comment was just highlighting that forbidding 'self compilation' is impossible and serves no purpose. OK, sorry. What I meant by preventing self-compilation is this: If I had the source of the application minus the part of the code that contains a secret key with which it signs the application output files, then I could compile the application but still wouldn't be able to return correctly signed results files to the server. Agreed ? Technically it would perhaps still be possible to somehow extract the secret key from the executable but I guess one could disguise/encrypt the key to make that quite hard. ID: 2888 · Rating: 0 · rate: / Reply Quote

John McLeod VII Send message Joined: 17 Sep 05 Posts: 108 Credit: 195,137 RAC: 0	Message 2918 - Posted: 11 Nov 2005, 22:09:06 UTC Redundancy can indeed help the science. SZTAKI started out with no redundancy until one of the hosts went haywire and started returning bogus results - it trashed quite a few before it was caught and redundancy was implemented. I would suggest a quorum of 3 with 3 initially sent. BOINC WIKI ID: 2918 · Rating: 0 · rate: / Reply Quote

ecafkid Send message Joined: 5 Oct 05 Posts: 40 Credit: 15,177,319 RAC: 0	Message 2970 - Posted: 12 Nov 2005, 13:54:13 UTC Last modified: 12 Nov 2005, 14:06:12 UTC Credits seem to be awfully important to some people. I thought this was volunteer and we were doing this because we believed in what the project was doing. I didn't know this was a competition. Wow what a competitive group. Also on the source code, for scientific reasons and control. If you release the source code and somebody modifies it how do you have control for your results. Send the same project to multiple people and hope they aren't using a porr self-compiled program that may or may not return bogus results and some how take an average of what they return. And as far as credits go I take what I get as I am only doing this to help out. I feel Like WE ALL MAKE A DIFFERENCE and who cares about the credit. I will be happy to see the graphics of the project s I am crunching. I will be happy if some good comes out of what WU's I process. Oh well off of my high horse for now. Happy crunching ID: 2970 · Rating: 0 · rate: / Reply Quote

FluffyChicken Send message Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0	Message 2992 - Posted: 12 Nov 2005, 16:55:57 UTC I don't mind redundency, if you need it do it If there are people cheating and it's noticed by members you will loose a lot of your crunching power if you do nothing about it. and no it's not BOINCs problem, rosetta use boinc so it's also their problem. Yes we are here for the science, but we're also here for some fun along the way, something to make you leave the computer on for a few more hours, spend your hard cash on that faster CPU. Many of the top teams are built upon this, makes them 'recruit' more members to the team and hence the project. Cheating KILLS this off and your members. As for going open source with no control over validity.. So I take the source, I alter the code by accident maybe divide by 2 rather than multiply by two, this then propegates though the results... Ok so my unvalidated result gets sent back in , can you detect it's got that error, what If I give it to my team members to use, release it on my webpage. Lots of people start using it because its 20% faster and I also have one that'll run on the XBOX360. Is that useful to rosetta ? Now you change your code in response to these results... As a side if you release the code, how does this also effect the WCG/UD users in the HPF project, could they compile it for the WCD/UD-BOINC platform so they send results in faster but witha client mishmash. Team mauisun.org ID: 2992 · Rating: 0 · rate: / Reply Quote

Angus Send message Joined: 17 Sep 05 Posts: 412 Credit: 321,053 RAC: 0	Message 2993 - Posted: 12 Nov 2005, 16:58:39 UTC - in response to Message 2970. Credits seem to be awfully important to some people. I thought this was volunteer and we were doing this because we believed in what the project was doing. I didn't know this was a competition. Wow what a competitive group. WOW - you must be new here.... :) and to Distributed Computing. Credits and stats are what keeps the masses running these projects. ( The few percent in it for the science need not start flaming - you know it's true.) Proudly Banned from Predictator@Home and now Cosmology@home as well. Added SETI to the list today. Temporary ban only - so need to work harder :) "You can't fix stupid" (Ron White) ID: 2993 · Rating: 0 · rate: / Reply Quote