Posts by Janus

1) Message boards : Number crunching : code release and redundancy (Message 4432)
Posted 27 Nov 2005 by Janus
Post:
I think the license sounds fine. If released along with a couple of test WUs, some result quality boundaries and 2 or 3 times redundancy it would be great.

By releasing the code you also attract the portion of the crunchers that only crunch on opensource projects.
2) Message boards : Number crunching : Disk Space (Message 3502)
Posted 17 Nov 2005 by Janus
Post:
Would have been interesting to know what file in the rosetta slot had gone mad...
3) Message boards : Number crunching : code release and redundancy (Message 2822)
Posted 10 Nov 2005 by Janus
Post:
Ok, several things:

1) Security through obfuscation doesn't work. Period.
Going closed source in attempt to avoid having people manipulate/fake results is exactly what will make people do these kinds of things. If redundancy is >=2 they can't get their results validated and hence there's no idea in attempting to cheat the system.

2) Furthermore closed source still allows people to set artificially high claimed credit values for their work, repetitively returning the same or similar looking results or something like that - Unless the redundancy is at at least 2.

3) I've learned that in science an independently confirmed result is worth at least twice as much as a result that cannot be fully trusted.

4) Most of the applications that are compiled on my own machine will run between 10 and 50% faster compared to the a standard x86 version - simply because the GCC or ICC compilers have nifty optimizations for mostly every CPU on earth. Those optimizations don't break the code and will provide the exact same results but utilize special features on the CPU to gain speed.

Setting redundancy to 2 while at the same time releasing the source code therefore does not equal a 50% drop in overall crunching power.

5) Even with closed source you will encounter people with CPUs so overclocked that the result is utterly wrong. With redundancy this result will be dropped. Also there's a small difference in how platforms handle floatingpoint operations. This can cause the results to differ across platforms. Without redundancy you may never know of some strange bug only apparent on a single odd platform. If I remember correctly a couple of other projects had to rewrite their science code after seeing that the results from different platforms didn't match up at all.

---

Given the current influx of users, the current credit rate and climbing RAC, it probably won't be more than a month or two before you hit an RAC of 1000k (it's 560k right now and I haven't even started yet, hehe).

"Live" Rosetta RAC based on last XML export

---

Well, I'd say do what the title of this thread says ("code release and redundancy"):
Go for the reliable results (redundancy=2 or higher)
AND
Release the code to get the results faster (and help optimizers by providing a test-WU+result and telling what limits for the values are accepted)

It's your call though. No matter what you choose you are probably going to loose a bunch of users who disagree in your decision. Tuff one eh'?

ps. If you know exactly how many iterations are spent in each loop of the science app you can use the most precise of the credit measurement systems in BOINC. But it takes a little while to actually figure that stuff out... CPDN can do it this way because they have an app that always spends the same amount of cycles in particular loops.

pps. Oh, one thing about redundancy=2: It is conservative credit-wise. You will always get granted at most the amount of credit that you deserve. Some people tend to dislike this... I find it way more fair than the current "get whatever you like" credit strategy.
4) Message boards : Rosetta@home Science : A question on stats... (Message 2655)
Posted 8 Nov 2005 by Janus
Post:
The timestamp is included in tables.xml: 1131472820
The above last one was Tue, 08 Nov 2005 18:00:20 GMT, so yup.
5) Message boards : Number crunching : pending credit (Message 2628)
Posted 8 Nov 2005 by Janus
Post:
Since the qourum size here is 1 unit and not 3 I guess you are just suffering from validator lag. Longest lag I had was a couple of minutes here, try refreshing your list.
6) Message boards : Number crunching : Is it time to verify Results (Message 2622)
Posted 8 Nov 2005 by Janus
Post:
Yup, I'm just worried that I may end up harming the project by trying to contribute more...
7) Message boards : Number crunching : Is it time to verify Results (Message 2617)
Posted 8 Nov 2005 by Janus
Post:
When you look at the results is it possible for you to see if they are not likely to be correct by using "adjacent" results?
For instance if I download the source code and optimize it a bit too much so that it always generates energy ratings that are too high compared to the correct result (I assume that in the example you just mentioned the energy rating was too low) - would you be able to tell? Perhaps from some kind of spike in the "landscape" you are searching?

What about redundancy set at 2 results? This seems like a good strategy to me if the answer to the above is "no".

And when looking at the growth rate of this project it's probably not that bad to cut it in 2. Ok, you do have the coolest server equipment, so perhaps that's not an issue =)






©2024 University of Washington
https://www.bakerlab.org