Posts by Aegion

1) Message boards : Rosetta@home Science : Zika virus. (Message 80115)
Posted 21 May 2016 by Aegion
Post:
For the record, World Community Grid (which also is available through the BOINC client software) just went public with the OpenZika distributed computing project which is specifically dedicated to finding viable antiviral medications to fight the Zika Virus.
http://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=480
2) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 73425)
Posted 11 Jul 2012 by Aegion
Post:
why all these errors?

If you take a look at the rcent post from Ray Wang in the CASP 10 thread of the science forum he says the following which likely explains what you observed.

"Hello everyone!!

Apologies!

I am Ray Wang, one of the protein structures predictors of Baker Lab CASP10 team. Few days ago, there were massive error submissions which caused the success rates to be zero. That was due to a remiss update of the CASP working pipeline. We were very sorry about causing the inconvenience for you all, and for sure will be doing more meticulous checking before we trigger the pipeline!

Again Sorry about this, and THANKS you all for the contribution to the Rosetta@home!
We couldn't accomplish these scientific feats without your participation!!!!!"
3) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 73364)
Posted 30 Jun 2012 by Aegion
Post:
It should be noted that version 7.0.28 of Boinc is no longer the development version but is now in finalized general release as the currently available version on the Berkeley website, so anyone having issues with 7.0.28 should definitely try upgrading.
4) Message boards : Rosetta@home Science : Rosetta@home Web Site Translations (Message 34475)
Posted 10 Jan 2007 by Aegion
Post:
Hi,

I can give you a hand translating to spanish, I´m from Paraguay so I´m native spanish speaker.

What do I have to do or who do I have to contact ?

KAYROS

Try going here for the words and the format needed in order to create a Spanish version of the website.
http://boinc.bakerlab.org/rosetta/en.po

Earlier posts in this thread give examples of how things should work once the document is translated, and specifically includes jmurdoch's attempt at a Spanish translation, although I'm not sure to what extent it in fact is usable.

Most of the rest of the information, including how to contact David Kim directly can be found from this link. You should send the file to David Kim once you have it fully translated properly.
http://boinc.bakerlab.org/rosetta/rah_translate.php

If you don't understand some of the scientific terms enough to effectively translate them into Spanish either contact David Kim directly and ask for help regarding the passages in question, or go ahead and post what you don't yet fully grasp in this thread.

You help is definitely appreciated since Spanish is the most needed language currently not part of the Rosetta@home webpage translation options.
5) Message boards : Number crunching : code release and redundancy (Message 4917)
Posted 2 Dec 2005 by Aegion
Post:
FWIW, my "votes"
I prefer open or at least visible source, and would be in favor of redundancy by two. But I'll stay here regardless.

I don't get why the idea of rosetta redundancy so outrages some people.

The key is that the nature of the science isn't actually threated at all by someone simply cheating in order to boost their score, so that removes the key reason for redundancy in other projects. At worst that person is just not productively helping the project, but he's not significantly harming the science of the project in a relevant way.

The only real threat would be someone deliberately faking a perfect, or close to perfect, result using knowledge of what protein structure we are trying to predict, and thereby confusing the answer to what strategies actually serve to have software predict the shape of the protein without knowledge of its original shape. Of course the only people who could plausibly do this and trick the scientists involved with the project are those who are experts themselves in this area, so that makes this possibility extremely unlikely.

Basically the reason many people are against redundancy is you're cutting the effective cpu power of the project in half or more is you're talking about triple redundancy, and gaining no measurable scientific benefit even in assured accuracy in the process.
6) Questions and Answers : Getting started : Rosetta project URL (Message 4111)
Posted 24 Nov 2005 by Aegion
Post:
The project url is http://boinc.bakerlab.org/rosetta. Welcome to the project!
7) Message boards : Rosetta@home Science : Graphics (Message 3921)
Posted 22 Nov 2005 by Aegion
Post:
Ok, it looks like there still may be a bug to iron out with the Graphics unit beta. I was viewing the graphics for a unit which was about 20% complete. (I had received a graphics unit sometime previously and that one had processed the protein just fine.) Suddenly the unit terminated and another one was downloaded from the server at which point the program proceeded normally. The error message sequences was as follows.

11/22/2005 5:50:32 AM|rosetta@home|Unrecoverable error for result 1dcj__abrelax_rand_len10_jit02_omega_sim_graphics_00353_0 ( - exit code -1073741819 (0xc0000005))
11/22/2005 5:50:32 AM||request_reschedule_cpus: process exited
11/22/2005 5:50:33 AM|rosetta@home|Deferring communication with project for 1 minutes and 0 seconds
11/22/2005 5:50:33 AM|rosetta@home|Computation for result 1dcj__abrelax_rand_len10_jit02_omega_sim_graphics_00353_0 finished
11/22/2005 5:51:32 AM|rosetta@home|Sending scheduler request to http://boinc.bakerlab.org/rosetta_cgi/cgi
11/22/2005 5:51:32 AM|rosetta@home|Requesting 8640 seconds of work, returning 1 results
8) Message boards : Rosetta@home Science : Website suggestions and comments (Message 3726)
Posted 20 Nov 2005 by Aegion
Post:
After thinking things through, I have two particular suggestions to make. A basic issue that is definately true of the current active distributing computing community in general, and probably true of most people downloading a distributed computing project for the first time, is they don't just agree to join up with a project because its posted at some website. They want a tangible and reliable organization backing the project so they can be sure they are not simply downloading spyware or viruses and they can feel more confident that the project is involved with something worthwhile. The University of Washington is that organization in this case, and you want to make that association as clear as possible at first glance.

Right now I observed that there is essentially nothing in the "Welcome from David Baker" page clearly showing the project is associated with The University of Washington. (While the logo being larger would help, the more clear indications that someone can find upon initial examination the better.) Right now the end of the introduction simply has your name. I would suggest adding your title at the University of Washington under your name to establish the project's link to the university more clearly. This has the added bonus of clarifying your academic credentials to help assure potential new participants that you are a qualified scientist so they can feel more confident that the project is scientifically sound.

The other issue I noticed from the same page is the link from it to the "David Baker Profile" has a similar issue. While it conveys your title, it actually doesn't very clearly convey WHICH university you are part of. If someone actually clicks on the link at the end of the page it should become clear that you are part of the University of Washington, ideally people shouldn't have to go that far just to figure this detail out. Simply the abbreviation "UW" may be fine when just dealing with University of Washington students, but when people from all over the world may be stumbling across this website, they may have no clue what the heck "UW" is an abreviation for. If changing the text on the page is a concern because the text is taken from the original article, I'd suggest adding "University of Washington" in brackets after the abbreviation.
9) Message boards : Rosetta@home Science : Rosetta Science for idiots (Message 3241)
Posted 15 Nov 2005 by Aegion
Post:
10) Message boards : Number crunching : code release and redundancy (Message 3140)
Posted 14 Nov 2005 by Aegion
Post:
I may be wrong, but my understanding is that all Rosetta work units use a random seed to begin with, at the client's end.

So if the same work unit is sent to two participants, they would not return the same result anyway. They might need to send it to millions of participants to get two matching results.

Based on my past knowledge gained from participication in a distributed computing project with virtually identical goals, I'm pretty certain that what your saying should be the case since it certainly is how it worked with the other project.
11) Message boards : Number crunching : Who exactly is sponsoring this project? (Message 3138)
Posted 14 Nov 2005 by Aegion
Post:
I've read thru the about files on this site, etc, but for the life of me, I can't figure out if this project is sponsored by a university or a company, or just Dave Baker.

Maybe I just missed it, but the only reference I see to any "non profit" is the copyright "University of Washington" at the bottom of the page...


How is this project funded/backed? I think that a fair question from all crunchers...

The project is backed by the University of Washington, where David Baker is a professor, and run out of The Baker Laboratory.
http://depts.washington.edu/bakerpg/

I'm not 100% sure about all their sources of funding specifically for this project, I would expect the University of Washington would be directly responsible for at least part of it, but I'll wait for someone who is part of the Baker Lab to give you the precise details.
12) Message boards : Number crunching : code release and redundancy (Message 3137)
Posted 14 Nov 2005 by Aegion
Post:
I'm here for both the science AND the stats.

With regards to redundancy, I find it difficult to have much faith in unverified results. How much important data do you write off due to unstable machines before you consider it worth having redundancy? How do we know we haven't already missed the most important result because it got corrupted? The other boon to having redundancy is that it partially minimises the effects of credit cheats or "tweakers".

I suspect you like a few other individuals in this thread haven't thought through the full ramifications of how the science of the project relates to these threats.

This project is not like a encrytion breaking effort where every computer gets assigned a different key or Seti where unless redundant units are given, everyone ends up with a different data point and a screwed up result could mean missing a sign of intelligence life being out there.

The goal of the project is to predict the outcome of protein folding with regard's to an amino acid chain's shape. Ultimately its only the most accurate result, as long as it can be accurately detected and recongized, that really matters for this project. Theoretically you could have thousands of increadibly innacurate results and only one that happens to predict the protein folding shape quite accurately, and the project could be considered spectacularly sucessfull. Now in practice if everyone but one person is getting horribly innacurate predictions it suggests that something should be done so that more people are coming closer, increasing the odds that one will get lucky and be extremely accurate. Having said that, its ultimately how accurate the most accurate result is that determines whether this distributed folding project can be considered successfuly or not.

What this means is that from a scientific standpoint if 5% of the participants in those projects are cheating and producing entirely bogus results, that only means that the project loses 5% of the participants that could have otherwise chosen to productively crunch for them. By contrast redundant units for each person means a loss of 50% in effective computing power! The only concern from a scientific standpoint is that a participant's cheating somehow will trick the server side software into thinking that a badly off result is actually extremely accurate. However since the top predicted results for a particular protein should get manually examined by a scientist involved with the project it is VERY unlikely that they can trick trained scientist, so no actual damage to the science of the project would be done. The only risk for the science would be if such a large portion of the participants started cheating that it makes it tough to determine whether a tweak to the code makes the results in general be closer to being accurate. However that level of cheating is unlikely to occur.

While I'm not at all convinced that cheating is such a problem in the first place, you should recognize that its not a threat to the actual science of the project.

Edit: Biggles, obviously feel free to quiz me more in this thread or in TVR's thread on the Arstechnica forum if you want more info relating to the science of the project itself.






©2024 University of Washington
https://www.bakerlab.org