Posts by Jack Schonbrun

1) Message boards : Number crunching : Happy Holidays - Admins back in Jan. (Message 7445)
Posted 24 Dec 2005 by Profile Jack Schonbrun
Post:
Just to let you know, I'm taking off for my holiday vacation. I wouldn't expect anyone from University of Washington to be reading these boards from now until the New Year. But I do think in 2006 you will see a renewed vigor in Rosetta@home, as we try to make up for all the problems experienced this past week.

Thanks again to everyone for their patience. And especially to all of you who helped answer everybody's questions, as well as provide encouragment and support. Things will get better.

Happy Holidays!

Jack

2) Message boards : Rosetta@home Science : Feedback, .. bandwidth usage :-( (Message 7429)
Posted 23 Dec 2005 by Profile Jack Schonbrun
Post:


Currently these files are Work Unit specific. Even though they have the same name, they do not always have identical contents.


I'm worried by this.

If someone has more than one WU (and even with the standard 0.1 day cache most of my clients have one crunching and one Ready to Run) then they will have only one copy of the file.


Okay, I wasn't very precise. For the work units that we have been sending out, to the best of my knowledge, these files are identical. But there are situations where these files could have different contents even though they have the same name. But I see that this could cause problems with the boinc directory structure. I'll make sure that we resolve this before the time comes where we might be inclined to send out different files with the same name.

3) Message boards : Rosetta@home Science : Feedback, .. bandwidth usage :-( (Message 7424)
Posted 23 Dec 2005 by Profile Jack Schonbrun
Post:
If you are on a dial-up, I would definitely ask for "no new work" from Rosetta until the new year. I can see how redownloading these files every 30 seconds as jobs fail would be maddening. And gobble up your monthly download limits. Pretty ridiculous. [edit] I mean, "pretty ridiculous" how many problems we are causing.[/edit]

4) Message boards : Number crunching : No credit! (Message 7423)
Posted 23 Dec 2005 by Profile Jack Schonbrun
Post:

I was gobsmacked to read in another thread that there is as yet no in-house testing of work about to be sent out, and to make changes with no testing and before the holidays seems ludicrous, and dare I say a touch amateurish.

But, I would just like to say thanks to all the volunteers here who have helped me during this brief stay.

Onward to a more stable and Windows 9X friendly project methinks.


Everyone has to do what that are comfortable with, and I hope you will give us another chance once things have stabilized.

We do have in-house testing, but it has been obviously been demonstrated to be inadequate. Distributed Computing challenges code in ways that can unanticpated. I think one of the things that makes Rosetta@home interesting is that code and algorithms will be constantly evolving. This probably makes it more likely that we will send out faulty work units. I think this is just going to be that kind of project. In my experience on the Rosetta project before it moved to distributed computing, I know that David Baker will always want to be updating the code with new ideas. Because of this especially, we will definitely need to implement a more rigorous test suite, and I wouldn't blame anyone for waiting to sign back up until that happens.

5) Message boards : Number crunching : Moderated messages moved here (Message 7402)
Posted 23 Dec 2005 by Profile Jack Schonbrun
Post:
This thread has gotten very off-topic.
6) Message boards : Rosetta@home Science : "Unrecoverable Error" message (Message 7387)
Posted 23 Dec 2005 by Profile Jack Schonbrun
Post:
Unfortunately, I think that all current Work Units currently have the possibility of crashing after 30 seconds. But these do not need to be aborted.
7) Message boards : Rosetta@home Science : Feedback, .. bandwidth usage :-( (Message 7386)
Posted 23 Dec 2005 by Profile Jack Schonbrun
Post:
* I've currently been baby sitting by keeping these large files in a seperate folder then copying them back when it deletes and tries to transfer them again, boinc has never complained, neither has rosetta. But if you say they are different then I guess they must be, although you should really 'tie' them to the work unit with a designation or something and notice the file is wrong <whistles>.


They may or may not be different, which is why we had them tied to work units. What you are doing may or may not be okay. We clearly do need some kind of check for this, as it could seriously confuse our results for people to be swapping files in and out.

I will tell DK to modify the requirements webpage.

This is another element of being a new boinc project. Our code was developed for running on our own local clusters, and we are still learning about how to optimally reconfigure it for distributed computing. Your comments are very useful in this regard.

I know that a some work went into writing code so that rosetta can read and write gzipped files, I don't know how hard it would be swap in bzip2. Is rar opensource?
8) Message boards : Rosetta@home Science : 4.81 allows rotating the protein... anybody try it? (Message 7329)
Posted 23 Dec 2005 by Profile Jack Schonbrun
Post:
Thank you Jack for the great work! It's nice to have the interactive graphics. Yes I see what you mean about the black outline looking funny at certain angles. Kind of looks like you're rotating a 2-D object in 3-D space, then you're seeing it on-edge (some of the segments, anyway).


David Kim is actually the one who put in the rotation, must give him credit. I hope to fix the lines problem and add some other features over the holidays, but no promises.

9) Message boards : Number crunching : How to have the best BOINC project. (Message 7327)
Posted 23 Dec 2005 by Profile Jack Schonbrun
Post:
I do think there must be better ways to arrange the message boards.
Their immediacy and non-hierarchical, democratic nature is part of what makes them fun. But it can make it very hard for the admins to keep up. And the same questions have to be answered repeatedly partly because threads are so difficult to navigate.

I'm picturing some kind of dynamic FAQ. I guess a WIKI might be the way to do this.
The thread titles are just too short. And the message boards are too unstructured. We need a way to have clear descriptions of problems that people might have, arranged in a way that is easy for people to find. But perhaps also easy for others to comment on and add to.
10) Message boards : Number crunching : Kudos to Bill Michael and others. (Message 7310)
Posted 23 Dec 2005 by Profile Jack Schonbrun
Post:
Hi Bill, you are doing a fantastic job. I can't tell you how much we appreciate it.

I'd like to think Rosetta has learned its lesson. Unfortunately, it doesn't seem that we will be able to demonstrate that until after the holidays. David Baker doesn't even know all of this is going on, luckily for him. When he gets back you can be sure that he will see to it that the resources are allocated to put safeguards in place. And he probably won't ever again add a bunch of new jobs to the queue just before going out the door, saying "See you next year!" :)


11) Message boards : Number crunching : Kudos to Bill Michael and others. (Message 7295)
Posted 22 Dec 2005 by Profile Jack Schonbrun
Post:
Yes, Jack, you beat me to it, but I thought Bill's (and others) efforts ought to be Front Page for all to see.


Good thinking.
12) Message boards : Number crunching : Kudos to Bill Michael and others. (Message 7290)
Posted 22 Dec 2005 by Profile Jack Schonbrun
Post:
I second this! (In fact, I might've beat you to it: link.)

13) Message boards : Number crunching : How to have the best BOINC project. (Message 7287)
Posted 22 Dec 2005 by Profile Jack Schonbrun
Post:
Thanks Bill and River~~ and Paul D. and dgnuff and other people I've forgotten, for answering people's questions during this trying time. I guess Distributed Assisstance is a part of what makes DC possible.

14) Message boards : Number crunching : Why do bad wu's keep getting sent out. (Message 7285)
Posted 22 Dec 2005 by Profile Jack Schonbrun
Post:
...and is anyone home at UofW to do anything about it anyway?!


One of us is home, but I'm unfortunately powerless to remove these jobs from the queue.

About all I can do for the time being is try to answer people's questions and let you know that we have found the source of the problem. Getting the fix out to you is proving more difficult, especially with everybody else gone for the holidays. It's as frustrating for me as it is for you.
15) Message boards : Rosetta@home Science : 4.81 allows rotating the protein... anybody try it? (Message 7277)
Posted 22 Dec 2005 by Profile Jack Schonbrun
Post:
I'm glad to see you guys noticed the rotation of the native. It's sort of an documented easter egg until we improve the issues that you saw with the center of rotation. You also might notice that little black outline around the chain doesn't stay in a consistent place and sometimes looks funny. This will be attended too once we are running stably again.
16) Message boards : Rosetta@home Science : Feedback, .. bandwidth usage :-( (Message 7274)
Posted 22 Dec 2005 by Profile Jack Schonbrun
Post:

Leave the files there, either untill they are no longer needed (newer versions) or that query type has finished.
You manage to do it with the client and other files (dunbrak.. bbdep...).


Currently these files are Work Unit specific. Even though they have the same name, they do not always have identical contents. There still may be a way for us to organize our downloads so that if they are identical, they are not redownloaded.


17) Message boards : Number crunching : Time to introduce the Quorum System (Message 7120)
Posted 22 Dec 2005 by Profile Jack Schonbrun
Post:
We did have an extensive discussion of this on the code release and redundancy thread.

Protecting the integrity of credit is important for all DC projects, and I think some form of searching for users claiming invalid credit will be implemented not too long from now. But I can't give you a time frame. I'd like to be the project that figures out a way to validate credit without any redundancy.
18) Message boards : Number crunching : Please abort WUs with (Message 7111)
Posted 21 Dec 2005 by Profile Jack Schonbrun
Post:
I would like to suggest either to your org or have your org suggest to BOINC devs that when a workunit has failed say 5 times to call a spade a spade and have the server flag and quarintine it.


Yes, there was a discussion of this earlier in this thread. BOINC does have this feature, but we set it to a default value of 10, instead of the 5 you suggest. Or the 3 might be even better. Unfortunately, it seems that we cannot change these Work Units after the fact. In the future these numbers will be adjusted.
19) Message boards : Rosetta@home Science : "Unrecoverable Error" message (Message 7096)
Posted 21 Dec 2005 by Profile Jack Schonbrun
Post:

Mmm, after looking around a bit on these forums I realized something I could try.
My general settings had "Leave applications in memory while preempted?" set to NO. I'll try setting this to YES and see if that stops the errors. I've got plenty of swap space anyway. I'll post later how this worked out.



Yes, please set it to YES.

There is another problem with a set of Work Units that we sent out. It's causing many of them to crash after about 30 seconds. Details are on this thread.

We are working on this problem, and hope that it will be fixed soon.
20) Message boards : Rosetta@home Science : Bad client (Message 7084)
Posted 21 Dec 2005 by Profile Jack Schonbrun
Post:
It is a problem with some Work Units. This threadhas more details.


Next 20



©2024 University of Washington
https://www.bakerlab.org