Setting up BOINC on master image and replicate to many computers?

Questions and Answers : Windows : Setting up BOINC on master image and replicate to many computers?

To post messages, you must log in.

AuthorMessage
UW Health Sciences Libraries

Send message
Joined: 12 Dec 08
Posts: 6
Credit: 19,943,362
RAC: 0
Message 57898 - Posted: 15 Dec 2008, 20:17:33 UTC

Hello. I want to run the Rosetta@home project on my library public XP workstations. I would like to include the setup in my master image and then replicate (via re-imaging) to my 60 other XP workstations. What is the best way to do this? I will be using one account/password for all the workstations. Thanks in advance for you advice.

-Adam
ID: 57898 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 57938 - Posted: 16 Dec 2008, 14:28:01 UTC

Adam, that sounds exactly like what Ethan Owens did over in housing and food services. Have you talked to him?
Rosetta Moderator: Mod.Sense
ID: 57938 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
UW Health Sciences Libraries

Send message
Joined: 12 Dec 08
Posts: 6
Credit: 19,943,362
RAC: 0
Message 58152 - Posted: 24 Dec 2008, 2:53:32 UTC - in response to Message 57938.  

I haven't talked to Ethan, but once I get back to work from my "snow" vacation, I will give him a shout.

I did set up a few test workstations and two of them seem to be gaining credit, but the 3rd one seems to be going slowly (if any at all). I am guessing that it might not have enough disk space allocated from my Deep Freeze "thawspace" (it was my first attempt). The other 2 have 5GB available each.

Is the BOINC_data folder the only one that needs to be writable? Thanks.

Adam, that sounds exactly like what Ethan Owens did over in housing and food services. Have you talked to him?

ID: 58152 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 58162 - Posted: 24 Dec 2008, 20:08:54 UTC

Yes, the data folder has all the applications (within the projects directory), and task data (within the slots directory), and the BOINC client application and tracking files within the root of the designated folder for BOINC data.
Rosetta Moderator: Mod.Sense
ID: 58162 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
UW Health Sciences Libraries

Send message
Joined: 12 Dec 08
Posts: 6
Credit: 19,943,362
RAC: 0
Message 58229 - Posted: 29 Dec 2008, 17:23:17 UTC - in response to Message 58162.  

Ok, it looks like I am "protein folding" now... :-)

I had built BOINC into my master image attached to the Rosetta project. When I deployed it to the individual stations, I got an error -- something about boinc user not in correct group. I also included the BOINC install executable on my image file so I was able to re-install with the "repair" option and then attach to the Rosetta project. That seem to fix the problem.

And my assistant was fairly fast last Friday and able to get around 30 machines on the new image. I have been watching the statistics and noticed the machines with the lowest credits have some log entries with "client error/compute error" and "validate error". Do you know what is going on with these systems?

Thanks. -Adam

Yes, the data folder has all the applications (within the projects directory), and task data (within the slots directory), and the BOINC client application and tracking files within the root of the designated folder for BOINC data.

ID: 58229 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,782,935
RAC: 1,880
Message 58234 - Posted: 29 Dec 2008, 22:55:27 UTC - in response to Message 58229.  

Ok, it looks like I am "protein folding" now... :-)

I had built BOINC into my master image attached to the Rosetta project. When I deployed it to the individual stations, I got an error -- something about boinc user not in correct group. I also included the BOINC install executable on my image file so I was able to re-install with the "repair" option and then attach to the Rosetta project. That seem to fix the problem.

And my assistant was fairly fast last Friday and able to get around 30 machines on the new image. I have been watching the statistics and noticed the machines with the lowest credits have some log entries with "client error/compute error" and "validate error". Do you know what is going on with these systems?
Thanks. -Adam


Are you actually transferring work units or just the directory? If workunits too then that could be your problem. Workunits are given to a pc and if not returned by THAT pc can cause errors. This is a Boinc thing, not a project thing. What you might be able to do is setup the image for Boinc in a "get no new work" state and then change that after imaging the individual machine. That way no existing workunits will be recrunched or loaded onto several machines.
ID: 58234 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 58236 - Posted: 29 Dec 2008, 22:59:39 UTC

Wow, that's a lot of hosts! Great job.

Yes, an already installed and attached BOINC client is going to have trouble replicating. BOINC has trouble trying to determine if this is really a new client, or the old one. Keeps track of sequence numbers of each request, and so the copies are out of synch with the last number stored on the server. That's why attaching after deploying is working better.

Check the Number Crunching board for the application version the failed tasks were run under.
Rosetta Moderator: Mod.Sense
ID: 58236 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
UW Health Sciences Libraries

Send message
Joined: 12 Dec 08
Posts: 6
Credit: 19,943,362
RAC: 0
Message 58251 - Posted: 30 Dec 2008, 15:41:02 UTC - in response to Message 58236.  

Thanks. I still have a few more hosts to add too!

So for the failed requests on particular hosts, should I reset the project so it clears the old/current tasks and downloads new tasks to get everything in order? Thanks.

-Adam
ID: 58251 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 58254 - Posted: 30 Dec 2008, 16:18:36 UTC

Not knowing what you are looking at, I really can't say. But, in general, a couple of computation or validation errors is no cause to take any action. They report back, you get more work, and things move on.
Rosetta Moderator: Mod.Sense
ID: 58254 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
UW Health Sciences Libraries

Send message
Joined: 12 Dec 08
Posts: 6
Credit: 19,943,362
RAC: 0
Message 58258 - Posted: 30 Dec 2008, 16:54:40 UTC - in response to Message 58254.  

Ok. I was just making sure I did not have a corrupted setup or anything like that. But if you are curious, the hosts "jughead" and "aloe" have some examples.

I think the Intel Quad Core processors on my workstations really like doing this kind of number crunching. My stats seem to be going up exponentially. :-)

-Adam

Not knowing what you are looking at, I really can't say. But, in general, a couple of computation or validation errors is no cause to take any action. They report back, you get more work, and things move on.

ID: 58258 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 58261 - Posted: 30 Dec 2008, 17:35:38 UTC
Last modified: 30 Dec 2008, 17:37:08 UTC

I can only view your hosts by host ID number. Only you can see the names.

If they stopped reporting work, or had dramatically different RAC then expectation over a period of a few days, then it bares more investigation. Otherwise, probably nothing to worry about.

RAC is a rolling average of granted credit. I think it's a 14 day period. So, you won't see RAC level off until 14 days after you stop adding hosts.
Rosetta Moderator: Mod.Sense
ID: 58261 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
UW Health Sciences Libraries

Send message
Joined: 12 Dec 08
Posts: 6
Credit: 19,943,362
RAC: 0
Message 58262 - Posted: 30 Dec 2008, 18:53:18 UTC - in response to Message 58261.  

Ok, I will keep eye on my hosts. As far as I can tell, all of them are actually doing useful work. Thanks for your help.

-Adam
ID: 58262 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Questions and Answers : Windows : Setting up BOINC on master image and replicate to many computers?



©2024 University of Washington
https://www.bakerlab.org