Message boards : Number crunching : BOINC service & P4's
Author | Message |
---|---|
Lee Carre Send message Joined: 6 Oct 05 Posts: 96 Credit: 79,331 RAC: 0 |
It will run for several days, then mysteriously stop.the reason for it stopping/crashing should be available in the windows event log does anything else happen/change around/after the time the service stops? is there any software that might be affecting BOINC, such as the type that restricts access or activity (some isn't that well writen) how do you monitor boinc on your farm? i assume you use boinview, what's the refresh interval set to? does the service stop at a particular time of day? maybe a virus scan or other regular task is causing problems, i know that certain virus scanning software causes problems with CPDN, maybe something similar is happening here is it just the BOINC service that stops, or do other services/processes stop as well? what are the hosts used for during the day? (or whenever they're being used by people and not just crunching) is there anything you've tried to discover the cause/solve the issue already? i assume you've done basic things like runing a disk check, and run various stress tests, especially memtest Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins |
Aaron Finney Send message Joined: 8 Oct 05 Posts: 52 Credit: 109,589 RAC: 0 |
You may want to check that windows is granting permissions properly for BOINC to communicate on these systems. Also - make sure that the HT settings are properly set and enabled in both hardware and software. If they are running on an identical windows image, then it would be fairly easy to write this off as a problem with the image. - How was it created? What settings did you use? |
BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0 |
The "windows image" line gives me the impression that it could be a corrupted device driver issue; or a device driver that wasn't properly removed and the new one installed over it. Try downloading all the device drivers for the P4 systems; install them on the P4 system.. go into safe mode and delete ALL the hardware drivers from System - always telling it to not reboot; and then reboot when you're done. Load Windows generic drivers till it boots up; then reboot and load the latest hardware specific drivers. (I remember horror stories about having to actually track down all the device driver files and .inf files for some video cards, and possibly a sound card - so you might need to research the proper removal technique for your drivers.) Or setup a P4 system from scratch, reloading all the software you're using on the systems, and then create a new P4 specific image for the rest of the P4 pharm if it doesn't fail as normal in several days. |
Lee Carre Send message Joined: 6 Oct 05 Posts: 96 Credit: 79,331 RAC: 0 |
as a quick-fix, you could set the service properties to restart the service when it fails, i assume you know where to look, if not let me know and i've write a step-by-step for you but obviosuly this is just a temp fix, and you should solve the actual cause of the problem Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins |
simpe73 Send message Joined: 20 Feb 06 Posts: 4 Credit: 438,570 RAC: 0 |
as a quick-fix, you could set the service properties to restart the service when it fails, i assume you know where to look, if not let me know and i've write a step-by-step for you I have also problems with running Boinc as a service _WITH_ROSETTA_. I have farm of 40 P4 -machines with WinXP and there are about 10 machines which do not run boinc if there is no one logget on. Machines should be identical with those who runs normally. I have started to crunch rosetta about 4 weeks ago. With previous projekt there were no crunching problems. All these problems with Rosetta (including this) causes significant loss to my credit. It used to be about 5000, now it is hardly 2000. If this happens to all users, it means that fixing all probelms should douple crunching capacity. |
Los Alcoholicos~Megaflix Send message Joined: 10 Nov 05 Posts: 24 Credit: 77,199 RAC: 0 |
I also have errors in workunits when running Rosetta in Boinc as a service (see the Miscellaneous Work Unit Errors-thread here: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1106 It's a very specific error, always the same one. The boinc-service stops and the unit it's working on generates an error. |
Lee Carre Send message Joined: 6 Oct 05 Posts: 96 Credit: 79,331 RAC: 0 |
for both of you, try running the service under the system account instructions can be found in the Enabling The Graphics Capability of the BOINC Client Software for a Service Type Installation guide Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins |
Message boards :
Number crunching :
BOINC service & P4's
©2024 University of Washington
https://www.bakerlab.org