BOINC service & P4's

Message boards : Number crunching : BOINC service & P4's

To post messages, you must log in.

AuthorMessage
Lee Carre

Send message
Joined: 6 Oct 05
Posts: 96
Credit: 79,331
RAC: 0
Message 11994 - Posted: 14 Mar 2006, 1:22:36 UTC - in response to Message 11993.  

It will run for several days, then mysteriously stop.
the reason for it stopping/crashing should be available in the windows event log

does anything else happen/change around/after the time the service stops?

is there any software that might be affecting BOINC, such as the type that restricts access or activity (some isn't that well writen)

how do you monitor boinc on your farm? i assume you use boinview, what's the refresh interval set to?

does the service stop at a particular time of day?
maybe a virus scan or other regular task is causing problems, i know that certain virus scanning software causes problems with CPDN, maybe something similar is happening here

is it just the BOINC service that stops, or do other services/processes stop as well?

what are the hosts used for during the day? (or whenever they're being used by people and not just crunching)

is there anything you've tried to discover the cause/solve the issue already?
i assume you've done basic things like runing a disk check, and run various stress tests, especially memtest
Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins
ID: 11994 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Aaron Finney

Send message
Joined: 8 Oct 05
Posts: 52
Credit: 109,589
RAC: 0
Message 12015 - Posted: 14 Mar 2006, 19:34:02 UTC - in response to Message 12002.  

You may want to check that windows is granting permissions properly for BOINC to communicate on these systems.

Also - make sure that the HT settings are properly set and enabled in both hardware and software.

If they are running on an identical windows image, then it would be fairly easy to write this off as a problem with the image. - How was it created? What settings did you use?
ID: 12015 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 12017 - Posted: 14 Mar 2006, 20:48:58 UTC

The "windows image" line gives me the impression that it could be a corrupted device driver issue; or a device driver that wasn't properly removed and the new one installed over it.

Try downloading all the device drivers for the P4 systems; install them on the P4 system.. go into safe mode and delete ALL the hardware drivers from System - always telling it to not reboot; and then reboot when you're done. Load Windows generic drivers till it boots up; then reboot and load the latest hardware specific drivers.

(I remember horror stories about having to actually track down all the device driver files and .inf files for some video cards, and possibly a sound card - so you might need to research the proper removal technique for your drivers.)

Or setup a P4 system from scratch, reloading all the software you're using on the systems, and then create a new P4 specific image for the rest of the P4 pharm if it doesn't fail as normal in several days.


ID: 12017 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Lee Carre

Send message
Joined: 6 Oct 05
Posts: 96
Credit: 79,331
RAC: 0
Message 12030 - Posted: 15 Mar 2006, 2:20:55 UTC
Last modified: 15 Mar 2006, 2:21:22 UTC

as a quick-fix, you could set the service properties to restart the service when it fails, i assume you know where to look, if not let me know and i've write a step-by-step for you

but obviosuly this is just a temp fix, and you should solve the actual cause of the problem
Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins
ID: 12030 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
simpe73

Send message
Joined: 20 Feb 06
Posts: 4
Credit: 438,570
RAC: 0
Message 12728 - Posted: 27 Mar 2006, 19:39:30 UTC - in response to Message 12030.  

as a quick-fix, you could set the service properties to restart the service when it fails, i assume you know where to look, if not let me know and i've write a step-by-step for you

but obviosuly this is just a temp fix, and you should solve the actual cause of the problem


I have also problems with running Boinc as a service _WITH_ROSETTA_. I have farm of 40 P4 -machines with WinXP and there are about 10 machines which do not run boinc if there is no one logget on. Machines should be identical with those who runs normally. I have started to crunch rosetta about 4 weeks ago. With previous projekt there were no crunching problems. All these problems with Rosetta (including this) causes significant loss to my credit. It used to be about 5000, now it is hardly 2000. If this happens to all users, it means that fixing all probelms should douple crunching capacity.
ID: 12728 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Los Alcoholicos~Megaflix

Send message
Joined: 10 Nov 05
Posts: 24
Credit: 77,199
RAC: 0
Message 12732 - Posted: 27 Mar 2006, 23:17:06 UTC

I also have errors in workunits when running Rosetta in Boinc as a service (see the Miscellaneous Work Unit Errors-thread here: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1106

It's a very specific error, always the same one. The boinc-service stops and the unit it's working on generates an error.
ID: 12732 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Lee Carre

Send message
Joined: 6 Oct 05
Posts: 96
Credit: 79,331
RAC: 0
Message 12800 - Posted: 29 Mar 2006, 22:27:18 UTC

for both of you, try running the service under the system account

instructions can be found in the Enabling The Graphics Capability of the BOINC Client Software for a Service Type Installation guide
Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins
ID: 12800 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : BOINC service & P4's



©2024 University of Washington
https://www.bakerlab.org