Client Errors

Message boards : Number crunching : Client Errors

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · Next

AuthorMessage
AlphaLaser

Send message
Joined: 19 Aug 06
Posts: 52
Credit: 3,327,939
RAC: 0
Message 72778 - Posted: 16 Apr 2012, 0:24:54 UTC - in response to Message 72775.  
Last modified: 16 Apr 2012, 0:26:17 UTC

Just got a batch of Ralph WUs and I can confirm that my host has been able to successfully complete Ralph while failing here at Rosetta. I set the runtimes to be 1 hr on both projects.


I installed Ralph@Home several days ago, but it won't give me any tasks. I've tried the PROJECT, UPDATE button many times. Any suggestions? ---KMF, Jr.


I also attached about a week ago but only yesterday did I receive any work. Ralph seems to only have work available very sporadically. I did set Ralph to a very high resource share so that BOINC would try to request work more often.
ID: 72778 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sky King

Send message
Joined: 28 Feb 12
Posts: 11
Credit: 15,912
RAC: 0
Message 72795 - Posted: 16 Apr 2012, 19:54:47 UTC
Last modified: 16 Apr 2012, 19:59:55 UTC

Just an FYI... I bumped my NVIDIA drivers up to the 295.73 (or more completely, 8.17.12.9573) from the previous 285.62. Had BOINC grab a few Rosetta WUs, just in case this fixed something... BOINC updated me with Rosetta 3.26, but no joy, still the same "client error" result under the new drivers/client combination.
ID: 72795 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
wbblakemore

Send message
Joined: 18 Dec 07
Posts: 33
Credit: 4,181
RAC: 0
Message 72797 - Posted: 16 Apr 2012, 23:17:50 UTC - in response to Message 72795.  

Just an FYI... I bumped my NVIDIA drivers up to the 295.73 (or more completely, 8.17.12.9573) from the previous 285.62. Had BOINC grab a few Rosetta WUs, just in case this fixed something... BOINC updated me with Rosetta 3.26, but no joy, still the same "client error" result under the new drivers/client combination.


The driver version I experienced the initial problem under, is 296.10 (latest production version for Geforce series 5 cards).
ID: 72797 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
woland

Send message
Joined: 17 Dec 05
Posts: 5
Credit: 124,792
RAC: 0
Message 72799 - Posted: 17 Apr 2012, 5:25:28 UTC

I've just uninstalled latest NVIDIA drivers and got back to Windows default one (8.17.12.6830) and the problem disappeared.
ID: 72799 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,769,484
RAC: 5,676
Message 72802 - Posted: 17 Apr 2012, 10:46:37 UTC - in response to Message 72799.  

I've just uninstalled latest NVIDIA drivers and got back to Windows default one (8.17.12.6830) and the problem disappeared.


But now you aren't able to crunch with the gpu anymore, the Windows drivers are good for minimal things like changing screens etc but not near capable enough to crunch with. If you look in the messages tab of Boinc near the top it should say 'no gpu found'. Now in Rosetta it makes no difference, but if you game or use the gpu to crunch for another project, it does make a difference.
ID: 72802 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,769,484
RAC: 5,676
Message 72803 - Posted: 17 Apr 2012, 10:49:01 UTC - in response to Message 72795.  

Just an FYI... I bumped my NVIDIA drivers up to the 295.73 (or more completely, 8.17.12.9573) from the previous 285.62. Had BOINC grab a few Rosetta WUs, just in case this fixed something... BOINC updated me with Rosetta 3.26, but no joy, still the same "client error" result under the new drivers/client combination.


I believe the newest is 3.13 or something like that. And I believe it was the 296 versions that had a serious screen saver flaw that caused Boinc to crash whenever the screen saver came on.
ID: 72803 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
wbblakemore

Send message
Joined: 18 Dec 07
Posts: 33
Credit: 4,181
RAC: 0
Message 72806 - Posted: 17 Apr 2012, 16:22:34 UTC - in response to Message 72803.  

Just an FYI... I bumped my NVIDIA drivers up to the 295.73 (or more completely, 8.17.12.9573) from the previous 285.62. Had BOINC grab a few Rosetta WUs, just in case this fixed something... BOINC updated me with Rosetta 3.26, but no joy, still the same "client error" result under the new drivers/client combination.


I believe the newest is 3.13 or something like that. And I believe it was the 296 versions that had a serious screen saver flaw that caused Boinc to crash whenever the screen saver came on.



Yes, there is a new version of the NVidia driver out, in the 300 series, but so far it's only for the new GTX 680 card. They're in the process of retrofitting it for other cards, but it's not released to production yet. Hopefully, it will be in the next couple of weeks.

Actually, the bug you're referring to isn't about the screen saver at all - the specifics are as follows: If you use a DVI connection to your monitor and the monitor enters power saving mode (due to inactivity), the CUDA drivers can't recover (at least in terms of crunching apps, as opposed to pure graphics processing). The simple work-around for this is to turn off the "sleep" function for power saving in Windows. As long as the monitor doesn't enter sleep mode, everything works fine.
ID: 72806 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,769,484
RAC: 5,676
Message 72811 - Posted: 18 Apr 2012, 10:35:57 UTC - in response to Message 72806.  

Just an FYI... I bumped my NVIDIA drivers up to the 295.73 (or more completely, 8.17.12.9573) from the previous 285.62. Had BOINC grab a few Rosetta WUs, just in case this fixed something... BOINC updated me with Rosetta 3.26, but no joy, still the same "client error" result under the new drivers/client combination.


I believe the newest is 3.13 or something like that. And I believe it was the 296 versions that had a serious screen saver flaw that caused Boinc to crash whenever the screen saver came on.



Yes, there is a new version of the NVidia driver out, in the 300 series, but so far it's only for the new GTX 680 card. They're in the process of retrofitting it for other cards, but it's not released to production yet. Hopefully, it will be in the next couple of weeks.

Actually, the bug you're referring to isn't about the screen saver at all - the specifics are as follows: If you use a DVI connection to your monitor and the monitor enters power saving mode (due to inactivity), the CUDA drivers can't recover (at least in terms of crunching apps, as opposed to pure graphics processing). The simple work-around for this is to turn off the "sleep" function for power saving in Windows. As long as the monitor doesn't enter sleep mode, everything works fine.


Thank you for the clarification!
ID: 72811 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5664
Credit: 5,711,666
RAC: 1,996
Message 72814 - Posted: 18 Apr 2012, 11:05:50 UTC - in response to Message 72811.  

Just an FYI... I bumped my NVIDIA drivers up to the 295.73 (or more completely, 8.17.12.9573) from the previous 285.62. Had BOINC grab a few Rosetta WUs, just in case this fixed something... BOINC updated me with Rosetta 3.26, but no joy, still the same "client error" result under the new drivers/client combination.


I believe the newest is 3.13 or something like that. And I believe it was the 296 versions that had a serious screen saver flaw that caused Boinc to crash whenever the screen saver came on.



Yes, there is a new version of the NVidia driver out, in the 300 series, but so far it's only for the new GTX 680 card. They're in the process of retrofitting it for other cards, but it's not released to production yet. Hopefully, it will be in the next couple of weeks.

Actually, the bug you're referring to isn't about the screen saver at all - the specifics are as follows: If you use a DVI connection to your monitor and the monitor enters power saving mode (due to inactivity), the CUDA drivers can't recover (at least in terms of crunching apps, as opposed to pure graphics processing). The simple work-around for this is to turn off the "sleep" function for power saving in Windows. As long as the monitor doesn't enter sleep mode, everything works fine.


Thank you for the clarification!



And it appears to be limited to CUDA as my graphics driver is a Intel PCI driver and I have never had problems with tasks crashing due to the monitor going into sleep mode.
ID: 72814 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile In Memory of Kimsey M Fowler Sr

Send message
Joined: 10 Mar 12
Posts: 26
Credit: 39,033,222
RAC: 0
Message 72878 - Posted: 25 Apr 2012, 6:04:27 UTC
Last modified: 25 Apr 2012, 6:08:14 UTC

Per request from Rocco, Wireshark was used to collect packets moving between my machine and the Rosetta server under both success and failure modes of operation. ExamDiff is being used to identify differences in the packet contents. I've only made a preliminary review of the data so far, but one obvious difference is that when the NVIDIA driver is installed (failure condition), some additional information related to CUDA is included in the outgoing information packets transferred to the server. Here is a snippet extracted from a packet:

    <os_name>Microsoft Windows 7</os_name>
    <os_version>Home Premium x64 Edition, Service Pack 1, (06.01.7601.00)</os_version>
</host_info>
    <disk_usage>
        <d_boinc_used_total>2148698524.000000</d_boinc_used_total>
        <d_boinc_used_project>2147898783.000000</d_boinc_used_project>
    </disk_usage>
    <coprocs>
<coproc_cuda>
   <count>2</count>
   <name>GeForce GTX 580</name>
   <req_secs>345601.728000</req_secs>
   <req_instances>2.000000</req_instances>
   <estimated_delay>0.000000</estimated_delay>
   <drvVersion>28562</drvVersion>
   <cudaVersion>4010</cudaVersion>
   <totalGlobalMem>3221225472</totalGlobalMem>
   <sharedMemPerBlock>49152</sharedMemPerBlock>
   <regsPerBlock>32768</regsPerBlock>
   <warpSize>32</warpSize>
   <memPitch>2147483647</memPitch>
   <maxThreadsPerBlock>1024</maxThreadsPerBlock>
   <maxThreadsDim>1024 1024 64</maxThreadsDim>
   <maxGridSize>65535 65535 65535</maxGridSize>
   <totalConstMem>65536</totalConstMem>
   <major>2</major>
   <minor>0</minor>
   <clockRate>1800000</clockRate>
   <textureAlignment>512</textureAlignment>
   <deviceOverlap>1</deviceOverlap>
   <multiProcessorCount>16</multiProcessorCount>
</coproc_cuda>
    </coprocs>
<result>
    <name>heterodimer_design_18_pose_C_abinitio_SAVE_ALL_OUT_46199_4020_0</name>
    <final_cpu_time>10643.500000</final_cpu_time>
    <final_elapsed_time>11430.226067</final_elapsed_time>
    <exit_status>0</exit_status>
    <state>5</state>
    <platform>windows_x86_64</platform>
    <version_num>326</version_num>
    <app_version_num>326</app_version_num>
<stderr_out>
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<stderr_txt>
[2012- 4- 8 19:55:12:] :: BOINC:: Initializing ... ok.
[2012- 4- 8 19:55:12:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully. 


Everything between <coprocs> and </coprocs> is added information. Notice that soon after that data comes the version number of Rosetta: <version_num>326</version_num> which is lost in the final WU's web page.

I wonder if the server software is robust enough to handle unexpected XML tags? For example, perhaps these tags get output only for the Win 7 64-bit version of the driver. ---KMF, Jr.
ID: 72878 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,769,484
RAC: 5,676
Message 72879 - Posted: 25 Apr 2012, 11:17:01 UTC - in response to Message 72878.  

Everything between <coprocs> and </coprocs> is added information. Notice that soon after that data comes the version number of Rosetta: <version_num>326</version_num> which is lost in the final WU's web page.

I wonder if the server software is robust enough to handle unexpected XML tags? For example, perhaps these tags get output only for the Win 7 64-bit version of the driver. ---KMF, Jr.


If that is the problem one would think a simple call to Dr. Anderson and his Team would solve that in a hurry! After all it is they who support Boinc and the Server side of the software directly, they also support the Client side as well. Now each Project DOES modify, well most do, the Server software some, some MUCH more than others. But if Rosetta is having these kinds of problems and HASN'T contacted the Team yet, they need to right away! The worst that can happen is that Dr. A and Team say no they won't help, but if they do agree to help the problem could be solved by today! ESPECIALLY if it is a simple coding error!

I once worked on a webpage for over an hour trying to make it work, seems I had left out a period, ONE PERIOD, and the page REFUSED to do ANYTHING!!! This was long before programs, it was back to the days of text entering the code. I had a friend help and he found it within 5 minutes, new eyes and the problem was fixed! Todays Boinc Projects need to be able to handle BOTH cpu and gpu processing AND the pc's of those users that do, not just for their own project but for others too!
ID: 72879 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Borg_XMZ

Send message
Joined: 7 Apr 12
Posts: 2
Credit: 163,771
RAC: 0
Message 72881 - Posted: 25 Apr 2012, 12:47:19 UTC

I have a client error too. May I post here?

Yesterday I got my new Notebook. I installed and let it work over night.
Just the standard installation, with no special work unit settings an.

All work units ended with Client error.

HP-Notebook: i7-2860QM, 16Gb Ram, Windows7 x64 Prof
At the Task ID log I only see …OK.
Or is there another Log file where I could check for errors?


Server state: Over
Outcome: Client error
Client state: New
Exit status: 0 (0x0)

<core_client_version>7.0.25</core_client_version>
<![CDATA[
<stderr_txt>
[2012- 4-25 2:34:41:] :: BOINC:: Initializing ... ok.
[2012- 4-25 2:34:41:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev48292.zip
Unpacking WU data ...
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/input_CASP9_ff_benchmark_hybridization_run58_T0528_0_C1_yfsong.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
[2012- 4-25 3:54:31:] :: BOINC:: Initializing ... ok.
[2012- 4-25 3:54:31:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev48292.zip
Unpacking WU data ...
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/input_CASP9_ff_benchmark_hybridization_run58_T0528_0_C1_yfsong.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
======================================================
DONE :: 2 starting structures 7213.6 cpu seconds
This process generated 2 decoys from 2 attempts
======================================================
BOINC :: WS_max 4.99876e+008

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly ...
called boinc_finish

</stderr_txt>
]]>

Validate state: Invalid

Thanks in advance
ID: 72881 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile In Memory of Kimsey M Fowler Sr

Send message
Joined: 10 Mar 12
Posts: 26
Credit: 39,033,222
RAC: 0
Message 72886 - Posted: 25 Apr 2012, 23:05:56 UTC - in response to Message 72881.  

I have a client error too. May I post here?


Hi: Yes, certainly, if it's the same problem. Without knowing your work unit number, we cannot look at the full results: The very last line where it says "application version"... was the version number absent? If so then that is a good sign that you are having the same problem. Your computing platform seems to have the same characteristics as well, the i7 processor and Win 7 64-bit anyway. What video card are you using?

Here's a seemingly silly question for any of you experiencing this problem: Does your machine have a CD/DVD drive installed? One sort of odd thing with my problem machine is that I do not have one built in. I use a USB DVD drive I plug in when necessary.

Thanks. ---KMF, Jr.
ID: 72886 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sky King

Send message
Joined: 28 Feb 12
Posts: 11
Credit: 15,912
RAC: 0
Message 72890 - Posted: 26 Apr 2012, 1:32:26 UTC - in response to Message 72886.  

I have a client error too. May I post here?
Does your machine have a CD/DVD drive installed? One sort of odd thing with my problem machine is that I do not have one built in. I use a USB DVD drive I plug in when necessary.

I have a stock SATA DVD reader/burner in my rig.
ID: 72890 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
A.M.

Send message
Joined: 13 Jun 06
Posts: 12
Credit: 954,586
RAC: 0
Message 72892 - Posted: 26 Apr 2012, 5:05:14 UTC - in response to Message 72886.  

Here's a seemingly silly question for any of you experiencing this problem: Does your machine have a CD/DVD drive installed?


Yes I do. It's a SATA3 "reads and writes everything" combo drive.
ID: 72892 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Borg_XMZ

Send message
Joined: 7 Apr 12
Posts: 2
Credit: 163,771
RAC: 0
Message 72893 - Posted: 26 Apr 2012, 6:34:07 UTC

Yes there is no application version shown. Just 3x ---
(application version: ---)
My Video Card is a Nvidia 5010M and I have a build in optical drive.

And here are some of the Work unit ID´s:

Task ID, Work unit ID
501092173, 456890145
501090187, 456888335
501089545, 456887802
501088693, 456887038
501084098, 456882913
501069583, 456869975
501063628, 454749580
501063549, 456865034
501060433, 456862352
501055615, 456859394
501054481, 456858393
501052588, 456856650
501051508, 454710911
501047597, 454736825
501027182, 456836619
501026609, 456836107
501025987, 456835540
501022620, 456832669
501021861, 456831989
501020591, 456830849

thanks
ID: 72893 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Fi and Charlie Shaw

Send message
Joined: 7 May 07
Posts: 8
Credit: 346,961
RAC: 0
Message 72894 - Posted: 26 Apr 2012, 11:21:11 UTC

I've had approx 14 client errors out of my last 100 WU's

My other projects are working fine.

I suspect these are dud WU's expecially as many others are experiencing similar errors.

I'll carry on crunching, as I'm sure that matters will eventually be sorted out.

Yes I am using BOINC 7.0.25

Swordfish
ID: 72894 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rocco Moretti

Send message
Joined: 18 May 10
Posts: 66
Credit: 585,745
RAC: 0
Message 72899 - Posted: 26 Apr 2012, 18:26:28 UTC - in response to Message 72894.  

I've had approx 14 client errors out of my last 100 WU's
Yes I am using BOINC 7.0.25

Swordfish


You have a slightly different issue ("Incorrect function" with 7.0.25) than the one discussed in this thread (reported as client error, but with exit code 0), so I'll point you to the thread for it (though there aren't any updates yet).
ID: 72899 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AlphaLaser

Send message
Joined: 19 Aug 06
Posts: 52
Credit: 3,327,939
RAC: 0
Message 72900 - Posted: 26 Apr 2012, 18:31:16 UTC

Yes I believe with this issue there is 100% invalid rate, at least in my case I have not managed to successfully validate a single WU.

My host does have DVD reader/writer (internal to laptop).
ID: 72900 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,769,484
RAC: 5,676
Message 72901 - Posted: 27 Apr 2012, 10:33:39 UTC - in response to Message 72899.  

I've had approx 14 client errors out of my last 100 WU's
Yes I am using BOINC 7.0.25

Swordfish


You have a slightly different issue ("Incorrect function" with 7.0.25) than the one discussed in this thread (reported as client error, but with exit code 0), so I'll point you to the thread for it (though there aren't any updates yet).


You guys DO realize that version 7.0.25 is the RELEASE version now and has been for more than a week, right?! EVEN WCG handles it properly!
ID: 72901 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · Next

Message boards : Number crunching : Client Errors



©2024 University of Washington
https://www.bakerlab.org