Author | Message |
A.M.
Send message
Joined: 13 Jun 06 Posts: 12 Credit: 954,586 RAC: 0
|
https://boinc.bakerlab.org/rosetta/results.php?hostid=1518514
Brand new computer, fresh install of Win7, etc. No problems with Einstein or Collatz, but every single one of the Rosetta WUs this computer has been allowed to finish has been flagged as invalid. Why?
|
|
Rocco Moretti
Send message
Joined: 18 May 10 Posts: 66 Credit: 585,745 RAC: 0
|
It looks like everything finished fine on your end (from the clean shutdowns in the stderr reports and the "Exit status: 0 (0x0)" line), but something went wrong in getting it back to the server. It's not necessarily a validation error, as you're getting "Outcome: Client error" rather than "Outcome: Validation error" or something like that. The curious thing is that the application version isn't being reported - this is something others have seen, so it doesn't look like you're the only one experiencing this particular issue.
We're looking into things to see if it's some intermittent error on our end, but in the meantime the recommendation would be to try detaching and reattaching the Rosetta@home project from your boinc client, to see if that might fix things.
|
|
A.M.
Send message
Joined: 13 Jun 06 Posts: 12 Credit: 954,586 RAC: 0
|
I have done as you suggested, with the same outcome as before.
https://boinc.bakerlab.org/rosetta/result.php?resultid=485001771
|
|
Digital Savior
Send message
Joined: 16 Jul 06 Posts: 1 Credit: 191,042 RAC: 0
|
Yea, I had about 140 WUs that ended in client error. =/ Wish I noticed sooner...
|
|
Sky King
Send message
Joined: 28 Feb 12 Posts: 11 Credit: 15,912 RAC: 0
|
Is there any update to this? I too just recently switched my stock Win7 i7 over to Rosetta after years of being an F@H contributor, and without exception, every unit results in "client error."
If I'm just generating heat and not contributing to science, I need to switch to another project, so I am hoping to hear an update on this.
|
|
Rocco Moretti
Send message
Joined: 18 May 10 Posts: 66 Credit: 585,745 RAC: 0
|
Is there any update to this? I too just recently switched my stock Win7 i7 over to Rosetta after years of being an F@H contributor, and without exception, every unit results in "client error."
If I'm just generating heat and not contributing to science, I need to switch to another project, so I am hoping to hear an update on this.
Unfortunately, I don't have any progress to report. We're still not sure why these runs look like they're successfully completing, but then resulting in errors while reporting back to the server.
If you're up to it, something might reveal itself if people encountering these errors turned on the extra debugging information in the cc_config.xml file
The relevant portion to focus on is the post-result reporting. Log flags like file_xfer_debug, http_debug, http_xfer_debug, network_status_debug, proxy_debug are ones to try. If anything looks off/wonky in clients who have these issues, then that's a lead we can follow up on in debugging. (That's not to say that the error will necessarily show itself on the client end - but it may be worth a shot.)
Edit: Others on the forum have indicated that the issue may be GPU related. (Rosetta@home doesn't use GPUs, but that's not to say a GPU-related setting might be involved.) You can also try (temporarily) turning off GPU crunching, to see that fixes things - even if you turn GPUs back on for non-Rosetta projects, posting that the issue was fixed on your machine with a GPU setting change will help us in tracking down the issue.
|
|
Markus Elfring
Send message
Joined: 10 Jun 06 Posts: 17 Credit: 3,610,273 RAC: 0
|
|
|
AlphaLaser
Send message
Joined: 19 Aug 06 Posts: 52 Credit: 3,327,939 RAC: 0
|
I have been experiencing the same problem with this host. I was able to complete work in the past but now nothing is validating. The BOINC version is 6.10.58.
I will try out the log flags when I clear out work from other projects, maybe I can get to it by this weekend.
|
|
wbblakemore
Send message
Joined: 18 Dec 07 Posts: 33 Credit: 4,181 RAC: 0
|
I've reported this in other threads, so I apologize in advance if anyone is offended by my multiple posts ...
I'm having the same problem as others in this thread have reported. It doesn't look like a validation error, since most of my stderr messages are similar to this one:
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
======================================================
DONE :: 55 starting structures 10824.1 cpu seconds
This process generated 55 decoys from 55 attempts
======================================================
BOINC :: WS_max 3.04529e+008
BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly
|
|
finchna
Send message
Joined: 21 Oct 10 Posts: 2 Credit: 1,078,913 RAC: 0
|
i'm also experiencing this problem -- going back at least 100 tasks. 2010 PowerMac -- not new but not old and seti and milkyway are running fine.
|
|
Rocco Moretti
Send message
Joined: 18 May 10 Posts: 66 Credit: 585,745 RAC: 0
|
i'm also experiencing this problem -- going back at least 100 tasks. 2010 PowerMac -- not new but not old and seti and milkyway are running fine.
If you're referring to this computer, it looks to be a different issue (not the same as above).
From the stderr reports for the task, you're getting a "Process creation failed: errno=13" according to this post, errno=13 is a permissions error. Others have also experienced this with the update to 3.24. Best I can tell, this is an issue with Boinc 7.0, rather than with Rosetta@home itself. (Boinc 7 is currently development code, and not recommended for general use - at least not with Rosetta@home)
|
|
finchna
Send message
Joined: 21 Oct 10 Posts: 2 Credit: 1,078,913 RAC: 0
|
If you're referring to this computer, it looks to be a different issue (not the same as above).
Best I can tell, this is an issue with Boinc 7.0, rather than with Rosetta@home itself. (Boinc 7 is currently development code, and not recommended for general use - at least not with Rosetta@home)
That is the computer and sorry to hear that 7 has a problem as the 6.12.35 client flat out fails on that machine and the BOINC folks suggested trying the 7 client which now makes some things work but, unfortunately, not Rosetta@home.
|
|
Rocco Moretti
Send message
Joined: 18 May 10 Posts: 66 Credit: 585,745 RAC: 0
|
That is the computer and sorry to hear that 7 has a problem as the 6.12.35 client flat out fails on that machine and the BOINC folks suggested trying the 7 client which now makes some things work but, unfortunately, not Rosetta@home.
You can try manually changing the permissions on the projects/boinc.bakerlab.org_rosetta/minirosetta_3.24_i686-apple-darwin (which looks like to be file with the permissions error in your case), to see if that helps. pvh in the other thread said that it worked for him.
The other suggestion is to talk with the BOINC 7 folks, to see if you can troubleshoot why it gave this permission error in the first place.
Good Luck
|
|
AlphaLaser
Send message
Joined: 19 Aug 06 Posts: 52 Credit: 3,327,939 RAC: 0
|
Hi. I enabled suggested log messages, I'm not sure which part is helpful since I get a "[network_status_debug] status: online" or "status: don't need connection" message every second but here is where I start to upload the task id=491825324 which just gave "Client error"
16-Mar-2012 16:23:39 [rosetta@home] Computation for task T0575_boinc_rosetta_cm_abrelax_cmiles_SAVE_ALL_OUT_44670_137_0 finished
16-Mar-2012 16:23:39 [rosetta@home] Starting if3dimer_fold_and_dock_if3design12_SAVE_ALL_OUT_44597_8465_0
16-Mar-2012 16:23:51 [rosetta@home] Starting task if3dimer_fold_and_dock_if3design12_SAVE_ALL_OUT_44597_8465_0 using minirosetta version 324
16-Mar-2012 16:23:52 [---] [network_status_debug] woke up after 14.552832 seconds
16-Mar-2012 16:23:52 [---] [network_status_debug] status: don't need connection
16-Mar-2012 16:23:53 [---] [network_status_debug] status: don't need connection
16-Mar-2012 16:23:53 [rosetta@home] [fxd] starting upload, upload_offset -1
16-Mar-2012 16:23:53 [---] [http_debug] HTTP_OP::libcurl_exec(): ca-bundle 'C:Program Files (x86)BOINCca-bundle.crt'
16-Mar-2012 16:23:53 [---] [http_debug] HTTP_OP::libcurl_exec(): ca-bundle set
16-Mar-2012 16:23:53 [---] [proxy_debug] HTTP_OP::no_proxy_for_url(): http://srv4.bakerlab.org/rosetta_cgi/file_upload_handler
16-Mar-2012 16:23:53 [---] [proxy_debug] returning false
16-Mar-2012 16:23:53 [rosetta@home] Started upload of T0575_boinc_rosetta_cm_abrelax_cmiles_SAVE_ALL_OUT_44670_137_0_0
16-Mar-2012 16:23:53 [rosetta@home] [file_xfer_debug] URL: http://srv4.bakerlab.org/rosetta_cgi/file_upload_handler
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Info: timeout on name lookup is not supported
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Info: About to connect() to srv4.bakerlab.org port 80 (#0)
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Info: Trying 128.95.160.145...
16-Mar-2012 16:23:54 [---] [network_status_debug] status: online
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Info: Connected to srv4.bakerlab.org (128.95.160.145) port 80 (#0)
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Sent header to server: POST /rosetta_cgi/file_upload_handler HTTP/1.1
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Sent header to server: User-Agent: BOINC client (windows_x86_64 6.10.58)
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Sent header to server: Host: srv4.bakerlab.org
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Sent header to server: Accept: */*
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Sent header to server: Accept-Encoding: deflate, gzip
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Sent header to server: Content-Type: application/x-www-form-urlencoded
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Sent header to server: Content-Length: 318
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Sent header to server:
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Received header from server: HTTP/1.1 200 OK
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Received header from server: Date: Fri, 16 Mar 2012 20:24:08 GMT
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Received header from server: Server: Apache/2.2.3 (Red Hat)
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Received header from server: Connection: close
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Received header from server: Transfer-Encoding: chunked
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Received header from server: Content-Type: text/plain; charset=UTF-8
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Received header from server:
16-Mar-2012 16:23:54 [---] [http_xfer_debug] [ID#201] HTTP: wrote 93 bytes
16-Mar-2012 16:23:54 [---] [http_debug] [ID#201] Info: Closing connection #0
16-Mar-2012 16:23:55 [---] [network_status_debug] status: online
16-Mar-2012 16:23:55 [rosetta@home] [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval 0
16-Mar-2012 16:23:55 [rosetta@home] [file_xfer_debug] parsing upload response: <data_server_reply>
<status>0</status>
<file_size>0</file_size>
</data_server_reply>
16-Mar-2012 16:23:55 [rosetta@home] [file_xfer_debug] parsing status: 0
16-Mar-2012 16:23:55 [rosetta@home] [fxd] starting upload, upload_offset 0
16-Mar-2012 16:23:55 [---] [http_debug] HTTP_OP::libcurl_exec(): ca-bundle set
16-Mar-2012 16:23:55 [---] [proxy_debug] HTTP_OP::no_proxy_for_url(): http://srv4.bakerlab.org/rosetta_cgi/file_upload_handler
16-Mar-2012 16:23:55 [---] [proxy_debug] returning false
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Info: timeout on name lookup is not supported
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Info: About to connect() to srv4.bakerlab.org port 80 (#0)
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Info: Trying 128.95.160.145...
16-Mar-2012 16:23:56 [---] [network_status_debug] status: online
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Info: Connected to srv4.bakerlab.org (128.95.160.145) port 80 (#0)
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Sent header to server: POST /rosetta_cgi/file_upload_handler HTTP/1.1
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Sent header to server: User-Agent: BOINC client (windows_x86_64 6.10.58)
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Sent header to server: Host: srv4.bakerlab.org
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Sent header to server: Accept: */*
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Sent header to server: Accept-Encoding: deflate, gzip
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Sent header to server: Content-Type: application/x-www-form-urlencoded
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Sent header to server: Content-Length: 8955
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Sent header to server: Expect: 100-continue
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Sent header to server:
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Received header from server: HTTP/1.1 100 Continue
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Received header from server: HTTP/1.1 200 OK
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Received header from server: Date: Fri, 16 Mar 2012 20:24:10 GMT
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Received header from server: Server: Apache/2.2.3 (Red Hat)
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Received header from server: Connection: close
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Received header from server: Transfer-Encoding: chunked
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Received header from server: Content-Type: text/plain; charset=UTF-8
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Received header from server:
16-Mar-2012 16:23:56 [---] [http_xfer_debug] [ID#201] HTTP: wrote 64 bytes
16-Mar-2012 16:23:56 [---] [http_debug] [ID#201] Info: Closing connection #0
16-Mar-2012 16:23:57 [---] [network_status_debug] status: online
16-Mar-2012 16:23:57 [rosetta@home] [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval 0
16-Mar-2012 16:23:57 [rosetta@home] [file_xfer_debug] parsing upload response: <data_server_reply>
<status>0</status>
</data_server_reply>
16-Mar-2012 16:23:57 [rosetta@home] [file_xfer_debug] parsing status: 0
16-Mar-2012 16:23:57 [rosetta@home] [file_xfer_debug] file transfer status 0
16-Mar-2012 16:23:57 [rosetta@home] Finished upload of T0575_boinc_rosetta_cm_abrelax_cmiles_SAVE_ALL_OUT_44670_137_0_0
16-Mar-2012 16:23:57 [rosetta@home] [file_xfer_debug] Throughput 6782 bytes/sec
Here is the second example for task id=491825190.
16-Mar-2012 16:25:20 [---] [network_status_debug] status: don't need connection
16-Mar-2012 16:25:21 [---] [network_status_debug] status: don't need connection
16-Mar-2012 16:25:22 [---] [network_status_debug] status: don't need connection
16-Mar-2012 16:25:25 [rosetta@home] Computation for task if3dimer_design9monomer_abinitio_SAVE_ALL_OUT_44611_2681_0 finished
16-Mar-2012 16:25:25 [rosetta@home] Starting T0600_boinc_rosetta_cm_abrelax_cmiles_SAVE_ALL_OUT_44692_155_0
16-Mar-2012 16:25:28 [rosetta@home] Starting task T0600_boinc_rosetta_cm_abrelax_cmiles_SAVE_ALL_OUT_44692_155_0 using minirosetta version 324
16-Mar-2012 16:25:28 [---] [network_status_debug] status: don't need connection
16-Mar-2012 16:25:29 [---] [network_status_debug] status: don't need connection
16-Mar-2012 16:25:29 [rosetta@home] [fxd] starting upload, upload_offset -1
16-Mar-2012 16:25:29 [---] [http_debug] HTTP_OP::libcurl_exec(): ca-bundle 'C:Program Files (x86)BOINCca-bundle.crt'
16-Mar-2012 16:25:29 [---] [http_debug] HTTP_OP::libcurl_exec(): ca-bundle set
16-Mar-2012 16:25:29 [---] [proxy_debug] HTTP_OP::no_proxy_for_url(): http://srv3.bakerlab.org/rosetta_cgi/file_upload_handler
16-Mar-2012 16:25:29 [---] [proxy_debug] returning false
16-Mar-2012 16:25:29 [rosetta@home] Started upload of if3dimer_design9monomer_abinitio_SAVE_ALL_OUT_44611_2681_0_0
16-Mar-2012 16:25:29 [rosetta@home] [file_xfer_debug] URL: http://srv3.bakerlab.org/rosetta_cgi/file_upload_handler
16-Mar-2012 16:25:30 [---] [http_debug] [ID#202] Info: timeout on name lookup is not supported
16-Mar-2012 16:25:30 [---] [http_debug] [ID#202] Info: About to connect() to srv3.bakerlab.org port 80 (#0)
16-Mar-2012 16:25:30 [---] [http_debug] [ID#202] Info: Trying 128.95.160.144...
16-Mar-2012 16:25:30 [---] [network_status_debug] status: online
16-Mar-2012 16:25:30 [---] [http_debug] [ID#202] Info: Connected to srv3.bakerlab.org (128.95.160.144) port 80 (#0)
16-Mar-2012 16:25:30 [---] [http_debug] [ID#202] Sent header to server: POST /rosetta_cgi/file_upload_handler HTTP/1.1
16-Mar-2012 16:25:30 [---] [http_debug] [ID#202] Sent header to server: User-Agent: BOINC client (windows_x86_64 6.10.58)
16-Mar-2012 16:25:30 [---] [http_debug] [ID#202] Sent header to server: Host: srv3.bakerlab.org
16-Mar-2012 16:25:30 [---] [http_debug] [ID#202] Sent header to server: Accept: */*
16-Mar-2012 16:25:30 [---] [http_debug] [ID#202] Sent header to server: Accept-Encoding: deflate, gzip
16-Mar-2012 16:25:30 [---] [http_debug] [ID#202] Sent header to server: Content-Type: application/x-www-form-urlencoded
16-Mar-2012 16:25:30 [---] [http_debug] [ID#202] Sent header to server: Content-Length: 314
16-Mar-2012 16:25:30 [---] [http_debug] [ID#202] Sent header to server:
16-Mar-2012 16:25:31 [---] [http_debug] [ID#202] Received header from server: HTTP/1.1 200 OK
16-Mar-2012 16:25:31 [---] [http_debug] [ID#202] Received header from server: Date: Fri, 16 Mar 2012 20:25:44 GMT
16-Mar-2012 16:25:31 [---] [http_debug] [ID#202] Received header from server: Server: Apache/2.2.3 (Red Hat)
16-Mar-2012 16:25:31 [---] [http_debug] [ID#202] Received header from server: Connection: close
16-Mar-2012 16:25:31 [---] [http_debug] [ID#202] Received header from server: Transfer-Encoding: chunked
16-Mar-2012 16:25:31 [---] [http_debug] [ID#202] Received header from server: Content-Type: text/plain; charset=UTF-8
16-Mar-2012 16:25:31 [---] [http_debug] [ID#202] Received header from server:
16-Mar-2012 16:25:31 [---] [http_xfer_debug] [ID#202] HTTP: wrote 93 bytes
16-Mar-2012 16:25:31 [---] [http_debug] [ID#202] Info: Expire cleared
16-Mar-2012 16:25:31 [---] [http_debug] [ID#202] Info: Closing connection #0
16-Mar-2012 16:25:31 [---] [network_status_debug] status: online
16-Mar-2012 16:25:31 [rosetta@home] [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval 0
16-Mar-2012 16:25:31 [rosetta@home] [file_xfer_debug] parsing upload response: <data_server_reply>
<status>0</status>
<file_size>0</file_size>
</data_server_reply>
16-Mar-2012 16:25:31 [rosetta@home] [file_xfer_debug] parsing status: 0
16-Mar-2012 16:25:31 [rosetta@home] [fxd] starting upload, upload_offset 0
16-Mar-2012 16:25:31 [---] [http_debug] HTTP_OP::libcurl_exec(): ca-bundle set
16-Mar-2012 16:25:31 [---] [proxy_debug] HTTP_OP::no_proxy_for_url(): http://srv3.bakerlab.org/rosetta_cgi/file_upload_handler
16-Mar-2012 16:25:31 [---] [proxy_debug] returning false
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Info: timeout on name lookup is not supported
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Info: About to connect() to srv3.bakerlab.org port 80 (#0)
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Info: Trying 128.95.160.144...
16-Mar-2012 16:25:32 [---] [network_status_debug] status: online
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Info: Connected to srv3.bakerlab.org (128.95.160.144) port 80 (#0)
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Sent header to server: POST /rosetta_cgi/file_upload_handler HTTP/1.1
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Sent header to server: User-Agent: BOINC client (windows_x86_64 6.10.58)
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Sent header to server: Host: srv3.bakerlab.org
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Sent header to server: Accept: */*
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Sent header to server: Accept-Encoding: deflate, gzip
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Sent header to server: Content-Type: application/x-www-form-urlencoded
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Sent header to server: Content-Length: 14156
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Sent header to server: Expect: 100-continue
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Sent header to server:
16-Mar-2012 16:25:32 [---] [http_debug] [ID#202] Received header from server: HTTP/1.1 100 Continue
16-Mar-2012 16:25:33 [---] [http_debug] [ID#202] Received header from server: HTTP/1.1 200 OK
16-Mar-2012 16:25:33 [---] [http_debug] [ID#202] Received header from server: Date: Fri, 16 Mar 2012 20:25:46 GMT
16-Mar-2012 16:25:33 [---] [http_debug] [ID#202] Received header from server: Server: Apache/2.2.3 (Red Hat)
16-Mar-2012 16:25:33 [---] [http_debug] [ID#202] Received header from server: Connection: close
16-Mar-2012 16:25:33 [---] [http_debug] [ID#202] Received header from server: Transfer-Encoding: chunked
16-Mar-2012 16:25:33 [---] [http_debug] [ID#202] Received header from server: Content-Type: text/plain; charset=UTF-8
16-Mar-2012 16:25:33 [---] [http_debug] [ID#202] Received header from server:
16-Mar-2012 16:25:33 [---] [http_xfer_debug] [ID#202] HTTP: wrote 64 bytes
16-Mar-2012 16:25:33 [---] [http_debug] [ID#202] Info: Expire cleared
16-Mar-2012 16:25:33 [---] [http_debug] [ID#202] Info: Closing connection #0
16-Mar-2012 16:25:33 [---] [network_status_debug] status: online
16-Mar-2012 16:25:33 [rosetta@home] [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval 0
16-Mar-2012 16:25:33 [rosetta@home] [file_xfer_debug] parsing upload response: <data_server_reply>
<status>0</status>
</data_server_reply>
16-Mar-2012 16:25:33 [rosetta@home] [file_xfer_debug] parsing status: 0
16-Mar-2012 16:25:33 [rosetta@home] [file_xfer_debug] file transfer status 0
16-Mar-2012 16:25:33 [rosetta@home] Finished upload of if3dimer_design9monomer_abinitio_SAVE_ALL_OUT_44611_2681_0_0
16-Mar-2012 16:25:33 [rosetta@home] [file_xfer_debug] Throughput 11894 bytes/sec
And then I reported the tasks.
16-Mar-2012 16:33:27 [rosetta@home] Sending scheduler request: Requested by user.
16-Mar-2012 16:33:27 [rosetta@home] Reporting 2 completed tasks, not requesting new tasks
16-Mar-2012 16:33:27 [---] [http_debug] HTTP_OP::init_post(): http://srv4.bakerlab.org/rosetta_cgi/cgi
16-Mar-2012 16:33:27 [---] [http_debug] HTTP_OP::libcurl_exec(): ca-bundle set
16-Mar-2012 16:33:27 [---] [proxy_debug] HTTP_OP::no_proxy_for_url(): http://srv4.bakerlab.org/rosetta_cgi/cgi
16-Mar-2012 16:33:27 [---] [proxy_debug] returning false
16-Mar-2012 16:33:27 [---] [network_status_debug] woke up after 17.175983 seconds
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Info: timeout on name lookup is not supported
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Info: About to connect() to srv4.bakerlab.org port 80 (#0)
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Info: Trying 128.95.160.145...
16-Mar-2012 16:33:27 [---] [network_status_debug] status: online
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Info: Connected to srv4.bakerlab.org (128.95.160.145) port 80 (#0)
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Sent header to server: POST /rosetta_cgi/cgi HTTP/1.1
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Sent header to server: User-Agent: BOINC client (windows_x86_64 6.10.58)
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Sent header to server: Host: srv4.bakerlab.org
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Sent header to server: Accept: */*
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Sent header to server: Accept-Encoding: deflate, gzip
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Sent header to server: Content-Type: application/x-www-form-urlencoded
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Sent header to server: Content-Length: 17602
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Sent header to server: Expect: 100-continue
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Sent header to server:
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Received header from server: HTTP/1.1 100 Continue
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Received header from server: HTTP/1.1 200 OK
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Received header from server: Date: Fri, 16 Mar 2012 20:33:41 GMT
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Received header from server: Server: Apache/2.2.3 (Red Hat)
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Received header from server: Connection: close
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Received header from server: Transfer-Encoding: chunked
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Received header from server: Content-Type: text/xml
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Received header from server:
16-Mar-2012 16:33:27 [---] [http_xfer_debug] [ID#1] HTTP: wrote 1216 bytes
16-Mar-2012 16:33:27 [---] [http_xfer_debug] [ID#1] HTTP: wrote 1813 bytes
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Info: Expire cleared
16-Mar-2012 16:33:27 [---] [http_debug] [ID#1] Info: Closing connection #0
16-Mar-2012 16:33:28 [rosetta@home] Scheduler request completed
16-Mar-2012 16:33:28 [---] [network_status_debug] status: online
16-Mar-2012 16:33:29 [---] [network_status_debug] status: online
16-Mar-2012 16:33:30 [---] [network_status_debug] status: online
I am still saving logs for a few more tasks still in progress.
|
|
In Memory of Kimsey M Fowler Sr
Send message
Joined: 10 Mar 12 Posts: 26 Credit: 39,033,222 RAC: 0
|
I've been experiencing this problem for a week now on a new machine. I'm running two nearly identical machines: the new one is experiencing this problem and the older one is not. I want to document the similarities and differences between these machines so that others might find something in common that may yield a good clue:
Old Machine (no problem):
i7-3930K processor, family 6, model 45, stepping 6
Win7 Home Premium x64, SP1
Solid state hard drive
Motherboard ASUS X-79 Sabertooth TUF with CPU overclocked to 4.7 GHz
CPU running Rosetta@Home full time
2x EVGA 560Ti GPU's running Folding@Home full time
New Machine (problem):
i7-3930K processor, family 6, model 45, stepping 7 <==delta
Win7 Home Premium x64, SP1
Solid state hard drive
Motherboard ASUS X-79 Sabertooth TUF with CPU overclocked to 4.7 GHz
CPU running Rosetta@Home for the failing 8 WU's/day, remaining time to F@H.
2x EVGA 580 GPU's running Folding@Home full time <==delta
Things I have tried incrementally:
1) setting the GPU Activity button to "Suspend GPU"
2) uninstalling the F@H SMP core to get same s/w configuration as old machine
3) uninstalling and reinstalling BOINC and Rosetta
Planned test for Monday night:
1) completely shut down F@H during the download, execution, and upload of next 8 R@H WU's.
Please respond if you see any commonality.
Note: All GPU's are using the same driver version 285.62; BOINC is version 6.12.34(x64).
|
|
P . P . L .
Send message
Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0
|
I've been experiencing this problem for a week now on a new machine. I'm running two nearly identical machines: the new one is experiencing this problem and the older one is not. I want to document the similarities and differences between these machines so that others might find something in common that may yield a good clue:
Old Machine (no problem):
i7-3930K processor, family 6, model 45, stepping 6
Win7 Home Premium x64, SP1
Solid state hard drive
Motherboard ASUS X-79 Sabertooth TUF with CPU overclocked to 4.7 GHz
CPU running Rosetta@Home full time
2x EVGA 560Ti GPU's running Folding@Home full time
New Machine (problem):
i7-3930K processor, family 6, model 45, stepping 7 <==delta
Win7 Home Premium x64, SP1
Solid state hard drive
Motherboard ASUS X-79 Sabertooth TUF with CPU overclocked to 4.7 GHz
CPU running Rosetta@Home for the failing 8 WU's/day, remaining time to F@H.
2x EVGA 580 GPU's running Folding@Home full time <==delta
Things I have tried incrementally:
1) setting the GPU Activity button to "Suspend GPU"
2) uninstalling the F@H SMP core to get same s/w configuration as old machine
3) uninstalling and reinstalling BOINC and Rosetta
Planned test for Monday night:
1) completely shut down F@H during the download, execution, and upload of next 8 R@H WU's.
Please respond if you see any commonality.
Note: All GPU's are using the same driver version 285.62; BOINC is version 6.12.34(x64).
Hi.
First thing to do is put it back to stock speed and see if you still get errors, if you haven't done that already.
Not all rigs will run the same science stable.
my 2c worth.
|
|
Rocco Moretti
Send message
Joined: 18 May 10 Posts: 66 Credit: 585,745 RAC: 0
|
AlphaLaser - Thanks for posting the log. I can't see anything obviously wrong from the log, but hopefully it will help us rule out what it isn't.
In Memory of Kimsey M Fowler Sr - Thanks for posting the system delta, and a double thanks for you efforts in troubleshooting.
I don't think it's been mentioned so far, but from what I've seen so far, the issue looks to be confined to Windows 7 machines (most commonly SP1 x64), although I can't say if it's due to the operating system, or rather the type of machines which typically run Win7.
|
|
In Memory of Kimsey M Fowler Sr
Send message
Joined: 10 Mar 12 Posts: 26 Credit: 39,033,222 RAC: 0
|
This is a report on my troubleshooting activities. Last night on the problem machine I ran 8 Rosetta WU's with the following concurrent changes to the system:
1) Folding@Home was stopped on CPU and GPU's,
2) All applications except BOINC/Rosetta were terminated,
3) All nonessential processes were terminated with Task Manager,
4) CPU was not overclocked,
5) screen saver and desktop image were not used, only blue background used.
The results were the same. All WU's failed due to client error, invalid, and application version not reported (showing only three dashes).
A.M. reported via e-mail using an EVGA 550Ti GPU, previously with driver 285.62 and now with 295.73. It would be useful to hear from William Blakemore, Alpha Laser, Sky King, and Digital Savior if they are running EVGA's, how many, what model, and if they have been running Folding@Home. Here is a summary of the machines from this thread that appear to be dealing with the same problem:
a) my problem machine: i7-3930K, 3.2GHz, Family 6, model 45, stepping 7
b) William Blakemore: i7-2700K, 3.5GHz, Family 6, model 42, stepping 7
c) Alpha Laser: i7-Q740 CPU, 1.7GHz, Family 6, model 30, stepping 5
d) Sky King: i7-920 CPU, 2.7GHz, Family 6, model 26, stepping 4
e) Digital Savior: i7-2600K, 3.4GHz, Family 6, model 42, stepping 7
f) A.M.: i7-2600K, 3.4GHz, Family 6, model 42, stepping 7
I think all of the stepping 7 processors are the latest CPU version. Suspicious, but I found other i7-3930K's stepping 7 without our issue (computer ID's 1520085 & 1524601).
I suspect there are plenty more machines out there with this problem, but either the users haven't noticed it yet, or they haven't bothered to report it.
For tonight's continued troubleshooting I'm considering uninstalling & physically removing the EVGA GPU's and installing a low end GPU of another type. Other suggestions/approaches would be appreciated if anyone has any ideas.
|
|
AlphaLaser
Send message
Joined: 19 Aug 06 Posts: 52 Credit: 3,327,939 RAC: 0
|
Hi, my system is a Dell XPS laptop, the specs are:
CPU: Core i7 740QM 1.73GHz
GPU: Nvidia Geforce 435M (2GB DDR3 Video RAM)
Driver: 285.62
RAM: 6GB DDR3-1333
OS: Windows 7 Home Premium SP1 64-bit
HDD: 640 GB 7200 RPM
I don't run folding but I do run GPUGrid on my GPU.
|
|
Rayburner
Send message
Joined: 4 Oct 05 Posts: 32 Credit: 16,518,823 RAC: 0
|
Hi!
You can add my system to the list:
CPU:i7-2600K CPU @ 3.40GHz Family 6 Model 42 Stepping 7
RAM: 8GB PC3-12800
GPU: NVIDIA GeForce GTX 570 (2560MB)
Graphics driver: 296.10
OS: MS Windows 7 Professional x64 Edition, SP 1
I run GPUGrid and PrimeGrid on the GPU.
EDIT: Are all your host running NVIDIA GPUs?
I have another i7 960 system with an AMD RADEON 6970 which is crunching
Rosetta just fine.
Regards,
Rayburner
|
|