Posts by Rayburner

1) Message boards : Number crunching : Client errors (Message 75271)
Posted 21 Mar 2013 by Rayburner
Post:
corrected typo

The Problem is fixed for me too. My host returned its first valid results ever.

Thank You Rosetta and David Anderson for finally fixing it.

Rayburner


The problem appears to have been fixed.

I don't know the details of the fix, but from what I understand...
The Rosetta Admins and David Anderson were able to work together, using the scheduler requests that I posted, to identify and correct the problem. Rosetta's server software was recompiled, but I'm not sure if code was changed/updated; http://srv4.bakerlab.org/rosetta_cgi/cgi still shows scheduler version 605.

My tasks are currently resulting in:
Server state: Over
Outcome: Success
Client state: Done
Exit status: 0 (0x0)
Validate state: Valid
application version: 3.45
... even though I have nVidia GPUs capable of OpenCL work, using the latest nVidia drivers!

So, as far as I can tell, IT'S FIXED!
Thank you Rosetta for finally resolving this issue.
2) Message boards : Number crunching : Client errors (Message 75231)
Posted 12 Mar 2013 by Rayburner
Post:
David

do you still plan to update the Scheduler?

And if so when do expect to do this?

Thanks

Rayburner

We haven't changed the scheduler in a long time so it's likely the driver update that broke things.

Can anyone confirm that ralph does not have this issue?

I'll ask people here to submit more test jobs to Ralph.

I don't know when I'll be able to update the server but hopefully next month.

thanks!

3) Message boards : Number crunching : Client errors (Message 75230)
Posted 12 Mar 2013 by Rayburner
Post:
David

do you still plan to update the Scheduler?

And if so when do expect to do this?

Thanks

Rayburner

We haven't changed the scheduler in a long time so it's likely the driver update that broke things.

Can anyone confirm that ralph does not have this issue?

I'll ask people here to submit more test jobs to Ralph.

I don't know when I'll be able to update the server but hopefully next month.

thanks!

4) Message boards : Number crunching : Client errors (Message 75007)
Posted 28 Jan 2013 by Rayburner
Post:
Hi mikey

I have this problem from the very beginning with this box (December 2011). So this problem is much older that driver version 306.97.

Every other project (RAPLPH also!!) runs just fine on this computer.

BTW I also switched to BOINC 7.0.45

Regards,
Rayburner


Have you guys with Nvidia gpu's tried down grading your driver to version 306.97 and seeing if that helps? It has in some cases, not all but some. MOST of us with AMD cards are not seeing the problem anymore, I am up to Boinc version 7.0.45 and am still getting the usual credits for the units crunched.

5) Message boards : Number crunching : Client errors (Message 74990)
Posted 26 Jan 2013 by Rayburner
Post:
Hi

I forgot to add the specs of my box:

Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz [Family 6 Model 42 Stepping 7]

Microsoft Windows 7
Professional x64 Edition, Service Pack 1, (06.01.7601.00)

NVIDIA GeForce GTX 570 (2560MB) driver: 31090

Regards,
Rayburner

David

here is the information you requested

However the exist code 0 not 3

Work unit is http://boinc.bakerlab.org/rosetta/result.php?resultid=558582819

The log from BOINC:

26.01.2013 20:11:05 | rosetta@home | [task] Process for P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 exited, exit code 0, task state 1
26.01.2013 20:11:05 | rosetta@home | [task] task_state=EXITED for P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 from handle_exited_app
26.01.2013 20:11:05 | rosetta@home | Computation for task P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 finished
26.01.2013 20:11:05 | rosetta@home | [task] result state=FILES_UPLOADING for P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 from CS::app_finished
26.01.2013 20:11:07 | rosetta@home | Started upload of P2_1_s4_f5_abinitio_design_y024_005_72943_231_0_0
26.01.2013 20:11:13 | rosetta@home | Finished upload of P2_1_s4_f5_abinitio_design_y024_005_72943_231_0_0
26.01.2013 20:11:13 | rosetta@home | [task] result state=FILES_UPLOADED for P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 from CS::update_results
26.01.2013 20:13:58 | rosetta@home | Sending scheduler request: To report completed tasks.
26.01.2013 20:13:58 | rosetta@home | Reporting 1 completed tasks
26.01.2013 20:13:58 | rosetta@home | Not requesting tasks: "no new tasks" requested via Manager
26.01.2013 20:14:01 | rosetta@home | Scheduler request completed


Regards
Rayburner

6) Message boards : Number crunching : Client errors (Message 74988)
Posted 26 Jan 2013 by Rayburner
Post:
David

here is the information you requested

However the exist code 0 not 3

Work unit is http://boinc.bakerlab.org/rosetta/result.php?resultid=558582819

The log from BOINC:

26.01.2013 20:11:05 | rosetta@home | [task] Process for P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 exited, exit code 0, task state 1
26.01.2013 20:11:05 | rosetta@home | [task] task_state=EXITED for P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 from handle_exited_app
26.01.2013 20:11:05 | rosetta@home | Computation for task P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 finished
26.01.2013 20:11:05 | rosetta@home | [task] result state=FILES_UPLOADING for P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 from CS::app_finished
26.01.2013 20:11:07 | rosetta@home | Started upload of P2_1_s4_f5_abinitio_design_y024_005_72943_231_0_0
26.01.2013 20:11:13 | rosetta@home | Finished upload of P2_1_s4_f5_abinitio_design_y024_005_72943_231_0_0
26.01.2013 20:11:13 | rosetta@home | [task] result state=FILES_UPLOADED for P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 from CS::update_results
26.01.2013 20:13:58 | rosetta@home | Sending scheduler request: To report completed tasks.
26.01.2013 20:13:58 | rosetta@home | Reporting 1 completed tasks
26.01.2013 20:13:58 | rosetta@home | Not requesting tasks: "no new tasks" requested via Manager
26.01.2013 20:14:01 | rosetta@home | Scheduler request completed


Regards
Rayburner
7) Message boards : Number crunching : Client errors (Message 74987)
Posted 26 Jan 2013 by Rayburner
Post:
David

here is the information you requested

However the exist code 0 not 3

Work unit is http://boinc.bakerlab.org/rosetta/result.php?resultid=558582819

The log from BOINC:

26.01.2013 20:11:05 | rosetta@home | [task] Process for P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 exited, exit code 0, task state 1
26.01.2013 20:11:05 | rosetta@home | [task] task_state=EXITED for P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 from handle_exited_app
26.01.2013 20:11:05 | rosetta@home | Computation for task P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 finished
26.01.2013 20:11:05 | rosetta@home | [task] result state=FILES_UPLOADING for P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 from CS::app_finished
26.01.2013 20:11:07 | rosetta@home | Started upload of P2_1_s4_f5_abinitio_design_y024_005_72943_231_0_0
26.01.2013 20:11:13 | rosetta@home | Finished upload of P2_1_s4_f5_abinitio_design_y024_005_72943_231_0_0
26.01.2013 20:11:13 | rosetta@home | [task] result state=FILES_UPLOADED for P2_1_s4_f5_abinitio_design_y024_005_72943_231_0 from CS::update_results
26.01.2013 20:13:58 | rosetta@home | Sending scheduler request: To report completed tasks.
26.01.2013 20:13:58 | rosetta@home | Reporting 1 completed tasks
26.01.2013 20:13:58 | rosetta@home | Not requesting tasks: "no new tasks" requested via Manager
26.01.2013 20:14:01 | rosetta@home | Scheduler request completed


Regards
Rayburner
8) Message boards : Number crunching : Client Errors (Message 72717)
Posted 9 Apr 2012 by Rayburner
Post:
I have received one WU at ralph and crunched it successfully!!
http://ralph.bakerlab.org/result.php?resultid=2647221
Regards,
Rayburner


Hi Rayburner: Thank you for some positive sounding news! I wonder if you have fiddled with the drivers or anything else since you last had a WU ending in client error? Perhaps you could give Rosetta another try and see what happens. Thus far I have not tried Ralph, but will give it a shot. Kimsey, Jr.


Nothing was changed on my side. After the successfull run on ralph I tried rosetta again. Unfortunately still with the known client error outcome.

That lets me assume there must be a difference between ralph and rosetta. Maybe project admins can have a look at possible differences at the server side.

Regards,
Rayburner
9) Message boards : Number crunching : Client Errors (Message 72712)
Posted 9 Apr 2012 by Rayburner
Post:
I have received one WU at ralph and crunched it successfully!!

http://ralph.bakerlab.org/result.php?resultid=2647221

Regards,
Rayburner
10) Message boards : Number crunching : Client Errors (Message 72605)
Posted 26 Mar 2012 by Rayburner
Post:



I want to extend a heartfelt thanks to everyone on the forums who is helping with troubleshooting the issue, especially those (like In Memory of Kimsey M Fowler Sr) who have gone above and beyond in diagnosing things. Thanks to all of your efforts, I think we can be relatively confident that the issue is directly related to NVidia GPU drivers on Windows 7.



Just out of curiosity, do we know for an affirmative fact that the same problem DOESN'T affect GPU users with ATI cards?


I can just say that I do have an i7 with a AMD RADEON 6970 with driver 12.1 running just fine on rosetta (on Win7 x64).
11) Message boards : Number crunching : Client Errors (Message 72595)
Posted 25 Mar 2012 by Rayburner
Post:

In the mean time, does anyone know if there is a way to set up a test running a pre-Rosetta 3.22 task? If the task will complete normally with an NVIDIA driver installed in the GPU, some change that went into Rosetta 3.22 would likely be the source of the problem.


The problem already existed with with version Rosetta version 3.19. Howver I seemed to be the only one to report this problem.

I started this thread (http://boinc.bakerlab.org/rosetta/forum_thread.php?id=5875) back then.



Yes, that definitely sounds like what the rest of us are experiencing - are you also running NVidia-based GPU apps for other projects?



Yes, I do. GPUGrid and PrimeGrid
12) Message boards : Number crunching : Client Errors (Message 72588)
Posted 24 Mar 2012 by Rayburner
Post:

In the mean time, does anyone know if there is a way to set up a test running a pre-Rosetta 3.22 task? If the task will complete normally with an NVIDIA driver installed in the GPU, some change that went into Rosetta 3.22 would likely be the source of the problem.


The problem already existed with with version Rosetta version 3.19. Howver I seemed to be the only one to report this problem.

I started this thread (http://boinc.bakerlab.org/rosetta/forum_thread.php?id=5875) back then.

Regards,

Rayburner
13) Message boards : Number crunching : Client Errors (Message 72562)
Posted 20 Mar 2012 by Rayburner
Post:
Hi!

You can add my system to the list:

CPU:i7-2600K CPU @ 3.40GHz Family 6 Model 42 Stepping 7
RAM: 8GB PC3-12800
GPU: NVIDIA GeForce GTX 570 (2560MB)
Graphics driver: 296.10
OS: MS Windows 7 Professional x64 Edition, SP 1

I run GPUGrid and PrimeGrid on the GPU.

EDIT: Are all your host running NVIDIA GPUs?
I have another i7 960 system with an AMD RADEON 6970 which is crunching
Rosetta just fine.

Regards,
Rayburner
14) Message boards : Number crunching : New machine, nothing but client error (Message 72493)
Posted 11 Mar 2012 by Rayburner
Post:
Hi!

I have the same problem with one of my hosts. Taking into acount what you guys say + my experience this issue is not related to the BOINC version or just GPU crunching.

I have one i7 which crunching successfully on Rosetta and crunching for different projects on its GPU. However this GPU is an AMD RADEON 6970.

My host which has the described problems has a NVIDIA GPU (GTX 570). The same seems to be true for mitrichr and Wiliam Blakemore.

Regards,
Rayburner
15) Message boards : Number crunching : Client error with Rosetta Mini 3.19 (Message 71913)
Posted 28 Dec 2011 by Rayburner
Post:
now this host has reached a daily quota of 8 because it always returns "bad" results.

Looks I have to take this host out of rosetta. It doesn't make sense to keep it attached as long this problem persists.

Regards,
Rayburner
16) Message boards : Number crunching : Client error with Rosetta Mini 3.19 (Message 71902)
Posted 27 Dec 2011 by Rayburner
Post:
It looks like you are now running the Beta Boinc client
<core_client_version>7.0.3</core_client_version>

In looking at some previous results on this bad host, none of them show a application version at the end of the Task Details online.





right. I installed the beta BOINC client hoping that it would help...

However the problem was the same with 6.12.34

like Mod.Sense has written looks like the validator seems to have a problem analyzing my returned result. I guess that is why no application version is displayed.

Regards,
Rayburner
17) Message boards : Number crunching : Client error with Rosetta Mini 3.19 (Message 71899)
Posted 27 Dec 2011 by Rayburner
Post:
It looks like the missing credits were granted by the project (automatically??), but still all new returned results by this host are marked as outcome client error. As credits were granted I assume my returned results are valuable to the project.

So how are we going to proceed? I think I have checked everything on my side. Is it a problem on the server side? Does it make sense to delete this host on the server and let it create a new id by contacting the server again?

Regards,
Rayburner
18) Message boards : Number crunching : Client error with Rosetta Mini 3.19 (Message 71893)
Posted 26 Dec 2011 by Rayburner
Post:
I have detached and reatached to the project, run time set to 1 hour, screensaver is not active

The result is still the same. Outcome is client error

My second host in the meantime is generating credits.

Regards,
Rayburner
19) Message boards : Number crunching : Client error with Rosetta Mini 3.19 (Message 71888)
Posted 25 Dec 2011 by Rayburner
Post:
Hello,

I am having problems with a new host I just attached to rosetta.

All WU I report show outcome client error.

For example task ID 472808156 (http://boinc.bakerlab.org/rosetta/result.php?resultid=472808156)

However according to the stderr out (see below) I don't have a clue what the Problem is.

The spefic host is new, no OC and crunching successfully for several different projects (Einstein, SETI, WCG, LHC, Primegrid)

Has anybody a clue why this happens?

I stopped rosetta on this host for now.

Best Regards,
Rayburner

<core_client_version>6.12.34</core_client_version>
<![CDATA[
<stderr_txt>
[2011-12-25 20:14:32:] :: BOINC:: Initializing ... ok.
[2011-12-25 20:14:32:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev46494.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
Setting up folding (abrelax) ...
Beginning folding (abrelax) ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Starting work on structure: _00001
# cpu_run_time_pref: 14400
Starting work on structure: _00002
Starting work on structure: _00003
Starting work on structure: _00004
Starting work on structure: _00005
Starting work on structure: _00006
Starting work on structure: _00007
Starting work on structure: _00008
Starting work on structure: _00009
Starting work on structure: _00010
Starting work on structure: _00011
Starting work on structure: _00012
Starting work on structure: _00013
Starting work on structure: _00014
Starting work on structure: _00015
Starting work on structure: _00016
======================================================
DONE :: 1 starting structures 13863.3 cpu seconds
This process generated 16 decoys from 16 attempts
======================================================
BOINC :: WS_max 4.09907e+008

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly ...
called boinc_finish

</stderr_txt>
]]>

20) Message boards : Number crunching : Problems with Minirosetta 1.80 (Message 61976)
Posted 27 Jun 2009 by Rayburner
Post:
compute error after 4 hours

http://boinc.bakerlab.org/rosetta/result.php?resultid=261844121

real_core_1.5_low200_beta_low200_start_hb_t374__IGNORE_THE_REST_13119_137_0


Next 20



©2024 University of Washington
https://www.bakerlab.org