Mini Rosetta 3.45

Message boards : Number crunching : Mini Rosetta 3.45

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4

AuthorMessage
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1225
Credit: 13,855,859
RAC: 2,047
Message 74994 - Posted: 27 Jan 2013, 0:43:46 UTC - in response to Message 74989.  

Hi All

Has everyone seen the post by David Kim requesting debug information regarding the apparent nVidia bug here?:

https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6177&nowrap=true#74982

Danny


Note that that nVidia bug appears to occur under Linux only.
ID: 74994 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 207
Credit: 23,363,648
RAC: 10,793
Message 74995 - Posted: 27 Jan 2013, 1:32:39 UTC

No "nVidia bug" seen on all major OS: windows XP, windows 7, few linux variations, Mac OS

Fresh example om Mac OS: https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1529537
(last 10 tasks, all other fail due app download error)
ID: 74995 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
GALAXY-VOYAGER

Send message
Joined: 25 Oct 12
Posts: 15
Credit: 47,437
RAC: 0
Message 75146 - Posted: 20 Feb 2013, 11:46:54 UTC - in response to Message 74994.  

Hi All

Has everyone seen the post by David Kim requesting debug information regarding the apparent nVidia bug here?:

https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6177&nowrap=true#74982

Danny


Note that that nVidia bug appears to occur under Linux only.


I'm Running an Acer Aspire M1800 with Windows vista, and I'm contantly getting Errors with Nvidia. i've tried to search for updates, but it says that My Nvidia is Up To Date. It keeps having to close Nvidia.
ID: 75146 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Polian
Avatar

Send message
Joined: 21 Sep 05
Posts: 152
Credit: 10,141,266
RAC: 0
Message 75150 - Posted: 20 Feb 2013, 18:15:04 UTC - in response to Message 75146.  


I'm Running an Acer Aspire M1800 with Windows vista, and I'm contantly getting Errors with Nvidia. i've tried to search for updates, but it says that My Nvidia is Up To Date. It keeps having to close Nvidia.


It sounds like you are describing a problem unrelated to BOINC or Rosetta, as if you have an issue with your operating system and/or video driver installation.
ID: 75150 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile microchip
Avatar

Send message
Joined: 10 Nov 10
Posts: 10
Credit: 2,160,208
RAC: 123
Message 75151 - Posted: 20 Feb 2013, 20:08:50 UTC - in response to Message 74994.  

Hi All

Has everyone seen the post by David Kim requesting debug information regarding the apparent nVidia bug here?:

https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6177&nowrap=true#74982

Danny


Note that that nVidia bug appears to occur under Linux only.


Ugh, I have yet to see it on my Linux machine...

Team Belgium
ID: 75151 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1864
Credit: 8,184,675
RAC: 7,690
Message 75155 - Posted: 23 Feb 2013, 10:28:24 UTC

Like in Ralph, all H3i-A2E2 wus have problems with screensaver...
ID: 75155 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Rabinovitch
Avatar

Send message
Joined: 28 Apr 07
Posts: 28
Credit: 5,439,728
RAC: 0
Message 75347 - Posted: 11 Apr 2013, 5:24:03 UTC

Hi all!

I found that there appears tasks with "Validate error" on both my Linux and Windows PCs. I have described this problem already, but I thought that it is present only at my Linux machine.

Example WU from Win7 box: https://boinc.bakerlab.org/rosetta/result.php?resultid=573733678
From Linux: https://boinc.bakerlab.org/rosetta/result.php?resultid=573992857

Is there any way to fix it? 'Cause I just worrying for (as they say here) science done by my PCs (so as consumed electricity) and going to waste...
From Siberia with love!
ID: 75347 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Polian
Avatar

Send message
Joined: 21 Sep 05
Posts: 152
Credit: 10,141,266
RAC: 0
Message 75348 - Posted: 11 Apr 2013, 15:01:48 UTC
Last modified: 11 Apr 2013, 15:17:00 UTC

Try using a shorter target run time. I use 12 hours personally. It looks like most of your tasks are finishing successfully.

edit: I just looked over my recent results and there appears to be a batch of bad tasks in currently issued work. All of my failures but one were short running and terminated within a couple of minutes.
ID: 75348 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 75350 - Posted: 12 Apr 2013, 1:57:38 UTC

Lots of errors, these 4 with the same error.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=522032571

cryo_be__chain_N_subrun_001_SAVE_ALL_OUT_IGNORE_THE_REST_78278_50_0

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=522023452

cryo_be__chain_f_l_subrun_001_SAVE_ALL_OUT_IGNORE_THE_REST_78273_63_1

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=522023894

cryo_be__chain_K_subrun_001_SAVE_ALL_OUT_IGNORE_THE_REST_78274_127_0

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=522023820

cryo_be__chain_K_subrun_001_SAVE_ALL_OUT_IGNORE_THE_REST_78274_60_0


ERROR: weights_.size()
ERROR:: Exit from: src/numeric/random/WeightedSampler.cc line: 134
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
==========================================================

This 1 erred also.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=522161319

e6-52-1-29_mod_abinitio_SAVE_ALL_OUT_78300_706_0

ERROR: ERROR: FragmentIO: could not open file start.200.9mers
ERROR:: Exit from: src/core/fragment/FragmentIO.cc line: 233
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

( As they say in programming GARBAGE IN = GARBAGE OUT! )


ID: 75350 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 75351 - Posted: 12 Apr 2013, 3:37:47 UTC

Another 3 on another rig, same error.


cryo_be__chain_K_subrun_001_SAVE_ALL_OUT_IGNORE_THE_REST_78274_99_0.


cryo_be__chain_K_subrun_001_SAVE_ALL_OUT_IGNORE_THE_REST_78274_124_0.


cryo_be__chain_M_subrun_001_SAVE_ALL_OUT_IGNORE_THE_REST_78281_418_0.


ERROR: weights_.size()
ERROR:: Exit from: src/numeric/random/WeightedSampler.cc line: 134
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
# cpu_run_time_pref: 21600

</stderr_txt>

ID: 75351 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1864
Credit: 8,184,675
RAC: 7,690
Message 75352 - Posted: 12 Apr 2013, 8:08:20 UTC
Last modified: 12 Apr 2013, 8:08:37 UTC

574867308

ERROR: ERROR: FragmentIO: could not open file start.200.9mers
ERROR:: Exit from: ......srccorefragmentFragmentIO.cc line: 233
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
ID: 75352 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 75364 - Posted: 13 Apr 2013, 7:56:42 UTC
Last modified: 13 Apr 2013, 8:04:41 UTC

Bad batch of tasks! still out there.

e6-52-1-46_abinitio_SAVE_ALL_OUT_78312_1036_0


https://boinc.bakerlab.org/rosetta/workunit.php?wuid=522194572


ERROR: ERROR: FragmentIO: could not open file start.200.9mers
ERROR:: Exit from: src/core/fragment/FragmentIO.cc line: 233
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish
==================================

Some like this one are running O.K.

e6-9-4-48_abinitio_SAVE_ALL_OUT_78317_1037_0
ID: 75364 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1864
Credit: 8,184,675
RAC: 7,690
Message 75373 - Posted: 15 Apr 2013, 20:58:21 UTC

57540781

Server state Over
Outcome Validate error
Client state Done
Exit status 0 (0x0)
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
# cpu_run_time_pref: 7200
======================================================
DONE :: 1 starting structures 1201 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================
BOINC :: WS_max 0

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly ...
called boinc_finish

</stderr_txt>
]]>

Validate state Invalid
ID: 75373 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1225
Credit: 13,855,859
RAC: 2,047
Message 75402 - Posted: 20 Apr 2013, 21:37:27 UTC

Rosetta Mini 3.45
cryo_be__chain_H_subrun_000_SAVE_ALL_OUT_IGNORE_THE_REST_78252_282_1

Has already run almost 5 hours without writing any checkpoints at all.

Claims to be 30.748% done. Claims to have almost 10 hours to completion, out of the 12 hours I've selected for these runs.

Unclear if this is a problem or not.
ID: 75402 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 75425 - Posted: 22 Apr 2013, 23:22:23 UTC

Hi.

I'm seeing a few of these error out with the same problem.

CASP9_fc_benchmark_hybridization_run55_T0542_0_D4_SAVE_ALL_OUT_IGNORE_THE_REST_48152_899_0


CASP9_fc_benchmark_hybridization_run55_T0547_0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_48158_899_0


ERROR: error in process_residue_request: 'com'
ERROR:: Exit from: src/core/conformation/symmetry/util.cc line: 93
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

ID: 75425 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TJ

Send message
Joined: 29 Mar 09
Posts: 127
Credit: 4,799,890
RAC: 0
Message 75438 - Posted: 23 Apr 2013, 22:19:05 UTC - in response to Message 75402.  

Rosetta Mini 3.45
cryo_be__chain_H_subrun_000_SAVE_ALL_OUT_IGNORE_THE_REST_78252_282_1

Has already run almost 5 hours without writing any checkpoints at all.

Claims to be 30.748% done. Claims to have almost 10 hours to completion, out of the 12 hours I've selected for these runs.

Unclear if this is a problem or not.


It seems that unless you crunch the cryo's on a Mac, they will error out.
I crunch via Windows, so when I see a cryo I abort it immediately.

Greetings,
TJ.
ID: 75438 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1225
Credit: 13,855,859
RAC: 2,047
Message 75440 - Posted: 24 Apr 2013, 4:16:37 UTC - in response to Message 75425.  

Hi.

I'm seeing a few of these error out with the same problem.

CASP9_fc_benchmark_hybridization_run55_T0542_0_D4_SAVE_ALL_OUT_IGNORE_THE_REST_48152_899_0


CASP9_fc_benchmark_hybridization_run55_T0547_0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_48158_899_0


ERROR: error in process_residue_request: 'com'
ERROR:: Exit from: src/core/conformation/symmetry/util.cc line: 93
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish


I've seen a number of BOINC projects use a zip utility that works properly in one direction, but usually gives an error if the other direction is tried.

Could it be that problem again?
ID: 75440 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Priebe

Send message
Joined: 27 Nov 09
Posts: 16
Credit: 33,020,247
RAC: 0
Message 75478 - Posted: 25 Apr 2013, 22:52:33 UTC - in response to Message 75425.  

Hi.

I'm seeing a few of these error out with the same problem.

CASP9_fc_benchmark_hybridization_run55_T0542_0_D4_SAVE_ALL_OUT_IGNORE_THE_REST_48152_899_0


CASP9_fc_benchmark_hybridization_run55_T0547_0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_48158_899_0


ERROR: error in process_residue_request: 'com'
ERROR:: Exit from: src/core/conformation/symmetry/util.cc line: 93
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

For me too, 90% of the CASP9 series bomb on this same error.
ID: 75478 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4

Message boards : Number crunching : Mini Rosetta 3.45



©2024 University of Washington
https://www.bakerlab.org