Message boards : Number crunching : Report long-running models here
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · Next
Author | Message |
---|---|
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 1,227 |
9/23/2009 4:07:32 PM rosetta@home task 1fv5A_ZNMP_ABRELAX_tetraR_IGNORE_THE_REST_ZINC_METALLOPROTEIN-1fv5A-_14711_576_0 resumed by user Looks to me like 1.97 is subject to a new variant of the lockfile problem, but at least a little more information appears to be reported for the new variant. You may need to reboot or restart BOINC in order to clear away the remnants of the lockfile problem, if it's affecting any workunits that try to use the same slot later. Also, I'd consider the possibility that 1.97 and faah 6.07 have some kind of conflict in how they handle zinc. |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
This one took 7hrs, 33min to do ( 1 model ) on a 3ghz rig, dam! frb_0_8__rnd2_aln_list_mike_chosen_bestaln.alns.homolog_csts_oct09_hb_t303__IGNORE_THE_REST_1FEZA_8_15003_15 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=259645046 |
CraniuMod Send message Joined: 11 Jan 08 Posts: 3 Credit: 565,798 RAC: 0 |
9/23/2009 4:07:32 PM rosetta@home task 1fv5A_ZNMP_ABRELAX_tetraR_IGNORE_THE_REST_ZINC_METALLOPROTEIN-1fv5A-_14711_576_0 resumed by user Did as suggested and all appears to be well although I haven't seen a zinc go through. Thanks. |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Another pig of a task, this one on my quad took 8hrs, 2min for 1 model. frb_0_8__rnd2_aln_list_mike_chosen_bestaln.alns.homolog_csts_oct09_hb_t305__IGNORE_THE_REST_1LARA_6_15004_15_0 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=259645182 |
clayton1966 Send message Joined: 5 Sep 09 Posts: 6 Credit: 166,791 RAC: 0 |
This is the first task I have ever aborted but after over 12 hours aqnd only 3% finish I figured something was stuck. Here are the log messages I could find regarding this particular task. 10/12/2009 11:03:57 AM rosetta@home Starting task histone_loopbuild_run1_14925_57751_1 using minirosetta version 197 10/12/2009 11:13:34 PM rosetta@home task histone_loopbuild_run1_14925_57751_1 aborted by user |
macko Send message Joined: 25 Jun 09 Posts: 32 Credit: 153,495 RAC: 0 |
|
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
This was done on my quad, on a 4hr run time pref it showed 6hr, 50min boinc time. Biggest i've had in a while. broker_idealclose_kic10_hb_t328__IGNORE_THE_REST_16455_612_0 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=277137434 ====================================================== DONE :: 2 starting structures 24608.2 cpu seconds This process generated 2 decoys from 2 attempts ====================================================== |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
My runtime is 4hrs this ran just over 8hrs on a 3Ghz intel. t365__boinc_filtered_loopbuild_threading_cst_all_tex_IGNORE_THE_REST_16902_9991_0 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=289111193 ========================================================================================= Watchdog active. # cpu_run_time_pref: 14400 Continuing computation from checkpoint: chk_S_1WB9A_12_0001_FastRelax__chk1_fa ... success! Continuing computation from checkpoint: chk_S_1WB9A_12_0001_FastRelax__chk2_fa ... success! Continuing computation from checkpoint: chk_S_1WB9A_12_0001_FastRelax__chk3_fa ... success! Continuing computation from checkpoint: chk_S_1WB9A_12_0001_FastRelax__chk4_fa ... success! BOINC:: CPU time: 29194s, 14400s + 14400s[2010- 2-12 13:38:58:] :: BOINC WARNING! cannot get file size for default.out.gz: could not open file. Output exists: default.out.gz Size: -1 InternalDecoyCount: 0 (GZ) ----- 0 ----- Stream information inconsistent. Writing W_0000001 ====================================================== DONE :: 1 starting structures 29194 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== called boinc_finish SIGSEGV: segmentation violation </stderr_txt> |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
This ran over my 4hr time set as you can see, on an intel 2.9Ghz. t323__boinc_filtered_loopbuild_threading_cst_lb_tex_IGNORE_THE_REST_16900_6289_0 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=292851592 InternalDecoyCount: 0 ====================================================== DONE :: 1 starting structures 28890.9 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== called boinc_finish Just over 8hrs. |
apohawk Send message Joined: 13 Sep 08 Posts: 5 Credit: 30,438,070 RAC: 0 |
This one took a long time. placestub_1zvy_1zma_ppk_ProteinInterfaceDesign_28Feb2010_18489_296_0 https://boinc.bakerlab.org/rosetta/result.php?resultid=322582319 CPU time: 15685.67 preferred time: 2h application version: 2.05 OS: WinXP 64 BOINC Manager: 6.10.36 CPU: phenom II 945 (3GHz) DONE :: 2 starting structures 15685.5 cpu seconds This process generated 2 decoys from 2 attempts now, what surprised me the most: claimed credit: 92.63 granted credit: 0.38 Did something go wrong during validation or during crunching ? |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
This took double my run time, on my 2.9 intel. 373AA_boinc_slac373_loopbuild_threading_firas_IGNORE_THE_REST_18610_7623_0 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=294620614 BOINC:: CPU time: 29305.1s, 14400s + 14400s[2010- 3-10 13:26:16:] :: BOINC InternalDecoyCount: 0 ====================================================== DONE :: 1 starting structures 29305.1 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== called boinc_finish |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Here's another long one, this was on my intel quad. aqp9__boinc_aqp9_run02_blast_yfsong_loopbuild_threading_cst_relax_yfsong_IGNORE_THE_REST_18418_2895_1 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=294212467 BOINC:: CPU time: 28906.1s, 14400s + 14400s[2010- 3-12 14: 7:17:] :: BOINC InternalDecoyCount: 0 ====================================================== DONE :: 1 starting structures 28906.1 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== called boinc_finish |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
This took 8hrs, 2min on my 3ghz intel. aqp9__boinc_aqp9_fast_run01_yfsong_loopbuild_threading_cst_relax_superfast_yfsong_IGNORE_THE_REST_18658_1421_0 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=296064742 # cpu_run_time_pref: 14400 Continuing computation from checkpoint: chk_S_2B6OA_15_0001_Remodel__loop_1_0_0_S ... success! BOINC:: CPU time: 28914.7s, 14400s + 14400s[2010- 3-17 13:39:17:] :: BOINC InternalDecoyCount: 0 ====================================================== DONE :: 1 starting structures 28914.7 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== called boinc_finish |
Bikermatt Send message Joined: 12 Feb 10 Posts: 20 Credit: 10,552,445 RAC: 0 |
Does anyone look at long running models anymore? I have been seeing two to three per week. -Matt Win 7 64 bit v2FcInnerW_1dAl_3fk8_ProteinInterfaceDesign_15Mar2010_18672_235_0 https://boinc.bakerlab.org/rosetta/result.php?resultid=326997251 <core_client_version>6.10.18</core_client_version> ====================================================== DONE :: 2 starting structures 20222.2 cpu seconds This process generated 2 decoys from 2 attempts ====================================================== Validate state Valid Claimed credit 115.549211591099 Granted credit 0.467169393634457 application version 2.05 |
Trotador Send message Joined: 30 May 09 Posts: 108 Credit: 291,214,977 RAC: 0 |
Not sure if they all fall within this category regards Task ID 327672671 Name v2FcInnerW_1dAl_1UCH_ProteinInterfaceDesign_15Mar2010_18672_216_0 ====================================================== DONE :: 2 starting structures 25768.5 cpu seconds This process generated 17 decoys from 17 attempts ====================================================== called boinc_finish </stderr_txt> ]]> Validate state Valid Claimed credit 179.886682155965 Granted credit 13.8316680387615 application version 2.05 Task ID 327464846 Name v2FcInnerW_1dAl_2r39_ProteinInterfaceDesign_15Mar2010_18672_188_0 ====================================================== DONE :: 49 starting structures 10799 cpu seconds This process generated 49 decoys from 49 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down cleanly ... called boinc_finish </stderr_txt> ]]> Validate state Valid Claimed credit 75.3815505458785 Granted credit 9.5733070767376 application version 2.05 Task ID 326922282 Name placestub_alt_denovo_1zvy_3d6j_ProteinInterfaceDesign_21Mar2010_18705_75_0 Workunit 298345588 ====================================================== DONE :: 2 starting structures 13226.9 cpu seconds This process generated 2 decoys from 2 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down cleanly ... called boinc_finish </stderr_txt> ]]> Validate state Valid Claimed credit 78.282098375707 Granted credit 0.513724378027017 application version 2.05 Task ID 326825643 Name placestub_alt_denovo_1zvy_2vg9_ProteinInterfaceDesign_21Mar2010_18705_50_0 Workunit 298255310 ====================================================== DONE :: 7 starting structures 11606 cpu seconds This process generated 7 decoys from 7 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down cleanly ... called boinc_finish </stderr_txt> ]]> Validate state Valid Claimed credit 68.6891373975153 Granted credit 1.04329606606261 application version 2.05 Task ID 326822780 Name v2FcInnerW_1dAl_2iwx_ProteinInterfaceDesign_15Mar2010_18672_122_0 Workunit 298252600 ====================================================== DONE :: 1 starting structures 25789.3 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== called boinc_finish </stderr_txt> ]]> Validate state Valid Claimed credit 152.642238209485 Granted credit 0.732511919011592 application version 2.05 Task ID 326401609 Name v2FcInnerW_1dAl_1YRV_ProteinInterfaceDesign_15Mar2010_18672_115_0 Workunit 297855809 ====================================================== DONE :: 36 starting structures 11274.3 cpu seconds This process generated 36 decoys from 36 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down cleanly ... called boinc_finish </stderr_txt> ]]> Validate state Valid Claimed credit 66.961941133408 Granted credit 9.51201988583631 application version 2.05 |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2145 Credit: 41,560,787 RAC: 9,320 |
Another long-running model on this task type running on W7 64bit v2FcInnerW_1dAl_3HMH_ProteinInterfaceDesign_15Mar2010_18672_208_0 <core_client_version>6.10.36</core_client_version> |
Mad_Max Send message Joined: 31 Dec 09 Posts: 209 Credit: 26,537,115 RAC: 17,468 |
453AA_boinc_slac453_loopbuild_threading_firas_IGNORE_THE_REST_19484_6668_0 1 model took about ~4 hours of CPU time(3 Ghrz Athlon II X2 250), and client starts second model, ignoring Target CPU Time = 2 hours Result - task killed by Watchdog at 6 hours from start(2+4), ~2 hours of CPU time lost (time spend on second model) I met this error already many times before with the tasks of this type (* boinc * loopbuild_threading *). So now I abort them if I see them in the queue. But it was missed and got the same error. |
LizzieBarry Send message Joined: 25 Feb 08 Posts: 76 Credit: 201,862 RAC: 0 |
(Report format copied from above - seems to make sense) A long-running model on this task, running on a 32-bit Vista laptop: rhoA15May2010_1lb1_2j49_ProteinInterfaceDesign_15May2010_20686_35_0 <core_client_version>6.10.43</core_client_version> What gets me about this is that 1205 decoys seemed to run within my 6 hour runtime, then the last decoy had to get shut-down by the watchdog after exceeding 4 hours. Was I just unlucky? The credit award was still reasonable. |
Tackleway Send message Joined: 3 May 10 Posts: 3 Credit: 11,886 RAC: 0 |
Thoroughly un impressed. Task implied 3...hrs to complete runs for 6.5hrs claims 84.511 credits then is granted 6.83 credits. this is not good VFM I will not be processing further tasks that look like rhoA15May2010_1lb1_1rw1_ProteinInterfaceDesign_15May2010_20686_107_0 339844150 310311035 19 May 2010 5:56:27 UTC 19 May 2010 17:02:41 UTC Over Success Done 21,938.98 84.51 6.83 |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2145 Credit: 41,560,787 RAC: 9,320 |
What gets me about this is that 1205 decoys seemed to run within my 6 hour runtime, then the last decoy had to get shut-down by the watchdog after exceeding 4 hours. Was I just unlucky? The credit award was still reasonable. Is that all? The job below ran 3224 decoys in 8 hours before the last one got shutdown by the watchdog. rhoA15May2010_1lb1_3e9v_ProteinInterfaceDesign_15May2010_20686_23_0 From tackleway's post it seems like this is a characteristic of this job-type and not all of us get good credits (mine was over-awarded). Luck of the draw... |
Message boards :
Number crunching :
Report long-running models here
©2024 University of Washington
https://www.bakerlab.org