1)
Message boards :
Number crunching :
Problems with Minirosetta 1.80
(Message 62011)
Posted 29 Jun 2009 by lusvladimir Post: Errors for tasks: real_core_1.5_low200_beta_low200_start_hb http://boinc.bakerlab.org/result.php?resultid=261781005 http://boinc.bakerlab.org/result.php?resultid=261750967 http://boinc.bakerlab.org/result.php?resultid=261750701 http://boinc.bakerlab.org/result.php?resultid=261750699 Ended by the watchdog. Marked invalid. |
2)
Message boards :
Number crunching :
Problems with Minirosetta 1.76
(Message 61870)
Posted 21 Jun 2009 by lusvladimir Post: http://boinc.bakerlab.org/rosetta/result.php?resultid=259823440 Task ID: 259823440 Name: wRMSF_1_5_core_jumps_mixcst2_hb_t290__IGNORE_THE_REST_12911_2080_0 Workunit: 237144213 InternalDecoyCount: protocols::boinc::Boinc::decoy_count() (GZ) ====================================================== DONE :: 1 starting structures 18047.8 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>wRMSF_1_5_core_jumps_mixcst2_hb_t290__IGNORE_THE_REST_12911_2080_0_0</file_name> <error_code>-161</error_code> </file_xfer_error> Validate state: Invalid --- http://boinc.bakerlab.org/rosetta/result.php?resultid=259799581 Task ID: 259799581 Name: wRMSF_1_5_core_jumps_mixcst2_hb_t362__IGNORE_THE_REST_12924_1373_0 Workunit: 237123479 InternalDecoyCount: protocols::boinc::Boinc::decoy_count() (GZ) ====================================================== DONE :: 1 starting structures 18501.5 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>wRMSF_1_5_core_jumps_mixcst2_hb_t362__IGNORE_THE_REST_12924_1373_0_0</file_name> <error_code>-161</error_code> </file_xfer_error> Validate state: Invalid |
3)
Message boards :
Number crunching :
Problems with Minirosetta 1.76
(Message 61859)
Posted 20 Jun 2009 by lusvladimir Post: Validate error: Task ID: 259620267 Name: looprebuild_t374_decoy_5_12863_2150_0 Workunit: 236957726 ====================================================== DONE :: 1 starting structures 1749.9 cpu seconds This process generated 99 decoys from 99 attempts ====================================================== Validate state: Invalid |
4)
Message boards :
Number crunching :
Report long-running models here
(Message 59206)
Posted 31 Jan 2009 by lusvladimir Post: This WU stopped after preferred runtime (1hrs) + 4hrs Debian Linux Boinc 6.2.14 Rosetta Mini 1.54 1nkuA_BOINC_MPZN_with_zinc_abrelax_cs_frags_6231_156556_0 http://boinc.bakerlab.org/result.php?resultid=224365694 CPU Time: 18255.47 1nkuA_BOINC_MPZN_with_zinc_abrelax_cs_frags_6231_113531_0 http://boinc.bakerlab.org/result.php?resultid=224061637 CPU Time: 18444.31 1nkuA_BOINC_MPZN_with_zinc_abrelax_cs_frags_6231_113176_0 http://boinc.bakerlab.org/result.php?resultid=224058752 CPU Time: 18159.49 stderr out: ... End of unzipping. Setting database description ... Setting up checkpointing ... Setting up folding (abrelax) ... Beginning folding (abrelax) ... BOINC:: Worker startup. Starting watchdog... Watchdog active. Starting work on structure: _00001 # cpu_run_time_pref: 3600 ====> called boinc_finish </stderr_txt> ]]> Validate state Invalid |
5)
Message boards :
Number crunching :
Minirosetta v1.47 bug thread.
(Message 58087)
Posted 21 Dec 2008 by lusvladimir Post: Running Debian Linux , Boinc 6.2.14. http://boinc.bakerlab.org/result.php?resultid=215464278 Task ID 215464278 Name cc_nonideal_1_3_nocst4_hb_t286__IGNORE_THE_REST_1VYHA_6_5693_20_0 Workunit 196380006 <core_client_version>6.2.14</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> # cpu_run_time_pref: 3600 *** glibc detected *** double free or corruption (!prev): 0x0e13a4f0 *** SIGABRT: abort called Stack trace (23 frames): |
6)
Message boards :
Number crunching :
Servers running, but no work available??
(Message 55858)
Posted 18 Sep 2008 by lusvladimir Post: Server Status as of 18 Sep 2008 11:57:02 UTC [ Scheduler running ] Queued: 8 18-Sep-2008 13:13:20 [rosetta@home] Sending scheduler request: To fetch work. Requesting 101005 seconds of work, reporting 1 completed tasks 18-Sep-2008 13:13:30 [rosetta@home] Scheduler request succeeded: got 0 new tasks 18-Sep-2008 13:19:52 [rosetta@home] Sending scheduler request: To fetch work. Requesting 102317 seconds of work, reporting 0 completed tasks 18-Sep-2008 13:20:02 [rosetta@home] Scheduler request succeeded: got 0 new tasks 18-Sep-2008 13:37:58 [rosetta@home] Sending scheduler request: To fetch work. Requesting 106150 seconds of work, reporting 0 completed tasks 18-Sep-2008 13:38:03 [rosetta@home] Scheduler request succeeded: got 0 new tasks |
7)
Message boards :
Number crunching :
Minirosetta v1.32 bug thread
(Message 55258)
Posted 24 Aug 2008 by lusvladimir Post: lusvladimir, thank you for all the details. One more question, have you run any other projects when the time change is negative? I mean, do tasks from other projects have a similar problem? Mod.Sense, thank you for advice about negative time!!! I read more manual about time synchronization and I was able to tune my system so that the time shift was very very small (millisecons per several hours) and still positive. NTP daemon now do not need to synchronize the time often, adn rosetta workunits work without errors. I did not replicate the error on another project (Einstein @ Home), but too little time has passed. I will continue to monitor the state of the system and in case of errors will announce their way to reproduce. |
8)
Message boards :
Number crunching :
Minirosetta v1.32 bug thread
(Message 55243)
Posted 23 Aug 2008 by lusvladimir Post: lusvladimir, thank you for all the details. One more question, have you run any other projects when the time change is negative? I mean, do tasks from other projects have a similar problem? I crunch rosetta@home only, but for experiment and resolving this problem i will try to be connected to other project for linux platforms and inform results after 1-2 days. Thanks you. |
9)
Message boards :
Number crunching :
Minirosetta v1.32 bug thread
(Message 55237)
Posted 23 Aug 2008 by lusvladimir Post: I temporarily stop ntp daemon and not see this error. Thanks and sorry for my english, its not my own language. Debian Linux Kernel is 2.6.26-1-686 SMP Boinc Manager 6.2.14 Machine configure to run at 100 % CPU In linux system log ..... ntp time change Aug 23 03:09:12 alpha ntpd[13389]: time reset -0.175490 s Aug 23 03:09:33 alpha ntpd[13389]: synchronized to 77.234.200.98, stratum 4 Aug 23 03:10:29 alpha ntpd[13389]: synchronized to 87.236.24.179, stratum 2 ...and in BOINC stderr.txt at this time (i'm set task_debug on) .... 23-Aug-2008 03:07:27 [rosetta@home] Started download of boinc_homfrags_aa1pxuA03_05.200_v1_3.gz 23-Aug-2008 03:08:05 [rosetta@home] [task_debug] result abinitio_only62_A_1bq9A_4438_2605_0 checkpointed 23-Aug-2008 03:08:44 [rosetta@home] [task_debug] result abinitio_only62_A_1vcc__4438_3676_0 checkpointed 23-Aug-2008 03:08:45 [rosetta@home] [task_debug] result abinitio_only62_A_1vcc__4438_3676_0 checkpointed 23-Aug-2008 03:09:23 [rosetta@home] [task_debug] result abinitio_only62_A_2chf__4434_6914_0 checkpointed 23-Aug-2008 03:09:38 [rosetta@home] [task_debug] result abinitio_homfrag_71_A_2hboA_4443_1214_0 checkpointed 23-Aug-2008 03:09:52 [rosetta@home] Finished download of boinc_homfrags_aa1pxuA03_05.200_v1_3.gz 23-Aug-2008 03:09:52 [rosetta@home] Started download of boinc_homfrags_aa1pxuA09_05.200_v1_3.gz 23-Aug-2008 03:10:32 [rosetta@home] [task_debug] result abinitio_only62_A_1bq9A_4438_2605_0 checkpointed 23-Aug-2008 03:10:44 [rosetta@home] [task_debug] result abinitio_only62_A_1bq9A_4438_2605_0 checkpointed 23-Aug-2008 03:10:56 [rosetta@home] [task_debug] result abinitio_only62_A_1vcc__4438_3676_0 checkpointed 23-Aug-2008 03:11:28 [rosetta@home] Sending scheduler request: To fetch work. Requesting 3081 seconds of work, reporting 0 completed tasks 23-Aug-2008 03:11:31 [rosetta@home] [task_debug] result abinitio_only62_A_2chf__4434_6914_0 checkpointed 23-Aug-2008 03:11:33 [rosetta@home] Scheduler request succeeded: got 1 new tasks 23-Aug-2008 03:11:33 [rosetta@home] [task_debug] result state=NEW for abinitio_only62_A_1ptq__4438_5437_0 from handle_scheduler_reply 23-Aug-2008 03:11:34 [rosetta@home] [task_debug] result state=FILES_DOWNLOADING for abinitio_only62_A_1ptq__4438_5437_0 from CS::update_results 23-Aug-2008 03:12:00 [rosetta@home] [task_debug] result abinitio_homfrag_71_A_2hboA_4443_1214_0 checkpointed 23-Aug-2008 03:12:11 [rosetta@home] [task_debug] Process for abinitio_only62_A_2chf__4434_6914_0 exited 23-Aug-2008 03:12:11 [rosetta@home] [task_debug] task_state=EXITED for abinitio_only62_A_2chf__4434_6914_0 from handle_exited_app 23-Aug-2008 03:12:11 [rosetta@home] [task_debug] result state=COMPUTE_ERROR for abinitio_only62_A_2chf__4434_6914_0 from CS::report_result_error 23-Aug-2008 03:12:11 [rosetta@home] [task_debug] exit status 193 23-Aug-2008 03:12:11 [rosetta@home] Computation for task abinitio_only62_A_2chf__4434_6914_0 finished 23-Aug-2008 03:12:11 [rosetta@home] Output file abinitio_only62_A_2chf__4434_6914_0_0 for task abinitio_only62_A_2chf__4434_6914_0 absent 23-Aug-2008 03:12:11 [rosetta@home] [task_debug] result state=COMPUTE_ERROR for abinitio_only62_A_2chf__4434_6914_0 from CS::app_finished 23-Aug-2008 03:12:11 [rosetta@home] Starting abinitio_only62_A_1pgx__4438_2667_0 23-Aug-2008 03:12:12 [---] [task_debug] ACTIVE_TASK::start(): forked process: pid 4030 23-Aug-2008 03:12:12 [rosetta@home] [task_debug] task_state=EXECUTING for abinitio_only62_A_1pgx__4438_2667_0 from start 23-Aug-2008 03:12:12 [rosetta@home] Starting task abinitio_only62_A_1pgx__4438_2667_0 using minirosetta version 132 23-Aug-2008 03:12:13 [rosetta@home] [task_debug] Process for abinitio_homfrag_71_A_2hboA_4443_1214_0 exited 23-Aug-2008 03:12:13 [rosetta@home] [task_debug] task_state=EXITED for abinitio_homfrag_71_A_2hboA_4443_1214_0 from handle_exited_app 23-Aug-2008 03:12:13 [rosetta@home] [task_debug] result state=COMPUTE_ERROR for abinitio_homfrag_71_A_2hboA_4443_1214_0 from CS::report_result_error 23-Aug-2008 03:12:13 [rosetta@home] [task_debug] exit status 193 23-Aug-2008 03:12:13 [rosetta@home] Computation for task abinitio_homfrag_71_A_2hboA_4443_1214_0 finished 23-Aug-2008 03:12:13 [rosetta@home] Output file abinitio_homfrag_71_A_2hboA_4443_1214_0_0 for task abinitio_homfrag_71_A_2hboA_4443_1214_0 absent 23-Aug-2008 03:12:13 [rosetta@home] [task_debug] result state=COMPUTE_ERROR for abinitio_homfrag_71_A_2hboA_4443_1214_0 from CS::app_finished 23-Aug-2008 03:12:13 [rosetta@home] Starting abinitio_homfrag_71_A_2hl7A_4443_1633_0 23-Aug-2008 03:12:13 [---] [task_debug] ACTIVE_TASK::start(): forked process: pid 4042 23-Aug-2008 03:12:13 [rosetta@home] [task_debug] task_state=EXECUTING for abinitio_homfrag_71_A_2hl7A_4443_1633_0 from start 23-Aug-2008 03:12:13 [rosetta@home] Starting task abinitio_homfrag_71_A_2hl7A_4443_1633_0 using minirosetta version 132 23-Aug-2008 03:12:17 [rosetta@home] [task_debug] Process for abinitio_only62_A_1pgx__4438_2667_0 exited 23-Aug-2008 03:12:17 [rosetta@home] [task_debug] task_state=EXITED for abinitio_only62_A_1pgx__4438_2667_0 from handle_exited_app 23-Aug-2008 03:12:17 [rosetta@home] [task_debug] result state=COMPUTE_ERROR for abinitio_only62_A_1pgx__4438_2667_0 from CS::report_result_error 23-Aug-2008 03:12:17 [rosetta@home] [task_debug] exit status 193 23-Aug-2008 03:12:17 [rosetta@home] Computation for task abinitio_only62_A_1pgx__4438_2667_0 finished 23-Aug-2008 03:12:17 [rosetta@home] Output file abinitio_only62_A_1pgx__4438_2667_0_0 for task abinitio_only62_A_1pgx__4438_2667_0 absent 23-Aug-2008 03:12:17 [rosetta@home] [task_debug] result state=COMPUTE_ERROR for abinitio_only62_A_1pgx__4438_2667_0 from CS::app_finished 23-Aug-2008 03:12:17 [rosetta@home] Starting abinitio_only62_A_1cc8A_4438_3695_0 23-Aug-2008 03:12:18 [---] [task_debug] ACTIVE_TASK::start(): forked process: pid 4061 23-Aug-2008 03:12:18 [rosetta@home] [task_debug] task_state=EXECUTING for abinitio_only62_A_1cc8A_4438_3695_0 from start 23-Aug-2008 03:12:18 [rosetta@home] Starting task abinitio_only62_A_1cc8A_4438_3695_0 using minirosetta version 132 23-Aug-2008 03:12:21 [rosetta@home] [task_debug] Process for abinitio_homfrag_71_A_2hl7A_4443_1633_0 exited 23-Aug-2008 03:12:21 [rosetta@home] [task_debug] task_state=EXITED for abinitio_homfrag_71_A_2hl7A_4443_1633_0 from handle_exited_app 23-Aug-2008 03:12:21 [rosetta@home] [task_debug] result state=COMPUTE_ERROR for abinitio_homfrag_71_A_2hl7A_4443_1633_0 from CS::report_result_error 23-Aug-2008 03:12:21 [rosetta@home] [task_debug] exit status 193 I see my old rosetta result (my own stats) - in BOINC 5.10.45 and 5.96 rosetta client - 4 errors in month After upgrading BOINC to version 6.2.X and new minirosetta app i see many more errors if ntp is on .... Stop ntp damon - all works fine without error. I try manually run ntpdate ( not daemon, only once sync with time server) - after sync, two workunuts fails and then works again without error. I run rosetta 3 years ago and i do not know in what a problem in my system cause it - kernel, boinc manger, science app or ntp. |
10)
Message boards :
Number crunching :
Minirosetta v1.32 bug thread
(Message 55233)
Posted 23 Aug 2008 by lusvladimir Post: <core_client_version>6.2.14</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) I observed that the rosetta model I was processing failed with this error after a ntp daemon resynch on my linux mashine. System clock, when adjusted on a routine resynch, caused the running model to fail because its understanding of time steps changed outside of the model I temporarily stop ntp daemon and not see this error. |
11)
Message boards :
Number crunching :
Minirosetta v1.32 bug thread
(Message 55173)
Posted 19 Aug 2008 by lusvladimir Post: Please post bugs/issues with minirosetta v1.32 here. Debian Linux;Boinc Manager 6.2.14 errors from 1.32 tasks http://boinc.bakerlab.org/rosetta/result.php?resultid=185499939 http://boinc.bakerlab.org/rosetta/result.php?resultid=185493038 http://boinc.bakerlab.org/rosetta/result.php?resultid=185493027 http://boinc.bakerlab.org/rosetta/result.php?resultid=185493026 http://boinc.bakerlab.org/rosetta/result.php?resultid=185489375 <core_client_version>6.2.14</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> # cpu_run_time_pref: 3600 needs psipred_ss2 to run filters needs psipred_ss2 to run filters SIGSEGV: segmentation violation Stack trace (19 frames): [0x8926f8f] [0x89514e0] [0xb7f19400] [0x880c924] [0x834349c] [0x88bcc81] [0x880c5d6] [0x829591c] [0x85f20f6] [0x8072b5d] [0x807e2e7] [0x8165bee] [0x80abecc] [0x80a9ea4] [0x80d7044] [0x80d8651] [0x804b9f8] [0x89acfdc] [0x8048111] Exiting... </stderr_txt> ]]> |
12)
Message boards :
Number crunching :
Problems with version 5.90/5.91
(Message 49877)
Posted 21 Dec 2007 by lusvladimir Post: Ubuntu 7.10 and Core2Duo Progress indicators do not progress and show on my two WU's 0% and 0.014% I'm wait 5 hours - progress freeze, CPU usage - 100 at both WU's. |
©2024 University of Washington
https://www.bakerlab.org