21)
Message boards :
Number crunching :
Report long-running models here
(Message 59760)
Posted 23 Feb 2009 by AdeB Post: Preferred runtime + 4 hrs for workunit loopbuild_mamaln_ideal_hb_t312__IGNORE_THE_REST_1zjc_1_7634_42_0. stderr out: ... BOINC:: Worker startup. Starting watchdog... Watchdog active. # cpu_run_time_pref: 43200 Hbond tripped. ====> called boinc_finish ... AdeB |
22)
Message boards :
Number crunching :
Report long-running models here
(Message 59195)
Posted 31 Jan 2009 by AdeB Post: Another one that was stopped after preferred runtime + 4hrs. CPU time 57846.37 stderr out: ... Starting watchdog... Watchdog active. Starting work on structure: _00001 # cpu_run_time_pref: 43200 Starting work on structure: _00002 ====> called boinc_finish ... AdeB |
23)
Message boards :
Number crunching :
Report long-running models here
(Message 59155)
Posted 29 Jan 2009 by AdeB Post: Here is one where the watchdog stepped in, it was stopped after 16hrs with a 12hr preference: task: 224203414 CPU time 57633.47 stderr out: ... AdeB |
24)
Message boards :
Number crunching :
Problems with Minirosetta v1.54
(Message 59153)
Posted 29 Jan 2009 by AdeB Post: This task was aborted after my preferred runtime + 4 hours. It was working on the 3th model. stderr out: ... AdeB |
25)
Message boards :
Number crunching :
Report long-running models here
(Message 58661)
Posted 7 Jan 2009 by AdeB Post: I aborted lr5_score12_rlbd_2fls_IGNORE_THE_REST_DECOY_5559_1293_1 after running for more than 30 hours. AdeB |
26)
Message boards :
Number crunching :
Report long-running models here
(Message 58543)
Posted 5 Jan 2009 by AdeB Post: AdeB, since that single model has been running longer then 6 hours. I would suggest you abort it... After another crash the task has been aborted. |
27)
Message boards :
Number crunching :
Report long-running models here
(Message 58508)
Posted 4 Jan 2009 by AdeB Post: long-running models: 1nkuA_BOINC_MPZN_with_zinc_abrelax_6130_134659_0 took more than 3x my preferred time (which is 12 hours) 1nkuA_BOINC_MPZN_with_zinc_abrelax_6130_188618_1 is still running. I am the second one to try this workunit, the first time there was an error because there were too many restarts. Yesterday is saw that the CPU time was over 13 hours, when i tried to look at the graphics it crashed. Today (after crunching for some other projects) it restarted at 6 hours. This time the graphics worked fine, but it took 20 minutes to go from 'model 1 step 203980' to 'model 1 step 203991'. So, what to do? How many steps are there in a model? Should i let it run because it is almost finished, or abort it because there is no way i can finish this model? AdeB |
28)
Message boards :
Number crunching :
Rosetta adds 100,000th host!
(Message 58207)
Posted 28 Dec 2008 by AdeB Post: hosts and users are going up and teraflops are going down as of dec 22.. id like to see a way on the home page to post only active hosts within 30 or 60 days it hard to see where we stand as far as active hosts and users Check this: users and hosts. Both numbers are dropping. |
29)
Message boards :
Number crunching :
Expired deadline
(Message 57925)
Posted 16 Dec 2008 by AdeB Post: This issue of reissuing a task to someone that may not receive credit for it has been on the BOINC "to do" list for over a year. Here is a link to the trac item: Server reissues task more then "total" specified Looks like someone stepped in and granted credit for the task. I hope it was also possible to save the results, because that's what its all about. |
30)
Message boards :
Number crunching :
Expired deadline
(Message 57899)
Posted 15 Dec 2008 by AdeB Post: This issue of reissuing a task to someone that may not receive credit for it has been on the BOINC "to do" list for over a year. Here is a link to the trac item: Server reissues task more then "total" specified I think that you are right that 'no reply' doesn't count as an error. But it should not be send to a third computer, because then there will be a validate error as the number of tasks exceeds the maximum number of tasks: max # of error/total/success tasks [b]1, [color=red]2[/color], 1[/b] |
31)
Message boards :
Number crunching :
Expired deadline
(Message 57887)
Posted 15 Dec 2008 by AdeB Post: This issue of reissuing a task to someone that may not receive credit for it has been on the BOINC "to do" list for over a year. Here is a link to the trac item: Server reissues task more then "total" specified I guess this hasn't been resolved yet. In this workunit it is clear that at the moment it was send to one of my computers there were too many total results. And though my computer crunched it for more then 11 hours, and it finished without any errors, it could never validate. Bad luck for me, and a complete waste of CPU-time. AdeB |
32)
Message boards :
Number crunching :
Minirosetta v1.45 bug thread
(Message 57702)
Posted 8 Dec 2008 by AdeB Post: ERROR: Illegal value for integer option -run:jran specified: in workunit 1g73A_ZNMP_ABRELAX_tetraR_IGNORE_THE_REST_ZINC_METALLOPROTEIN-1g73A-_5476_258_1 AdeB |
33)
Message boards :
Number crunching :
Minirosetta v1.40 bug thread
(Message 57282)
Posted 27 Nov 2008 by AdeB Post: for the team to know what is going on, please post your affected work units links in your next message. FalconFly, i noticed that you are crunching for LHC@home as well. It might be that LHC@home is causing your crashes. I've had some crashes too this week. Next time it happens check your boinc.log file, the last message there, before SIGSEGV and the stack trace, is probably: [lhcathome] Scheduler request A few weeks ago this has also been mentioned by several people in the LHC@home message boards. AdeB |
34)
Message boards :
Number crunching :
Rosetta Mini with new score terms bug thread
(Message 56657)
Posted 3 Nov 2008 by AdeB Post: No problems here: linux, AMD Athlon XP |
35)
Message boards :
Number crunching :
Problems with Rosetta version 5.98
(Message 56039)
Posted 26 Sep 2008 by AdeB Post: This workunit is valid but stderr out is enormous: <core_client_version>5.10.45</core_client_version> <![CDATA[ <stderr_txt> Graphics are disabled due to configuration... # cpu_run_time_pref: 43200 # random seed: 2792818 sin_cos_range ERROR: nan is outside of [-1,+1] sin and cos value legal range sin_cos_range ERROR: nan is outside of [-1,+1] sin and cos value legal range sin_cos_range ERROR: nan is outside of [-1,+1] sin and cos value legal range . /// This line is repeated 516 times /// . sin_cos_range ERROR: nan is outside of [-1,+1] sin and cos value legal range sin_cos_range ERROR: nan is outside of [-1,+1] sin and cos value legal range sin_cos_range ERROR: nan is outside of [-1,+1] sin and cos value legal range ====================================================== DONE :: 1 starting structures 43239.7 cpu seconds This process generated 45 decoys from 45 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... called boinc_finish </stderr_txt> ]]> |
36)
Message boards :
Number crunching :
Minirosetta v1.32 bug thread
(Message 55207)
Posted 21 Aug 2008 by AdeB Post: Compute error in this workunit. stderr out: <core_client_version>5.10.45</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> <stderr_txt> # cpu_run_time_pref: 43200 ERROR: NANs occured in hbonding! ERROR:: Exit from: src/core/scoring/hbonds/hbonds_geom.cc line: 763 called boinc_finish </stderr_txt> ]]> |
37)
Message boards :
Number crunching :
Problems with Rosetta version 5.93
(Message 51008)
Posted 26 Jan 2008 by AdeB Post: resultid=135831728 Oh no, you did get 20. You should have got at least an extra 100 for all the effort you put into it. |
38)
Message boards :
Number crunching :
Problems with Rosetta version 5.93
(Message 50995)
Posted 26 Jan 2008 by AdeB Post: sorry for the triple-post. I had some problems with my connection. |
39)
Message boards :
Number crunching :
Problems with Rosetta version 5.93
(Message 50994)
Posted 26 Jan 2008 by AdeB Post: Strange. hedera received 88 of his 98 claimed for his watchdog ended task resultid=135513724. I wonder what the difference was? And i received 92 of 94 claimed for resultid 135481414. I hope Astro gets more than 20 credits for his job, but it probably won't be 400+. |
40)
Message boards :
Number crunching :
Problems with Rosetta version 5.93
(Message 50993)
Posted 26 Jan 2008 by AdeB Post: Strange. hedera received 88 of his 98 claimed for his watchdog ended task resultid=135513724. I wonder what the difference was? And i received 92 of 94 claimed for resultid 135481414. I hope Astro gets more than 20 credits for his job, but it probably won't be 400+. |
©2024 University of Washington
https://www.bakerlab.org