1)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 107280)
Posted 12 Oct 2022 by Luigi R. Post: Validation is running. Workunits waiting for validation 2084120 Workunits waiting for assimilation 10578 Workunits waiting for file deletion 2 After about 20 minutes. Workunits waiting for validation 1970571 Workunits waiting for assimilation 415 Workunits waiting for file deletion 176 Will validator win against crunchers? D: |
2)
Message boards :
Number crunching :
Tells us your thoughts on granting credit for large protein, long-running tasks
(Message 94982)
Posted 20 Apr 2020 by Luigi R. Post: R@h adapts to changing requirements. With these new large protein models coming soon, tagged with a 4GB memory bound, and with models that may take several hours to run, enough that the watchdog has been extended from its normal 4hours to 10 hours, it seems credit may need some changes as well. if( maxMemoryUsed > 1) { grantedCredits = normalCredits * maxMemoryUsed; } else { grantedCredits = normalCredits; } E.g. a host gets 40cr/h. If it uses up to3.5GB of memory, you pay it 40*3.5=140cr/h. |
3)
Message boards :
Number crunching :
Reported a bit late and invalid
(Message 94217)
Posted 12 Apr 2020 by Luigi R. Post: Is server marking tasks as invalid because results are reported some minutes/hours after deadline? How come? rb_04_08_20667_20541_ab_t000__h002_robetta_IGNORE_THE_REST_04_08_905771_13_0 https://boinc.bakerlab.org/rosetta/result.php?resultid=1143484478 <core_client_version>7.9.3</core_client_version> <![CDATA[ <stderr_txt> command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.12_x86_64-pc-linux-gnu @rb_04_08_20667_20541_ab_t000__h002_robetta_FLAGS -in::file::fasta t000__h002.fasta -in:file:boinc_wu_zip rb_04_08_20667_20541_ab_t000__h002_robetta.zip -frag3 rb_04_08_20667_20541_ab_t000__h002_robetta.200.3mers.index.gz -fragA rb_04_08_20667_20541_ab_t000__h002_robetta.200.8mers.index.gz -fragB rb_04_08_20667_20541_ab_t000__h002_robetta.200.4mers.index.gz -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 1868176 Starting watchdog... Watchdog active. ====================================================== DONE :: 1 starting structures 9816.2 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== BOINC :: WS_max 3.79814e+08 BOINC :: Watchdog shutting down... 21:32:46 (18543): called boinc_finish(0) </stderr_txt> ]]> rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta_IGNORE_THE_REST_05_11_905767_20_0 https://boinc.bakerlab.org/rosetta/result.php?resultid=1143482427 <core_client_version>7.9.3</core_client_version> <![CDATA[ <stderr_txt> command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.12_x86_64-pc-linux-gnu @rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta_FLAGS -in::file::fasta t000__h002.fasta -in:file:boinc_wu_zip rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.zip -frag3 rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.200.3mers.index.gz -fragA rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.200.11mers.index.gz -fragB rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.200.5mers.index.gz -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 1872149 Starting watchdog... Watchdog active. ====================================================== DONE :: 1 starting structures 25995.4 cpu seconds This process generated 5 decoys from 5 attempts ====================================================== BOINC :: WS_max 4.65002e+08 BOINC :: Watchdog shutting down... 20:17:00 (16382): called boinc_finish(0) </stderr_txt> ]]> rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta_IGNORE_THE_REST_04_08_905767_20_0 https://boinc.bakerlab.org/rosetta/result.php?resultid=1143482437 <core_client_version>7.9.3</core_client_version> <![CDATA[ <stderr_txt> command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.12_x86_64-pc-linux-gnu @rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta_FLAGS -in::file::fasta t000__h002.fasta -in:file:boinc_wu_zip rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.zip -frag3 rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.200.3mers.index.gz -fragA rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.200.8mers.index.gz -fragB rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.200.4mers.index.gz -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 1872449 Starting watchdog... Watchdog active. ====================================================== DONE :: 1 starting structures 24984.6 cpu seconds This process generated 6 decoys from 6 attempts ====================================================== BOINC :: WS_max 5.80346e+08 BOINC :: Watchdog shutting down... 21:40:25 (16842): called boinc_finish(0) </stderr_txt> ]]> rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta_IGNORE_THE_REST_06_05_905767_20_0 https://boinc.bakerlab.org/rosetta/result.php?resultid=1143482499 <core_client_version>7.9.3</core_client_version> <![CDATA[ <stderr_txt> command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.12_x86_64-pc-linux-gnu @rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta_FLAGS -in::file::fasta t000__h002.fasta -in:file:boinc_wu_zip rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.zip -frag3 rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.200.3mers.index.gz -fragA rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.200.5mers.index.gz -fragB rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.200.6mers.index.gz -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 1872769 Starting watchdog... Watchdog active. ====================================================== DONE :: 1 starting structures 28015.9 cpu seconds This process generated 6 decoys from 6 attempts ====================================================== BOINC :: WS_max 4.64642e+08 BOINC :: Watchdog shutting down... 20:47:18 (16224): called boinc_finish(0) </stderr_txt> ]]> rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta_IGNORE_THE_REST_08_05_905767_20_0 https://boinc.bakerlab.org/rosetta/result.php?resultid=1143482500 <core_client_version>7.9.3</core_client_version> <![CDATA[ <stderr_txt> command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.12_x86_64-pc-linux-gnu @rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta_FLAGS -in::file::fasta t000__h002.fasta -in:file:boinc_wu_zip rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.zip -frag3 rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.200.3mers.index.gz -fragA rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.200.5mers.index.gz -fragB rb_04_08_20790_20542_abCOVID-19_t000__h002_robetta.200.8mers.index.gz -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 1872729 Starting watchdog... Watchdog active. ====================================================== DONE :: 1 starting structures 24192.6 cpu seconds This process generated 5 decoys from 5 attempts ====================================================== BOINC :: WS_max 5.64347e+08 BOINC :: Watchdog shutting down... 21:27:31 (16886): called boinc_finish(0) </stderr_txt> ]]> |
4)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 93396)
Posted 4 Apr 2020 by Luigi R. Post: Not sure what retroactive aspect you are talking about. This is not mentioned in the post you referenced. Well, there was a thread about this issue and you referred to admin's post. https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13672&postid=92969#92969 Maybe I should have posted there. |
5)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 93387)
Posted 4 Apr 2020 by Luigi R. Post: Not sure what retroactive aspect you are talking about. This is not mentioned in the post you referenced. Covid-19 wu https://boinc.bakerlab.org/rosetta/result.php?resultid=1136817690 Only 2cr today, although I downloaded on 29 Mar 2020. I expected no problem with credits, but I was wrong. |
6)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 93382)
Posted 4 Apr 2020 by Luigi R. Post: @bcov: Why is low-credit granting retroactive? I downloaded a bunch of covid-19 tasks on the 29th of March. I'm reporting now and getting 2cr. instead of ~200cr. I thought issue would affect only new tasks downloaded after the 31st of March. My app version is 4.08. |
7)
Message boards :
Number crunching :
Stalled downloads
(Message 92013)
Posted 16 Mar 2020 by Luigi R. Post: I've only encountered stuck download queue behavior on my Linux boxes, where BOINC still thinks there are stalled downloads, despite me having cleared them. A restart of the boincmgr fixes that issue. Same here. It fixes itself after some minutes or hours too. |
8)
Message boards :
Number crunching :
Computation errors: rb_02_25_16883_16706_ab_t000__robetta_cstwt_5.0_xxxx
(Message 91893)
Posted 7 Mar 2020 by Luigi R. Post: Errors on Ubuntu 18.04 too. BOINC didn't crash though. rb_02_23_16774_16587_ab_t000__robetta_cstwt_5.0_FT_IGNORE_THE_REST_03_07_899045_59_1 rb_02_24_16778_16590_ab_t000__robetta_cstwt_5.0_FT_IGNORE_THE_REST_07_07_899052_43_1 |
9)
Message boards :
Number crunching :
Computation errors: rb_02_25_16883_16706_ab_t000__robetta_cstwt_5.0_xxxx
(Message 91889)
Posted 7 Mar 2020 by Luigi R. Post: On Ubuntu 18.04 there are no problems to run *cstwt_5.0* tasks. Ubuntu 18.04.4 LTS, kernel 4.15.0-88-generic BOINC v7.9.3 OK Ubuntu 14.04.6 LTS, kernel 4.4.0-142-generic BOINC v7.2.42 Dangerous |
10)
Message boards :
Number crunching :
Computation errors: rb_02_25_16883_16706_ab_t000__robetta_cstwt_5.0_xxxx
(Message 91882)
Posted 6 Mar 2020 by Luigi R. Post: I'm going to abort all *cstwt_5.0* tasks by bash on Linux to guarantee my contribution to R@H. Here it is my script: https://pastebin.com/RKdZKhGx |
11)
Message boards :
Number crunching :
Computation errors: rb_02_25_16883_16706_ab_t000__robetta_cstwt_5.0_xxxx
(Message 91863)
Posted 4 Mar 2020 by Luigi R. Post: Do you mean Mod.Sense? He is a great guy, but he is NOT an admin.Here he is. https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13510&postid=91696#91696 https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13510&postid=91703#91703 If you read forums, the "cstwt_5.0" wus has problems since February 2019.I see. I think I have already encountered this issue, but I didn't remember it at all. BOINC client stops to respond and you can't even kill it. Although my client is standalone and user's process ( not a service), you have to kill as superuser. |
12)
Message boards :
Number crunching :
Computation errors: rb_02_25_16883_16706_ab_t000__robetta_cstwt_5.0_xxxx
(Message 91860)
Posted 4 Mar 2020 by Luigi R. Post: Well, I see that Admin answers on Number Crunching threads. As volunteer, I can spend my time to arrange a solution to abort all tasks named like "*cstwt*". |
13)
Message boards :
Number crunching :
Computation errors: rb_02_25_16883_16706_ab_t000__robetta_cstwt_5.0_xxxx
(Message 91857)
Posted 4 Mar 2020 by Luigi R. Post: I'm on Linux, BOINC got freezed because of task rb_02_21_16595_16419_ab_t000__robetta_cstwt_5.0_FT_IGNORE_THE_REST_05_05_896595_65_1. After 8 hours, I found that my host is at idle! This is very bad. Please check it, it's not acceptable that a task blocks crunching on all DC projects. Sadly, I set no more work on R@H. |
14)
Message boards :
Number crunching :
Rosetta 4.0+
(Message 87939)
Posted 20 Dec 2017 by Luigi R. Post: My target CPU run time is 1 hour. Rosetta application runs for >5 hours. Then it fails. It happened 1 week ago too. It looks no credit will get granted for that. I mean the credit granted after ~24 hours for errors. How can I disable receiving Rosetta workunits? I want Rosetta Mini only. <core_client_version>7.4.22</core_client_version> <![CDATA[ <message> process got signal 11 </message> <stderr_txt> command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.06_x86_64-pc-linux-gnu @cp11v3n3_c.293.9_0001_0001_0001_0001.flags -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 1894171 Starting watchdog... Watchdog active. BOINC:: CPU time: 18537.5s, 14400s + 3600s[2017-12-20 19:18:14:] :: BOINC WARNING! cannot get file size for default.out.gz: could not open file. Output exists: default.out.gz Size: -1 InternalDecoyCount: 0 (GZ) ----- 0 ----- Stream information inconsistent. Writing W_0000001 ====================================================== DONE :: 1 starting structures 18537.5 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== 19:18:14 (10793): called boinc_finish(0) </stderr_txt> ]]> Link to result: https://boinc.bakerlab.org/result.php?resultid=961026207 |
15)
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
(Message 87881)
Posted 10 Dec 2017 by Luigi R. Post: Same here. https://boinc.bakerlab.org/result.php?resultid=958964042 https://boinc.bakerlab.org/result.php?resultid=958964041 |
16)
Message boards :
Number crunching :
Stuck on uploading is a new problem?
(Message 81488)
Posted 18 Apr 2017 by Luigi R. Post: Another (stuck) task got sent this night. 18-Apr-2017 04:05:28 [rosetta@home] Started upload of rb_03_23_72525_116778__t000__ab_robetta_IGNORE_THE_REST_474917_815_0_0 18-Apr-2017 04:05:46 [rosetta@home] Finished upload of rb_03_23_72525_116778__t000__ab_robetta_IGNORE_THE_REST_474917_815_0_0 18-Apr-2017 04:05:48 [rosetta@home] Sending scheduler request: To report completed tasks. 18-Apr-2017 04:05:48 [rosetta@home] Reporting 1 completed tasks 18-Apr-2017 04:05:48 [rosetta@home] Not requesting tasks: don't need 18-Apr-2017 04:05:51 [rosetta@home] Scheduler request completedhttps://boinc.bakerlab.org/rosetta/result.php?resultid=910050184 Why are some tasks uploaded successfully after days? Now I have 3 remaining (stuck) tasks. |
17)
Message boards :
Number crunching :
Stuck on uploading is a new problem?
(Message 81479)
Posted 17 Apr 2017 by Luigi R. Post: By the way, I'm still not convinced it isn't Windows-10-specific. Yes, there was at least one report of a similar problem on Linux, but maybe it was different. I ran one of my Linux boxes all day yesterday without getting a sticker, and my Mac remains sticker free. Firing up a different Linux box today and will let it run all day (but that's including a major upgrade, which confuses all issues). My host (4 stuck task now) has Xubuntu 14.04.5, Linux kernel 3.13.0-116-generic. |
18)
Message boards :
Number crunching :
Stuck on uploading is a new problem?
(Message 81473)
Posted 17 Apr 2017 by Luigi R. Post: A little update. One stuck task was miraculously uploaded yesterday. 16-Apr-2017 17:15:48 [rosetta@home] Started upload of 14dslfv5_14re4np_gb_0037_0001_30_0002_SAVE_ALL_OUT_480050_322_0_0 16-Apr-2017 17:16:02 [rosetta@home] Finished upload of 14dslfv5_14re4np_gb_0037_0001_30_0002_SAVE_ALL_OUT_480050_322_0_0 16-Apr-2017 17:16:06 [rosetta@home] Sending scheduler request: To report completed tasks. 16-Apr-2017 17:16:06 [rosetta@home] Reporting 1 completed tasks 16-Apr-2017 17:16:06 [rosetta@home] Not requesting tasks: don't need 16-Apr-2017 17:16:09 [rosetta@home] Scheduler request completedhttps://boinc.bakerlab.org/rosetta/result.php?resultid=910051019 Now I have 4 remaining tasks stuck on uploading state. You can see that they break the "8 KB rule". https://s28.postimg.org/43l918fy5/rosetta_stuck_tasks.png Maybe 3 tasks stalled 4 times, so that's why uploaded file size is 32KB (=4*8KB) and not 8KB. |
19)
Message boards :
Number crunching :
Stuck on uploading is a new problem?
(Message 81455)
Posted 16 Apr 2017 by Luigi R. Post: I'm running LHC@Home, WCG and NumberFields too. No problem for these projects. |
20)
Message boards :
Number crunching :
Stuck on uploading is a new problem?
(Message 81441)
Posted 15 Apr 2017 by Luigi R. Post: 5 tasks stuck on uploading here. client_state.xml <file> <name>rb_03_23_72525_116778__t000__ab_robetta_IGNORE_THE_REST_474917_815_0_0</name> <nbytes>530178.000000</nbytes> <max_nbytes>25000000.000000</max_nbytes> <md5_cksum>221c7cf702ff15910a96060fed236335</md5_cksum> <status>1</status> <upload_url>http://srv1.bakerlab.org/rosetta_cgi/file_upload_handler</upload_url> <persistent_file_xfer> <num_retries>11</num_retries> <first_request_time>1492170993.617707</first_request_time> <next_request_time>1492253144.919740</next_request_time> <time_so_far>2552.406322</time_so_far> <last_bytes_xferred>32768.000000</last_bytes_xferred> <is_upload>1</is_upload> </persistent_file_xfer> </file> <file> <name>rb_03_23_72525_116778__t000__4_C1_SAVE_ALL_OUT_IGNORE_THE_REST_474917_259_0_0</name> <nbytes>887888.000000</nbytes> <max_nbytes>25000000.000000</max_nbytes> <md5_cksum>33e5d718ef03b6c814fecaa4d08c9b81</md5_cksum> <status>1</status> <upload_url>http://srv4.bakerlab.org/rosetta_cgi/file_upload_handler</upload_url> <persistent_file_xfer> <num_retries>11</num_retries> <first_request_time>1492169234.382923</first_request_time> <next_request_time>1492257568.171602</next_request_time> <time_so_far>2357.694327</time_so_far> <last_bytes_xferred>207.000000</last_bytes_xferred> <is_upload>1</is_upload> </persistent_file_xfer> </file> <file> <name>UN-NM_C4Yang_000006_2L8HC4-12_DHR32_0019.pdb_C4Yang_17_04_20_47_25_localDocking_9_SAVE_ALL_OUT_479492_23_0_0</name> <nbytes>22502.000000</nbytes> <max_nbytes>50000000.000000</max_nbytes> <md5_cksum>db8309e5c372f565885d1075ac2b9683</md5_cksum> <status>1</status> <upload_url>http://srv4.bakerlab.org/rosetta_cgi/file_upload_handler</upload_url> <persistent_file_xfer> <num_retries>8</num_retries> <first_request_time>1492198192.963454</first_request_time> <next_request_time>1492258150.352014</next_request_time> <time_so_far>2485.533146</time_so_far> <last_bytes_xferred>22502.000000</last_bytes_xferred> <is_upload>1</is_upload> </persistent_file_xfer> </file> <file> <name>3566f810a5e0096440dc8f17796115d2_eehee_pd1-docking_CancerImmunotherapy_17_04_13_32_36_globalDocking_4_SAVE_ALL_OUT_478149_7_0_0</name> <nbytes>99159.000000</nbytes> <max_nbytes>50000000.000000</max_nbytes> <md5_cksum>a72161808b6852c6bb6f86c8fc85619f</md5_cksum> <status>1</status> <upload_url>http://srv3.bakerlab.org/rosetta_cgi/file_upload_handler</upload_url> <persistent_file_xfer> <num_retries>5</num_retries> <first_request_time>1492214322.948390</first_request_time> <next_request_time>1492248107.442623</next_request_time> <time_so_far>2275.530723</time_so_far> <last_bytes_xferred>32768.000000</last_bytes_xferred> <is_upload>1</is_upload> </persistent_file_xfer> </file> <file> <name>14dslfv5_14re4np_gb_0037_0001_30_0002_SAVE_ALL_OUT_480050_322_0_0</name> <nbytes>337741.000000</nbytes> <max_nbytes>50000000.000000</max_nbytes> <md5_cksum>8b05674e7048a0d3632f82d93a4d9571</md5_cksum> <status>1</status> <upload_url>http://srv1.bakerlab.org/rosetta_cgi/file_upload_handler</upload_url> <persistent_file_xfer> <num_retries>5</num_retries> <first_request_time>1492243267.332962</first_request_time> <next_request_time>1492251204.143131</next_request_time> <time_so_far>1553.738969</time_so_far> <last_bytes_xferred>32768.000000</last_bytes_xferred> <is_upload>1</is_upload> </persistent_file_xfer> </file> |
©2023 University of Washington
https://www.bakerlab.org