Posts by Luigi R.

1) Message boards : Number crunching : Rosetta 4.0+ (Message 87939)
Posted 20 Dec 2017 by Luigi R.
Post:
My target CPU run time is 1 hour.
Rosetta application runs for >5 hours. Then it fails. It happened 1 week ago too.
It looks no credit will get granted for that. I mean the credit granted after ~24 hours for errors.
How can I disable receiving Rosetta workunits? I want Rosetta Mini only.

<core_client_version>7.4.22</core_client_version>
<![CDATA[
<message>
process got signal 11
</message>
<stderr_txt>
command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.06_x86_64-pc-linux-gnu @cp11v3n3_c.293.9_0001_0001_0001_0001.flags -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 1894171
Starting watchdog...
Watchdog active.
BOINC:: CPU time: 18537.5s, 14400s + 3600s[2017-12-20 19:18:14:] :: BOINC 
WARNING! cannot get file size for default.out.gz: could not open file.
Output exists: default.out.gz Size: -1
InternalDecoyCount: 0 (GZ)
-----
0
-----
Stream information inconsistent.
Writing W_0000001
======================================================
DONE ::     1 starting structures  18537.5 cpu seconds
This process generated      1 decoys from       1 attempts
======================================================
19:18:14 (10793): called boinc_finish(0)

</stderr_txt>
]]>

Link to result: https://boinc.bakerlab.org/result.php?resultid=961026207
2) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 87881)
Posted 10 Dec 2017 by Luigi R.
Post:
Same here.

https://boinc.bakerlab.org/result.php?resultid=958964042
https://boinc.bakerlab.org/result.php?resultid=958964041
3) Message boards : Number crunching : Stuck on uploading is a new problem? (Message 81488)
Posted 18 Apr 2017 by Luigi R.
Post:
Another (stuck) task got sent this night.

18-Apr-2017 04:05:28 [rosetta@home] Started upload of rb_03_23_72525_116778__t000__ab_robetta_IGNORE_THE_REST_474917_815_0_0
18-Apr-2017 04:05:46 [rosetta@home] Finished upload of rb_03_23_72525_116778__t000__ab_robetta_IGNORE_THE_REST_474917_815_0_0
18-Apr-2017 04:05:48 [rosetta@home] Sending scheduler request: To report completed tasks.
18-Apr-2017 04:05:48 [rosetta@home] Reporting 1 completed tasks
18-Apr-2017 04:05:48 [rosetta@home] Not requesting tasks: don\'t need
18-Apr-2017 04:05:51 [rosetta@home] Scheduler request completed
https://boinc.bakerlab.org/rosetta/result.php?resultid=910050184


Why are some tasks uploaded successfully after days?


Now I have 3 remaining (stuck) tasks.
4) Message boards : Number crunching : Stuck on uploading is a new problem? (Message 81479)
Posted 17 Apr 2017 by Luigi R.
Post:
By the way, I\'m still not convinced it isn\'t Windows-10-specific. Yes, there was at least one report of a similar problem on Linux, but maybe it was different. I ran one of my Linux boxes all day yesterday without getting a sticker, and my Mac remains sticker free. Firing up a different Linux box today and will let it run all day (but that\'s including a major upgrade, which confuses all issues).

My host (4 stuck task now) has Xubuntu 14.04.5, Linux kernel 3.13.0-116-generic.
5) Message boards : Number crunching : Stuck on uploading is a new problem? (Message 81473)
Posted 17 Apr 2017 by Luigi R.
Post:
A little update.

One stuck task was miraculously uploaded yesterday.

16-Apr-2017 17:15:48 [rosetta@home] Started upload of 14dslfv5_14re4np_gb_0037_0001_30_0002_SAVE_ALL_OUT_480050_322_0_0
16-Apr-2017 17:16:02 [rosetta@home] Finished upload of 14dslfv5_14re4np_gb_0037_0001_30_0002_SAVE_ALL_OUT_480050_322_0_0
16-Apr-2017 17:16:06 [rosetta@home] Sending scheduler request: To report completed tasks.
16-Apr-2017 17:16:06 [rosetta@home] Reporting 1 completed tasks
16-Apr-2017 17:16:06 [rosetta@home] Not requesting tasks: don\'t need
16-Apr-2017 17:16:09 [rosetta@home] Scheduler request completed
https://boinc.bakerlab.org/rosetta/result.php?resultid=910051019


Now I have 4 remaining tasks stuck on uploading state.
You can see that they break the \"8 KB rule\".
https://s28.postimg.org/43l918fy5/rosetta_stuck_tasks.png
Maybe 3 tasks stalled 4 times, so that\'s why uploaded file size is 32KB (=4*8KB) and not 8KB.
6) Message boards : Number crunching : Stuck on uploading is a new problem? (Message 81455)
Posted 16 Apr 2017 by Luigi R.
Post:
I\'m running LHC@Home, WCG and NumberFields too. No problem for these projects.
7) Message boards : Number crunching : Stuck on uploading is a new problem? (Message 81441)
Posted 15 Apr 2017 by Luigi R.
Post:
5 tasks stuck on uploading here.



client_state.xml

<file>
    <name>rb_03_23_72525_116778__t000__ab_robetta_IGNORE_THE_REST_474917_815_0_0</name>
    <nbytes>530178.000000</nbytes>
    <max_nbytes>25000000.000000</max_nbytes>
    <md5_cksum>221c7cf702ff15910a96060fed236335</md5_cksum>
    <status>1</status>
    <upload_url>http://srv1.bakerlab.org/rosetta_cgi/file_upload_handler</upload_url>
    <persistent_file_xfer>
        <num_retries>11</num_retries>
        <first_request_time>1492170993.617707</first_request_time>
        <next_request_time>1492253144.919740</next_request_time>
        <time_so_far>2552.406322</time_so_far>
        <last_bytes_xferred>32768.000000</last_bytes_xferred>
        <is_upload>1</is_upload>
    </persistent_file_xfer>
</file>

<file>
    <name>rb_03_23_72525_116778__t000__4_C1_SAVE_ALL_OUT_IGNORE_THE_REST_474917_259_0_0</name>
    <nbytes>887888.000000</nbytes>
    <max_nbytes>25000000.000000</max_nbytes>
    <md5_cksum>33e5d718ef03b6c814fecaa4d08c9b81</md5_cksum>
    <status>1</status>
    <upload_url>http://srv4.bakerlab.org/rosetta_cgi/file_upload_handler</upload_url>
    <persistent_file_xfer>
        <num_retries>11</num_retries>
        <first_request_time>1492169234.382923</first_request_time>
        <next_request_time>1492257568.171602</next_request_time>
        <time_so_far>2357.694327</time_so_far>
        <last_bytes_xferred>207.000000</last_bytes_xferred>
        <is_upload>1</is_upload>
    </persistent_file_xfer>
</file>

<file>
    <name>UN-NM_C4Yang_000006_2L8HC4-12_DHR32_0019.pdb_C4Yang_17_04_20_47_25_localDocking_9_SAVE_ALL_OUT_479492_23_0_0</name>
    <nbytes>22502.000000</nbytes>
    <max_nbytes>50000000.000000</max_nbytes>
    <md5_cksum>db8309e5c372f565885d1075ac2b9683</md5_cksum>
    <status>1</status>
    <upload_url>http://srv4.bakerlab.org/rosetta_cgi/file_upload_handler</upload_url>
    <persistent_file_xfer>
        <num_retries>8</num_retries>
        <first_request_time>1492198192.963454</first_request_time>
        <next_request_time>1492258150.352014</next_request_time>
        <time_so_far>2485.533146</time_so_far>
        <last_bytes_xferred>22502.000000</last_bytes_xferred>
        <is_upload>1</is_upload>
    </persistent_file_xfer>
</file>

<file>
    <name>3566f810a5e0096440dc8f17796115d2_eehee_pd1-docking_CancerImmunotherapy_17_04_13_32_36_globalDocking_4_SAVE_ALL_OUT_478149_7_0_0</name>
    <nbytes>99159.000000</nbytes>
    <max_nbytes>50000000.000000</max_nbytes>
    <md5_cksum>a72161808b6852c6bb6f86c8fc85619f</md5_cksum>
    <status>1</status>
    <upload_url>http://srv3.bakerlab.org/rosetta_cgi/file_upload_handler</upload_url>
    <persistent_file_xfer>
        <num_retries>5</num_retries>
        <first_request_time>1492214322.948390</first_request_time>
        <next_request_time>1492248107.442623</next_request_time>
        <time_so_far>2275.530723</time_so_far>
        <last_bytes_xferred>32768.000000</last_bytes_xferred>
        <is_upload>1</is_upload>
    </persistent_file_xfer>
</file>

<file>
    <name>14dslfv5_14re4np_gb_0037_0001_30_0002_SAVE_ALL_OUT_480050_322_0_0</name>
    <nbytes>337741.000000</nbytes>
    <max_nbytes>50000000.000000</max_nbytes>
    <md5_cksum>8b05674e7048a0d3632f82d93a4d9571</md5_cksum>
    <status>1</status>
    <upload_url>http://srv1.bakerlab.org/rosetta_cgi/file_upload_handler</upload_url>
    <persistent_file_xfer>
        <num_retries>5</num_retries>
        <first_request_time>1492243267.332962</first_request_time>
        <next_request_time>1492251204.143131</next_request_time>
        <time_so_far>1553.738969</time_so_far>
        <last_bytes_xferred>32768.000000</last_bytes_xferred>
        <is_upload>1</is_upload>
    </persistent_file_xfer>
</file>
8) Message boards : Number crunching : Computation Error (Message 80299)
Posted 27 Jun 2016 by Luigi R.
Post:
Sorry, I don\'t know. It\'s weird some successes occurred too.
9) Message boards : Number crunching : Rosetta Mini for Android is not available for your type of computer. (Message 80297)
Posted 27 Jun 2016 by Luigi R.
Post:
I can meet the deadlines.

That\'s good, but a bit surprising if you hold a 7-day buffer but half the tasks have a deadline of 2 or 5 days rather than the more usual 14 days


Deadline of 2 is rare.
5 days is enough to complete 240 4-hour tasks.
If I set 7 days of work queue, it\'s not my mistake whether BOINC software downloads work that\'s expiring in 5 days. Indeed very few tasks were going to expire in 5 days.
As a last resort if I want to meet deadlines and I can\'t do it, I could switch to 1-hour configuration and update preferences. Then all the starting tasks will get cpu_run_time_pref=3600. In my case the required time to complete tasks would be divided by 4.

Finally I don\'t think my \'24/7 effort\' is comparable to server work to generate tasks and according to me a deadline of 5 days doesn\'t mean \"hurry up!\". Some administrators of other projects say \"feel free to abort tasks if you need\" too.
I don\'t know if a participation by this behaviour could damage Rosetta@home project, but I think not.
10) Message boards : Number crunching : Stuck on Uploading (Message 80290)
Posted 25 Jun 2016 by Luigi R.
Post:
I am running BOINC on Linux machines headless. After updating /etc/hosts with 128.95.160.145 srv1.bakerlab.org they still weren\'t able to upload recent work. I control BOINC through boinctui but couldn\'t find a way to force retrying of upload. The solution I found was to restart BOINC after the hosts file edit, then the uploads retried, and successfully uploaded. On a recent Debian system or derivative (Ubuntu and more), BOINC is restarted from the terminal by running sudo systemctl restart boinc.

Yes, on Xubuntu you didn\'t need restarting BOINC to get hosts modifications effective and I\'m not running BOINC as a service.

You could have tried
$boinccmd --get_file_transfers

Then
$boinccmd --file_transfer http://boinc.bakerlab.org/rosetta $filename retry

where $boinccmd is your boinccmd command including path and $filename is the result you want try to reupload.

Or simply
$boinccmd --set_network_mode never
$boinccmd --set_network_mode auto

and retry should be automatic.

See boinccmd.


Without making any changes to any hosts files and simply waiting (While still crunching at least), all my completed tasks finally uploaded and I got credit for them. Looks like they fixed it.

That solution was recommended for people having near-deadline tasks. Some robetta (rb_*) tasks were expiring on 24/06.
11) Message boards : Number crunching : Computation Error (Message 80289)
Posted 25 Jun 2016 by Luigi R.
Post:
It looks like there is no db_set3_7res_android_d_c.20.10_0001_SAVE_ALL_OUT_344080_2094_1_0 file. Maybe no read permissions for some reason?
12) Message boards : Number crunching : Stuck on Uploading (Message 80275)
Posted 24 Jun 2016 by Luigi R.
Post:
99% of the users will forget that they have made this change an will later have another problems only caused by this solution!

Maybe not. These servers will be both online, so there could be no difference in the future. For forgetful users it could be fine to edit hosts file to upload only and to comment that line right after upload process.

I\'m agree we should not do any special configuration. We should run BOINC only.

Not our problem, but we want our work to get validated too.
13) Message boards : Number crunching : Stuck on Uploading (Message 80273)
Posted 24 Jun 2016 by Luigi R.
Post:
..., we\'re all on the same team here man :)

No! That is wrong, they need us, we don\'t.
They get money for the job. I spend money, for hardware and for power. I spend my spare time for maintenance of my systems to keep them always crunching.
Later, in an hopefully no so far future, if the first results come to the market, pay i again to get the medicine or whatever.

I share this point, but - you know - BOINC is plenty of medicine projects. If Rosetta@home is not always reliable or doesn\'t meet your demands, you can choose another project to place side by side or to replace it.

For example I don\'t run Rosetta@home tasks very much cause of inefficiency, but it is off-topic. Anyway 3.73 app sounds more cpu-intensive to me.
14) Message boards : Number crunching : Stuck on Uploading (Message 80272)
Posted 24 Jun 2016 by Luigi R.
Post:
Change of hosts.txt is a stupid idea! There should be another way.

No, it\'s not stupid. It worked fine for me. I uploaded and got validated 14 4-hour tasks, that\'s 56 hours of computing.

On Linux is /etc/hosts.

Replace of \"srv1.bakerlab.org\" in client_state.xml is useless. Whenever i restart the boinc manager is it again there. I can\'t find from where boinc restore this.
i have it also replace in client_state_prev.xml.

Yep, that is what I was worried about.

Thanks your lazy guys have i now two WU\'s lost! 24 hours of work for nothing.

See above.
15) Message boards : Number crunching : Stuck on Uploading (Message 80267)
Posted 24 Jun 2016 by Luigi R.
Post:
Thanks Timo! I just tried it and it works for uploads.

Here is the line to add to Windows hosts file to work around the server that is not responding. If you don\'t know what to do with this information, it would be best to just wait for the issue to be resolved on the server side.

128.95.160.145 srv1.bakerlab.org

Thanks, it works on linux *buntu too.

Put that line in /etc/hosts.
16) Message boards : Number crunching : Stuck on Uploading (Message 80265)
Posted 24 Jun 2016 by Luigi R.
Post:
What about manually modifying client_state.xml? I\'m wondering if it could be safe or have some effect something like...

***PLEASE DON\'T DO THIS***
Replace
<upload_url>http://srv1.bakerlab.org/rosetta_cgi/file_upload_handler</upload_url>

with
<upload_url>http://srv4.bakerlab.org/rosetta_cgi/file_upload_handler</upload_url>
17) Message boards : Number crunching : Stuck on Uploading (Message 80232)
Posted 23 Jun 2016 by Luigi R.
Post:
Yes, I answered in your previous thread.

https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6849&nowrap=true#80231
18) Message boards : Number crunching : Stuck on \"Ready to Report\" (Message 80231)
Posted 23 Jun 2016 by Luigi R.
Post:
Currently my host is not uploading any result.

23-Jun-2016 11:45:49 [rosetta@home] Started upload of FFD__462b8a0655b364e9452ad1a759651ef7_abinitioDocking_16_06_16_35_36_globalDocking_4_SAVE_ALL_OUT_383173_1_0_0
23-Jun-2016 11:47:49 [---] Project communication failed: attempting access to reference site
23-Jun-2016 11:47:49 [rosetta@home] Temporarily failed upload of FFD__462b8a0655b364e9452ad1a759651ef7_abinitioDocking_16_06_16_35_36_globalDocking_4_SAVE_ALL_OUT_383173_1_0_0: transient HTTP error
23-Jun-2016 11:47:49 [rosetta@home] Backing off 04:19:42 on upload of FFD__462b8a0655b364e9452ad1a759651ef7_abinitioDocking_16_06_16_35_36_globalDocking_4_SAVE_ALL_OUT_383173_1_0_0
23-Jun-2016 11:47:51 [---] Internet access OK - project servers may be temporarily down.
19) Message boards : Number crunching : Rosetta Mini for Android is not available for your type of computer. (Message 80228)
Posted 23 Jun 2016 by Luigi R.
Post:
I can meet the deadlines.
20) Message boards : Number crunching : Rosetta Mini for Android is not available for your type of computer. (Message 80221)
Posted 22 Jun 2016 by Luigi R.
Post:
Solution: Drop the support for smartphones and tablets and concentrate on enabling AVX2. This should have a much higher priority.

Totally agree.


Anyway a temporary \"solution\":
-block work request of all project except Rosetta
-increase queue days (e.g. 7)
-flood rosetta server for 2 hours (a request every 4 minutes)


Now I have got more than 400 tasks and my host will not get dried.


Next 20



©2019 University of Washington
http://www.bakerlab.org