Posts by Aurum

1) Message boards : Number crunching : Rosetta Beta 6.00 (Message 108781)
Posted 26 Dec 2023 by Aurum
Post:
I use an APP_CONFIG to run all my projects. I can set to run just 1 beta WU, never tried to set max tasks to 0.
Right now i'm running Milky Way until Beta is more stable.

<max_concurrent>Zero</max_concurrent> = <max_concurrent>Infinity</max_concurrent>


I don't know what your settings are but I'm getting nearly 100% valid tasks here on my both Windows and Linux pc;s.
Me too. Send more betas.
My coment was meant to say that the syntax <max_concurrent>0</max_concurrent> is meaningless to BOINC.
2) Message boards : Number crunching : Rosetta Beta 6.00 (Message 108774)
Posted 20 Dec 2023 by Aurum
Post:
I use an APP_CONFIG to run all my projects. I can set to run just 1 beta WU, never tried to set max tasks to 0.
Right now i'm running Milky Way until Beta is more stable.

<max_concurrent>Zero</max_concurrent> = <max_concurrent>Infinity</max_concurrent>
3) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 107083)
Posted 4 Oct 2022 by Aurum
Post:
Was shocked at how many processes were running with GPU Python.
Anyway...I'll dump GPU Grid when it finishes and see how the system balances out after that.
PythonGPU works good if run right. Do not try to run it on a CPU with less than 32 threads. I've tried 24 threads and it's very slow.
Run 2 PythonGPU WUs and nothing else is best. The 2 PythonGPU WUs play well together and can share those CPU threads. If you try to run a different project with a PythonGPU WU it'll have annoying quirks. Since most of the work is actually done on the CPU it can use less powerful GPUs, e.g. 1080, than the acemd4 WU needs. Here's the app_config I use on an i9-10980XE with a 3060 Ti:
<app_config>
<!-- i9-10980XE   18c36t   2x16=32 GB   L3 Cache 24.75 MB  3060 Ti -->
    <app>
        <name>PythonGPU</name>
        <plan_class>cuda1131</plan_class>
        <gpu_versions>
            <cpu_usage>32</cpu_usage>
            <gpu_usage>0.5</gpu_usage>
        </gpu_versions>
        <max_concurrent>2</max_concurrent>
        <fraction_done_exact/>
    </app>
</app_config>
4) Message boards : Number crunching : How do I change the data directory in Ubuntu? (Message 99903)
Posted 4 Dec 2020 by Aurum
Post:
tried changing the BOINC_DIR path
I don't know how to change the data directory but wouldn't mind learning how.
Have you tried this way?
https://boinc.berkeley.edu/forum_thread.php?id=9633&postid=56258#56258
5) Message boards : Number crunching : Another instance of BOINC is running (Message 99902)
Posted 4 Dec 2020 by Aurum
Post:
Have you tried sudo /etc/init.d/boinc-client restart ?
No shortage of WUs as I'm getting a regular flow.
6) Message boards : Number crunching : 3832 new hosts per day? (Message 99893)
Posted 3 Dec 2020 by Aurum
Post:
The CPU Benchmark is broken and results in very different values than it used to. Is that a BOINC issue or a Ubuntu issue?
When I upgraded from Linux Mint 19 to 20 BOINC went from 7.9.3 to 7.16.6.
Is it right now and it used to be wrong or was it wrong and now it's right?
Anyone know anything about getting it fixed?
7) Message boards : Number crunching : 23.3 GB RAM per Rosetta WU (Message 99870)
Posted 2 Dec 2020 by Aurum
Post:
true_relax_SAVE_ALL_OUT_1044287_24
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1164498828
true_relax_SAVE_ALL_OUT_1044287_54
true_relax_SAVE_ALL_OUT_1044287_62
true_relax_SAVE_ALL_OUT_1044288_60
8) Message boards : Number crunching : 23.3 GB RAM per Rosetta WU (Message 99856)
Posted 1 Dec 2020 by Aurum
Post:
true_relax_SAVE_ALL_OUT_1044287_54
true_relax_SAVE_ALL_OUT_1044287_62
true_relax_SAVE_ALL_OUT_1044288_60
9) Message boards : Number crunching : 23.3 GB RAM per Rosetta WU (Message 99852)
Posted 1 Dec 2020 by Aurum
Post:
Got another one. This is an E5-2697 v4 18c32t and it's using 21 of 32 GB RAM. It's the only WU left running and has also used 4.8 of 16 GB Swap. Aborted.
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1164500044
true_relax_SAVE_ALL_OUT_1044287_62
true_relax_SAVE_ALL_OUT_1044288_60
10) Message boards : Number crunching : 23.3 GB RAM per Rosetta WU (Message 99840)
Posted 30 Nov 2020 by Aurum
Post:
Just happened to notice this big boy was at 98.6% so I watched it finish. It hit 100% and I lost contact with Rig-33. When it returned this WU was tagged as Aborted but not by me.
I don't recall what they call that last step between 100% and Uploading, rollup maybe. But it may have needed additional RAM memory of which there was none since it had already consumed everything that computer had except the 16 GB swap file that I did not see it use.
Hopefully the responsible party will fix the bug.

I saw memory leak mentioned. If it was a memory leak wouldn't all WUs run by rosetta_4.20_x86_64-pc-linux-gnu suffer from it??? Is it possible to have a WU specific memory leak???
11) Message boards : Number crunching : 23.3 GB RAM per Rosetta WU (Message 99826)
Posted 30 Nov 2020 by Aurum
Post:
It's at 62.6% and 7.5 hours. I'll let it run and see what happens.
Fine with me if they do a server abort.
12) Message boards : Number crunching : No predictor of the day? (Message 99823)
Posted 30 Nov 2020 by Aurum
Post:
It would be better if Baker had badges.
13) Message boards : Number crunching : 23.3 GB RAM per Rosetta WU (Message 99822)
Posted 30 Nov 2020 by Aurum
Post:
It gets better, my wingman UH UIT HPC erred out with 258 GB RAM. Looking for a new wingman.
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1164499996
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=4011154
14) Message boards : Number crunching : 23.3 GB RAM per Rosetta WU (Message 99821)
Posted 30 Nov 2020 by Aurum
Post:
true_relax_SAVE_ALL_OUT_1044288_60_1 is using 100% of the RAM on its host that has 24 GB RAM.
When it arrived last night 23 WUs were running on that computer. Then they started going into Suspended: Waiting for Memory mode one by one. This morning all the 22 WUs that crowded managed to complete but no new ones can run and 45 WUs are Ready to Run.
Is this where Rosetta is headed, one WU per computer???
15) Message boards : Number crunching : Rosetta WU delivery out of control (Message 95806)
Posted 2 May 2020 by Aurum
Post:
It has nothing to do with WUs failing or the runtime estimate being wrong. I can crunch any Rosetta WU they send. Rosetta just simply does not respect BOINC settings and DLs far too many WUs. Client side fix is to set all computers to No New Work and abort a few thousand a day.
16) Message boards : Number crunching : L3 Cache is not a problem, CPU Performance Counter proves it (Message 94253)
Posted 12 Apr 2020 by Aurum
Post:
People said that Rosetta application is not optimized and needs 5MB of L3 cache for each instance running (though I don't really know where they got this number).
We observed it running MIP at WCG. When we dedicated all threads to MIP fewer results were completed in the same time. We verified this repeatedly for months. Then the project made a statement explaining:
"The short version is that Rosetta, the program being used by the MIP to fold the proteins on all of your computers, is pretty hungry when it comes to cache. A single instance of the program fits well in to a small cache. However, when you begin to run multiple instances there is more contention for that cache. This results in L3 cache misses and the CPU sits idle while we have to make a long trip to main memory to get the data we need. This behavior is common for programs that have larger memory requirements. It's also not something that we as developers often notice; we typically run on large clusters and use hundreds to thousands of cores in parallel on machines. Nothing seemed slower for us because we are always running in that regime."
https://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,40374_offset,50#569786

I doubt it's been fixed unless Rosetta coders consciously went in and rewrote the offending code. Why won't Baker respond to this question???
17) Message boards : Number crunching : Rosetta WU delivery out of control (Message 94252)
Posted 12 Apr 2020 by Aurum
Post:
Last night Rosetta delivered 999 WUs to a quad-core computer with a total estimated time to complete of 265 days. Its queue was set to 0.5/0.1 days. This problem has gotten so bad I've had to set all my computers to accept no new work from Rosetta. Only when it runs out do I let it download way too much work requiring that most be aborted.
I'm tired of babysitting this project. From now on I'll just let it send me hundreds of times more than I can crunch and they can sit there until they expire.
18) Message boards : News : Rosetta's role in fighting coronavirus (Message 93252)
Posted 3 Apr 2020 by Aurum
Post:
...long list of target proteins we're interested in for COVID-19 (some of which are COVID-19 itself).

I'm so confused. No idea what proteins you're talking. Would love to hear more details.
Covid-19 is the disease.
SARS-CoV-2 is the virus.
NSP1, NSP2, NSP3, NSP4, NSP5, NSP6, NSP7, NSP8, NSP9, NSP10, NSP12 NSP13, NSP14, NSP15 ,NSP16, S, ORF3a, E, M, ORF6, ORF7a, ORF8, N & ORF10 are the proteins.
19) Message boards : News : Help in the fight against COVID-19! (Message 93248)
Posted 3 Apr 2020 by Aurum
Post:
Will you make any COVID-19 badges ? ;-)


That would be nice.
20) Message boards : News : Help in the fight against COVID-19! (Message 93202)
Posted 3 Apr 2020 by Aurum
Post:
Dear Brian Coventry et alia,
1. Your custom protein binding the Spike protein looks like it will put neutralizing IgGs to shame!!! Very impressive!!! {As an aside each dose of this protein therapeutic will have to be injected as it could never survive a transit of the stomach and upper GI tract with the low pH and proteolytic enzymes that would reduce it to amino acids.}

2. I'm laboring under the assumption that Rosetta code incorrectly programmed the use of the L3 cache and will bottleneck if too many Rosetta WUs are running simultaneously. I limited my use to one WU per 5 MB of L3 cache. Exceeding that limitation slows the entire CPU over 60%. This means that my fleet of Xeon E5s are running Rosetta at less than 20% capacity. Has this been fixed and is it safe to run full force now???

3. I believe that your statement implies that all WUs are contributing to Covid-19 research regardless of whether you include "Covid-19" in the WU name and that minirosetta is doing Covid-19 work as well. Is that what you mean???

4. I don't know if I'm running ARM64 Rosetta aps or not. I see from your Applications page there are numerous versions but when I look at the Properties of a given WU I'm running it says the version 3.76, 4.07, 4.08 or 4.12. What are the Linux-ARM platforms???

5. I suggest you tether these blocking proteins together like an IgM. This would more readily mark them for destruction. Also, it appears that your blocking protein is much larger than an IgG so when it binds a Spike it also blocks neighboring Spikes (like an umbrella) from binding ACE2 and being invaginated.

{Please use black text on white background for your forum to maximize contrast. This faded gray text really strains my old eyes so I never read your forums. I would not have seen this announcement except that you were nice enough to send out a BOINC Notice. Thank you in advance :-}


Next 20



©2024 University of Washington
https://www.bakerlab.org