Posts by Tom M

1) Questions and Answers : Unix/Linux : No new tasks for weeks, no Virtualbox (Message 105802)
Posted 3 Apr 2022 by Tom M
Post:
I've looked around off and on for this a bit. Want to make sure that requiring VirtualBox for tasks is the expected behavior. There are a few threads on this matter e.g. https://boinc.bakerlab.org/rosetta/forum_thread.php?id=14814#103235. If there is no way around this, could someone please provide directions for the 'thinnest' way to add VirtualBox on Linux? Is there a way to provide VirtualBox it without maintaining another OS instance?



https://www.virtualbox.org/wiki/Linux_Downloads

The Ubuntu 19 version has installed on Ubuntu 20 and I am currently downloading Python tasks. Will follow up once it starts crunching.

Tom M
2) Message boards : Cafe Rosetta : Local BOINC Meetup (Message 102137)
Posted 28 Jun 2021 by Tom M
Post:
Hi,

I like to talk with other people about BOINC projects.
I like to talk face-to-face.
I live in North East Kansas, USA

Is there anyone nearby?

Tom M
3) Message boards : Number crunching : Should you run with some threads "idling" or not? (Message 96219)
Posted 7 May 2020 by Tom M
Post:
My previous experience with Boinc projects is you need to run at less than 100% CPU threads to maximize production.
This was always on machines that mixed CPU and GPU crunching.
You needed at least 1 thread clear up to 4 threads clear on the larger multi-core count cpus.
That can translate into 75% (4 cores), 90% or in one of my cases 87.5%.

Does this "rule of thumb" change on systems that do only CPU-based crunching?

I don't think it does. But I could add up to 4 more threads crunching on one Rig if CPU-based crunching doesn't have the same experience as CPU/GPU crunching.

Thank you.
Tom M
4) Message boards : Number crunching : Discussion on increasing the default run time (Message 96215)
Posted 7 May 2020 by Tom M
Post:
If I am not confused is the main reason to "increase the default run time" to decrease the download/upload pressure on the servers?

I am a newbie on how R@H goes about things so I may not be understanding it right,

If the servers are "still" getting hammered to much how about changing the "minimum" to say 8 hours?
And enforcing it so nobody is "grandfathered" in.

All this presumes reliable check pointing at the client end so part-time crunchers are not impacted.

Tom M
5) Message boards : Number crunching : If You Don't Know Where to Put it, Post it here. (Message 96184)
Posted 6 May 2020 by Tom M
Post:
On the Leader board...
It is amazing the number of server class/HEDT systems with huge core counts that are running here.

Tom M
I’ve only got a measly 88 threads running between 8 machines :-)

I really need to get myself one of those 128 thread machines.


I moved my 32 thread box (Amd 3950x) [running 28] to new X570 MB on an "older" Cube case. https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=4308859

I am running maybe 18 threads on Rosetta plus some odds'n ends of 1 thread per project, the highly variable MindModeling (upto 4 threads) and maybe 4-6 threads on WCG.

I did an estimate and came to the conclusion there was no hope even if I ran just R@H of getting my RAC up there very high. Especially since I don't run Turbo to save electricity.

So I do what I can do. I have maybe 2-6 threads of R@H running on a couple of other gpu (5 and 6 gpus right now) [8c/16t) systems.

Tom M
6) Message boards : Number crunching : If You Don't Know Where to Put it, Post it here. (Message 96146)
Posted 6 May 2020 by Tom M
Post:
On the Leader board...
Until you get to #76 (8 threads) and #131 (32 threads) you don't see ANY systems that are running less than 48-64 threads.

It is amazing the number of server class/HEDT systems with huge core counts that are running here.

Tom M
7) Message boards : Number crunching : Quite a few signal 11 errors on a Linux host - what does it mean? (Message 96106)
Posted 5 May 2020 by Tom M
Post:
Hello.
Browsing through my machines on the site today and came across this. Not sure what the logs mean.
[url]https://boinc.bakerlab.org/rosetta/results.php?hostid=4295358&offset=0&show_names=0&state=6&appid=

Log for a task:
<message>
process got signal 11
</message>
<stderr_txt>
command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.20_x86_64-pc-linux-gnu -run:protocol jd2_scripting -parser:protocol jhr_boinc_v4.xml @flags -in:file:silent Junior_HalfRoid_design5_COVID-19_SAVE_ALL_OUT_IGNORE_THE_REST_1dy7ly8k.silent -in:file:silent_struct_type binary -silent_gz -mute all -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip Junior_HalfRoid_design5_COVID-19_SAVE_ALL_OUT_IGNORE_THE_REST_1dy7ly8k.zip @Junior_HalfRoid_design5_COVID-19_SAVE_ALL_OUT_IGNORE_THE_REST_1dy7ly8k.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 1075497
Using database: database_357d5d93529_n_methyl/minirosetta_database

</stderr_txt

Is this a hardware error? temperatures are fine the machine doesn't seem to be struggling for ram. Unless it's using a version of Boinc which is too old?
I tried to update using ppa:costamagnagianfranco/boinc, but receiving an error. Might try again if this persists.

Any help appreciated. Hate wasting tasks and resources like this.
[/url]


Which computer?
8) Message boards : Number crunching : Minirosetta 3.73-3.78 (Message 95955)
Posted 3 May 2020 by Tom M
Post:
I thought MiniRosetta had expired with all tasks returned, but I just got one and I see there are 7000 new tasks out there <sigh>


MiniRosetta has a significantly smaller memory size and seems to schedule for a shorter run when I get those kinds of tasks. Is that what it is aimed at? Smaller/Slower machines?

Tom M
9) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 95572)
Posted 29 Apr 2020 by Tom M
Post:
Ok, i may have an optimistic explanation of this: The completion percentages shown in the BOINC manager might be wildly innacurate.

I had "very slow" WU's and "less slow" WU's. I killed a lot of very slow ones and let the less slow run their course. Surprise: the less slow tasks, announced to take 6 hours or more have stopped at around 4 hours and half as promised. They were at 85 -90 % completion and they suddently jumped to 100% With the correct run time.

Let's hope it will be the same with the "very slow" ones. I will only know tomorrow.


Sounds reasonable. Just wait for it :)

Tom M
10) Message boards : Number crunching : Threadripper and Ryzen (and EYPC) (Message 95346)
Posted 25 Apr 2020 by Tom M
Post:
I am hoping to get up well past 50,000 but I am not running all my CPU threads on Rosetta.


Make that up to 20,000 RAC or so. I recently did a calculation based on the credits I was getting and came out with an estimated 19,xxxx

Tom M
11) Message boards : Number crunching : Tells us your thoughts on granting credit for large protein, long-running tasks (Message 95345)
Posted 25 Apr 2020 by Tom M
Post:
I expect you do not want to end up with a bias toward 1GB or 4GB jobs, while both are needed. For the clients that can handle the 4GB jobs the bias should be neutral. Unless you expect a tendency towards more 4GB jobs with respect to 1 GB jobs, or the other way around, then you want a bias.
That's the thinking.
Over all, the effect should be neutral. People shouldn't lose out for processing these larger RAM requirement Tasks, and they shouldn't get a boost either. All the work is important, so if a Tasks stops 2 or more others form being processed at that time, it needs to offset that loss in production.

Credits can't buy you a toaster, but they can let you see how you are doing, and how much you have done to help Rosetta.


+1
12) Message boards : Number crunching : The most efficient cruncher rig possible (Message 95211)
Posted 23 Apr 2020 by Tom M
Post:
I know that some/many bios offer a "pause on bios error" feature like "keyboard missing". So it seems reasonable that if you were to disable the "boot up errors pause" then a computer would boot headless (without a monitor or keyboard) but it would need by default to accept some kind of remote desktop connection.

Tom M
13) Message boards : Number crunching : The most efficient cruncher rig possible (Message 95132)
Posted 22 Apr 2020 by Tom M
Post:
Do you think this one could do the trick for now, costing $525 upfront?



I am told the 3700x is the "sweet spot" between price and performance for the Ryzen 3000 series.

You may want to buy an MB with the largest number of PCIe slots you can rather than just 3 slots. 5 sometimes 6 are very common for that Chipset. You also want to confirm that the bios support "Above 4G" incase you start wanting to run more than 2-3 video cards.

And you will want at least one video card for setup as the other responder said.

Respectfully,
Tom M
14) Message boards : Number crunching : The most efficient cruncher rig possible (Message 94945)
Posted 19 Apr 2020 by Tom M
Post:
I did turn PBO off after I thought about it. Is that what you mean by turbo boost?


"Turbo-boost" is what I call any of the technologies by Intel and AMD that cause a cpu to run beyond its "normal" top rated speed. In the ASUS ROG Crosshair VII Hero motherboard I am using there is a place to disable PBO. And there are two other places to disable "cpu boost". I have them all disabled.

When in doubt run all the way through your bios to make sure you have it disabled "every place."

I also have my TDP on my Amd 3950x limited to 65 watts.

So for the most part my cpu is running about 3.3GHz or less.
It doesn't maximize production but it maximizes production at the minimum cost possible. (I think).

Tom M
15) Message boards : Number crunching : The most efficient cruncher rig possible (Message 94869)
Posted 19 Apr 2020 by Tom M
Post:

Yes its like a 3700x. The added benefit is its quiet, at the lower threshold of my hearing and not irritating. A fan control program from the manufacturer of the motherboard which I use to auto control fans, sets the fans at about 6-700 rpm when Boinc is running when I am at 90%. At 100% cpu utilization the fan control program ramps up the fans and they get loud. Lots of fans in my case.


Running without "turbo-boost" also has the added benefit of using less electricity :)

Tom M
16) Message boards : Number crunching : Rosetta 4.1+ and 4.2+ (Message 94868)
Posted 19 Apr 2020 by Tom M
Post:
So are these extremely long run times only showing up on the "686" tasks?

Tom
17) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 94867)
Posted 19 Apr 2020 by Tom M
Post:
Hello, I'm a newbie to Rosetta and got things set up and running ok. In the last two days I've noticed my laptop running this app in an odd manner. Instead of running at 100% CPU, it fluctuates between 33% and 100%,


If you have Seti@Home mostly idling I would go to the S@H website and disable the "intel igpu" check box.

Generally running any crunching task on that part of the Intel cpu chip slows the entire system down significantly.

This usually is true now, if/when Intel delivers on the planned upgrades to the iGPU it will then start behaving more like AMD's iGPU but not yet.

Tom M
18) Message boards : Number crunching : If You Don't Know Where to Put it, Post it here. (Message 94673)
Posted 17 Apr 2020 by Tom M
Post:
Tom, if it doesn't work, I would empty the cache and try resetting the project, might work.


Just did that and it appears to be only downloading non-"686" tasks now for both mini and regular rosetti@home.

Thank you for the idea.
I am still running "only" 8 threads while I run down the tasks on another cpu & gpu project that I have switched to gpu only.

OBTW, what was the "advantage" of the "686" version over the version I am running now?

Tom M
19) Message boards : Number crunching : If You Don't Know Where to Put it, Post it here. (Message 94631)
Posted 16 Apr 2020 by Tom M
Post:
Yes, I stand corrected. I was looking at the "Application version" section in the header, not the command string in the output section. The command string shows i686.

I am not certain why the cc_config doesn't seem to do what folks told me it would do.


After confirming I had that parameter in there I cycled the Boinc Manager/Clients to make sure the cc_config.xml file was read.
I am going to throttle my # of threads to 8 and my time to 2 hours and then allow new tasks again. I will take a look at the stuff after it downloads and starts running and see if it avoid the i686.

My impression was the "mini" shows up more often when the short tasks are selected so I may be able to screen it out that way. Or at least reduce the time before the watch dog timer catches up with it.

Tom M
20) Message boards : Number crunching : If You Don't Know Where to Put it, Post it here. (Message 94629)
Posted 16 Apr 2020 by Tom M
Post:
The few WUs on this host that I see that ran for 12 hours, using the Rosetta v4.15
x86_64-pc-linux-gnu
application, and were all ended by the watchdog. I see the host has 32 CPUs and 32GB of memory. How many active R@h threads are running at the same time on this host?


18 threads with no pauses.

I have examined the tasks and they all seem to be "i686" tasks. I thought I had disabled those with the no_alt_platform> but apparently not.
---edit----------
Based on suggestion(s) from other threads I have aborted the "i686" tasks which were all 4.15 versions.
Now what? I have gone to NNT while I see what else I can try
---edit---

Tom M


Next 20



©2022 University of Washington
https://www.bakerlab.org