Posts by teacup_DPC

1) Message boards : Number crunching : The most efficient cruncher rig possible (Message 95426)
Posted 27 Apr 2020 by teacup_DPC
Post:
Interesting read, this topic. The opinions shared represent quite some experience, more than some I do recognize.

When building a new rig specialized for sponsoring Rosetta I get caught, as some others in this topic, by the following thoughts:

    Cost of ownership is an addition of investment, selling and cost of operation.
    Too much focus on efficiency can cause buying expensive hardware that do not return on their investment.
    Too much focus on performance can cause expensive operation because of cooling and power consumption issues.


I know the above statements are a bit cliches, and are easily said without any risk. I understand that.

Without wanting to put the sensible middle of the road solutions mentioned here (Ryzen 3700X, Bronze PSU, no Christmas tree motherboard but with decent VRM) to the discussion we still do not have any bearings regarding:

    initial costing of rig
    its use of power
    and its output of calculations (RAC)


It is easy to state an opinion like this. I know. Writing a message in a topic is not so difficult and not so time consuming. Trying to get answers to the three questions above takes loads of time. While we all need to do exactly the same thing to measure something that can be compared with the results of someone else. Only then we are able to get some unbiased answers. Don't take me wrong, I do not doubt at all the judgment and experience of the respondents in this topic, and do not want to discuss their opinions. We only should try to give some foundation to all those mostly valid opinions in some numbers.

So if we should like to have some answers we should benchmark, benchmark and benchmark again. But before we can start benchmarking we should agree on what to measure and what information should be involved. This is a challenge itself :) (and I think it should be a separated topic as well). Benchmarking in a reproductive way is difficult, and do not be surprised that later on advancing insight will force us to alter that benchmark itself.

A first step can be is defining how to benchmark and how we sensibly can measure what we want to know. To get thinking started:

    RAC: How to measure? How long to get an average result with enough resolution, against what side conditions (what do we allow to run on the OS besides the calculations)?
    Measuring power consumption: what meters to use (define a standard?), what to measure? Again how long to get an average result (we can Make RAC leading for this period)
    Most difficult and chaotic variable: against what cost hardware can be brought and sold economically. Many depencies, in what part of the world is brought and sold, special deals etc.


So do not start testing yet, we first need to agree how to test.

2) Message boards : Number crunching : The most efficient cruncher rig possible (Message 95424)
Posted 27 Apr 2020 by teacup_DPC
Post:
https://www.cnx-software.com/2020/04/23/50-odroid-c4-raspberry-pi-4-competitor-combines-amlogic-s905x3-soc-with-4gb-ram/amp

Could this be a candidate for beating the efficiency of ryzen?


In response: A bit earlier in the topic already some things have been said about this, but not conclusive yet. Please have read, its interesting enough.
3) Message boards : Number crunching : Tells us your thoughts on granting credit for large protein, long-running tasks (Message 95422)
Posted 27 Apr 2020 by teacup_DPC
Post:
Hi sangaku

I found your https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13791#94266 thread, read some of its first posts. I liked the questioning approach of it , and will direct responses concerning what hardware to use in that topic.

Your Raspberry Pi 4 remark did set me thinking. Without doing the math I got a vision of a stack of these things, each taking 2 or 3 threads. Being a Dutch my financial domain, as yours, is Euros, and a Pi4 can be fetched in Holland for around 50-60 Euro's. Storage and PSU for all those Pi's should be approached in some clever combined way. First will completely read that topic now, probably the math will not add up, making a Pi 4 a no go. But only fantasizing about that pile of Pi's made my morning a good one, though it probably was not the aim of your post :|.

Thanks!
4) Message boards : Number crunching : Tells us your thoughts on granting credit for large protein, long-running tasks (Message 95343)
Posted 25 Apr 2020 by teacup_DPC
Post:
but if we contributed every phone, tablet and low-mid end office machine, typically with 2-4 cores, our computing capacity could increase by orders of magnitude. (I.e., we have way less than a million hosts and there exist billions of personal computing devices in the world)
For as many of of those devices there are, many are of such low capability they are of no use to many projects.
And for those that are of use, their frequent use for what they were designed for by the users means they often can't contribute much during those periods, compared to more capable systems.
Just to nuance this, I know people getting their old phones from below a layer of dust out of the chest of drawers and setting them to work. As I've understood they only can be functional with their display turned off, so I doubt if that very phone is available for normal use at all.

And you need to keep in mind efficiency isn't actually about low peak or maximum power use- it is about energy used over time to complete a task.
It's no good having a device use 1W if it takes 1 month to produce a result when something that uses 1kW can produce the same result in a matter of seconds. Yeah, it's instantaneous power consumption is a lot higher. But it uses less energy to do the same work. And the fact it can do so much more work over the same period of time as the slower device makes it even more useful to a project.
I read your point, and it sounds logical, but that coin has two sides. Phone hardware is tailored as well to be super efficient, while continuously needs to be on battery use. Desktop hardware does not necessarily has this efficiency pedigree, though large steps have been made miniaturizing the processor circuits. This phone sideline is a bit off topic perhaps, I admit. But your remark made me a bit curious, I need to search somewhere an GFLOP/W ratio or so. Maybe you're completely right after all, I only caught myself on the thought I was not able to quantify your argumentation. I think an interesting topic in itself.

But no need marginalize our beloved Behemoth machines. I am always impressed what their work throughput is in my team (Dutch Power Cows), saliva dripping from the corners of my mouth looking at those numbers. My older i5 and i7 processors stand their ground, but they are from another order. Independent from this 4GB discussion my next processor becomes a big Ryzen, that's for sure. Behemoths and more potent desktops will always remain a pillar in the capacity of distributed computing.

Rosetta is stretching herself by trying to meet the phone clients and the potent desktop client with those 4GB jobs. If support of phones proves to be a long term investment time needs to learn, but there are a lot of (old) phones out there, and they represent a huge capacity. That is tried to harvest this I can fully understand.

(sorry, a bit off topic i fear)
5) Message boards : Number crunching : Tells us your thoughts on granting credit for large protein, long-running tasks (Message 95057)
Posted 21 Apr 2020 by teacup_DPC
Post:
Unread Message 94951 - Posted: 19 Apr 2020, 23:16:02 UTC - in response to Message 94937.

I think the new longer tasks should get more than the credit awarded for running 4 1gb tasks, the idea being that they can't be run by everyone and that by definition means older and slower pc;s which should be enouraged to be replaced or updated as over time they simply won't be able to keep up. How much more depends on the priority the Project places on these new workunits, if they are just new workunits than only a minimal amount of credit above the amount 4 1gb tasks would get, on the other hand if the new tasks are a higher than normal priority than a higher credit should be given to encourage people to crunch them instead.


I think your remark is valid when we leave from the assumption that systems able to do 4GB tasks without any issue besides doing other things are rare. To be honest when writing my reply I was not sure about that. That is why I wrote this sentence:
Where exactly to position the credits between 1x1GB and 4x1GB depends on the availability of processor cores and memory in the clients capable for the 4GB jobs, you can judge that better than I.
Another thing I am not sure about is that 4GB WU's will take more time to solve. It seems logical, while more data needs to be moved around. This would imply there is not only a lower limit on memory size, but on core performance as well. If so, where does the threshold lie?


From one side I expect however you will have a point. Lets look at an extremity of the spectrum. Ryzen 9 3950X, 16core/32thread. When you only want to harvest the 4GB jobs for each thread, more than 4x32=128GB of memory is needed. Systems like this will be a minority, I expect. But besides this extremity what about an average desktop, 8 years or younger, quite some systems will have 16GB, a majority will have 8GB of RAM, a minority will have 32GB or more. All those systems are able to do without much effort, besides doing other things, at least 1 4GB WU. Assuming their processor cores can cope with the load.

So back to the question, how probable are those systems? Perhaps Mod.Sense can shed some light on this?
6) Message boards : Number crunching : Tells us your thoughts on granting credit for large protein, long-running tasks (Message 94937)
Posted 19 Apr 2020 by teacup_DPC
Post:
So, it seems reasonable that these 4GB work units should come with a premium on credits granted. But how much of a premium is reasonable?

There are many ways to look at it, so we thought we'd open it up for discussion. Please keep things respectful. Probably best to just state your own perspective on the topic and not address other posts directly, and certainly no need for rebuttals here. We're trying to brainstorm.


I am still quite new to actively using distributed computing, so I hope my thoughts are of relevance. The footprint of a WU can be seen as a multiplication of the resources processor capacity and memory space. I am not fully aware what disk space is needed, so I leave that aside.

When needing (worst case) 4GB of memory the 4GB task prevents the client from running 4 1GB tasks at the same time.
The single 4GB task will probably use less processor capacity (1 core?) than the parallel processor capacity needed for the 4 separate 1GB tasks (4 cores?).
So per time unit the credits should be positioned somewhere between 1 1GB task and 4 1GB tasks. More than 1 1GB task while the memory use is four times as much, and less than 4 1GB tasks, while the single task can be completed with one core.
Where exactly to position the credits between 1x1GB and 4x1GB depends on the availability of processor cores and memory in the clients capable for the 4GB jobs, you can judge that better than I. My long shot will be that the credits will end up somewhere between 2,5 and 3 1GB jobs, per time unit.

When running a 4GB WU with one core more data needs to be dealt with, so it is probable the task will need more time. This can be covered with the time dependency in the credits. Maybe an extra bonus for the somewhat higher risk of failing because of the longer throughput time, as others did suggest as well.

I expect you do not want to end up with a bias toward 1GB or 4GB jobs, while both are needed. For the clients that can handle the 4GB jobs the bias should be neutral. Unless you expect a tendency towards more 4GB jobs with respect to 1 GB jobs, or the other way around, then you want a bias.






©2025 University of Washington
https://www.bakerlab.org