Posts by Paul

1) Message boards : Number crunching : Raspberry Pi4 (Message 90871)
Posted 25 Jun 2019 by Paul
Post:
I look forward to some benchmarks on this platform with Rosy. I can buy lots of cores with obsolete AMD Opterons at about 550 RAC per core. A machine with 64 cores & 64GB RAM costs about $300 & yield a RAC of 35,000. It would require 16 Raspberry Pi 4B machines for the same number of CPU cores and RAM. The Pi solution requires much less power but would cost almost 3x as much.

I look forward to some test runs on this platform.
2) Message boards : Number crunching : Raspberry Pi4 (Message 90869)
Posted 25 Jun 2019 by Paul
Post:
Any word of a version of rosy for the Pi 4? I could see a whole stack of these at my house. 4 cores at 15W sounds good. Could we get the Android workloads to run on this thing?
3) Message boards : Number crunching : AMD 1950x Threadripper performance (Message 90293)
Posted 2 Feb 2019 by Paul
Post:
I have 2 machines running 64 cores with 64GB of RAM and I never see Roses take more than 30GB. With 48GB of RAM, you should be good for a long time.

When you look at task manager, do you see individual graphs for each of the 32 processors?
4) Message boards : Number crunching : More errors while computing. (Message 90030)
Posted 19 Dec 2018 by Paul
Post:
I found that I need to go into BOINC Manager, go to the Options Menu > Computing Preferences then go to the Disk and Memory tab. You will see an option to Leave non-GPU tasks in memory while suspended. Check that box.

That reduced the compute errors for me.
5) Message boards : Number crunching : Larger Memory Models (Message 89899)
Posted 16 Nov 2018 by Paul
Post:
I think you will find many of us built machines with 1GB of RAM for each concurrent work unit. I know my machines almost never exceed 20% of total memory usage. I would love to see the work units use more RAM if it accelerates the calculations.
6) Message boards : Number crunching : Keeping new volunteers (Message 89869)
Posted 10 Nov 2018 by Paul
Post:
I run Rosetta on several Lenovo & Dell laptops. They run at 100% 24 hours a day 7 days each week. I have done this for years because I can usually find cheap laptops that just need a little attention. I had to replace a fan in one of the Lenovo’ machines but all of these machines appears to run fine. I put them on a wire rack so the get good air flow and I blow out the air vents & fans with some compressed air. Once they are setup, they usually run fine for years. Our MacBook Pro also does a great job but the fans are much louder.
7) Message boards : Number crunching : Ryzen vs Intel Performance (Message 89868)
Posted 10 Nov 2018 by Paul
Post:
What can we do to encourage the project team to take a closer look at these suggestions? I have 2 of the top 10 machines on the project right now & both are powered by AMD Opteron. I assume these changes will work with the Opteron processors as well as they support the extended instruction set. It would be great to get more work out of these systems and get to treatments faster.

It is shocking how much compute power we can buy for $500. Now we need to figure out how to efficiently use all of it.
8) Message boards : Number crunching : Ryzen 2700 performance on full cores (Ubuntu 18.04.1) (Message 89807)
Posted 31 Oct 2018 by Paul
Post:
Did you set BOINC to keep work units in RAM when inactive? I found that if I don’t keep the work units in RAM they often fail when restarted. I am not sure why.
9) Message boards : Number crunching : Work Units less than 100% CPU Utilization (Message 89796)
Posted 28 Oct 2018 by Paul
Post:
Looks like it was a hardware issue. I re-seated two of the CPUs and moved some RAM SIMMS around. Everything appears to be back to normal.

Thanks for all the suggestions. All 64 cores are back to 100%
10) Message boards : Number crunching : Which processor (Message 89792)
Posted 28 Oct 2018 by Paul
Post:
It has always been difficult to benchmark on Rosy. I have always leveraged hyper threading on my Intel CPUs. You need to have enough RAM for all those threads, typically 1GB for each thread.

I would love to get a threadripper or an i9 running but no budget for that.

Good luck & please report back when you get these systems running.
11) Message boards : Number crunching : Work Units less than 100% CPU Utilization (Message 89791)
Posted 28 Oct 2018 by Paul
Post:
Thx for the response. I recently did some CPU upgrades and my Opteron 6176 processors we jammed at 100% almost all the time. I am starting to wonder if there is some thermal throttling. I will check he heat sink. It also appears to happen mostly on CPU2 processors 17 - 32.

It sounds like others are not observing these issues. Guess it is time to dive back into the hardware.

Does anyone know if I need to do anything to Ubuntu 18.04 after a CPU upgrade? I assume it will load the correct CPU drivers for the new processors.

Hope to have this machine running at 100% again soon.
12) Message boards : Number crunching : Work Units less than 100% CPU Utilization (Message 89753)
Posted 23 Oct 2018 by Paul
Post:
I have run Rosy for a long time. These are dedicated crunchers set for 100% CPU Utilization. I have never seen this many tasks bounce around on utilization. I wonder if this is a larger issue with some of the new work units.
13) Message boards : Number crunching : Work Units less than 100% CPU Utilization (Message 89747)
Posted 23 Oct 2018 by Paul
Post:
I have a number of work units that are bouncing around on CPU utilization. In some cases, I see processors drop to less than 50% utilization. They don't drop for long but I am accustomed to seeing all processors at 100% almost all the time.

I need to know if other people see the same thing. If I have an issue with my platform, I want to understand that and get it resolved.
14) Message boards : Number crunching : Error while computing (Ubuntu 18.04 LTS, Boinc version 7.9.3) (Message 89633)
Posted 25 Sep 2018 by Paul
Post:
That fixed it! I have several successful 4.07 jobs completed & validated.

Thx
15) Message boards : Number crunching : Legacy CPU Performance (Message 89618)
Posted 23 Sep 2018 by Paul
Post:
After months of run time, it appears that the Opteron 6176 gets a RAC of about 450 per core. The 6380 get a RAC of about 500 so not a big improvement but the core count increases from 12 to 16. I am only running mini jobs and wonder how things will change when I get some standard tasks.
16) Message boards : Number crunching : Error while computing (Ubuntu 18.04 LTS, Boinc version 7.9.3) (Message 89581)
Posted 19 Sep 2018 by Paul
Post:
I hope someone can fix this soon. I tried to implement the fix listed above but my systems don’t have that directory structure. I have /etc/system/systemd but I don’t have the boinc-client.system.d folder.

If the developers can fix this with 1 line of code I hope they will do it soon. I have several big systems that refuse to process 4.07 jobs because they have the 2.27 version of glibc.
17) Message boards : Number crunching : Fewer Hosts (Message 89443)
Posted 23 Aug 2018 by Paul
Post:
I looked at the stats on the number of hosts & noticed an incredible shift away from Rosy over the last few months. It looks like May - June was dramatic. What happened?

https://boincstats.com/en/stats/14/project/detail/host

Maybe I am reading the chart incorrectly
18) Message boards : Number crunching : Compute Error - Ubuntu 18.04 (Message 89428)
Posted 21 Aug 2018 by Paul
Post:
All:

What is the fix for computation errors and Ubuntu 18.04? I have 2 servers that have yet to complete a 4.07 task after I upgraded to Ubuntu 18.04. I understand there is a problem with the glibc 2.27 but there must be a fix other than creating virtual machines. My Ubuntu machines are all dedicated crunchers and I am getting hundreds of failed work units everyday. I am glad they typically fail in the first 6 seconds but it feels very inefficient to download and upload these work units.
19) Message boards : Number crunching : Error while computing - AMD Opteron (Message 88870)
Posted 12 May 2018 by Paul
Post:
I am running Ubuntu 16.04 LTS. I can try a project reset. How do I look at dmesg?

ldd version 2.23

64GB RAM
4 AMD Opteron 6176 Processors with 12 Cores each
250GB SSD

100% dedicated to Rosetta. Everything else runs fine including other Rosetta WUs

Problem started all at once. I did not reset the project as she is running 48 active WUs. Hate to waste all that progress.
20) Message boards : Number crunching : Error while computing - AMD Opteron (Message 88855)
Posted 11 May 2018 by Paul
Post:
All:

I have many failed work units fail with Error while computing after about 1 min of run time. All on my AMD Opteron cores are chewing through these WUs. All of them are rb_05_10_167_247__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_600454_

I think I have 500 failed work units and growing.

examples:
https://boinc.bakerlab.org/workunit.php?wuid=898124245
https://boinc.bakerlab.org/workunit.php?wuid=898124145
https://boinc.bakerlab.org/workunit.php?wuid=898124156


Next 20



©2020 University of Washington
https://www.bakerlab.org