Output versus work unit size

Message boards : Number crunching : Output versus work unit size

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
rjs5

Send message
Joined: 22 Nov 10
Posts: 272
Credit: 21,057,031
RAC: 17,867
Message 88822 - Posted: 5 May 2018, 19:28:51 UTC - in response to Message 88817.  

Whether it is a good idea or not depends on how frequently tasks are getting preempted. I recommend people set it so it DOES leave them "in memory", and they get swapped out if the system gets busy with other work. By NOT keeping tasks in memory, you are expressing a willingness to throw away partially completed work, i.e. willingness to lose credit in favor of more quickly getting out of the way when other demands arrive on the machine.

True, and I normally have it enabled. But at the moment, I am running ONLY Rosetta, so there is nothing to preempt. (In fact, it should not matter now whether LAIM is enabled or not, since no tasks are suspended.)

Maybe I can someday find another project (e.g., WCG) that plays well with Rosetta. In that case, I will have to do a little experimenting to see if LAIM is worthwhile or not.


The i7 4771 Windows machine is the one giving you the problem. Looking at the resent results, there are some things that stick out.

1. you have hyperthreading turned OFF. IMO, turning OFF hyperthreading has never been a "throughput" win. An individual job make take more time, but the AVERAGE time of running twice as many jobs is lower. Enable hyperthreads unless you are SURE that it is a problem. I might leave the 3 Rosetta WU limit in place for awhile to see what changes.

2. Looking at the current results that I can see, it looks like Rosetta 4.07 Windows is the source of the problem. I would open the Windows TASK MANAGER and look at and monitor the DETAILED information to see if other programs are taking unexpected resources. Windows is USUALLY faster than Linux UNLESS the Linux version is using HUGE PAGES (unlikely). The newer Haswell CPU (4771) does improvements in the instruction set, but that is UNLIKELY the source of a 400% performance improvement (180 -> 774 credits)


Results summary sorted by CREDITS.
i7 3770
Credit Peak working set size Peak swap size Peak disk usage Application version
949.7 513.58 660.98 536.76 Rosetta v4.07 x86_64-pc-linux-gnu
908.77 501.71 648.84 536.96 Rosetta v4.07 x86_64-pc-linux-gnu
890.51 494.95 642.43 537.02 Rosetta v4.07 x86_64-pc-linux-gnu
856.81 513.75 661.08 536.75 Rosetta v4.07 x86_64-pc-linux-gnu
806.6 610.37 755.43 549.96 Rosetta v4.07 x86_64-pc-linux-gnu
714.55 485.88 552.25 536.42 Rosetta v4.07 i686-pc-linux-gnu
426.06 427.27 484.19 436.63 Rosetta Mini v3.78 x86_64-pc-linux-gnu



i7 4771
Credit Peak working set size Peak swap size Peak disk usage Application version
854.55 422.21 407.02 432.13 Rosetta Mini v3.78 windows_x86_64
774.94 410.25 393.73 425.93 Rosetta Mini v3.78 windows_x86_64
464.71 285.15 269.19 415.76 Rosetta Mini v3.78 windows_x86_64
183.17 647.66 627.78 528.75 Rosetta v4.07 windows_x86_64
180.90 443.88 426.99 512.69 Rosetta v4.07 windows_intelx86
179.00 791.43 776.73 547.89 Rosetta v4.07 windows_intelx86
178.36 429.46 414.06 514.25 Rosetta v4.07 windows_intelx86
177.59 525.55 505.02 514.56 Rosetta v4.07 windows_x86_64
176.21 764.33 744.05 527.87 Rosetta v4.07 windows_x86_64
172.69 668.47 652.33 546.11 Rosetta v4.07 windows_intelx86
172.02 511.33 496.50 525.72 Rosetta v4.07 windows_intelx86
167.39 851.91 833.11 545.81 Rosetta v4.07 windows_x86_64
166.32 787.81 768.35 544.73 Rosetta v4.07 windows_x86_64
164.31 787.73 768.40 545.32 Rosetta v4.07 windows_x86_64
161.73 489.06 468.77 514.08 Rosetta v4.07 windows_x86_64
ID: 88822 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mmonnin

Send message
Joined: 2 Jun 16
Posts: 54
Credit: 20,058,207
RAC: 31,720
Message 88831 - Posted: 8 May 2018, 2:04:32 UTC - in response to Message 88799.  

Yes, R@h is memory intensive. Any memory intensive application is potentially going to be labelled as not playing well with others. It is just how memory contention works in a system. So I don't see a specific problem with your scenario. But wanted to assure you that the developers do look at memory usage and attempt to improve the algorithms used to dial back the use of memory where possible. Also wanted to point out that you said in prior posts that R@h doesn't play well with others, which always sounds like a skirmish for resources and people often invent logic that says it is the application being aggressive, when in fact such things are controlled by the operating system. But I wanted to point out that your last post essentially now boils down to you saying that R@h doesn't play well with itself either. So, at least there is no bias on what is being impacted. As you say, L2 cache contention is going to crop up with any memory intensive application. The larger the L2 cache, the faster any memory intensive application will run.

One approach to optimizing the work on a machine is to get a mixture of work with lower memory requirements. I often suggest people attach to world community grid. Their projects have humanitarian and medical implications, and typically have much lower memory requirements. You can define your preference for mixture of work using the "resource share" for each project. So, for example a resource share of 70% R@h and 30% WCG, you could setup R@h with resource share of 700 and WCG with resource share of 300. On an 8 core system, that would typically result in at least two WCG tasks running alongside 6 R@h tasks. This mix is often enough to make full use of the cores that you just suggested leaving idle.




I guess you are addressing me. I really don't know what the developers pay attention to. I just make my conclusions based on empirical observations. IMO, PrimeGrid is probably the project with the biggest optimization problems. They have over-tuned the code. R@H has done some simple things, but they have overlooked issues that are typically not understood by developers. What they have done is fine with me. Their design decisions determine the power cost, network traffic, disk sizes needed, .... of the machines. IMO, they can make some changes to use those resources more efficiently.

Including all the models in one binary is a design decision and it puts extra pressure on the TLB and networks.
Basing a design on a library of small functions (BOOST) causes a page of code to be read into memory so the program can execute 1 function. Loading the rest of that page is overhead, takes memory and puts pressure on the TLB.
Compiling the code with options like "-O3 -funroll-loops -finline-functions" unwinds the loops (makes code footprint larger) and inlining code puts a copy of the code in multiple places that take up multiple locations in memory, cache, ...

If a cruncher gets WU using all the same model, the machine will use memory most efficiently and ... run faster and get more credits.
If a cruncher gets WU needing 8 different models for an 8-CPU machine, the machine will run slower because the WU do not share CODE or DATA as effectively. The cruncher will be penalized for R@H less efficient use of the caches.
As WU complete and drain, the kind of WU that the machine will affect running WU.
A WU in the first case would give them more credits that in the second case, just because of the R@H interaction.


If most of the WU use just one model, then the problem is low. If there is a lot of variation, the impact will be larger.

Again, I think what R@H is doing is fine and have zero problems with their decisions.


It'd be great if we could at least select one or both apps in preferences. I'd assume that would limit the models to an extent.

Is it that hard to take an existing functioning app as a baseline app any new models/code added that instead of piling it all into one app? Some projects have many apps that do different things. PrimeGrid has different algorithms (right term?) to find primes set as different kind of apps. Maybe then we wouldn't have like a gig download for two apps plus the task files.
ID: 88831 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1847
Credit: 7,994,764
RAC: 8,835
Message 88834 - Posted: 8 May 2018, 15:05:26 UTC - in response to Message 88831.  

It'd be great if we could at least select one or both apps in preferences. I'd assume that would limit the models to an extent.

Something like gpugrid, with "Long runs" and "Short runs" wus?

Is it that hard to take an existing functioning app as a baseline app any new models/code added that instead of piling it all into one app? Some projects have many apps that do different things. PrimeGrid has different algorithms (right term?) to find primes set as different kind of apps. Maybe then we wouldn't have like a gig download for two apps plus the task files.

Fork the code to produce different, specialized apps (one for "folding", one for "ab initio", etc) may be a solution.
But i don't know how much complex could be
ID: 88834 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mmonnin

Send message
Joined: 2 Jun 16
Posts: 54
Credit: 20,058,207
RAC: 31,720
Message 88840 - Posted: 9 May 2018, 10:01:18 UTC - in response to Message 88834.  

It'd be great if we could at least select one or both apps in preferences. I'd assume that would limit the models to an extent.

Something like gpugrid, with "Long runs" and "Short runs" wus?



There are currently two apps, Rosetta and Rosetta Mini but no way in the preferences to select one or the other. Off the top of my head I don't recall another project that doesn't allow selecting between its apps. In the recent past, Rosetta ap has had a much higher change of running for much longer than set in preferences and returning results. Could I not select them? No. Rosetta also had comp errors at like 5sec on one computer with error 193. Mini was fine.

This project already allows for multiple length options.
ID: 88840 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1847
Credit: 7,994,764
RAC: 8,835
Message 88843 - Posted: 9 May 2018, 19:26:31 UTC - in response to Message 88840.  
Last modified: 9 May 2018, 19:26:53 UTC

There are currently two apps, Rosetta and Rosetta Mini but no way in the preferences to select one or the other. Off the top of my head I don't recall another project that doesn't allow selecting between its apps. In the recent past, Rosetta ap has had a much higher change of running for much longer than set in preferences and returning results. Could I not select them? No. Rosetta also had comp errors at like 5sec on one computer with error 193. Mini was fine.


Ok, i understand.
I know that 4.xx branch is in development, but i don't know if 3.xx is still developed or debugged.
I thinked, months ago when 4.x started, that r@h would have abandoned the 3.x version.
But 3.x is still here, so i don't know what they want to do with
ID: 88843 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 88888 - Posted: 13 May 2018, 12:24:20 UTC
Last modified: 13 May 2018, 12:43:52 UTC

FWIW, I am in the early testing phase of running a Coffee Lake (i7-8700) CPU on Rosetta (Ubuntu 18.04).
At the moment, the 12 cores are divided equally between Rosetta and Universe, with one core being reserved to support a GPU on GPUGrid.

The results are encouraging thus far, though I may have to change the mix of projects to optimize it a little more:
https://boinc.bakerlab.org/rosetta/results.php?hostid=3399951&offset=0&show_names=0&state=4&appid=

In general, my Haswell gives more consistent output than my Ivy Bridge, and the Coffee Lake may do even better.
Intel probably improved the cache performance in the later chips, which may account for it.

Universe has very small work units, about 3 MB, while I was earlier also running Einstein with Rosetta, which is about the same size. But it seems to be not as simple as just work unit size as I had thought. Even leaving cores free does not necessarily fix it. Maybe how the cache is shared among the cores has something to do with it, which is far beyond my ability to investigate. But chip architecture seems to affect it somehow.
ID: 88888 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1966
Credit: 38,188,338
RAC: 11,005
Message 88905 - Posted: 15 May 2018, 2:05:46 UTC - in response to Message 88816.  

Whether it is a good idea or not depends on how frequently tasks are getting preempted. I recommend people set it so it DOES leave them "in memory", and they get swapped out if the system gets busy with other work. By NOT keeping tasks in memory, you are expressing a willingness to throw away partially completed work, i.e. willingness to lose credit in favor of more quickly getting out of the way when other demands arrive on the machine.

Again relating to issues long ago that I can't properly recall, leaving "non-GPU tasks in memory while suspended" solved problems for me and is the preferred option on all my machines
ID: 88905 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 88949 - Posted: 19 May 2018, 12:50:15 UTC - in response to Message 88888.  
Last modified: 19 May 2018, 13:03:54 UTC

My Coffee Lake (i7-8700) credits running under Ubuntu 18.04 range from excellent (1,727.88 points) to miserable (178.94 points), with everything in between.
https://boinc.bakerlab.org/rosetta/results.php?hostid=3399951&offset=20&show_names=0&state=4&appid=

On the other hand, my more modest Ivy Bridge (i7-3770) machine running under Win7 64-bit has been much more consistent, averaging around 700 points or so.
https://boinc.bakerlab.org/rosetta/results.php?hostid=3381276&offset=0&show_names=0&state=4&appid=

These machines have sometimes run only Rosetta, but sometimes other projects as well. Universe (BHspin v2) is probably the best, since it uses very little memory, only about 3 MB, and is very stable (does not crash). But they have had all the cores busy with something, and none left free (they are dedicated machines which I do not use for desktop purposes).

The Ubuntu machine seems to react most negatively to something. And that something may be crashes of work units, whether of Rosetta itself (as with the 4.07 x64 work units), or of GPUGrid Quantum Chemistry, which has had its own problems recently. Even running just Rosetta (or Rosetta and Universe) on the Ubuntu machine does not entirely stabilize it. Otherwise, I see no real rhyme or reason to it.

But for whatever reason, Windows is more stable. There is no point wasting a very capable Coffee Lake chip when a lowly Ivy Bridge can average about as well. Until the crashes end, I think I will take the i7-8700 off of Rosetta and use it elsewhere, and maybe try again later.
ID: 88949 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 88973 - Posted: 22 May 2018, 15:19:35 UTC - in response to Message 88949.  

My Ivy Bridge is running on only seven cores now, with the other one free (Win7 64-bit). That is why the credits are a little high, since the cores are not fully loaded.
I could probably get a little more total output by running on eight cores, but I like it this way, for a while at least. And it may help to ensure a little extra cache is available, for whatever that is worth (not clear at present).
https://boinc.bakerlab.org/rosetta/results.php?hostid=3381276&offset=0&show_names=0&state=4&appid=

I will just let it run until something comes along to break it.
ID: 88973 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile shanen
Avatar

Send message
Joined: 16 Apr 14
Posts: 195
Credit: 12,662,308
RAC: 0
Message 88985 - Posted: 24 May 2018, 8:26:45 UTC

Frankly amusing to see people worrying about such details as regards THIS particular BOINC project. I actually stopped by to wonder how many other volunteers are nuking 3-day projects on sight.
#1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech)
ID: 88985 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 88989 - Posted: 24 May 2018, 12:47:37 UTC - in response to Message 88985.  

It is one of the most important projects for its science and potential benefits. They just rush the work into production without testing it thoroughly I believe.
I was surprised that the OS makes a difference, as well as the CPU type (Intel v AMD). They really should look into that for their own benefit. But it is a great project otherwise.
ID: 88989 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 272
Credit: 21,057,031
RAC: 17,867
Message 88991 - Posted: 24 May 2018, 14:55:25 UTC - in response to Message 88989.  

It is one of the most important projects for its science and potential benefits. They just rush the work into production without testing it thoroughly I believe.
I was surprised that the OS makes a difference, as well as the CPU type (Intel v AMD). They really should look into that for their own benefit. But it is a great project otherwise.


Rosetta IS important, as is BOINC.

There are many small things that can impact performance substantially. Every CPU clock time can be broken down into the parts getting/decoding the instructions, getting the data and executing/storing the results.

One example, .... Sandy Bridge had a major change in CPU cache. Before, if one CPU had a modified data value and another CPU wanted it ... the CPU would write it to memory and then the new CPU would READ it from memory. Sandy Bridge changed that and the owning CPU just handed the modified value to the new CPU and invalidated that line of its cache. Big performance difference. Intel CPU before Sandy Bridge suffer.

There is a professor Agner Fog at the University of Denmark who has done a lot of work at comparing CPU performance at the INSTRUCTION level and has published some interesting data. How about the CYCLE counts of all the Intel/AMD CPU so you can see some of the differences.

http://www.agner.org/optimize/instruction_tables.pdf
ID: 88991 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile shanen
Avatar

Send message
Joined: 16 Apr 14
Posts: 195
Credit: 12,662,308
RAC: 0
Message 88996 - Posted: 26 May 2018, 0:57:32 UTC - in response to Message 88989.  

It is one of the most important projects for its science and potential benefits. They just rush the work into production without testing it thoroughly I believe.

You are repeating one of my oft-repeated concerns: Any programs that are supposed to be producing scientific results need to be "tested thoroughly" or the research itself becomes questionable. The project staff seems to have a rather cavalier attitude towards testing, but maybe that's only on the side of the software that the volunteers see. Looks buggy to us, but maybe it's perfect on the results side. (But I doubt it and I strongly hope that they are running all crucial results several times in several ways.)

From what I've seen, if I were still a senior referee for the IEEE Computer Society and if I was reviewing a paper that relied on data from Rosetta@home calculations, I would start out with a highly skeptical attitude. At a minimum I would want to know that the code was audited, but more likely I would ask for replication of the key calculations by some other researchers.

Right now I'm just a volunteer, and my main annoyance is the 3-day deadlines. I'm mostly nuking those pending tasks on sight and NOT feeling sorry about wasting the project's bandwidth. Not at all.
#1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech)
ID: 88996 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 88997 - Posted: 26 May 2018, 6:44:23 UTC - in response to Message 88996.  
Last modified: 26 May 2018, 7:31:20 UTC

I was going on the opposite assumption: they have focused too much on getting the science right, at the expense of basic computer operation. At least I am in no position to suggest otherwise, given their eminent position in the field. I am sure they have plenty of peer review when they publish to validate their results; not that they could not benefit from a basic computer science review as you suggest also.

But while we are on the subject, I thought I would fire up my i7-8700 on Ubuntu 18.04 again, having made the fix for the x64 crashes as taught by Juha (great work) and implemented by rjs5 (https://boinc.bakerlab.org/rosetta/forum_thread.php?id=12242&postid=88954#8895).

At the moment, things are going fine, and the output is much more constant. Whether that is a long term fix we will see. If so, it would suggest that the crashes somehow affected the running work units. So basic computer operation is not to be neglected either of course. (I started out running with Universe BHspin v2 too, but now all 12 cores are on Rosetta; I don't even have a GPU installed.)
https://boinc.bakerlab.org/results.php?hostid=3399951&offset=0&show_names=0&state=4&appid=

EDIT: The initial group are all x64, so an alternate explanation is just that the x64 have more consistent output than the 32-bit ones. It will take a while to see.
ID: 88997 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 89073 - Posted: 7 Jun 2018, 16:06:57 UTC

FWIW, I just can not get Rosetta to run consistently on my i7-8700 (Ubuntu 18.04). It will start out great when I first attach (1200 points for the 24 hour work units), and then go downhill. It is now at around 170 PPD.
https://boinc.bakerlab.org/results.php?hostid=3399951&offset=0&show_names=0&state=4&appid=
Leaving cores free, or running with or without other projects does not seem to help.

On the other hand, my Windows 7 64-bit machine (i7-3770) does consistently well, at around 800 PPD (6 cores, leaving 2 free). So that is how I will go.
https://boinc.bakerlab.org/results.php?hostid=3381276&offset=0&show_names=0&state=4&appid=
ID: 89073 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 89075 - Posted: 7 Jun 2018, 17:52:21 UTC

@Jim1348, that sounds odd. Does the machine start thrashing memory over time? The credit system is structured in such a way that if the particular work unit you were crunching were harder, then the credit per model would be higher. In other words, the machines of other users did not find those WUs any harder than normal, but for some reason your machine isn't getting as much work done.

You used the phrase "...when I first attach...", so I wasn't certain if you meant the client attaching to BOINC Manager, or BOINC Manager attaching to the project. If you remain attached to the project, and power off the machine for 30 seconds and then restart things, does your credit per hour improve again?

If reboot resumes more normal credit, this would tend to indicate something is up with your machine. Perhaps there is a memory leak somewhere. Given that the work units are beginning anew each day, it would seem more likely any such memory leak is in the operating system somewhere.
Rosetta Moderator: Mod.Sense
ID: 89075 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 89076 - Posted: 7 Jun 2018, 17:59:15 UTC - in response to Message 89075.  
Last modified: 7 Jun 2018, 18:06:05 UTC

Those are all good questions. The credit has usually been good when I first attach to the project, but then becomes erratic before finally settling down to a low value. Rebooting the machine does not help. I would normally think that it is something on my machine also, but I can't find it. It is a dedicated machine with only BOINC running; not even a GPU now.

I thought perhaps the work units were different between Windows and Linux, and that somehow the BOINC credit system is to blame. But it would have to be a large discrepancy for that to be the case.

I have plenty of memory - 32 GB on both machines by the way, and devote several GB to a write cache on the Ubuntu machine. There is always plenty of free memory when I check it.

PS - One of these days I am going to turn off hyper-threading and see what that does. But it is a bit of a nuisance to attach a monitor, etc. and I won't get around to it for some time. But if something dramatic happens, I will post about it. Until then, it will have to be a known unknown.
ID: 89076 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 272
Credit: 21,057,031
RAC: 17,867
Message 89084 - Posted: 9 Jun 2018, 19:09:42 UTC - in response to Message 89076.  

Those are all good questions. The credit has usually been good when I first attach to the project, but then becomes erratic before finally settling down to a low value. Rebooting the machine does not help. I would normally think that it is something on my machine also, but I can't find it. It is a dedicated machine with only BOINC running; not even a GPU now.

I thought perhaps the work units were different between Windows and Linux, and that somehow the BOINC credit system is to blame. But it would have to be a large discrepancy for that to be the case.

I have plenty of memory - 32 GB on both machines by the way, and devote several GB to a write cache on the Ubuntu machine. There is always plenty of free memory when I check it.

PS - One of these days I am going to turn off hyper-threading and see what that does. But it is a bit of a nuisance to attach a monitor, etc. and I won't get around to it for some time. But if something dramatic happens, I will post about it. Until then, it will have to be a known unknown.



Is the disk write caching in the default Ubuntu state or did you change the settings?
Most drives already have a GB or so of write caching on the drive and for SSD ... I don't think it is really needed.
Did you measure the impact of changing the setting (if you did)?

I have not seen Rosetta behavior that would benefit from write caching.
ID: 89084 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 89085 - Posted: 9 Jun 2018, 22:16:50 UTC - in response to Message 89084.  
Last modified: 9 Jun 2018, 22:24:41 UTC

Is the disk write caching in the default Ubuntu state or did you change the settings?
Most drives already have a GB or so of write caching on the drive and for SSD ... I don't think it is really needed.
Did you measure the impact of changing the setting (if you did)?

I have not seen Rosetta behavior that would benefit from write caching.

The large cache was not set for Rosetta but for other projects that have high write rates, in order to protect the SSD.
However, I have never found that too much hurts anything, though you are right that Rosetta does not need it.

Here are the settings:

Swappiness: to never use swap: sudo sysctl vm.swappiness=0

Set write cache to 12 GB/12.5 GB: for 32 GB main memory
sudo sysctl vm.dirty_background_bytes=12000000000
sudo sysctl vm.dirty_bytes=12500000000
sudo sysctl vm.dirty_writeback_centisecs=500 (checks the cache every 5 seconds)
sudo sysctl vm.dirty_expire_centisecs=720000 (flush pages older than 2 hours)

Insofar as I know, it just means that Rosetta will be operating mainly out of the DRAM cache rather than accessing the SSD most of the time.
I will set it back to the default and try again in a few days.
ID: 89085 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 272
Credit: 21,057,031
RAC: 17,867
Message 89086 - Posted: 10 Jun 2018, 3:01:58 UTC - in response to Message 89085.  

Is the disk write caching in the default Ubuntu state or did you change the settings?
Most drives already have a GB or so of write caching on the drive and for SSD ... I don't think it is really needed.
Did you measure the impact of changing the setting (if you did)?

I have not seen Rosetta behavior that would benefit from write caching.

The large cache was not set for Rosetta but for other projects that have high write rates, in order to protect the SSD.
However, I have never found that too much hurts anything, though you are right that Rosetta does not need it.

Here are the settings:

Swappiness: to never use swap: sudo sysctl vm.swappiness=0

Set write cache to 12 GB/12.5 GB: for 32 GB main memory
sudo sysctl vm.dirty_background_bytes=12000000000
sudo sysctl vm.dirty_bytes=12500000000
sudo sysctl vm.dirty_writeback_centisecs=500 (checks the cache every 5 seconds)
sudo sysctl vm.dirty_expire_centisecs=720000 (flush pages older than 2 hours)

Insofar as I know, it just means that Rosetta will be operating mainly out of the DRAM cache rather than accessing the SSD most of the time.
I will set it back to the default and try again in a few days.


I don't think the SSD needs protecting. Only a large fraction of the SSD is active at any one time. The SSD BIOS has "wear algorithms" built into the drive that map the LOGICAL drive block number to a PHYSICAL block. The wear algorithms move stuff around so wear is uniform AND any reliability problem is automatically handled. Any block that exhibits write or retention problems will be detected with their multiple bit detection algorithms is removed from the active drive.

The Linux kernel guys have probably implemented the write caches properly, but I always worry about someone using memory copy code that purges the CPU caches.

I like that you will be looking at he performance.
IMO, Rosetta spends entirely too much time trying to make things "fair" and it just takes too much time to explain AND results are unstable.
ID: 89086 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Output versus work unit size



©2024 University of Washington
https://www.bakerlab.org