Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 163 · 164 · 165 · 166 · 167 · 168 · 169 . . . 300 · Next
Author | Message |
---|---|
gbayler Send message Joined: 10 Apr 20 Posts: 14 Credit: 3,069,484 RAC: 0 |
For the Linux-users out there: I have written a Perl-script boinc_watchdog.pl that checks for "0 CPU"-tasks (tasks with a very low CPU utilization, that likely won't terminate) and whether there is at least one task executing. If it finds "0 CPU"-tasks, it aborts them, and if there is not a single task executing, it restarts the boinc-client. I run it every 30 minutes as a cron job; for me, it works quite well. I am perfectly aware that this doesn't solve the root cause of the current problems, this is merely a workaround. Still, I think it is an improvement in comparison to having to manually abort tasks or restart the PC every other day. Here you can find it: https://github.com/gbayler/boinc_watchdog Hope that it is useful for someone else too! :) Günther |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,494 RAC: 22,489 |
Swap files are for poor people without enough RAM :-) If you don't have matched pairs of RAM, things can slow down. Dual channel is a great benefit for some things but not others. Depends if they're accessing the memory a lot. I changed my Ryzen to dual channel to make my game faster. It didn't help, but half the Boinc projects sped up a lot. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,494 RAC: 22,489 |
AFAIK Windows has a write cache unless it's a removable drive. In fact I know it does, because I've copied a huge amount of files from an SSD to a rotary drive, and the rotary drive kept being accessed long after things looked like they'd copied. Here's a cite: https://www.tenforums.com/tutorials/21904-enable-disable-disk-write-caching-windows-10-a.htmlEverything works better with more memory, if you're not using it you get a massive disk cache. Using all four slots does not cause stability problems. Always test your new memory with memtest before use, even quality stuff has duds. Memtest really doesn't have much to do with stability. It is mainly for errors, which might cause crashes, but more likely failures in work units.Memtest is everything to do with stability. Every single time someone has come to me with a crashing computer, I've found dodgy memory using Memtest. With large amounts of memory, especially the two-sided memory modules, you will see many more crashes using four slots. Check the forums.Not in my experience. Must be dodgy memory. I can find nothing on google suggesting 4 sticks causes problems. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,494 RAC: 22,489 |
The write rates on the pythons are horrendous. I am getting well over 1 TB/day (almost 2 TB) when running 20 pythons, even with a huge 26 GB write cache. That is too much. I will do something else with this machine.SSDs have a longer life than rotary drives nowadays, look up the expected writes allowed to your SSD model and see how long the Pythons would take to wear it out. And caching the writes won't help anyway, since they have to be done at some point. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
SSDs have a longer life than rotary drives nowadays, look up the expected writes allowed to your SSD model and see how long the Pythons would take to wear it out. And caching the writes won't help anyway, since they have to be done at some point. You can find out the hard way about SSD lifetimes. They usually don't publish the figures now, probably because they have been going down as the chip geometries shrink. The caching for science projects works differently than if you are copying a video file, which would all have to be transferred. But in a scientific algorithm, you usually read from a location, do a calculation, a then store the value back, either into the original location or a related one. Therefore, by storing the information in DRAM memory, most of the writes are done to the memory. You transfer to the SSD only the residual writes remaining at the end of the cache latency period. In fact, if you made the cache latency (write-delay) long enough, you would never have to transfer any of the writes to the SSD. That is effectively what a ramdisk does, but it requires a lot more memory. You would have to store the entire BOINC data folder. |
.clair. Send message Joined: 2 Jan 07 Posts: 274 Credit: 26,399,595 RAC: 0 |
By the way, I used to just put projects with high write rates on a ramdisk, and have all the writes go to main memory. Yes , some of the pythons need a kick in the compilers Amazingly I have 31 python and 7 R4,2 tasks running ATM and I have been through them to clear out two 0 cpu dud work units , it is a pain having to that at least once a day Rosetta is using 235GB of disk space though the most I have seen was 266GB Ram use right now is 59GB total system use 71GB on `standby` and only 40MB `free` of 128GB fitted in 8 slots [crashes ?? wot crashes !! . . . . tic tic tic . . . BOOM :), SSD write bombardment by pythons , following an idea by [Greg I think] I have put in a 500GB SATA SSD Samsung 870 evo [£58 on ebay new still sealed] I will see how long it lasts , though I haven't installed the additional "Samsung Magician" apps yet to keep an eye on the write rate , trim, garbage clean up etc installed only boinc on it , to speed up python work unit loading times . it looked like the fastest kid on the block in benchmarks at low price , there is faster stuff out there at a high cost I did look at M2 NVME drives but getting them to work in win7 looks like a pain of magical incantations on the command line to load the drivers , win8.1 onwards has them in already [I checked MS forum] OK time to post this drivel on the forum and see what happens :) |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Ram use right now is 59GB total system use 71GB on `standby` and only 40MB `free` of 128GB fitted in 8 slots [crashes ?? wot crashes !! . . . . tic tic tic . . . BOOM :), Good. I was hoping that someone would do some real-world tests. I don't want do them myself. |
.clair. Send message Joined: 2 Jan 07 Posts: 274 Credit: 26,399,595 RAC: 0 |
Swap files are for poor people without enough RAM :-) I waz tiepin while you waz postin :-), |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,494 RAC: 22,489 |
The caching for science projects works differently than if you are copying a video file, which would all have to be transferred. But in a scientific algorithm, you usually read from a location, do a calculation, a then store the value back, either into the original location or a related one. Therefore, by storing the information in DRAM memory, most of the writes are done to the memory. You transfer to the SSD only the residual writes remaining at the end of the cache latency period.Modern SSDs take 3000 write cycles, pythons write about 2MB/s per task on a fast CPU, so if you have a 1TB SSD, that would last for 5 years even running 10 at once, by which time you'd want to buy a bigger one anyway. However hard disks hate moving heads back and forth and fall apart with that much random access. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,494 RAC: 22,489 |
Yes , some of the pythons need a kick in the compilersStandby memory? Is that disk cache? In windows, all my RAM is always in use, but the disk cache just takes whatever is left. You can ignore that. HWInfo (or many other free utilities) will show you the disk SMART data so you can see how much life is left. The drive reports % life left. Why are you still on Windows 7? 10 was free. NVME is about 8 times faster. If your MB does have a slot for it, you can get cards to take them. |
.clair. Send message Joined: 2 Jan 07 Posts: 274 Credit: 26,399,595 RAC: 0 |
Standby memory? Is that disk cache? In windows, all my RAM is always in use, but the disk cache just takes whatever is left. You can ignore that. It probably is `Disk Cache` In winders `Resource Monitor` in the `Memory` tab , an MS app built in to W7 etc if you can find it, I keep it `pinned` to the taskbar , it also does Disk , Network , CPU It is using Win 7 Ultimate, I don't know if that version [ that the licence limit will work with twin CPUs sockets] was free , I know from win 8 onwards the max memory that is ok is now something like 192GB unlike win 7 home that is limited to 16GB. No M2 slot onboard , though have seen PCIe-M2 adapters are on Ebay |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Modern SSDs take 3000 write cycles, pythons write about 2MB/s per task on a fast CPU, so if you have a 1TB SSD, that would last for 5 years even running 10 at once, by which time you'd want to buy a bigger one anyway. However hard disks hate moving heads back and forth and fall apart with that much random access. On my Ryzen 3900X, I was seeing the OS write 4 TB/day (for 20 work units), or 46 MB/sec if I have my numbers right. So you can do it if you limit the number of tasks to only a few at a time. And I think that Linux writes a bit less than Windows from what I can see, though it has other problems. You can make it work if you are careful. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,494 RAC: 22,489 |
It is using Win 7 Ultimate, I don't know if that version [ that the licence limit will work with twin CPUs sockets] was free , I know from win 8 onwards the max memory that is ok is now something like 192GBI had my machines some on Win 7 home and some win 7 ultimate. They all got an upgrade to Win 10 home or win 10 pro for free. But I don't think they still do it, unless you fiddle with the settings and say you're disabled! |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,461,273 RAC: 24,705 |
EDIT: The only thing I see is this.It's still system RAM, just a very small amount. Hence the need for 3rd party software if you want to make more use of your RAM for system caching. Grant Darwin NT |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,494 RAC: 22,489 |
That sounds right, if you multiply my numbers up I come to 3.5TB a day for 20 work units. I was just guessing an average from watching the task manager. It'll differ depending on CPU speed, I was watching an i5 8600K.Modern SSDs take 3000 write cycles, pythons write about 2MB/s per task on a fast CPU, so if you have a 1TB SSD, that would last for 5 years even running 10 at once, by which time you'd want to buy a bigger one anyway. However hard disks hate moving heads back and forth and fall apart with that much random access. So you can do it if you limit the number of tasks to only a few at a time. And I think that Linux writes a bit less than Windows from what I can see, though it has other problems.I see an SSD as a consumable (like GPUs that wear out running 24/7). I get them dirt cheap second hand and expect to change them when they're too small or wear out. Most of my Boinc machines are on hard disks because I had loads kicking about not big enough for other uses. As they break I get SSDs. My main computer that gets thrashed all the time has 50% life left on the SSD, but it's an ancient model with old technology that didn't last so long, and it's getting too small anyway. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,494 RAC: 22,489 |
My main computer currently reads:EDIT: The only thing I see is this.It's still system RAM, just a very small amount. Hence the need for 3rd party software if you want to make more use of your RAM for system caching. Memory in use: 22.1GB Cache memory: 38.4GB That's a big cache. Ok, reading up on it, you get 10% RAM disk write cache. So 6.4GB for me. Surely that's enough. Windows server will use 50% RAM. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
EDIT: The only thing I see is this.It's still system RAM, just a very small amount. Hence the need for 3rd party software if you want to make more use of your RAM for system caching. Right. That is the point. You need a lot more. But I think .clair. mentioned Samsung Magician. I have used it when I was on Windows, and it includes around a GB, or maybe less, but could be enough to save an SSD if you did not run too many work units. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,461,273 RAC: 24,705 |
But I think they really need to develop the pythons a bit and call back when they are ready.To be honest, i would classify the present Python work as being at Alpha test level of development- they are still not even good enough for Beta testing. They are no where near being ready for actual deployment IMHO. Excessive system requirements. Errors that result in systems being black listed from getting work- but not even advising those systems of what has happened, let along informing them of what they need to do to get work again. And worst of all- so many Tasks that just don't process & sit there taking up disk & RAM, blocking possibly OK Tasks from being downloaded & worked on requiring manual intervention to remove them. Not to mention the manual intervention often needed to clean up the VirtualBox VM environments. Yep, Alpha software- not yet remotely ready for live deployment. Grant Darwin NT |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I think you have accurately portrayed it. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,494 RAC: 22,489 |
My Windows 10 is using 10% of my RAM = 6.5GB for a write cache. And if you tick the box to turn off write cache buffer flushing, it helps even more. Right click the drive, properties, hardware, properties, change settings, policies, tick "turn off buffer flushing".EDIT: The only thing I see is this.It's still system RAM, just a very small amount. Hence the need for 3rd party software if you want to make more use of your RAM for system caching. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org