Message boards : Number crunching : Problems on old AMD processors (pre-Bulldozer)
Author | Message |
---|---|
spRocket Send message Joined: 23 Mar 20 Posts: 22 Credit: 3,008,018 RAC: 0 |
I'm finding that I get signal 11 issues with a couple of older AMD processors, an Athlon II X4 630 and a Phenom II X2 550 Black Edition (the latter running with two unlocked cores). Both of these systems are running on ASUS M4A785-M motherboards with 4 GB of ECC RAM. It seems that Rosetta Mini works OK, but the full Rosetta consistently gets errors on tasks. An example from Task 1133622372: <core_client_version>7.9.3</core_client_version> <![CDATA[ <message> process got signal 11</message> <stderr_txt> command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.08_x86_64-pc-linux-gnu -run:protocol jd2_scripting -parser:protocol jhr_boinc.xml @flags -in:file:silent 7hp5zr7e_jhr_design1_COVID-19.silent -in:file:silent_struct_type binary -silent_gz -mute all -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip 7hp5zr7e_jhr_design1_COVID-19.zip -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 3696211 Starting watchdog... Watchdog active. </stderr_txt> ]]> Both of these CPUs are shown as "Family 16" in the CPU type listing. In the meantime, I've shifted both of these systems over to World Community Grid, which is working as it should. On the other hand, my Ryzen 7/1700 is happily devouring Rosetta tasks, as is an old ThinkPad with an i7 L 640.[/code] |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
It looks like that task ran on your system with 4 CPU cores and 4GB of memory. It seems the COVID tasks are consuming more memory than has been typical for other work. I believe you will find that running both WCG and R@h with same resource share will leave you with enough memory to still run some R@h work if you wish. This tends to run half the cores on WCG and half on R@h, and so runs a small memory WCG task alongside a large memory R@h task. Rosetta Moderator: Mod.Sense |
spRocket Send message Joined: 23 Mar 20 Posts: 22 Credit: 3,008,018 RAC: 0 |
I think I'll just give another one of my other older machines a cleaning. I tried it earlier and I started hearing a thermal warning tone from its speaker - but on the other hand, it has 8 GB of RAM installed. |
William Albert Send message Joined: 22 Mar 20 Posts: 23 Credit: 1,069,070 RAC: 99 |
I'm also having some odd issues with a pre-Bulldozer AMD machine. The problem machine is running an AMD Turion II Neo N40L (essentially a low-power K10 chip). At the time of this writing, out of the 120 WUs that have been issued to it, 93 have failed with "process got signal 11". On a positive note, the WUs that fail do so quickly, so the machine isn't wasting too much time processing WUs that end up getting thrown away. However, it also points away from overheating (or some other type of environmental factor), and toward some type of compatibility issue. I also have another machine running an AMD Phenom II X6 1090T, which is also a K10 chip, and it hasn't had failed WUs yet. The one thing both of our failing machines have in common is that they're running Ubuntu 18.04, whereas my working AMD machine is running Windows 10. Perhaps there's some type of bug or compatibility issue with Rosetta's Linux workers? |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
I think the AMD running Linux issue is now understood, see this post for details on AMD getting signal 11 failures. A fix will be tested on Ralph soon. Rosetta Moderator: Mod.Sense |
Tom M Send message Joined: 20 Jun 17 Posts: 87 Credit: 15,166,437 RAC: 39,077 |
I have run under Ubuntu 18.4 with no problem for Rosetta. I haven't fired up an A6 laptop that I have so I can't tell if it is a problem on that old of hardware either. Tom Help, my tagline is missing..... Help, my tagline is......... Help, m........ Hel..... |
Message boards :
Number crunching :
Problems on old AMD processors (pre-Bulldozer)
©2024 University of Washington
https://www.bakerlab.org