Message boards : Number crunching : Quite a few signal 11 errors on a Linux host - what does it mean?
Author | Message |
---|---|
wolfman1360 Send message Joined: 18 Feb 17 Posts: 72 Credit: 18,450,036 RAC: 0 |
Hello. Browsing through my machines on the site today and came across this. Not sure what the logs mean. [url]https://boinc.bakerlab.org/rosetta/results.php?hostid=4295358&offset=0&show_names=0&state=6&appid= Log for a task: <message> process got signal 11 </message> <stderr_txt> command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.20_x86_64-pc-linux-gnu -run:protocol jd2_scripting -parser:protocol jhr_boinc_v4.xml @flags -in:file:silent Junior_HalfRoid_design5_COVID-19_SAVE_ALL_OUT_IGNORE_THE_REST_1dy7ly8k.silent -in:file:silent_struct_type binary -silent_gz -mute all -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip Junior_HalfRoid_design5_COVID-19_SAVE_ALL_OUT_IGNORE_THE_REST_1dy7ly8k.zip @Junior_HalfRoid_design5_COVID-19_SAVE_ALL_OUT_IGNORE_THE_REST_1dy7ly8k.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 1075497 Using database: database_357d5d93529_n_methyl/minirosetta_database </stderr_txt Is this a hardware error? temperatures are fine the machine doesn't seem to be struggling for ram. Unless it's using a version of Boinc which is too old? I tried to update using ppa:costamagnagianfranco/boinc, but receiving an error. Might try again if this persists. Any help appreciated. Hate wasting tasks and resources like this.[/url] |
Tom M Send message Joined: 20 Jun 17 Posts: 87 Credit: 15,166,437 RAC: 39,077 |
Hello. Which computer? Help, my tagline is missing..... Help, my tagline is......... Help, m........ Hel..... |
William Albert Send message Joined: 22 Mar 20 Posts: 23 Credit: 1,069,070 RAC: 99 |
Looks like this is your host in question: https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=4295358 Your computer is running an AMD Opteron 6128 HE, which is based on the AMD K10 architecture. AMD K10-based processors have a known issue with Rosetta on Linux where the application assumes that the SSSE3 instruction is present, which it is not for AMD K10. See this bug report for more details. Hopefully this will be fixed soon. Until then, I would recommend either running Windows (which doesn't have this problem), or using this machine for other projects that you might be interested it. |
wolfman1360 Send message Joined: 18 Feb 17 Posts: 72 Credit: 18,450,036 RAC: 0 |
Looks like this is your host in question: https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=4295358 Thank you so much. This helps a great deal. I'll yank it off here and perhaps give WCG or TN grid a go. Or maybe I should just retire it. |
spRocket Send message Joined: 23 Mar 20 Posts: 22 Credit: 3,008,018 RAC: 0 |
Both my homebrew NAS and an old machine I unretired turned out to be bitten by this bug as well. I've put those two onto WCG, and they've been happily crunching away. |
Message boards :
Number crunching :
Quite a few signal 11 errors on a Linux host - what does it mean?
©2024 University of Washington
https://www.bakerlab.org