Error while computing (Ubuntu 18.04 LTS, Boinc version 7.9.3)

Message boards : Number crunching : Error while computing (Ubuntu 18.04 LTS, Boinc version 7.9.3)

To post messages, you must log in.

AuthorMessage
sergioclr

Send message
Joined: 16 Jan 13
Posts: 3
Credit: 12,757
RAC: 0
Message 89539 - Posted: 12 Sep 2018, 12:51:32 UTC

"Error while computing" on 4 tasks. Please help. Thanks in advance.

Computer
Intel(R) Core(TM)2 Duo CPU E8500 @ 3.16GHz [Family 6 Model 23 Stepping 10]
AMD AMD TURKS (DRM 2.50.0 / 4.15.0-33-generic, LLVM 6.0.0) (2048MB) OpenCL: 1.1
Ubuntu 18.04.1 LTS [4.15.0-33-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)]
BOINC version 7.9.3

Tasks that failed (all of them)
PF16401.4_nojmps_aivan_SAVE_ALL_OUT_03_09_686649_6570_1
Stderr output

<core_client_version>7.9.3</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)</message>
<stderr_txt>
command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.07_x86_64-pc-linux-gnu @PF16401.4.nojmps.flags -in:file:boinc_wu_zip PF16401.4.nojmps.zip -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 1171180
rosetta_4.07_x86_64-pc-linux-gnu: loadlocale.c:129: _nl_intern_locale_data: Assertion `cnt < (sizeof (_nl_value_type_LC_TIME) / sizeof (_nl_value_type_LC_TIME[0]))' failed.
SIGABRT: abort called
Stack trace (17 frames):
[0x5efead0]
[0x5ffe380]
[0x607e517]
[0x60083a8]
[0x6002794]
[0x60027ee]
[0x6000f73]
[0x6001996]
[0x60007df]
[0x600020e]
[0x5f1d10e]
[0x5f1d73e]
[0x5f1707a]
[0x5f17202]
[0x412631]
[0x5fff8cc]
[0x610b97]

Exiting...

</stderr_txt>
]]>

PF14092.5_jmps_aivan_SAVE_ALL_OUT_03_09_686650_38888_0

Stderr output
<core_client_version>7.9.3</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)</message>
<stderr_txt>
command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.07_x86_64-pc-linux-gnu @PF14092.5.jmps.flags -in:file:boinc_wu_zip PF14092.5.jmps.zip -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 3961113
rosetta_4.07_x86_64-pc-linux-gnu: loadlocale.c:129: _nl_intern_locale_data: Assertion `cnt < (sizeof (_nl_value_type_LC_TIME) / sizeof (_nl_value_type_LC_TIME[0]))' failed.
SIGABRT: abort called
Stack trace (17 frames):
[0x5efead0]
[0x5ffe380]
[0x607e517]
[0x60083a8]
[0x6002794]
[0x60027ee]
[0x6000f73]
[0x6001996]
[0x60007df]
[0x600020e]
[0x5f1d10e]
[0x5f1d73e]
[0x5f1707a]
[0x5f17202]
[0x412631]
[0x5fff8cc]
[0x610b97]

Exiting...

</stderr_txt>
]]>

PF11937.7_jmps_aivan_SAVE_ALL_OUT_03_09_686692_2863_0
Stderr output

<core_client_version>7.9.3</core_client_version>
<![CDATA[
<message>
couldn't start app: Input file minirosetta_database_1a38360_n_methyl.zip missing or invalid: RSA key check failed for file</message>
]]>

PF06834.10_jmps_aivan_SAVE_ALL_OUT_03_09_686470_11068_0

Stderr output
<core_client_version>7.9.3</core_client_version>
<![CDATA[
<message>
app_version download error: couldn't get input files:
<file_xfer_error>
<file_name>minirosetta_database_1a38360_n_methyl.zip</file_name>
<error_code>-120 (RSA key check failed for file)</error_code>
<error_message>signature verification failed</error_message>Tasks that failed (all of them)
</file_xfer_error>
</message>
]]>
ID: 89539 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 273
Credit: 21,465,703
RAC: 16,826
Message 89540 - Posted: 12 Sep 2018, 13:17:59 UTC - in response to Message 89539.  
Last modified: 12 Sep 2018, 13:21:27 UTC

The key information is your Ubuntu version AND the error message.

"rosetta_4.07_x86_64-pc-linux-gnu: loadlocale.c:129: _nl_intern_locale_data: Assertion `cnt < (sizeof (_nl_value_type_LC_TIME) / sizeof (_nl_value_type_LC_TIME[0]))' failed."

Search for "glibc" on the message boards and change the locale settings for the BOINC user.

from: Trotador

Rosetta 4.07 was linked statically. Ubuntu 18.04 migrated to the next version of GLIBC version 2.27. They made some changes around the "locale" settings system calls that introduce a error. The solution is to set the "locale" setting properly for the BOINC user. You can search the forum for "glibc" and find a number of discussions. I would REALLY like for the Rosetta developers to publish the preferred instructions.


Instead of changing language options globally I would suggest limiting changes to only what is needed, in this case BOINC client.

For those using repository BOINC package and systemd distro, you can edit boinc-client.service file or add an override to the service. The override would look something like this:

[Service]
Environment=LC_ALL=C



LC_ALL overrides all the other language settings. Put the override file in /etc/systemd/system/boinc-client.service.d/locale.override.conf and restart the client with sudo systemctl restart boinc-client. If changing the distro supplied service file then find boinc-client.service and add the Environment line in Service section. Changes to the file will be overwritten any time the package is updated.

For those not using distro package or not using systemd make similar change to whatever startup script you use for the client.



I installed BOINC with a .sh file so it is completely within the BOINC folder, what would I have to modify to circumvent the locale error?



===============================

This was posted by henfredemars: in the thread ....
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=12242

2) Message boards : Number crunching : Rosetta 4.0+ (Message 89466)
Posted 11 days ago by henfredemars
Post: It took me hours to find a fix for this! I am so glad that others have found this problem and found a solution. Setting the locale to C using systemd's service file worked perfectly.

Please don't statically link to glibc. That's just a bad idea.

Hint: Ubuntu users, you can use

systemctl show boinc-client.service | grep Path

...to find the service file.
ID: 89540 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
sergioclr

Send message
Joined: 16 Jan 13
Posts: 3
Credit: 12,757
RAC: 0
Message 89541 - Posted: 12 Sep 2018, 16:56:56 UTC - in response to Message 89540.  

Thank you for your prompt answer.

I edited /lib/systemd/system/boinc-client.service to include the Environment parameter.

[Unit]
Description=Berkeley Open Infrastructure Network Computing Client
Documentation=man:boinc(1)
After=network-online.target

[Service]
Environment=LC_ALL=C
ProtectHome=true
Type=simple
Nice=10
User=boinc
WorkingDirectory=/var/lib/boinc
ExecStart=/usr/bin/boinc
ExecStop=/usr/bin/boinccmd --quit
ExecReload=/usr/bin/boinccmd --read_cc_config
ExecStopPost=/bin/rm -f lockfile
IOSchedulingClass=idle
.
.

I stopped Boinc client, rebooted the computer and requested
2 new tasks to verify if the problem (Error while computing)
still exists after the tasks are finished.

Status: waiting for the tasks to finish execution.
ID: 89541 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
sergioclr

Send message
Joined: 16 Jan 13
Posts: 3
Credit: 12,757
RAC: 0
Message 89548 - Posted: 13 Sep 2018, 18:41:26 UTC

For the time being 2 tasks have been successfully processed and validated after including the Environment parameter as
per my last thread update.

Tasks:
fadh_9_11_db_gre_1xlattparam_1_rpce1_atpacstincr-run_0_0_X_h17_l3_h16_l3-start_0006__9.9_10_fixedLoop_0003_design_m8_fragments_abinitio_SAVE_ALL_OUT_687026_180_0
foldit_2005364_0001_fold_and_dock_SAVE_ALL_OUT_686932_487_1


Question: did someone, with the same problem (Error while computing), apply the same solution?

i appreciate any answers or comments.

Tks, Sergio.
ID: 89548 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 89549 - Posted: 13 Sep 2018, 19:56:10 UTC - in response to Message 89548.  

Question: did someone, with the same problem (Error while computing), apply the same solution?

Sure. I (and everybody else) have done it ever since rjs5/Juha came up with that solution.
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=12242&postid=88951#88951

It is strange that we have to do it, but apparently to cure it will introduce other problems now. Rosetta is in a bit of a hole.
ID: 89549 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 273
Credit: 21,465,703
RAC: 16,826
Message 89552 - Posted: 14 Sep 2018, 0:29:15 UTC - in response to Message 89549.  

Question: did someone, with the same problem (Error while computing), apply the same solution?

Sure. I (and everybody else) have done it ever since rjs5/Juha came up with that solution.
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=12242&postid=88951#88951

It is strange that we have to do it, but apparently to cure it will introduce other problems now. Rosetta is in a bit of a hole.



What will make your head explode is that .... ALL Rosetta developers have to do to make their code work is to add ONE system call at the beginning of execution that sets the LC_ALL parameter. If they did that, every OLD and NEW GLIBC distribution would work automatically. They are paying the cost of NOT fixing it through the system cost of sending these jobs and managing the failure results. Crunchers are paying for the network bandwidth, disk space ....

The one line code change to eliminate the problem? The putenv call will likely work and make the Rosetta execution more robust and work in both environments. Of course, their are other ways, but this one system call should just change operation for that one WU so nothing else on the system is changed. Others will correct me if I am wrong, BUT they will have a better solution.
8-)
putenv("LC_ALL=C");
ID: 89552 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Paul

Send message
Joined: 29 Oct 05
Posts: 193
Credit: 65,754,624
RAC: 1,396
Message 89581 - Posted: 19 Sep 2018, 3:54:52 UTC - in response to Message 89552.  

I hope someone can fix this soon. I tried to implement the fix listed above but my systems don’t have that directory structure. I have /etc/system/systemd but I don’t have the boinc-client.system.d folder.

If the developers can fix this with 1 line of code I hope they will do it soon. I have several big systems that refuse to process 4.07 jobs because they have the 2.27 version of glibc.
Thx!

Paul

ID: 89581 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,663,494
RAC: 723
Message 89582 - Posted: 19 Sep 2018, 8:05:23 UTC

I've had three work units fail today, exit code 1, Windows 8.1 BOINC 7.12.1, all after just a few seconds.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 89582 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 273
Credit: 21,465,703
RAC: 16,826
Message 89587 - Posted: 19 Sep 2018, 11:01:32 UTC - in response to Message 89581.  

I hope someone can fix this soon. I tried to implement the fix listed above but my systems don’t have that directory structure. I have /etc/system/systemd but I don’t have the boinc-client.system.d folder.

If the developers can fix this with 1 line of code I hope they will do it soon. I have several big systems that refuse to process 4.07 jobs because they have the 2.27 version of glibc.



The one line fix I gave them not only appears to fix the problem that glibc 2.27 users see today, BUT should also prevent the reverse when Rosetta starts building on 18.04. The Rosetta systems are all on Ubuntu 16.04 today.

The file on my Virtualbox image of Ubuntu 18.04 and my Fedora 27 hardware is located one level deeper in the "multi-user.target.wants" directory. Seems like the systemctl command automatically leaves out that one "multi-user" directory in the path.


find /etc | grep boinc-client.service
/etc/systemd/system/multi-user.target.wants/boinc-client.service


If you get a clean simple set of instructions that work for you, it would be nice to get them documented in a clean, new thread. You can message me if you think I can help.

1. I isolated the problem to a static link with glibc 2.26 when running on a system with glibc 2.27.
2. My first sledge-hammer solution was to set LC_ALL=C system wide which causes some problems with some applications like "terminal".
3. The solution was refined to remove my system-wide sledge-hammer patch and locate the specific boinc user file to modify it.
ID: 89587 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Paul

Send message
Joined: 29 Oct 05
Posts: 193
Credit: 65,754,624
RAC: 1,396
Message 89633 - Posted: 25 Sep 2018, 21:13:23 UTC - in response to Message 89587.  

That fixed it! I have several successful 4.07 jobs completed & validated.

Thx
Thx!

Paul

ID: 89633 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1866
Credit: 8,186,159
RAC: 7,029
Message 89650 - Posted: 27 Sep 2018, 5:36:20 UTC

Something new (from Ralph):
Rosetta beta 64 bit linux version 4.08 released for testing.
This update includes a fix suggested by rjs5 for the latest linux versions that use glibc 2.27+. Thank you rjs5!


I hope they use also the others suggestion of rjs5 to improve the code...
Well done Rjs5!!
ID: 89650 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Error while computing (Ubuntu 18.04 LTS, Boinc version 7.9.3)



©2024 University of Washington
https://www.bakerlab.org