Message boards : Number crunching : High error rate
Author | Message |
---|---|
USTL-FIL (Lille Fr) Send message Joined: 20 Jan 10 Posts: 5 Credit: 244,379,048 RAC: 0 |
Hello, I noticed a high error rate today for my linux hosts: https://boinc.bakerlab.org/rosetta/results.php?userid=367342&offset=0&show_names=0&state=6&appid= Some errors like this: "<core_client_version>7.9.3</core_client_version> <![CDATA[ <message> finish file present too long</message> <stderr_txt> command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.20_x86_64-pc-linux-gnu -run:protocol jd2_scripting -parser:protocol predictor_v11_boinc--fuse--covid_at3_design_boinc_v1.xml @flags_covid_site3 -in:file:silent Mini_Protein_binds_COVID-19_boinc_site3_8_SAVE_ALL_OUT_IGNORE_THE_REST_4wp2gq2j.silent -in:file:silent_struct_type binary -silent_gz -mute all -silent_read_through_errors true -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip Mini_Protein_binds_COVID-19_boinc_site3_8_SAVE_ALL_OUT_IGNORE_THE_REST_4wp2gq2j.zip @Mini_Protein_binds_COVID-19_boinc_site3_8_SAVE_ALL_OUT_IGNORE_THE_REST_4wp2gq2j.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 1497416 Using database: database_357d5d93529_n_methyl/minirosetta_database ====================================================== DONE :: 35 starting structures 7153.19 cpu seconds This process generated 35 decoys from 35 attempts ====================================================== BOINC :: WS_max 3.47357e+08 18:23:28 (27269): called boinc_finish(0) </stderr_txt> ]]>" and others like this: "<core_client_version>7.9.3</core_client_version> <![CDATA[ <message> process got signal 11</message> <stderr_txt> 000067bd783 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.20_x86_64-pc-linux-gnu': free(): invalid pointer: 0x00000000067bd783 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.20_x86_64-pc-linux-gnu': free(): invalid pointer: 0x00000000067bd783 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.20_x86_64-pc-linux-gnu': free(): invalid pointer: 0x00000000067bd783 *** *** Error in `../../projects/boinc.bakerlab.org_rosetta/rosetta_4.20_x86_64-pc-linux-gnu': free(): invalid pointer: 0x00000000067bd783 *** . . . " Thanks |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,150,132 RAC: 4,252 |
Hello, A Signal 11 error means: "signal 11, or translated a segmentation error pointing to a problem with your memory, virtual memory (page file) or that it's a bad batch of tasks. However, if you're the only one returning these as an error and consistently over two or more projects, you best go look into a problem with the RAM or page file on that computer." |
Eugene Send message Joined: 11 Jan 16 Posts: 1 Credit: 1,901,430 RAC: 0 |
Regarding "Finish file present too long", that is known client issue: https://github.com/BOINC/boinc/issues/3017 Fixed in recent versions. |
Message boards :
Number crunching :
High error rate
©2024 University of Washington
https://www.bakerlab.org