Rosetta computation error

Questions and Answers : Unix/Linux : Rosetta computation error

To post messages, you must log in.

AuthorMessage
stratos412

Send message
Joined: 18 Mar 20
Posts: 5
Credit: 25,471
RAC: 64
Message 93367 - Posted: 4 Apr 2020, 11:43:18 UTC
Last modified: 4 Apr 2020, 11:43:48 UTC

Hello to the community.

I am a BOINC member and I am participating in ROSETTA@Home project. The last two days I am dealing with an issue when working on ROSETTA projects.
I finished the projects and I come up with a ‘’computation error’’. I have complete three project the last two days
and two of them ended with a ‘’computation error”
Please check link https://boinc.bakerlab.org/rosetta/results.php?hostid=3813606
Is it an issue with PC specs or an issue with the projects?
ID: 93367 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 754
Credit: 5,213,843
RAC: 21,680
Message 93426 - Posted: 4 Apr 2020, 20:31:31 UTC

Check the BOINC Manager Event log for messages relating to disk space. Do you have a AV or Internet Security programme on that system, as many of them need to be told directly that BOINC and it's subfolders are not malicious.
1 Task you aborted, the others had file problems-

WARNING! cannot get file size for default.out.gz: could not open file.
Output exists: default.out.gz Size: -1

<message>
upload failure: <file_xfer_error>
  <file_name>0iy6db6u_Junior_HalfRoid_vs_COVID-19_design1_dev_SAVE_ALL_OUT_NOJRAN_904989_5_0_r488063185_0</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
</message>



WARNING! cannot get file size for default.out.gz: could not open file.
Output exists: default.out.gz Size: -1

<message>
upload failure: <file_xfer_error>
  <file_name>2ek0xs0v_Junior_HalfRoid_vs_COVID-19_design1_SAVE_ALL_OUT_NOJRAN_904988_1_0_r1117466164_0</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
</message>

Grant
Darwin NT
ID: 93426 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
stratos412

Send message
Joined: 18 Mar 20
Posts: 5
Credit: 25,471
RAC: 64
Message 93433 - Posted: 4 Apr 2020, 21:22:37 UTC
Last modified: 4 Apr 2020, 21:31:42 UTC

No AV or Internet security. I use Linux Mint and I haven't installed anything besides what it comes with linux distro.
I have enough space on hard drive (items 30.0 GB / Free space 44.4 GB). I have already completed some Rosetta tasks and other BOINC tasks without any problem.
Check Event Log, I didn't find anything maybe because I restarted my PC...

P.S. I also saw that some other LINUX PCs had similar "computation errors".
Is there a chance for faults/ corrupted projects maybe?


Computing Preferences

Use at most 100 % of the CPUs
Use at most 85 % of CPU time

When to suspend
Suspend when computer is on battery (ticked)
Suspend when computer is in use
Suspend GPU computing when computer is in use
'In use' means mouse/keyboard input in last 3 minutes
Suspend when no mouse/keyboard input in last --- minutes
Suspend when non-BOINC CPU usage is above --- %
Compute only between ---

Other
Store at least 0.2 days of work
Store up to an additional 0.3 days of work
Switch between tasks every 120 minutes
Request tasks to checkpoint at most every 60 seconds

Disk
Use no more than 10 GB
Leave at least 0.5 GB free
Use no more than 50 % of total

Memory
When computer is in use, use at most 75 %
When computer is not in use, use at most 85 %
Leave non-GPU tasks in memory while suspended (ticked)
Page/swap file: use at most 60 %
ID: 93433 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 754
Credit: 5,213,843
RAC: 21,680
Message 93438 - Posted: 4 Apr 2020, 22:34:05 UTC - in response to Message 93433.  
Last modified: 4 Apr 2020, 22:37:00 UTC

No AV or Internet security. I use Linux Mint and I haven't installed anything besides what it comes with linux distro.
I have enough space on hard drive (items 30.0 GB / Free space 44.4 GB). I have already completed some Rosetta tasks and other BOINC tasks without any problem.
Do you have a HDD or an SSD? If it's a HDD, can you give the drive more RAM for write caching?
Rosetta writes a lot of data at various points throughout processing a Task. Even current HDDs shouldn't have any issues, but the errors you've got there point to either AV software (which you don't have) or storage I/O problems, and that is an older, slower, limited core system.


Check Event Log, I didn't find anything maybe because I restarted my PC...
Yep, restarting BOINC clears it's event log (maybe just exiting does it, i've never actually checked that).


P.S. I also saw that some other LINUX PCs had similar "computation errors".
Is there a chance for faults/ corrupted projects maybe?
Always possible, but unlikely.


Computing Preferences

Use at most	100 % of the CPUs
Use at most	85 % of CPU time
I can't see any issues with your Computing preferences, although if trying to reduce system temperatures due to cooling issues i'd suggest swapping those values around (maybe having to change the 85 to 50 to free up the other core)- gnerally it's better just to have less tasks running all the time, than more Tasks running and starting & stopping them. It may be involved with your issue- low powered system dealing with periods of very heavy disk I/O.
Even so, the stock heatsink & fan on the Core2 Duo as long as they are both nice & clean & the case vents & fan(s) are clean shouldn't have any heat issues running both cores at 100%
Grant
Darwin NT
ID: 93438 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
stratos412

Send message
Joined: 18 Mar 20
Posts: 5
Credit: 25,471
RAC: 64
Message 93465 - Posted: 5 Apr 2020, 7:54:51 UTC

PC Specs (it's pretty old):
Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz [Family 6 Model 15 Stepping 6]
RAM: 4GB (BOINC uses 75%-85%)
OS: Linux Mint Tricia 19.3 (32-bit).
DISK: HDD (western digital)
TEMP: Pretty much is fine (around 77 Celsius if I have more than one task at the same time or around 72 Celsius if I have only one task)
(Nothing else runs besides the BOINC projects. I have this PC only for that job)

I really don't proceed to multi-tasks, especially if projects are more than 10 hours.
I usually start 1-2 projects (or 3 if some BOINC projects are up to 3 hours).
CPU to 85% runs pretty well to other BOINC projects

Anyway, I already have two more tasks in queue, so I will test again and also copy the events from BOINC event log
ID: 93465 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
stratos412

Send message
Joined: 18 Mar 20
Posts: 5
Credit: 25,471
RAC: 64
Message 93509 - Posted: 5 Apr 2020, 15:58:27 UTC

UPDATE:

ROSETTA TASK: rb_04_04_20257_20148_ab_t000__robetta_IGNORE_THE_REST_09_05_905151_6
(https://boinc.bakerlab.org/rosetta/result.php?resultid=1140150871)

<core_client_version>7.9.3</core_client_version>
<![CDATA[
<stderr_txt>
command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.12_i686-pc-linux-gnu @rb_04_04_20257_20148_ab_t000__robetta_FLAGS -in::file::fasta t000_.fasta -psipred_ss2 t000_.spider3_ss2 -kill_hairpins t000_.nobuformat.spider3_ss2 -abinitio::use_filters true -in:file:boinc_wu_zip rb_04_04_20257_20148_ab_t000__robetta.zip -frag3 rb_04_04_20257_20148_ab_t000__robetta.200.3mers.index.gz -fragA rb_04_04_20257_20148_ab_t000__robetta.200.5mers.index.gz -fragB rb_04_04_20257_20148_ab_t000__robetta.200.9mers.index.gz -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 1991901
Starting watchdog...
Watchdog active.
======================================================
DONE :: 1 starting structures 28818 cpu seconds
This process generated 32 decoys from 32 attempts
======================================================
BOINC :: WS_max 6.2435e+144

BOINC :: Watchdog shutting down...
18:49:34 (11864): called boinc_finish(0)

</stderr_txt>
]]>


Seems this one worked well. I really can't understand why the other two tasks ended with "computation errors", since I didn't change anything.
ID: 93509 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4016
Credit: 0
RAC: 0
Message 93528 - Posted: 5 Apr 2020, 17:46:29 UTC - in response to Message 93509.  

I really can't understand why the other two tasks ended with "computation errors", since I didn't change anything.


I wouldn't worry too much about it on your end. New application version, bugs shaking out. New 4.13 version on Ralph now for testing.
Rosetta Moderator: Mod.Sense
ID: 93528 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
stratos412

Send message
Joined: 18 Mar 20
Posts: 5
Credit: 25,471
RAC: 64
Message 93646 - Posted: 6 Apr 2020, 15:58:15 UTC

I can understand that a new project/ software will probably have some kind of bugs.
The sad/ frustrating thing about Rosetta projects is that they are time consuming and the last thing you want to see after 8-12 hours of work is these kind of errors.
At least, I hope they can extract some info, despite of these computation errors.
ID: 93646 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Questions and Answers : Unix/Linux : Rosetta computation error



©2020 University of Washington
https://www.bakerlab.org