Minirosetta 3.73-3.78

Message boards : Number crunching : Minirosetta 3.73-3.78

To post messages, you must log in.

Previous · 1 . . . 9 · 10 · 11 · 12

AuthorMessage
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 153
Credit: 8,024,281
RAC: 8,364
Message 88042 - Posted: 9 Jan 2018, 18:21:07 UTC
Last modified: 9 Jan 2018, 18:25:28 UTC

Looks kike something wrong with rb_01_08_.... series of WUs on minirosetta 3.78. (rb_01_08_77806_122534__t000__2_C1_SAVE_ALL_OUT_IGNORE_THE_REST_541301_331_0 latest example)

i have seen some of these tasks consuming huge amount of RAM - it start from standard 200-400 Mb range but at same point can hoard up to 1400-1800 Mb per task. May be even more - it crashed due to out of RAM (8 GB RAM + 4 GB page/swap file on 6-core CPU)
ID: 88042 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 168
Credit: 4,654,958
RAC: 7,004
Message 88043 - Posted: 9 Jan 2018, 19:38:06 UTC - in response to Message 88042.  
Last modified: 9 Jan 2018, 20:03:21 UTC

i have seen some of these tasks consuming huge amount of RAM - it start from standard 200-400 Mb range but at same point can hoard up to 1400-1800 Mb per task.

I have five on Windows 7 64-bit (i7-4771), and six on Ubuntu 16.04 (i7-3770) ranging from 1 to 19 hours with no problems yet, but I will keep an eye on them. If they blow up, it must be late in the run.

EDIT: By the way, I see you are using AMD CPUs. I got poor performance on my Ryzen 1700 on Rosetta, as I reported earlier. I wonder if they need to recompile it to fix this problem too?
ID: 88043 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 895
Credit: 14,273,664
RAC: 13,978
Message 88084 - Posted: 17 Jan 2018, 3:57:06 UTC

Boinc 7.83 recent Mini-rosetta 3.78 error
nRoCM_01_P05055_group0_congq_SAVE_ALL_OUT_IGNORE_THE_REST_541727_1334_0
ERROR: ERROR: reading of AtomPair failed.

ERROR:: Exit from: ..\..\..\src\core\scoring\constraints\ConstraintIO.cc line: 559
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

ID: 88084 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 153
Credit: 8,024,281
RAC: 8,364
Message 88088 - Posted: 17 Jan 2018, 7:44:06 UTC - in response to Message 88043.  
Last modified: 17 Jan 2018, 7:45:14 UTC

I do not see such memory leaks any more lately too.

About AMD CPU performance - I do not know. I do not have any latest AMD CPUs (from Ryzen family) yet.
I am still using older CPUs: one Phenom II X6 and two FX-8320 (Vishera/Piledriver), And I have not seen any performance issues with these older AMD CPUs in Rosetta: they almost on par with corresponding (from same Generation/age and same core number) Intel CPUs.
ID: 88088 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
James W

Send message
Joined: 25 Nov 12
Posts: 27
Credit: 375,613
RAC: 430
Message 88130 - Posted: 20 Jan 2018, 22:48:04 UTC

I've recently begun having this issue with my host XP running Pentium 4 CPU. Previously no problems, though of course slow and relatively low credits as expected. Using app 3.78 windows_intelx86. Workunit 872559942 - Task 967645181

01/20/2018 12:57:30 PM | Rosetta@home | Computation for task rb_01_17_79431_122764__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_542014_553_0 finished
01/20/2018 12:57:30 PM | Rosetta@home | Output file rb_01_17_79431_122764__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_542014_553_0_r1092951988_0 for task rb_01_17_79431_122764__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_542014_553_0 absent
01/20/2018 12:57:51 PM | | Suspending computation - CPU is busy
01/20/2018 12:58:01 PM | | Resuming computation
01/20/2018 12:58:17 PM | Rosetta@home | Sending scheduler request: To report completed tasks.
01/20/2018 12:58:17 PM | Rosetta@home | Reporting 1 completed tasks
01/20/2018 12:58:17 PM | Rosetta@home | Not requesting tasks: don't need (job cache full)
01/20/2018 12:58:26 PM | Rosetta@home | Scheduler request completed

Exit status -1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0121B939 read attempt to address 0x39D5626C

Same errors for Workunit 872559856 Task 967645069.
No point in me continuing to run Rosetta on this host if this situation continues, as able to run SETI@home without issue with it.
ID: 88130 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
James W

Send message
Joined: 25 Nov 12
Posts: 27
Credit: 375,613
RAC: 430
Message 88214 - Posted: 2 Feb 2018, 5:55:58 UTC

I'm having this error on my host Windows XP running Pentium 4 CPU. Using app 3.78 windows_intelx86.
Name RhbaA_18619_a_trimmed_27_127len_cstwt_3.0_centerjumps_9mers_542830_9305_0
Workunit 875039666
Created 30 Jan 2018, 4:33:41 UTC
Sent 30 Jan 2018, 5:07:48 UTC
Report deadline 7 Feb 2018, 5:07:48 UTC
Received 2 Feb 2018, 3:33:42 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS
Computer ID 1580783


The following is repeated several times:

Initializing random generators... ok
Initialization complete.
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Using previously extracted minirosetta_database.
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
Setting up folding (abrelax) ...
Beginning folding (abrelax) ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Starting work on structure: _00001
Continuing computation from checkpoint: chk_S_00000001_Abrelax__rg_state ... success!
ID: 88214 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
James W

Send message
Joined: 25 Nov 12
Posts: 27
Credit: 375,613
RAC: 430
Message 88256 - Posted: 10 Feb 2018, 7:28:48 UTC

Re: My host Windows XP with Pentium 4 CPU (1 core HT). Issue with Rosetta Mini v3.78 windows_intelx86.

Name rb_02_04_80757_123466__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_544266_607_0
Workunit 876143103 Created 4 Feb 2018, 14:13:59 UTC Sent 4 Feb 2018, 14:43:33 UTC Report deadline 12 Feb 2018, 14:43:33 UTC Received 9 Feb 2018, 10:24:36 UTC Task: 971639876

Initialization complete. Setting WU description ... Using previously extracted minirosetta_database. Unpacking WU data ...
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/input_rb_02_04_80757_123466__t000__1_C1_robetta.zip Setting database description ... Setting up checkpointing ... Setting up graphics native ... BOINC:: Worker startup. Starting watchdog... Watchdog active.

ERROR: semi-rotameric input invalid -- chi means differ. bb1: 120bb2: 110 original chi_1: -176.6 later chi_1: -177.8
ERROR: Exit from: C:\Users\boinc\src\Rosetta\main\source\src\core/pack/dunbrack/SemiRotamericSingleResidueDunbrackLibrary.tmpl.hh line: 1685 called boinc_finish
ID: 88256 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Aladar42

Send message
Joined: 14 Nov 17
Posts: 2
Credit: 67,864
RAC: 0
Message 88327 - Posted: 20 Feb 2018, 15:13:39 UTC

Couple of errors overnight:

https://boinc.bakerlab.org/workunit.php?wuid=878661631
https://boinc.bakerlab.org/workunit.php?wuid=878661900
https://boinc.bakerlab.org/workunit.php?wuid=878661716
ID: 88327 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
James W

Send message
Joined: 25 Nov 12
Posts: 27
Credit: 375,613
RAC: 430
Message 88789 - Posted: 2 May 2018, 6:18:54 UTC

Application version Rosetta Mini v3.78 windows_x86_64
Device: 1759960, Task: 993065168, and WU 894585192 .
Status: Error while computing.
Errors: Too many errors (may have bug). Too many total results.

Exit status -1 (0xFFFFFFFF) Unknown error code
Options::initialize()
Options::adding options()
Options::initialize() Check specs.
Options::initialize() End reached
ERROR: No values of the appropriate type specified for multi-valued option -jumps:random_sheets
ID: 88789 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 710
Credit: 2,253,850
RAC: 2,108
Message 88793 - Posted: 2 May 2018, 9:54:20 UTC

All "nas_final" wus end, after few seconds, with error:

ERROR: unrecognized residue NAS
ERROR:: Exit from: ..\..\..\src\core\io\pose_from_sfr\PoseFromSFRBuilder.cc line: 1030
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

ID: 88793 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
James W

Send message
Joined: 25 Nov 12
Posts: 27
Credit: 375,613
RAC: 430
Message 88803 - Posted: 3 May 2018, 5:18:17 UTC

Application version Rosetta Mini v3.78 windows_x86_64
Device: 1759960, Task: 993062525, and WU 894544928.
Status: Error while computing.
Errors: Too many errors (may have bug). Too many total results.

Exit status: 1 (0x00000001) Unknown error code
<message> Incorrect function. (0x1) - exit code 1 (0x1)</message>

Starting work on structure: _00009
std::cerr: Exception was thrown:
chi angle must be between -180 and 180: -1.#IND
ID: 88803 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
James W

Send message
Joined: 25 Nov 12
Posts: 27
Credit: 375,613
RAC: 430
Message 88804 - Posted: 3 May 2018, 5:42:43 UTC
Last modified: 3 May 2018, 5:44:12 UTC

Application version Rosetta Mini v3.78 windows_x86_64
Device: 1759960, Task: 993062751, and WU 894756725.
Status: Aborted.

Exit status: 203 (0x000000CB) EXIT_ABORTED_VIA_GUI
BOINC:: Worker startup.
Starting watchdog...
Starting work on structure: _00001
Watchdog active.
Continuing computation from checkpoint: chk_S_00000001_Abrelax__rg_state ... success!

Unhandled Exception Detected...
- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x75EC338D

Engaging BOINC Windows Runtime Debugger...

As WU continued to start over and resetting elapsed time to zero, at least 6 or more times, I aborted WU. Figured if was still on Structure 1 after 6 to 8 hrs of number crunching, it wasn't going to finish well., and was wasting CPU resources that could be used doing other work.
ID: 88804 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 710
Credit: 2,253,850
RAC: 2,108
Message 88819 - Posted: 5 May 2018, 8:59:34 UTC - in response to Message 88793.  

All "nas_final" wus end, after few seconds, with error:

ERROR: unrecognized residue NAS
ERROR:: Exit from: ..\..\..\src\core\io\pose_from_sfr\PoseFromSFRBuilder.cc line: 1030
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish


Again, all "nas_final" with the same error
ID: 88819 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 710
Credit: 2,253,850
RAC: 2,108
Message 89059 - Posted: 5 Jun 2018, 10:06:51 UTC

1003924662

ERROR: Unable to open atomset parameter file: minirosetta_database\chemical/atom_type_sets/fa_standard//

ID: 89059 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 189
Credit: 4,428,353
RAC: 3,577
Message 89065 - Posted: 5 Jun 2018, 19:39:25 UTC - in response to Message 89059.  

1003924662

ERROR: Unable to open atomset parameter file: minirosetta_database\chemical/atom_type_sets/fa_standard//


Seems very strange that the path name has both forward and back slashes.
ID: 89065 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 9 · 10 · 11 · 12

Message boards : Number crunching : Minirosetta 3.73-3.78



©2018 University of Washington
http://www.bakerlab.org