Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 120 · 121 · 122 · 123 · 124 · 125 · 126 . . . 277 · Next

AuthorMessage
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1987
Credit: 38,503,003
RAC: 14,287
Message 102560 - Posted: 9 Sep 2021, 0:42:29 UTC - in response to Message 102559.  

I have 4 running at between 11%-16% progress. Seems better than before but perhaps some are still going to error now.

I've got 14 running between 0 & 23%
A second has crashed out after 1h 58m - same error as above. No idea what it means

Edit: And a 3rd crashes at 1h 35m - same error again. Other tasks reaching 25%

Yeah, cancel all that...
Oh...
ID: 102560 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1484
Credit: 14,639,415
RAC: 12,959
Message 102562 - Posted: 9 Sep 2021, 6:44:55 UTC

I've had 12 run for 8 hours, end normally & Validate with no issues. Another 8 running for an hour so far (much better than their previous 2min or less).
Grant
Darwin NT
ID: 102562 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1484
Credit: 14,639,415
RAC: 12,959
Message 102563 - Posted: 9 Sep 2021, 6:53:56 UTC - in response to Message 102560.  

I have 4 running at between 11%-16% progress. Seems better than before but perhaps some are still going to error now.

I've got 14 running between 0 & 23%
A second has crashed out after 1h 58m - same error as above. No idea what it means

Edit: And a 3rd crashes at 1h 35m - same error again. Other tasks reaching 25%

Yeah, cancel all that...
Oh...

Could have been an issue with your system- AV programme comes to mind, or there was a change on the server- work was allocated to you, then deleted from the sever before you downloaded it.


You've got crap loads of downloads failing.

<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
app_version download error: couldn't get input files:
<file_xfer_error>
  <file_name>rosetta_4.20_windows_x86_64.exe</file_name>
  <error_code>-120 (RSA key check failed for file)</error_code>
</file_xfer_error>
</message>
]]>





Then there are the new Tasks giving errors _3mup_ being the culprit, after 15min or so.
Stderr output
<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 3221226356 (0xc0000374)</message>
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe -run:protocol jd2_scripting -parser:protocol pdblite_boinc_120_10_tfirst--fuse--predictor_v13_degrader_boinc--fuse--tslp_design_v2_degrader_boinc.xml @degrader_site_3mup_jhr_bcov_flags2 -in:file:silent degrader_site_3mup_jhr_bcov4_SAVE_ALL_OUT_IGNORE_THE_REST_0cx6qy1x.silent -in:file:silent_struct_type binary -silent_gz -mute all -silent_read_through_errors true -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip degrader_site_3mup_jhr_bcov4_SAVE_ALL_OUT_IGNORE_THE_REST_0cx6qy1x.zip @degrader_site_3mup_jhr_bcov4_SAVE_ALL_OUT_IGNORE_THE_REST_0cx6qy1x.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 3689000
Using database: database_357d5d93529_n_methylminirosetta_database

</stderr_txt>
]]>



Then the same type crashing after 12min or so with a different error message.

<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 3221225477 (0xc0000005)</message>
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe -run:protocol jd2_scripting -parser:protocol pdblite_boinc_120_10_tfirst--fuse--predictor_v13_degrader_boinc--fuse--tslp_design_v2_degrader_boinc.xml @degrader_site_3mup_jhr_bcov_flags2 -in:file:silent degrader_site_3mup_jhr_bcov4_SAVE_ALL_OUT_IGNORE_THE_REST_0dg5rm0i.silent -in:file:silent_struct_type binary -silent_gz -mute all -silent_read_through_errors true -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip degrader_site_3mup_jhr_bcov4_SAVE_ALL_OUT_IGNORE_THE_REST_0dg5rm0i.zip @degrader_site_3mup_jhr_bcov4_SAVE_ALL_OUT_IGNORE_THE_REST_0dg5rm0i.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 3942480
Using database: database_357d5d93529_n_methylminirosetta_database


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0000000000000000 

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 7.9.0


Dump Timestamp    : 09/09/21 03:12:33
Install Directory : C:Program FilesBOINC
Data Directory    : C:ProgramDataBOINC
Project Symstore  : https://boinc.bakerlab.org/rosetta/symstore
LoadLibraryA( C:ProgramDataBOINCdbghelp.dll ): GetLastError = 126
Loaded Library    : dbghelp.dll
LoadLibraryA( C:ProgramDataBOINCsymsrv.dll ): GetLastError = 126
LoadLibraryA( symsrv.dll ): GetLastError = 126
LoadLibraryA( C:ProgramDataBOINCsrcsrv.dll ): GetLastError = 126
LoadLibraryA( srcsrv.dll ): GetLastError = 126
LoadLibraryA( C:ProgramDataBOINCversion.dll ): GetLastError = 126
Loaded Library    : version.dll
Debugger Engine   : 4.0.5.0
Symbol Search Path: C:ProgramDataBOINCslots7;C:ProgramDataBOINCprojectsboinc.bakerlab.org_rosetta;srv*C:ProgramDataBOINCprojectsboinc.bakerlab.org_rosettasymbols*http://msdl.microsoft.com/download/symbols;srv*C:ProgramDataBOINCprojectsboinc.bakerlab.org_rosettasymbols*https://boinc.bakerlab.org/rosetta/symstore


ModLoad: 00000000b4580000 00000000057ef000 C:ProgramDataBOINCprojectsboinc.bakerlab.org_rosettarosetta_4.20_windows_x86_64.exe (-exported- Symbols Loaded)
    Linked PDB Filename   : C:cygwin64homeboinc4.17RosettamainsourceideVisualStudiox64BoincReleaserosetta_4.20_windows_x86_64.pdb

ModLoad: 00000000c9870000 00000000001f5000 C:WindowsSYSTEM32ntdll.dll (6.2.19041.844) (-exported- Symbols Loaded)
    Linked PDB Filename   : ntdll.pdb
    File Version          : 10.0.19041.804 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.804

ModLoad: 00000000c7da0000 00000000000bd000 C:WindowsSystem32KERNEL32.DLL (6.2.19041.804) (-exported- Symbols Loaded)
    Linked PDB Filename   : kernel32.pdb
    File Version          : 10.0.19041.804 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.804

ModLoad: 00000000c7060000 00000000002c9000 C:WindowsSystem32KERNELBASE.dll (6.2.19041.804) (-exported- Symbols Loaded)
    Linked PDB Filename   : kernelbase.pdb
    File Version          : 10.0.19041.804 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.804

ModLoad: 00000000c88f0000 000000000006b000 C:WindowsSystem32WS2_32.dll (6.2.19041.546) (-exported- Symbols Loaded)
    Linked PDB Filename   : ws2_32.pdb
    File Version          : 10.0.19041.1 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.1

ModLoad: 00000000c8020000 000000000012b000 C:WindowsSystem32RPCRT4.dll (6.2.19041.746) (-exported- Symbols Loaded)
    Linked PDB Filename   : rpcrt4.pdb
    File Version          : 10.0.19041.1 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.1

ModLoad: 00000000c95a0000 00000000001a0000 C:WindowsSystem32USER32.dll (6.2.19041.746) (-exported- Symbols Loaded)
    Linked PDB Filename   : user32.pdb
    File Version          : 10.0.19041.1165 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.1165

ModLoad: 00000000c6fb0000 0000000000022000 C:WindowsSystem32win32u.dll (6.2.19041.844) (-exported- Symbols Loaded)
    Linked PDB Filename   : win32u.pdb
    File Version          : 10.0.19041.844 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.844

ModLoad: 00000000c8a20000 000000000002a000 C:WindowsSystem32GDI32.dll (6.2.19041.746) (-exported- Symbols Loaded)
    Linked PDB Filename   : gdi32.pdb
    File Version          : 10.0.19041.746 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.746

ModLoad: 00000000c74e0000 000000000010b000 C:WindowsSystem32gdi32full.dll (6.2.19041.746) (-exported- Symbols Loaded)
    Linked PDB Filename   : gdi32full.pdb
    File Version          : 10.0.19041.746 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.746

ModLoad: 00000000c75f0000 000000000009d000 C:WindowsSystem32msvcp_win.dll (6.2.19041.789) (-exported- Symbols Loaded)
    Linked PDB Filename   : msvcp_win.pdb
    File Version          : 10.0.19041.789 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.789

ModLoad: 00000000c7690000 0000000000100000 C:WindowsSystem32ucrtbase.dll (6.2.19041.789) (-exported- Symbols Loaded)
    Linked PDB Filename   : ucrtbase.pdb
    File Version          : 10.0.19041.789 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.789

ModLoad: 00000000c8150000 00000000000ac000 C:WindowsSystem32ADVAPI32.dll (6.2.19041.610) (-exported- Symbols Loaded)
    Linked PDB Filename   : advapi32.pdb
    File Version          : 10.0.19041.1 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.1

ModLoad: 00000000c9480000 000000000009e000 C:WindowsSystem32msvcrt.dll (7.0.19041.546) (-exported- Symbols Loaded)
    Linked PDB Filename   : msvcrt.pdb
    File Version          : 7.0.19041.546 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 7.0.19041.546

ModLoad: 00000000c8310000 000000000009c000 C:WindowsSystem32sechost.dll (6.2.19041.789) (-exported- Symbols Loaded)
    Linked PDB Filename   : sechost.pdb
    File Version          : 10.0.19041.1 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.1

ModLoad: 00000000c82e0000 0000000000030000 C:WindowsSystem32IMM32.DLL (6.2.19041.546) (-exported- Symbols Loaded)
    Linked PDB Filename   : imm32.pdb
    File Version          : 10.0.19041.546 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.546

ModLoad: 00000000c4140000 0000000000012000 C:WindowsSYSTEM32kernel.appcore.dll (6.2.19041.546) (-exported- Symbols Loaded)
    Linked PDB Filename   : Kernel.Appcore.pdb
    File Version          : 10.0.19041.546 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.546

ModLoad: 00000000c6d00000 0000000000033000 C:WindowsSYSTEM32ntmarta.dll (6.2.19041.546) (-exported- Symbols Loaded)
    Linked PDB Filename   : ntmarta.pdb
    File Version          : 10.0.19041.1 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.1

ModLoad: 00000000c6170000 000000000000c000 C:WindowsSYSTEM32CRYPTBASE.DLL (6.2.19041.546) (-exported- Symbols Loaded)
    Linked PDB Filename   : cryptbase.pdb
    File Version          : 10.0.19041.546 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.546

ModLoad: 00000000c6fe0000 0000000000080000 C:WindowsSystem32bcryptPrimitives.dll (6.2.19041.662) (-exported- Symbols Loaded)
    Linked PDB Filename   : bcryptprimitives.pdb
    File Version          : 10.0.19041.662 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.662

ModLoad: 00000000c69e0000 00000000001e4000 C:WindowsSYSTEM32dbghelp.dll (6.2.19041.804) (-exported- Symbols Loaded)
    Linked PDB Filename   : dbghelp.pdb
    File Version          : 10.0.19041.804 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.804

ModLoad: 00000000c41f0000 000000000000a000 C:WindowsSYSTEM32version.dll (6.2.19041.546) (-exported- Symbols Loaded)
    Linked PDB Filename   : version.pdb
    File Version          : 10.0.19041.546 (WinBuild.160101.0800)
    Company Name          : Microsoft Corporation
    Product Name          : Microsoft&#174; Windows&#174; Operating System
    Product Version       : 10.0.19041.546



*** Dump of the Process Statistics: ***

- I/O Operations Counters -
Read: 29778, Write: 5295, Other 73117

- I/O Transfers Counters -
Read: 107012609, Write: 18039520, Other 19508

- Paged Pool Usage -
QuotaPagedPoolUsage: 318232, QuotaPeakPagedPoolUsage: 318408
QuotaNonPagedPoolUsage: 22296, QuotaPeakNonPagedPoolUsage: 24880

- Virtual Memory Usage -
VirtualSize: 1200349184, PeakVirtualSize: 1794699264

- Pagefile Usage -
PagefileUsage: 1200349184, PeakPagefileUsage: 1262059520

- Working Set Size -
WorkingSetSize: 1205559296, PeakWorkingSetSize: 1266581504, PageFaultCount: 1236762

*** Dump of thread ID 7372 (state: Initialized): ***

- Information -
Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0000000000000000 

- Registers -
rax=000000000000003a rbx=00000000408f68a0 rcx=0000000041330d50 rdx=0000000041410e88 rsi=000000000000000b rdi=0000000041330d50
r8=000000000000003a r9=0000000000000421 r10=00000000b8126e80 r11=0000000059147600 r12=00000000b4580000 r13=000000005915fc90
r14=0000000059147d40 r15=000000000048b215 rip=0000000000000000 rsp=0000000059147678 rbp=0000000000000000
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010206

- Callstack -
ChildEBP RetAddr  Args to Child
(-nosymbols- PC == 0)
59147670 b4a5831c 00000000 b8126d60 b8126e80 59147658 !&#128;+0x0 
591476a0 b4a1935d 408f68a0 59147740 b4a0b215 00000000 rosetta_4.20_windows_x86_64!xmlParserInputRead+0x0 
591476d0 b7b87f10 b8a70150 5915fc90 00000000 00000001 rosetta_4.20_windows_x86_64!xmlParserInputRead+0x0 
59147700 b4a039e8 b9b1a32c b4580000 591477f0 c98a0e7b rosetta_4.20_windows_x86_64!xmlValidateNotationDecl+0x0 
59147770 c9911f6f 00000000 59147cf0 591483b0 00000000 rosetta_4.20_windows_x86_64!xmlParserInputRead+0x0 
591477a0 c98c1454 00000000 59147cf0 591483b0 00000000 ntdll!__chkstk+0x0 
59147eb0 c9910a9e 7bb7ff10 7bb7ff20 00000020 fffffffe ntdll!RtlRaiseException+0x0 
591485c0 b4c5a77f 083e6b80 c98947b1 7e647208 408e0000 ntdll!KiUserExceptionDispatcher+0x0 
59148600 b50e7c98 7e647360 b82943b8 7e647300 b82943b8 rosetta_4.20_windows_x86_64!cppdb::session::is_open+0x0 
59148640 b50beadf 7e647300 7e647300 fffffffe b823fa88 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0 
59148680 b50bf664 ffffffff 00000000 591488e0 6d257bb0 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0 
591486b0 b50c0f1e 76e03ab0 74693680 ffffffff b49f7807 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0 
591486e0 b6447b0e 750a9380 74693680 fffffffe b49f7807 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0 
59148720 b50beb26 00000001 6d257bb0 412110a8 412110a8 rosetta_4.20_windows_x86_64!cppdb::backend::statements_cache::active+0x0 
59148760 b50bf6a4 00000001 412110a8 fffffffe 00000000 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0 
59148790 b4e578bf 7aa24d30 591488e0 6d257bb0 59148868 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0 
591487d0 b52c03f6 00000000 00000001 b99f2bc0 6d257bb0 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0 
59149060 b5482c41 00000000 74693680 75b52c60 ffffffff rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0 
59149110 b5482ee9 74f60ca0 59149310 75b52c60 ffffffff rosetta_4.20_windows_x86_64!xmlCheckHTTPInput+0x0 
59149190 b54810d3 74f60ca0 59149310 74693680 74693680 rosetta_4.20_windows_x86_64!xmlCheckHTTPInput+0x0 
59149200 b70c9896 000000d2 74d63dd0 7a346230 74d63dd0 rosetta_4.20_windows_x86_64!xmlCheckHTTPInput+0x0 
591493a0 b545b2a5 7a346230 59149410 7a346230 b542701d rosetta_4.20_windows_x86_64!cppdb::mutex::~mutex+0x0 
59149540 b545a402 4137dd20 ffffffff ffffffff 09edc118 rosetta_4.20_windows_x86_64!xmlCheckHTTPInput+0x0 
59149800 b5402dc0 b99f6ec0 40e8b3a0 09377d10 40e8b3a0 rosetta_4.20_windows_x86_64!xmlCheckHTTPInput+0x0 
59149da0 b54000f8 40e8b3a0 40e8b3a0 59149eb0 59149e98 rosetta_4.20_windows_x86_64!xmlCheckHTTPInput+0x0 
59149f70 b546a4d1 59149ff8 40e8b3a0 59149ff8 5914a180 rosetta_4.20_windows_x86_64!xmlCheckHTTPInput+0x0 
59149fc0 b545006d 40cf0590 59149ff8 40cf0590 408f7c01 rosetta_4.20_windows_x86_64!xmlCheckHTTPInput+0x0 
5914a050 b4a00dfd 412c95a0 5914a180 00000000 408f7c01 rosetta_4.20_windows_x86_64!xmlCheckHTTPInput+0x0 
5915fc80 b4a0b215 00000000 00000000 b992ccf8 00000000 rosetta_4.20_windows_x86_64!xmlParserInputRead+0x0 
5915fcc0 c7db7034 00000000 00000000 00000000 00000000 rosetta_4.20_windows_x86_64!xmlParserInputRead+0x0 
5915fcf0 c98c2651 00000000 00000000 00000000 00000000 KERNEL32!BaseThreadInitThunk+0x0 
5915fd70 00000000 00000000 00000000 00000000 00000000 ntdll!RtlUserThreadStart+0x0 

*** Dump of thread ID 32760 (state: Initialized): ***

- Information -
Status: Base Priority: Normal, Priority: Unknown, , Kernel Time: 32.000000, User Time: 0.000000, Wait Time: 1795168000.000000

- Registers -
rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000000 rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
r8=0000000000000000 r9=0000000000000000 r10=0000000000000000 r11=0000000000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000 rip=0000000000000000 rsp=0000000000000000 rbp=0000000000000000
cs=0000  ss=0000  ds=0000  es=0000  fs=0000  gs=0000             efl=00000000

- Callstack -
ChildEBP RetAddr  Args to Child
(-nosymbols- PC == 0)
00000000 00000000 00000000 00000000 00000000 00000000 !+0x0 

*** Dump of thread ID 30909726 (state: Unknown): ***

- Information -
Status: Base Priority: Normal, Priority: Unknown, , Kernel Time: 17179869184.000000, User Time: 21474842624.000000, Wait Time: 0.000000

- Registers -
rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000000 rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
r8=0000000000000000 r9=0000000000000000 r10=0000000000000000 r11=0000000000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000 rip=0000000000000000 rsp=0000000000000000 rbp=0000000000000000
cs=0000  ss=0000  ds=0000  es=0000  fs=0000  gs=0000             efl=00000000

- Callstack -
ChildEBP RetAddr  Args to Child
(-nosymbols- PC == 0)
00000000 00000000 00000000 00000000 00000000 00000000 !+0x0 


*** Debug Message Dump ****


*** Foreground Window Data ***
    Window Name      : 
    Window Class     : 
    Window Process ID: 0
    Window Thread ID : 0

Exiting...

</stderr_txt>
]]>


You haven't been playing with your overlocking again?

Will have to wait a while to see if other people are getting the same errors with the same Tasks.
Grant
Darwin NT
ID: 102563 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1987
Credit: 38,503,003
RAC: 14,287
Message 102564 - Posted: 9 Sep 2021, 10:58:07 UTC - in response to Message 102563.  


Yeah, cancel all that...
Oh...

Could have been an issue with your system- AV programme comes to mind, or there was a change on the server- work was allocated to you, then deleted from the sever before you downloaded it.

[...]

You haven't been playing with your overlocking again?

Will have to wait a while to see if other people are getting the same errors with the same Tasks.

Pretty sure it's not the AV. More likely the O/clock corrupting the file somewhere down the line.
I have been playing with the o/c but turning everything down by quite a lot, including voltage.
The weird thing is all my WCG tasks are very stable, but Rosetta tasks aren't playing at all.
And tasks do generally look ok as my laptop has completed several without a murmur.

So, it's me, not the tasks. I need to have a long think when I get more time (which may never happen)
ID: 102564 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1863
Credit: 8,184,675
RAC: 7,690
Message 102565 - Posted: 9 Sep 2021, 14:18:48 UTC - in response to Message 102564.  

I'm crunching some _5nvx_ without the usual error after few minutes.
I'm waiting to finish these and to see results.
ID: 102565 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1863
Credit: 8,184,675
RAC: 7,690
Message 102566 - Posted: 9 Sep 2021, 16:03:58 UTC - in response to Message 102565.  

I'm crunching some _5nvx_ without the usual error after few minutes.
I'm waiting to finish these and to see results.


First _5nvx_ correctly done!
1423165551
ID: 102566 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,370,930
RAC: 601
Message 102568 - Posted: 10 Sep 2021, 4:49:37 UTC - in response to Message 102566.  

Can't wait to try out the fixed _5nvx_ tasks.

BOINC insists on getting WCG tasks despite Rosetta having twice the resource share.
ID: 102568 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1484
Credit: 14,639,415
RAC: 12,959
Message 102569 - Posted: 10 Sep 2021, 5:37:49 UTC - in response to Message 102568.  

Can't wait to try out the fixed _5nvx_ tasks.

BOINC insists on getting WCG tasks despite Rosetta having twice the resource share.
WCG Resource share is different to all other BOINC projects in the way it is set (and probably as a result how it is honoured).
Best to reduce the share for WCG (you have to do it at their site from memory) than increase the share for Rosetta.

From a quick search
"at WCG, Resource Share" is called "Project Weight", and is set under Device Manager > Device Profiles."

If Rosetta is set to 200, WCG needs to be 100 for Rosetta to have double.
And given the number of projects you run, the smaller your cache, the sooner your Resource share settings will be honoured. 0 Cache would be best- 0 days + 0.01 additional days.
Grant
Darwin NT
ID: 102569 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,370,930
RAC: 601
Message 102570 - Posted: 10 Sep 2021, 8:42:44 UTC - in response to Message 102569.  
Last modified: 10 Sep 2021, 8:48:29 UTC

That's exactly my setting, 100 for WCG, 200 for Rosetta, and a ridiculously small cache. It does not seem to work well since my RAC for WCG is always over twice of that of Rosetta.

I think the problem is that WCG has GPU tasks and those massively boost my RAC.
ID: 102570 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1484
Credit: 14,639,415
RAC: 12,959
Message 102571 - Posted: 10 Sep 2021, 9:04:33 UTC - in response to Message 102570.  
Last modified: 10 Sep 2021, 9:26:57 UTC

That's exactly my setting, 100 for WCG, 200 for Rosetta, and a ridiculously small cache. It does not seem to work well since my RAC for WCG is always over twice of that of Rosetta.

I think the problem is that WCG has GPU tasks and those massively boost my RAC.
Your cache isn't ridiculously small. For the number of projects you've signed up to, it's huge (even with only two projects, it's still a large cache).
The idea of the cache is to give you enough work till you're able to connect to the internet again. As most people are able to connect as required, it makes a cache unnecessary. If you run only one project, then you may use a cache to get through project outages. But if you have more than 1 project, then there's no need for a cache.

If you had no cache, then as soon as Rosetta started producing work again, you'd have gotten some. As it is the fact that your cache is over a day, and WCG is actually a whole bunch of projects, and there has been no work from Rosetta for several days (and this after not having any work for several days and only having work for a week or so before running out again) means your system will have loaded up on WCG Tasks, and any of your other Projects if they had work available.
Until most of them have been cleared you won't get any more Rosetta Tasks. Then BOINC should do Rosetta to meet your Resource share settings, then do the odd WCG Task once the debt to Rosetta has been repaid (but if Rosetta runs out of work again (although that shouldn't happen as quickly this time), then it would load up on Tasks for other projects due to your large cache setting, delaying it doing Rosetta work when it became available again).




Edit- oh, and the Scheduler uses REC (Recent Estimated Credit) for determining scheduling, not the actual Credit granted by a a project. So a project that pays well above the definition of a Cobblestone will show more Credit than one that pays by the book (or underpays), even if they have actually done work in accordance with their Resource Share settings.
Grant
Darwin NT
ID: 102571 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,370,930
RAC: 601
Message 102572 - Posted: 10 Sep 2021, 10:58:11 UTC - in response to Message 102571.  
Last modified: 10 Sep 2021, 11:26:28 UTC

Wait, is 0.1 + 0 days a huge cache? That's the setting I have locally (BTW, what makes you think my cache settings are over a day? Are my actual settings listed somewhere? If my cache setting is actually 1 day, that could explain some things.). I believe local cache settings take precedence over web settings (I have left web cache settings at default), is that correct?

The only WCG tasks I have are the ones that are currently running, there isn't any in the queue. There are Rosetta tasks fresh in the queue though, since the WCG tasks (ARP, 12 hours per task on my PC) are all nearly done.

My problem is that BOINC seems to really like downloading WCG tasks over Rosetta tasks, regardless of RAC. Your explanation of REC makes a lot of sense. Thanks.
ID: 102572 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
fkmaster

Send message
Joined: 19 Jan 06
Posts: 2
Credit: 20,551,989
RAC: 6,034
Message 102573 - Posted: 10 Sep 2021, 15:52:02 UTC

Hi,

One of my computer has not got tasks for a week. I resetted the project, than reinstall the whole BOINC client to the newest, but nothing happened. Other computers get jobs well. This computer has only Windows 10 32 bit - does this matter?
Thanks in advance,

fkmaster
ID: 102573 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1829
Credit: 115,751,398
RAC: 58,222
Message 102574 - Posted: 10 Sep 2021, 20:18:20 UTC - in response to Message 102573.  
Last modified: 10 Sep 2021, 20:18:29 UTC

I'm not sure of there are take that will run on the 32-bit version, but if your can't upgrade toa64 but OS you could run Rosetta in a 64 bit Linux VM, e.g WSL.
ID: 102574 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1484
Credit: 14,639,415
RAC: 12,959
Message 102575 - Posted: 10 Sep 2021, 20:33:51 UTC - in response to Message 102572.  

Wait, is 0.1 + 0 days a huge cache?
If you've got more than 1 project, yes. And given that the deadlines for Rosetta are 3 days, having any cache is a very significant portion of the deadline limit. Where as if the deadline on a Task is 2 months, and you have a 7 day cache, it's an insignificant portion of the deadline period.
So you would be much better off with it as 0 & 0.01



That's the setting I have locally (BTW, what makes you think my cache settings are over a day? Are my actual settings listed somewhere? If my cache setting is actually 1 day, that could explain some things.).
Your cache is over a day.
If you look at your Computer list, and click on Details for the Ryzen 5 3600, it shows your Average turnaround time is 1.32 days. Click on "Application details- Show" and the turnaround time for Rosetta 4.20 (the current application) is 1.71 days.
That's a massive portion of the deadline period.



I believe local cache settings take precedence over web settings.
Yep.
Local settings override any web based settings.
It allows those with heaps of systems to set up different settings for multiple computers using the web based options & locations (School, Work & Home), and if there are one or two computers that are used regularly for other things that are impacted in some y by BOINC crunching they can use the Local settings on those & not affect any of the other systems.



(I have left web cache settings at default), is that correct?
I've no way of knowing- only you can see your account pages & the values there.




The only WCG tasks I have are the ones that are currently running, there isn't any in the queue. There are Rosetta tasks fresh in the queue though, since the WCG tasks (ARP, 12 hours per task on my PC) are all nearly done.
So once those WCG Tasks are done, and the already downloaded Rosetta ones start, then the next Tasks to be downloaded should be more Rosetta.
Grant
Darwin NT
ID: 102575 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1484
Credit: 14,639,415
RAC: 12,959
Message 102576 - Posted: 10 Sep 2021, 20:44:05 UTC - in response to Message 102573.  
Last modified: 10 Sep 2021, 20:47:49 UTC

One of my computer has not got tasks for a week. I resetted the project, than reinstall the whole BOINC client to the newest, but nothing happened. Other computers get jobs well. This computer has only Windows 10 32 bit - does this matter?
Only one of your systems has Rosetta work- the Ryzen 5 3600.
None of your other systems have any Rosetta work at present. This is most likely due to their low core/thread counts, the fact that Rosetta is your least important project behind the other 2, your large cache settings, and Rosetta having been out of work for a while.

If you reduce your cache to zero, then once those system have completed their present work for the other projects, they should start getting some Rosetta work. And with the smaller cache your Resource share settings will be meet much sooner.

Your account, Preferences, When and how BOINC uses your computer, click on Computing preferences, Other
           Store at least 0.0  days of work
Store up to an additional 0.01 days of work
Save the changes.
The next time a system contacts the Rosetta Scheduler it will get the new settings & they will take effect (or click on Update in the BOINC Manager on each system).




Edit- and it would be worth running the BOINC Benchmarks on your i3 system as it is using the default values & Rosetta uses the benchmarks for determining Credit for work done.
Grant
Darwin NT
ID: 102576 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,370,930
RAC: 601
Message 102577 - Posted: 10 Sep 2021, 20:56:43 UTC - in response to Message 102576.  
Last modified: 10 Sep 2021, 21:01:20 UTC

One of my computer has not got tasks for a week. I resetted the project, than reinstall the whole BOINC client to the newest, but nothing happened. Other computers get jobs well. This computer has only Windows 10 32 bit - does this matter?
Only one of your systems has Rosetta work- the Ryzen 5 3600.
None of your other systems have any Rosetta work at present. This is most likely due to their low core/thread counts, the fact that Rosetta is your least important project behind the other 2, your large cache settings, and Rosetta having been out of work for a while.


Rosetta has the highest resource share (200 vs 100) out of my projects, and I'm currently only running WCG alongside it on my devices. The reason why only one of my systems has Rosetta work is that I only have 2 working devices right now, the other device that is active is an Android phone (the Asus ROG 5 which has a 0 cache setting) which refuses to download Rosetta tasks. It completes a WCG task then insists on downloading another WCG task instead of a Rosetta task.

BTW, all of my devices have either a 0 cache setting or a 0.1 + 0 cache setting set locally, depending on how fast the GPU tasks (if available) run.
ID: 102577 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 102578 - Posted: 10 Sep 2021, 21:21:03 UTC - in response to Message 102577.  

BTW, all of my devices have either a 0 cache setting or a 0.1 + 0 cache setting set locally, depending on how fast the GPU tasks (if available) run.

That is too short for Rosetta. The work units run longer than that, and they won't download at all. (That will happen on CPDN too, with their long work units.)
Try 0.1 + 0.5 days.
ID: 102578 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1484
Credit: 14,639,415
RAC: 12,959
Message 102579 - Posted: 10 Sep 2021, 21:22:06 UTC - in response to Message 102577.  

Rosetta has the highest resource share (200 vs 100) out of my projects, and I'm currently only running WCG alongside it on my devices. The reason why only one of my systems has Rosetta work is that I only have 2 working devices right now, the other device that is active is an Android phone (the Asus ROG 5 which has a 0 cache setting) which refuses to download Rosetta tasks. It completes a WCG task then insists on downloading another WCG task instead of a Rosetta task.

BTW, all of my devices have either a 0 cache setting or a 0.1 + 0 cache setting set locally, depending on how fast the GPU tasks (if available) run.
If you look at what i posted there, it's all all in response to dcdc.

I suggest you checkout the post before it where i quoted what you had posted previously.




BTW, all of my devices have either a 0 cache setting or a 0.1 + 0 cache setting set locally, depending on how fast the GPU tasks (if available) run.
But i'll summarise again- 0 & 0.01 is best, not the other way around.
And the numbers shown against your computers & the applications show your cache is over 1.5days- that's how long it's taking you to return Rosetta work (i can't see any details for WCG).
Try 0 + 0.01 to see if that reduces the turnaround times- it'll take a couple of days to see if it does.
Grant
Darwin NT
ID: 102579 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1484
Credit: 14,639,415
RAC: 12,959
Message 102580 - Posted: 10 Sep 2021, 21:32:51 UTC - in response to Message 102578.  
Last modified: 10 Sep 2021, 21:33:53 UTC

BTW, all of my devices have either a 0 cache setting or a 0.1 + 0 cache setting set locally, depending on how fast the GPU tasks (if available) run.

That is too short for Rosetta. The work units run longer than that, and they won't download at all. (That will happen on CPDN too, with their long work units.)
Try 0.1 + 0.5 days.
If it doesn't accept 0, then give it 0.01 & 0.01.
CPDN might not give you work, but Rosetta does with those values.

The only thing that will stop (or should stop) you from getting work is if the Task won't be returned before the deadline. Having zero cache means just that- no cache. You will still get work- but only have the work you are presently processing, When it's done another Task will be downloaded.



And you have the values around the wrong way to get the result people are usually after-

If you want a .5 day cache, you need to set it as 0.5 & 0.01 additional days.
4 day cache, 4 & 0.01
Having a large value for the Additional days means you will actually end up with less work in your cache than the Days value as BOINC will let it run down until it can fill the additional Days value without going over the Day + Additional days total value.
Set the Day value for the size of the cache you want, then 0.01 Additional days so the cache remains around that level, and not significantly below it before re-filling.
Grant
Darwin NT
ID: 102580 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 102581 - Posted: 10 Sep 2021, 23:11:10 UTC - in response to Message 102580.  


The only thing that will stop (or should stop) you from getting work is if the Task won't be returned before the deadline. Having zero cache means just that- no cache. You will still get work- but only have the work you are presently processing, When it's done another Task will be downloaded.

That is so only if he is not running any other projects. But it appeared to me that he is.
ID: 102581 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 120 · 121 · 122 · 123 · 124 · 125 · 126 . . . 277 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org