Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 116 · 117 · 118 · 119 · 120 · 121 · 122 . . . 311 · Next

AuthorMessage
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 2002
Credit: 9,790,281
RAC: 2,986
Message 102294 - Posted: 28 Jul 2021, 9:16:11 UTC - in response to Message 102289.  

Funny to note that my Snapdragon 888 is beating my Ryzen 5 3600 and gets much more consistent credits per task. My phone is getting around 394 credits for every 8hr task it gets, My Ryzen 5 3600 gets a measly 346 credits per 8 hour task. Either something is up with the credit calculation, or there is something about these pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST tasks that make them particularly good on ARM. I'm leaning towards the former explanation because: 1) the benchmarks for my phone are 5887.99 million ops/sec floating point speed and 29296.86 million ops/sec integer speed , whilst my 3600 gets 5198.63 million ops/sec floating point speed and 19515.05 million ops/sec integer speed. 2) My 3600 seems to be consistently generating more "decoys" than my phone despite the credit deficit. Probably a BOINC issue that's been beaten to death already.


The long long story of "code optimization"...
ID: 102294 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 102295 - Posted: 28 Jul 2021, 9:29:02 UTC - in response to Message 102294.  
Last modified: 28 Jul 2021, 10:06:43 UTC

On WCG, my 3600 is many times faster than my snapdragon 888, as it should be.

Given how my old family iMac gets a measured floating point speed of 6471.51 million ops/sec and a measured integer speed of 21013.4 million ops/sec, which would make it about as fast as an Intel 10700K, I think BOINC's benchmark is extremely unreliable between platforms. My iMac is nowhere nearly as fast as my 3600 in BOINC. The problem is that BOINC's benchmark appears to play a large role in the calculation of credits for Rosetta.

My phone gets around 394 credits for each 8 hour task, 394/5.89 (the peak measured floating point speed) is 66.89, my 3600 gets around 346 credits for similar tasks, 346/5.20 is 66.54. My old Macbook Pro used to have the same issue, it was getting too many credits compared to my 3600.

It's important to note that in the helical tasks, my 3600 actually appears to be doing more work (more decoys generated), it's just being granted less credits.

If there is actually a bug, do I want it to be fixed? Yes. Do I think it is an issue Rosetta should focus on? No.

It's difficult to solve such issues with Rosetta@home because of how unique it is. Other projects have tasks that are of a known computation size, the speed of your device determines the amount of time it takes to complete the tasks. On Rosetta, the run-time of each task is the the same, the speed of your device determines how much work gets done. Determining how much work was actually done is difficult. BOINC's benchmark is extremely unreliable across platforms.

According to the admin, this is how things are supposed to work:
"What is the average processing rate? And what is it used to calculate?

Related to your initial question, R@h credits are based on a rolling average of claimed credit/model that our server keeps track of for each job set. So you are awarded the average credit per model that previous results have reported back multiplied by the number of models you produced. The first job reported back for a specific job set is granted the claimed credit, but subsequent results are granted the average credit per model * the number of models produced. This works quite well for large job sets (like for protein structure prediction) because a job set may consist of thousands of jobs which eventually gives a good average credit per model value. But it can be a bit variable for small job sets, and if a job set consists of only one job, then you will be granted your claimed credit which is based on the cpu benchmark and run time."

Given the fact my 3600 is consistently generating more models than my phone (170-180 vs 140-150) in the pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST tasks, I expect it to get more credits. There is also the possibility that the tasks my phone is getting have more complex models and thus gets more credits per model.

Here's another example of Rosetta's credit calculation being a little wonky:
Possible bug in the "Average Processing Rate" calculation
ID: 102295 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 826
Message 102297 - Posted: 28 Jul 2021, 18:23:26 UTC - in response to Message 102295.  

[snip]

It's important to note that in the helical tasks, my 3600 actually appears to be doing more work (more decoys generated), it's just being granted less credits.

If there is actually a bug, do I want it to be fixed? Yes. Do I think it is an issue Rosetta should focus on? No.

Note that not all decoys do the same amount of work, so less credit per decoy is not necessarily meaningful.
ID: 102297 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 102299 - Posted: 28 Jul 2021, 19:11:38 UTC

pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST_6fa0fn4c_1391037_5_0
https://boinc.bakerlab.org/rosetta/result.php?resultid=1411691662

<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
Incorrect function.
(0x1) - exit code 1 (0x1)</message>
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe -run:protocol jd2_scripting -parser:protocol pre_helix_boinc_v1.xml @helix_design.flags -in:file:silent pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST_6fa0fn4c.silent -in:file:silent_struct_type binary -silent_gz -mute all -silent_read_through_errors true -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST_6fa0fn4c.zip @pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST_6fa0fn4c.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 3747228
Using database: database_357d5d93529_n_methylminirosetta_database

ERROR: [ERROR] Unable to open constraints file: c0d44a74319eb5077ecc3d522c246773_0001.MSAcst
ERROR:: Exit from: ......srccorescoringconstraintsConstraintIO.cc line: 457
BOINC:: Error reading and gzipping output datafile: default.out
20:45:39 (11432): called boinc_finish(1)

</stderr_txt>
]]>
ID: 102299 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1734
Credit: 18,532,940
RAC: 17,945
Message 102302 - Posted: 29 Jul 2021, 5:55:39 UTC - in response to Message 102297.  

[snip]

It's important to note that in the helical tasks, my 3600 actually appears to be doing more work (more decoys generated), it's just being granted less credits.

If there is actually a bug, do I want it to be fixed? Yes. Do I think it is an issue Rosetta should focus on? No.

Note that not all decoys do the same amount of work, so less credit per decoy is not necessarily meaningful.
And the benchmarks are used to determine the amount of work done (there is no actual measurement of work actually done as such...).
So systems with higher benchmark values get more Credit per hour than those with lower benchmark values (unless they are too much higher in which case it is determined they are cheating and Credit awarded becomes some sort of average based on all the work returned to date for such Tasks).
And that is why systems that haven't run the benchmarks and are using the default values get bugger all Credit for the work they return until such time as they eventually do run the benchmarks.
Grant
Darwin NT
ID: 102302 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 102303 - Posted: 29 Jul 2021, 6:07:24 UTC - in response to Message 102302.  

[snip]

It's important to note that in the helical tasks, my 3600 actually appears to be doing more work (more decoys generated), it's just being granted less credits.

If there is actually a bug, do I want it to be fixed? Yes. Do I think it is an issue Rosetta should focus on? No.

Note that not all decoys do the same amount of work, so less credit per decoy is not necessarily meaningful.
And the benchmarks are used to determine the amount of work done (there is no actual measurement of work actually done as such...).
So systems with higher benchmark values get more Credit per hour than those with lower benchmark values (unless they are too much higher in which case it is determined they are cheating and Credit awarded becomes some sort of average based on all the work returned to date for such Tasks).
And that is why systems that haven't run the benchmarks and are using the default values get bugger all Credit for the work they return until such time as they eventually do run the benchmarks.


I've figured that out the hard way. I just find it annoying that the values are so inconsistent across platforms.

I assume these validate errors are also a known issue with the pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST tasks and not some stupid problem with my machine?

pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST_3db8ur8d_1391015_5_0
pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST_5be2mt7m_1391005_5_0
ID: 102303 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1734
Credit: 18,532,940
RAC: 17,945
Message 102304 - Posted: 29 Jul 2021, 7:09:29 UTC - in response to Message 102303.  

I assume these validate errors are also a known issue with the pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST tasks
Yep. Been occurring ever since they were first released. You get periods where you see hardly any of them, and then you get batches where as many as 20% could give a Validate or Computation error, either after several hours, or just after a few seconds.
We're back to more than the usual number of Tasks giving issues at present.

Although given the number of Tasks left queued (2 million & falling), we'll probably be out of all work in the next few days unless a another big batch of work is released before then.
Grant
Darwin NT
ID: 102304 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2146
Credit: 41,570,180
RAC: 8,210
Message 102322 - Posted: 31 Jul 2021, 3:38:22 UTC - in response to Message 102291.  
Last modified: 31 Jul 2021, 3:56:00 UTC

I've being getting quite a lot of these errors on all my devices. Sometimes they get validated, sometimes they just result in a computational error.
Those errors have been occurring with some pre_helical_bundles_ Tasks ever since they were released months ago.

Yes. They all run very short too.
The ones that come up validated award appropriate credits for the time.
Only the very shortest-running come up computation error with no credit, but usually run less than 20 seconds.
They "cost" a little in download time, but it's easier to let them error out than to fix so it'll continue occasionally until they're exhausted - won't be much longer now

Edit: I looked at this in late-May and reported it to admin, who gave the answer above
I've finally got round to examining this issue involving "ERROR:: Exit from: ......srccorescoringconstraintsConstraintIO.cc line: 457"
I didn't realise I'd been getting this error as much as everyone else and I've now reported it, probably for the first time, which is likely why it's been going on for so long.
I thought there were two types of error resulting from this, but there are actually three.

The one you quote above gives a gzip error and reports in the task list as a Computation Error - boinc_finish(1) - and usually only has a cpu runtime of 15 seconds or less so awards no credit,
There's another with no gzip error that has a cpu runtime of 3 or 4 minutes and reports a Validate Error - boinc_finish(0) - and awards a few credits when the daily cleanup job runs
The third one also has no gzip error, has a runtime of only 6-9 minutes and reports as Completed and Validated correctly, but obviously doesn't run fully either.

My main PC is reporting 3 Computation errors, 3 Validate Errors and 50 Validating properly, but 2 of them are running short, so 8 of 56 have a problem - 1 in 7.
All bar 3 are awarding credit for runtime - and those 3 for less than a minute of cpu runtime in total across all 16 cores - so it's a far bigger issue for the project than it is for any user, and I've reported it as a project issue on that basis.

ID: 102322 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mrchips

Send message
Joined: 11 Nov 09
Posts: 10
Credit: 15,046,470
RAC: 4,628
Message 102336 - Posted: 3 Aug 2021, 23:21:33 UTC

Scheduler bwsrv1 Not Running

whats up?
ID: 102336 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mrhastyrib

Send message
Joined: 18 Feb 21
Posts: 90
Credit: 2,541,890
RAC: 0
Message 102338 - Posted: 5 Aug 2021, 1:59:53 UTC

This is probably an issue for the BOINC board, but since I know you jamokes I will ask here first.

On one of my hosts, the BOINC manager will not launch. I try to run it as I have in the past, but nothing happens at all.

The host seems to be processing work, based on the CPU usage. But no joy with the manager program.

Any suggestions? I suppose that I can uninstall/reinstall; will that disturb anything?
ID: 102338 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 404
Credit: 12,294,748
RAC: 2,551
Message 102339 - Posted: 5 Aug 2021, 7:07:44 UTC - in response to Message 102338.  

This is probably an issue for the BOINC board, but since I know you jamokes I will ask here first.

On one of my hosts, the BOINC manager will not launch. I try to run it as I have in the past, but nothing happens at all.

The host seems to be processing work, based on the CPU usage. But no joy with the manager program.

Any suggestions? I suppose that I can uninstall/reinstall; will that disturb anything?


Look for a file in your home directory, 5 bytes long and called Boinc-Manager-xxx or similar. If you find it, delete it - it’s a lock file.
ID: 102339 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mrhastyrib

Send message
Joined: 18 Feb 21
Posts: 90
Credit: 2,541,890
RAC: 0
Message 102341 - Posted: 5 Aug 2021, 22:32:28 UTC - in response to Message 102339.  

This is probably an issue for the BOINC board, but since I know you jamokes I will ask here first.

On one of my hosts, the BOINC manager will not launch. I try to run it as I have in the past, but nothing happens at all.

The host seems to be processing work, based on the CPU usage. But no joy with the manager program.

Any suggestions? I suppose that I can uninstall/reinstall; will that disturb anything?


Look for a file in your home directory, 5 bytes long and called Boinc-Manager-xxx or similar. If you find it, delete it - it’s a lock file.
ID: 102341 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mrhastyrib

Send message
Joined: 18 Feb 21
Posts: 90
Credit: 2,541,890
RAC: 0
Message 102342 - Posted: 5 Aug 2021, 22:33:19 UTC - in response to Message 102341.  

Look for a file in your home directory, 5 bytes long and called Boinc-Manager-xxx or similar. If you find it, delete it - it’s a lock file.
[/quote]

Five star review, my man. Much obliged.
ID: 102342 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ross Parlette

Send message
Joined: 10 Nov 05
Posts: 32
Credit: 2,165,044
RAC: 0
Message 102348 - Posted: 7 Aug 2021, 18:52:55 UTC

Are we out of work units? I haven't gotten any in a while.
ID: 102348 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 826
Message 102349 - Posted: 7 Aug 2021, 19:00:05 UTC - in response to Message 102348.  

Are we out of work units? I haven't gotten any in a while.

I have noticed that completing a GPU task tends to restrict your computer from downloading any CPU tasks for a few days.
ID: 102349 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 828
Message 102350 - Posted: 7 Aug 2021, 19:19:59 UTC - in response to Message 102348.  

Yes, work ran out a couple days ago.
ID: 102350 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1734
Credit: 18,532,940
RAC: 17,945
Message 102353 - Posted: 7 Aug 2021, 21:36:14 UTC
Last modified: 7 Aug 2021, 21:44:17 UTC

Latest batah of work units, gb10_3CL_3CL_AVLstub_reversed_
Looking at around a 60% or higher error rate. Crash & burn within seconds of starting.


eg
gb10_3CL_3CL_AVLstub_reversed_renumbered_293_002400_extract_A_SAVE_ALL_OUT_1728505_321_0


              Outcome Computation error
         Client state Compute error
          Exit status -1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION
          Computer ID 3933928
             Run time 7 sec
             CPU time 1 sec
       Validate state Invalid
               Credit 0.00
    Device peak FLOPS 4.88 GFLOPS
  Application version Rosetta v4.20 windows_x86_64
Peak working set size 42.54 MB
       Peak swap size 16.57 MB
      Peak disk usage 0.01 MB




[pre]Stderr output
<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 3221225477 (0xc0000005)</message>
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe @gb10_3CL_3CL_AVLstub_reversed_renumbered_293_002400_extract_A.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 3657876
Using database: database_357d5d93529_n_methylminirosetta_database


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00007FF64E658698

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 7.9.0


Dump Timestamp : 08/08/21 06:55:47
Install Directory : C:Program FilesBOINC
Data Directory : C:ProgramDataBOINC
Project Symstore : https://boinc.bakerlab.org/rosetta/symstore
LoadLibraryA( C:ProgramDataBOINCdbghelp.dll ): GetLastError = 126
Loaded Library : dbghelp.dll
LoadLibraryA( C:ProgramDataBOINCsymsrv.dll ): GetLastError = 126
LoadLibraryA( symsrv.dll ): GetLastError = 126
LoadLibraryA( C:ProgramDataBOINCsrcsrv.dll ): GetLastError = 126
LoadLibraryA( srcsrv.dll ): GetLastError = 126
LoadLibraryA( C:ProgramDataBOINCversion.dll ): GetLastError = 126
Loaded Library : version.dll
Debugger Engine : 4.0.5.0
Symbol Search Path: C:ProgramDataBOINCslots9;C:ProgramDataBOINCprojectsboinc.bakerlab.org_rosetta;srv*C:ProgramDataBOINCprojectsboinc.bakerlab.org_rosettasymbols*http://msdl.microsoft.com/download/symbols;srv*C:ProgramDataBOINCprojectsboinc.bakerlab.org_rosettasymbols*https://boinc.bakerlab.org/rosetta/symstore


ModLoad: 000000004aaa0000 00000000057ef000 C:ProgramDataBOINCprojectsboinc.bakerlab.org_rosettarosetta_4.20_windows_x86_64.exe (-exported- Symbols Loaded)
Linked PDB Filename : C:cygwin64homeboinc4.17RosettamainsourceideVisualStudiox64BoincReleaserosetta_4.20_windows_x86_64.pdb

ModLoad: 00000000bff30000 00000000001f5000 C:WINDOWSSYSTEM32ntdll.dll (6.2.19041.1081) (-exported- Symbols Loaded)
Linked PDB Filename : ntdll.pdb
File Version : 10.0.19041.1023 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.1023

ModLoad: 00000000bf370000 00000000000bd000 C:WINDOWSSystem32KERNEL32.DLL (6.2.19041.1023) (-exported- Symbols Loaded)
Linked PDB Filename : kernel32.pdb
File Version : 10.0.19041.1023 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.1023

ModLoad: 00000000bd670000 00000000002c9000 C:WINDOWSSystem32KERNELBASE.dll (6.2.19041.1081) (-exported- Symbols Loaded)
Linked PDB Filename : kernelbase.pdb
File Version : 10.0.19041.1023 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.1023

ModLoad: 00000000bf220000 000000000006b000 C:WINDOWSSystem32WS2_32.dll (6.2.19041.546) (-exported- Symbols Loaded)
Linked PDB Filename : ws2_32.pdb
File Version : 10.0.19041.1081 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.1081

ModLoad: 00000000bede0000 000000000012a000 C:WINDOWSSystem32RPCRT4.dll (6.2.19041.1081) (-exported- Symbols Loaded)
Linked PDB Filename : rpcrt4.pdb
File Version : 10.0.19041.1 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.1

ModLoad: 00000000be840000 00000000001a0000 C:WINDOWSSystem32USER32.dll (6.2.19041.906) (-exported- Symbols Loaded)
Linked PDB Filename : user32.pdb
File Version : 10.0.19038.1 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19038.1

ModLoad: 00000000bdbb0000 0000000000022000 C:WINDOWSSystem32win32u.dll (6.2.19041.1081) (-exported- Symbols Loaded)
Linked PDB Filename : win32u.pdb
File Version : 10.0.19041.1081 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.1081

ModLoad: 00000000bf980000 000000000002a000 C:WINDOWSSystem32GDI32.dll (6.2.19041.746) (-exported- Symbols Loaded)
Linked PDB Filename : gdi32.pdb
File Version : 10.0.19041.746 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.746

ModLoad: 00000000bde90000 000000000010b000 C:WINDOWSSystem32gdi32full.dll (6.2.19041.928) (-exported- Symbols Loaded)
Linked PDB Filename : gdi32full.pdb
File Version : 10.0.19041.928 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.928

ModLoad: 00000000bddf0000 000000000009d000 C:WINDOWSSystem32msvcp_win.dll (6.2.19041.789) (-exported- Symbols Loaded)
Linked PDB Filename : msvcp_win.pdb
File Version : 10.0.19041.789 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.789

ModLoad: 00000000bdcf0000 0000000000100000 C:WINDOWSSystem32ucrtbase.dll (6.2.19041.789) (-exported- Symbols Loaded)
Linked PDB Filename : ucrtbase.pdb
File Version : 10.0.19041.789 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.789

ModLoad: 00000000bf040000 00000000000ac000 C:WINDOWSSystem32ADVAPI32.dll (6.2.19041.1052) (-exported- Symbols Loaded)
Linked PDB Filename : advapi32.pdb
File Version : 10.0.19041.1 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.1

ModLoad: 00000000bed40000 000000000009e000 C:WINDOWSSystem32msvcrt.dll (7.0.19041.546) (-exported- Symbols Loaded)
Linked PDB Filename : msvcrt.pdb
File Version : 7.0.19041.546 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 7.0.19041.546

ModLoad: 00000000bf7b0000 000000000009b000 C:WINDOWSSystem32sechost.dll (6.2.19041.906) (-exported- Symbols Loaded)
Linked PDB Filename : sechost.pdb
File Version : 10.0.19041.1 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.1

ModLoad: 00000000bf160000 0000000000030000 C:WINDOWSSystem32IMM32.DLL (6.2.19041.546) (-exported- Symbols Loaded)
Linked PDB Filename : imm32.pdb
File Version : 10.0.19041.546 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.546

ModLoad: 00000000bb5c0000 0000000000012000 C:WINDOWSSYSTEM32kernel.appcore.dll (6.2.19041.546) (-exported- Symbols Loaded)
Linked PDB Filename : Kernel.Appcore.pdb
File Version : 10.0.19041.546 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.546

ModLoad: 00000000bc3a0000 0000000000033000 C:WINDOWSSYSTEM32ntmarta.dll (6.2.19041.546) (-exported- Symbols Loaded)
Linked PDB Filename : ntmarta.pdb
File Version : 10.0.19041.1 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.1

ModLoad: 00000000b8100000 00000000001e4000 C:WINDOWSSYSTEM32dbghelp.dll (6.2.19041.867) (-exported- Symbols Loaded)
Linked PDB Filename : dbghelp.pdb
File Version : 10.0.19041.867 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.867

ModLoad: 00000000b8c20000 000000000000a000 C:WINDOWSSYSTEM32version.dll (6.2.19041.546) (-exported- Symbols Loaded)
Linked PDB Filename : version.pdb
File Version : 10.0.19041.546 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.546

ModLoad: 00000000bdaa0000 0000000000083000 C:WINDOWSSystem32bcryptPrimitives.dll (6.2.19041.1023) (-exported- Symbols Loaded)
Linked PDB Filename : bcryptprimitives.pdb
File Version : 10.0.19041.1023 (WinBuild.160101.0800)
Company Name : Microsoft Corporation
Product Name : Microsoft&#174; Windows&#174; Operating System
Product Version : 10.0.19041.1023



*** Dump of the Process Statistics: ***

- I/O Operations Counters -
Read: 5000, Write: 656, Other 13726

- I/O Transfers Counters -
Read: 14493717, Write: 11103, Other 6808

- Paged Pool Usage -
QuotaPagedPoolUsage: 317096, QuotaPeakPagedPoolUsage: 317376
QuotaNonPagedPoolUsage: 7200, QuotaPeakNonPagedPoolUsage: 7352

- Virtual Memory Usage -
VirtualSize: 83091456, PeakVirtualSize: 895533056

- Pagefile Usage -
PagefileUsage: 83091456, PeakPagefileUsage: 83091456

- Working Set Size -
WorkingSetSize: 109625344, PeakWorkingSetSize: 109629440, PageFaultCount: 27283

*** Dump of thread ID 8252 (state: Initialized): ***

- Information -
Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00007FF64E658698

- Registers -
rax=000000000000003a rbx=00000000c78e41b0 rcx=00000000c82eeac0 rdx=00000000c83cebf8 rsi=000000000000000b rdi=00000000c82eeac0
r8=000000000000003a r9=0000000000000421 r10=000000004e646e80 r11=0000000042145580 r12=000000004aaa0000 r13=000000004215fcc0
r14=0000000042145cc0 r15=000000000048b215 rip=000000004e658698 rsp=00000000421455f8 rbp=0000000000000000
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010202

- Callstack -
ChildEBP RetAddr Args to Child
421455f0 4af7831c 00000000 4e646d60 4e646e80 421455d8 rosetta_4.20_windows_x86_64!xmlValidateNotationDecl+0x0
42145620 4af3935d c78e41b0 421456c0 4af2b215 00000000 rosetta_4.20_windows_x86_64!xmlParserInputRead+0x0
42145650 4e0a7f10 4ef90150 4215fcc0 00000000 00000001 rosetta_4.20_windows_x86_64!xmlParserInputRead+0x0
42145680 4af239e8 5003a32c 4aaa0000 42145770 bff60e7b rosetta_4.20_windows_x86_64!xmlValidateNotationDecl+0x0
421456f0 bffd217f 00000000 42145c70 42146330 00000000 rosetta_4.20_windows_x86_64!xmlParserInputRead+0x0
42145720 bff81454 00000000 42145c70 42146330 00000000 ntdll!__chkstk+0x0
42145e30 bffd0cae 00000000 00000030 4e71a450 00000008 ntdll!RtlRaiseException+0x0
421465e0 4b1e3e2b fffffffe cbe393c8 ffffffff 4b1f18c5 ntdll!KiUserExceptionDispatcher+0x0
42146630 4b1f3690 4e71a3a0 cbe39120 4e71a3a0 42146729 rosetta_4.20_windows_x86_64!cppdb::session::is_open+0x0
42146760 4b309ee8 cb568798 cbcc5df0 cbe39120 cbcc5df0 rosetta_4.20_windows_x86_64!cppdb::session::is_open+0x0
42147310 4b2a4b6c cc019fe0 bff5b3c7 cd830000 00000000 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
42147510 4b2a488e 421475f8 00000000 421477e0 00000000 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
42147670 4b203da1 421477e8 00000000 c78e3b70 421478b0 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
42147a30 4b209f08 42147d80 42147d80 42147d80 00000000 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
42148080 4b2084db c7dc3d00 421480e0 c7e73880 c7e73880 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
421481e0 4b171fb7 00000000 421482f0 c7e73880 421484f0 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
42148350 4b1757a6 00000005 4af15190 c7ea87e0 c7ea87e0 rosetta_4.20_windows_x86_64!cppdb::session::is_open+0x0
421483c0 4b1756cc 421486c8 42148539 421486c8 c7e73880 rosetta_4.20_windows_x86_64!cppdb::session::is_open+0x0
42148470 4b23b6f5 421486c8 42148a41 00000000 4af375e8 rosetta_4.20_windows_x86_64!cppdb::session::is_open+0x0
42148590 4b23a592 00000005 421486c8 421488a0 00000000 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
42148660 4b23ad06 00000000 00000000 42148f80 cd830000 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
42148800 4b6971a3 421488a0 42148f80 ffffff01 4af23e73 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
42148af0 4b699d09 00000000 00000001 42148c00 42148f80 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
42148e80 4b692f8a 42148ec0 42148f80 cb3b3de0 c7ee9630 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
42148ee0 4b8acc70 42148f80 421496a8 c7ea87e0 00000000 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
42149670 4b8ac6e4 cbf93460 cbee9050 4ff15cc0 4af175a6 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
421496d0 4b8b603e 421497c0 cbf93190 421497e0 42149f30 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
42149e50 4b8b56d4 5ef76948 5ef76a58 4fe87f70 4b8d6cb4 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
42149ee0 4b8b578e 00000005 4214a488 c7ee9630 00000001 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
4214a080 4af2081d c81ab820 c81ab820 c7ee9630 c78e5701 rosetta_4.20_windows_x86_64!cppdb::backend::statement::cache+0x0
4215fcb0 4af2b215 00000000 00000000 4fe4ccf8 00000000 rosetta_4.20_windows_x86_64!xmlParserInputRead+0x0
4215fcf0 bf387034 00000000 00000000 00000000 00000000 rosetta_4.20_windows_x86_64!xmlParserInputRead+0x0
4215fd20 bff82651 00000000 00000000 00000000 00000000 KERNEL32!BaseThreadInitThunk+0x0
4215fda0 00000000 00000000 00000000 00000000 00000000 ntdll!RtlUserThreadStart+0x0

*** Dump of thread ID 32766 (state: Initialized): ***

- Information -
Status: Base Priority: Normal, Priority: Unknown, , Kernel Time: 6.000000, User Time: 0.000000, Wait Time: 3265596160.000000

- Registers -
rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000000 rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
r8=0000000000000000 r9=0000000000000000 r10=0000000000000000 r11=0000000000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000 rip=0000000000000000 rsp=0000000000000000 rbp=0000000000000000
cs=0000 ss=0000 ds=0000 es=0000 fs=0000 gs=0000 efl=00000000

- Callstack -
ChildEBP RetAddr Args to Child
(-nosymbols- PC == 0)
00000000 00000000 00000000 00000000 00000000 00000000 !+0x0

*** Dump of thread ID 30903250 (state: Unknown): ***

- Information -
Status: Base Priority: Normal, Priority: Unknown, , Kernel Time: 17179869184.000000, User Time: 21474836480.000000, Wait Time: 0.000000

- Registers -
rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000000 rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
r8=0000000000000000 r9=0000000000000000 r10=0000000000000000 r11=0000000000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000 rip=0000000000000000 rsp=0000000000000000 rbp=0000000000000000
cs=0000 ss=0000 ds=0000 es=0000 fs=0000 gs=0000 efl=00000000

- Callstack -
ChildEBP RetAddr Args to Child
(-nosymbols- PC == 0)
00000000 00000000 00000000 00000000 00000000 00000000 !+0x0


*** Debug Message Dump ****


*** Foreground Window Data ***
Window Name :
Window Class :
Window Process ID: 0
Window Thread ID : 0

Exiting...

</stderr_txt>
]]>[pre]
Grant
Darwin NT
ID: 102353 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TribbleRED

Send message
Joined: 24 Jun 10
Posts: 2
Credit: 28,020,519
RAC: 2,604
Message 102355 - Posted: 7 Aug 2021, 21:39:22 UTC

Seems like a lot of subjects going on in this thread but here we go:

Node Config:
Win10 Pro (10.0.19043)
Gigabyte x570 Aorus Xtreme
5950x
128GB (4x32GB) Trident Royal Z 3600 @ 16-22-22-42
1x Gigabyte RTX 3090 Gaming OC
3x Western Digital Black sn850 in RAID0 (AMD-RAID)
ALL drivers up-to-date
No Gigabyte settings software installed

Problem: All WU fail within seconds rendering the following log:

<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
Incorrect function.
(0x1) - exit code 1 (0x1)</message>
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe @gb10_3CL_3CL_AVLstub_reversed_renumbered_12_000155_extract_A.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 3682047
Using database: database_357d5d93529_n_methylminirosetta_database

ERROR: Residue topology file 'D:B_DATAprojectsboinc.bakerlab.org_rosettadatabase_357d5d93529_n_methylminirosetta_databasechemical/residue_type_sets/fa_standard/residue_types/metal_ions/FE.params' does not contain valid ATOM records.
ERROR:: Exit from: ......srccorechemicalresidue_io.cc line: 696
BOINC:: Error reading and gzipping output datafile: default.out
15:08:11 (8808): called boinc_finish(1)

</stderr_txt>
]]>


This is a new node.

Any help would be appreciated.
ID: 102355 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1734
Credit: 18,532,940
RAC: 17,945
Message 102356 - Posted: 7 Aug 2021, 21:56:13 UTC - in response to Message 102355.  
Last modified: 7 Aug 2021, 21:57:06 UTC

This is a new node.

Any help would be appreciated.
That's a completely different error to what i'm getting. While most of my Tasks are failing, still many of them are processing OK. All of yours failed, with a different error to mine.
I would suggest Resetting Rosetta- it could be it missed getting a file it needed when you attached & it initially got work.
BOINC Manager, select Rosetta, Reset Project.

That will make it dump all the application & support files & re-download them.
It could be a few days before there is work readily available here, and hopefully by then most of it won't error out as the present batch is doing.
If you Reset the project & are able to pick up some more work, if it does error out, check to see if they're the same type of error you are getting now, or if it's the same type that i posted above. If it's the type i posted above, then some Tasks should run OK. If it's still the same as your current errors, then all Tasks will error out again, regardless & i' suggest waiting till there is plenty of work available that doesn't produce a high percentage of errors before resetting the project (yet again).
Grant
Darwin NT
ID: 102356 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Gandolph1
Avatar

Send message
Joined: 3 Aug 21
Posts: 7
Credit: 589,132
RAC: 77
Message 102357 - Posted: 7 Aug 2021, 23:06:36 UTC - in response to Message 102356.  

Finally got my machine to start downloading tasks again (Problem at Boincstats) and now everything it's downloading is failing after a short computation time. Is there a way around this issue?
ID: 102357 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 116 · 117 · 118 · 119 · 120 · 121 · 122 . . . 311 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org