Posts by koetjesreep

1) Message boards : Number crunching : Rosetta 4.1+ and 4.2+ (Message 93611)
Posted 6 Apr 2020 by koetjesreep
Post:
Joined on my failing 2103 macbook air, waiting for work.

GenuineIntel Intel(R) Core(TM) i5-4250U CPU @ 1.30GHz [x86 Family 6 Model 69 Stepping 1] (4 processors)
High Sierra (10.13.6)
2) Message boards : Number crunching : Rosetta 4.1+ and 4.2+ (Message 93158)
Posted 3 Apr 2020 by koetjesreep
Post:
I see CIA's tasks have messages like this

https://boinc.bakerlab.org/rosetta/result.php?resultid=1138828991
<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>
command: rosetta_4.12_x86_64-apple-darwin -run:protocol jd2_scripting -parser:protocol predictor_v11_boinc--fuse--covid_spike_design_boinc_v1.xml @flags_jhr_cv -in:file:silent 1od4vw2h_Junior_HalfRoid_vs_COVID-19_design1_dev.silent -in:file:silent_struct_type binary -silent_gz -mute all -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip 1od4vw2h_Junior_HalfRoid_vs_COVID-19_design1_dev.zip @1od4vw2h_Junior_HalfRoid_vs_COVID-19_design1_dev.flags -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937
Starting watchdog...
Watchdog active.
error: zipfile probably corrupt (illegal instruction)

</stderr_txt>
]]>


Just to add to this, after poking around and looking up some other current MacOS Rosetta user, I can't find anyone else with a complete 4.12 task, just errors while computing, so I guess it's not just me.
Example: https://boinc.bakerlab.org/rosetta/hosts_user.php?userid=2108642


I'm in the same boat (see a couple of msgs up)[1]. I does seem to run on my other mac (as in it hasn't failed almost immediately but has been churning away for 13 hours so far). Sad thing is that on the problematic mac I seem to be given 4.12 tasks only (overnight I had about 50 of these failures) while on my other mac I get mini 3.78 tasks as well. Is there a way to refuse 4.12 work some way? (rosetta mini 3.78 tasks run fine on the problematic mac)

[1]https://boinc.bakerlab.org/rosetta/forum_thread.php?id=12554&postid=92962
3) Message boards : Number crunching : Rosetta 4.1+ and 4.2+ (Message 92962)
Posted 1 Apr 2020 by koetjesreep
Post:
Rosetta v4.12 x86_64-apple-darwin task crashed due to

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process got signal 11</message>
<stderr_txt>
etta_4.12_x86_64-apple-darwin(74386,0x7fffac05d380) malloc: *** error for object 0x10dbc2000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
rosetta_4.12_x86_64-apple-darwin(74386,0x7fffac05d380) malloc: *** error for object 0x10dbc2000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
rosetta_4.12_x86_64-apple-darwin(74386,0x7fffac05d380) malloc: *** error for object 0x10dbc2000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
rosetta_4.12_x86_64-apple-darwin(74386,0x7fffac05d380) malloc: *** error for object 0x10dbc2000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
[snip]

https://boinc.bakerlab.org/rosetta/result.php?resultid=1138137769


Update: all my Rosetta v4.12 tasks fail after 10-20 seconds with similar stderr output. One of the non-corrupted messages points to a corrupt zip file:

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>
command: rosetta_4.12_x86_64-apple-darwin -run:protocol jd2_scripting -parser:protocol jhr_boinc.xml @flags -in:file:silent 8yo4kt0l_jhr_design1_COVID-19.silent -in:file:silent_struct_type binary -silent_gz -mute all -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip 8yo4kt0l_jhr_design1_COVID-19.zip -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 2253607
Starting watchdog...
Watchdog active.
error: zipfile probably corrupt (illegal instruction)
rosetta_4.12_x86_64-apple-darwin(97934,0x7fffac05d380) malloc: *** error for object 0x111b64000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug

</stderr_txt>
]]>

(The Rosetta v4.09 and Rosetta Mini v3.78 task continue to run fine)

Crash reports available on request if any of the developers wants to look
4) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 92942)
Posted 1 Apr 2020 by koetjesreep
Post:
Signal 11 is SEGV (segmentation fault). This is typically due to a programming bug. Per stderr, looks like a few double frees as well, perhaps related. Anyone know how to report this to the boffins that write the software? Moderators?

https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6893&postid=92847#92847 says this thread is for problem reports, hence I posted it here :-)
5) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 92931)
Posted 1 Apr 2020 by koetjesreep
Post:
Rosetta v4.12 x86_64-apple-darwin task crashed due to

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process got signal 11</message>
<stderr_txt>
etta_4.12_x86_64-apple-darwin(74386,0x7fffac05d380) malloc: *** error for object 0x10dbc2000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
rosetta_4.12_x86_64-apple-darwin(74386,0x7fffac05d380) malloc: *** error for object 0x10dbc2000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
rosetta_4.12_x86_64-apple-darwin(74386,0x7fffac05d380) malloc: *** error for object 0x10dbc2000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
rosetta_4.12_x86_64-apple-darwin(74386,0x7fffac05d380) malloc: *** error for object 0x10dbc2000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
[snip]

https://boinc.bakerlab.org/rosetta/result.php?resultid=1138137769






©2024 University of Washington
https://www.bakerlab.org