Posts by Dayle

1) Message boards : Number crunching : Beyond newbie Q&A (Message 108946)
Posted 8 Mar 2024 by Dayle
Post:
Your confusion appeared to be in thinking that a WU had a time allocated to it


No, I didn't mean that either.
2) Message boards : Number crunching : Beyond newbie Q&A (Message 108933)
Posted 8 Mar 2024 by Dayle
Post:
I'm sorry, but neither answer directly responds to my question.

Maybe my question wasn't clear enough, but I'm asking the meaning of TQJ given the variable sizes, not what a work unit is or how long the server will take to process work units.
3) Message boards : Number crunching : Beyond newbie Q&A (Message 108925)
Posted 7 Mar 2024 by Dayle
Post:
How do I interpret the server status page?
Today it says "Total queued jobs: 431,868"

For most projects one job is one work unit, but here we have variable work unit sizes creating multiple models apiece.
My 36 hour work units have hundreds of models, and the beta ones have thousands.

The lowest common denominator is a two-hour work unit.
So are the total queued jobs 863,736 hours of work?
Are they 431,868 models of various complexity, completed in minutes to hours apiece?
Are they part of a pool with the in-progress models, totaling 717,885 iterative calculations bouncing between users until they reach a undisclosed limit and retire out of the pool?
4) Message boards : Cafe Rosetta : Newbie solar questions (Message 108163)
Posted 9 Mar 2023 by Dayle
Post:
For a lot of people, the only land they own is their roof.
Elevating solar panels also protects them from the shade of other objects.

That said, you can decouple the location of your house and the panels.
That's called "community solar," and it often takes the form of co-op ownership or power purchase agreements.
Unfortunately, it's a legal grey area in most places due to the political power of utility monopolies.
Look it up where you live and see if anything's happening in your area.
5) Message boards : Cafe Rosetta : Get Call Girls in Chandannagar 5* Hotels for Most Satisfying Experiences (Message 108157)
Posted 8 Mar 2023 by Dayle
Post:
Go away, spammers.
6) Questions and Answers : Android : What are the minimum, maximum and median RAM requirements per WU in Rosetta@home? (Message 108153)
Posted 6 Mar 2023 by Dayle
Post:
Most BOINC projects are designed to maximize CPU usage, but at a lower priority than the user, effectively using your spare computational power.
This is working as intended. If you are having any specific problems, there are options within the BOINC client to restrict your contribution.
7) Message boards : Number crunching : Valid WUs not meeting deadline are flagged incorrectly (Message 93978)
Posted 9 Apr 2020 by Dayle
Post:
I've been starting to get validation errors too.

I had zero a few days ago, now I've had four units fail. All were instructed to run for 24 hours, although most invalidated in half the time.

https://boinc.bakerlab.org/rosetta/result.php?resultid=1143200166
https://boinc.bakerlab.org/rosetta/result.php?resultid=1143200182
https://boinc.bakerlab.org/rosetta/result.php?resultid=1143200238
https://boinc.bakerlab.org/rosetta/result.php?resultid=1142301309
8) Message boards : Number crunching : Running on a 4GB Raspberry Pi 4 - How to? (Message 93653)
Posted 6 Apr 2020 by Dayle
Post:
Then please sign up for World Community Grid on the smaller device.
They've got COVID-19 tasks coming soon for Windows, Mac and Linux and need their queues drained of dry of everything else ASAP.
Also their system requirements are usually much smaller.
9) Message boards : Number crunching : 0 new tasks, Rosetta? (Message 93005)
Posted 2 Apr 2020 by Dayle
Post:
World Community Grid has announced they are on-boarding a COVID-19 project that complements Rosetta's work.
This project will be live ASAP.
If you sign up for WCG with a "0" priority, BOINC will treat it as a backup project and will only fetch tasks when Rosetta has no work available.

I encourage all of you who are not already donating to both projects to begin donating now in anticipation.
Cancer or COVID, an idle CPU isn't saving anybody's life.
And the fewer miscellaneous work units they've got left in their queue when this begins, the more of the system will be used to fight this pandemic.
10) Message boards : Number crunching : COVID 19 WU Errors (Message 92584)
Posted 30 Mar 2020 by Dayle
Post:
Give BOINC Manager some time to get used to the new project resource share. It will balance out when the work cache is refreshed.


I meant the other way round!
When RAM issues started appearing, I put in a “no new tasks” order in hopes of avoiding an overload.
Only after moving over RAM and adjusting resource share to a third did I let Rosetta sync for new tasks. It got three or so.
I go to bed, and when I awake every single slot is full of Rosetta tasks, with WCG in cashe waiting to run.

Oh, and now we have a task that was validated after running for 24 hours, but the output says the task ran into errors comparing the reference and working pose.

Task 1136464489
Name 5kn1wr9d_jhr_design1_COVID-19_SAVE_ALL_OUT_903675_1_0
Workunit 1023518897
Created 29 Mar 2020, 11:56:42 UTC
Sent 29 Mar 2020, 12:11:53 UTC
Report deadline 6 Apr 2020, 12:11:53 UTC
Received 29 Mar 2020, 23:55:06 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x00000000)
Computer ID 3925665
Run time 11 hours 24 min 31 sec
CPU time 11 hours 13 min 43 sec
Validate state Valid
Credit 120.37
Device peak FLOPS 4.49 GFLOPS
Application version Rosetta v4.07
windows_intelx86
Peak working set size 1,212.31 MB
Peak swap size 1,215.32 MB
Peak disk usage 486.06 MB
Stderr output
<core_client_version>7.14.2</core_client_version>
<![CDATA[
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.07_windows_intelx86.exe -run:protocol jd2_scripting -parser:protocol jhr_boinc.xml @flags -in:file:silent 5kn1wr9d_jhr_design1_COVID-19.silent -in:file:silent_struct_type binary -silent_gz -mute all -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip 5kn1wr9d_jhr_design1_COVID-19.zip -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 2854141
Starting watchdog...
Watchdog active.
Starting watchdog...
Watchdog active.

ERROR: Assertion `copy_pose.size() == native.size()` failed. MSG:the reference pose must be the same size as the working pose
ERROR:: Exit from: ......srcprotocolsprotein_interface_designfiltersRmsdFilter.cc line: 323
16:41:46 (5404): called boinc_finish(0)

</stderr_txt>
]]>
11) Message boards : Number crunching : COVID 19 WU Errors (Message 92569)
Posted 29 Mar 2020 by Dayle
Post:
Hello Mod.Sence,

Thanks for taking the time to investigate this.

When I posted, I had 16 GB of memory in my system, and BOINC was allowed to use 90% of memory.
I have since cannibalized memory from another system, which is why it's now showing 32 GB.
The memory is mismatched, and even though it's DDR4 it's showing speeds lower then what I thought was possible for that standard (1067 MHz).

I also updated Rosetta to a one third share, with WCG at two thirds.
BOINC doesn't seem to care, and is running only Rosetta on all 32 threads, as if to make up for lost time.
Since adding 16 more gigabytes of memory, I've still lost a task to OOM errors.

I've always left applications in memory while suspended.
12) Message boards : Number crunching : COVID 19 WU Errors (Message 92493)
Posted 28 Mar 2020 by Dayle
Post:
Just resubscribed to this project on a reliable PC that's been running Rosetta software on World Community Grid (their Microbiome Immunity Project) without issue.
Set WU size to 24 hours and ran a mix of the two feeds, weighted 50-50.
PC has 32 Threads and 16 gig of RAM.
When I went to bed last night there were two gigs of free system memory and a 16 GB page file just in case.

Looks like at some point there was a spike in RAM usage (while otherwise idle), and 5 work units errored without credit.
Total loss: two days, four hours of work on a modern system (plus five more hours of WCG tasks).
Maybe nothing over time but quite painful all at once, and not a great trend if it continues.

One of the failures didn't mention RAM, just "finish file present too long".
I'm hypothesizing that this task encountered a problem and got bigger and bigger, crashing the rest?
Output text is below.

It's also possible the crash took place when minirosetta tasks finished and were replaced by full size COVID tasks.

If anybody has any thoughts, they'd be appreciated.

Thanks,

Dayle

Task 1134921561
Name rb_03_27_19542_19448_ab_t000__h002_robetta_IGNORE_THE_REST_11_09_903961_5_0
Workunit 1022160017
Created 27 Mar 2020, 20:53:25 UTC
Sent 27 Mar 2020, 21:14:52 UTC
Report deadline 4 Apr 2020, 21:14:52 UTC
Received 28 Mar 2020, 22:40:43 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 194 (0x000000C2) EXIT_ABORTED_BY_CLIENT
Computer ID 3925665
Run time 21 hours 14 min 43 sec
CPU time 21 hours 14 min 43 sec
Validate state Invalid
Credit 0.00
Device peak FLOPS 4.49 GFLOPS
Application version Rosetta v4.07
windows_x86_64
Peak working set size 1,283.20 MB
Peak swap size 1,491.86 MB
Peak disk usage 492.17 MB
Stderr output
<core_client_version>7.14.2</core_client_version>
<![CDATA[
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.07_windows_x86_64.exe @rb_03_27_19542_19448_ab_t000__h002_robetta_FLAGS -in::file::fasta t000__h002.fasta -psipred_ss2 t000__h002.spider3_ss2 -kill_hairpins t000__h002.nobuformat.spider3_ss2 -abinitio::use_filters true -in:file:boinc_wu_zip rb_03_27_19542_19448_ab_t000__h002_robetta.zip -frag3 rb_03_27_19542_19448_ab_t000__h002_robetta.200.3mers.index.gz -fragA rb_03_27_19542_19448_ab_t000__h002_robetta.200.9mers.index.gz -fragB rb_03_27_19542_19448_ab_t000__h002_robetta.200.11mers.index.gz -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 2413747
Starting watchdog...
Watchdog active.
======================================================
DONE :: 1 starting structures 76648.4 cpu seconds
This process generated 5 decoys from 5 attempts
======================================================
BOINC :: WS_max 1.34554e+09

BOINC :: Watchdog shutting down...
13:39:55 (14096): called boinc_finish(0)

</stderr_txt>
<message>
finish file present too long</message>
]]>
13) Questions and Answers : Getting started : Failed to Add Project - Please try again later (Message 86731)
Posted 25 Jun 2017 by Dayle
Post:
Servers are up, but I can't seem to add Rosetta back to my list of projects.

6/25/2017 7:10:25 AM | | Fetching configuration file from http://boinc.bakerlab.org/rosetta/get_project_config.php
6/25/2017 7:10:37 AM | | Project communication failed: attempting access to reference site
6/25/2017 7:10:41 AM | | Internet access OK - project servers may be temporarily down.






©2024 University of Washington
https://www.bakerlab.org