Posts by marmot

1) Message boards : Number crunching : New kind of app on Ralph (Message 101981)
Posted 1 Jun 2021 by marmot
Post:
And we can't select.
....
Their habit of ignoring their crunchers will be reciprocated very shortly.


I see this complaint often and I share it.

It's not hard to give us the ability to choose app/WU types.
2) Message boards : Number crunching : No android tasks? (Message 90742)
Posted 2 May 2019 by marmot
Post:

There are a number of researchers working on the next android application update related to RNA folding and cyclic-peptide design. Hopefully the app will be updated soon and a new batch of workunits will be available.

If there are no tasks available, there just isn't any work to be done at that moment in time.


Any updates on the new Android app?

I just got a used Moto Z2 Force with 8 cores (only 4 useful out of the box, needs rooting and the powersaver governor installed to use 8) with only some minor screen burn-in. There are gobs of ugly, abused, yet functional Z2 Forces with 8 cores selling for $50 a pop on eBay.
While shopping, I noticed the benchmarks of the newest top sellers made 100% improvement in Geekbench 4 benchmarks over 2 year old top sellers.
Major computational advance.

I have 10 (14 after root and governor change, 2.46Ghz overheats the battery) newer Android cores ready to run Rosetta tomorrow.
3) Message boards : Number crunching : BOINC Android trying to run 4 Rosetta WU when device doesn't have enough RAM. (Message 90741)
Posted 2 May 2019 by marmot
Post:
Can you configure R@h to use only 1 cpu for your device and see if that helps?


Set project_max_concurrent to 2.

Just wanted to point out that it requires a rooted tablet/phone to access the (root)/data/data/edu.berkeley.boinc/client/projects/boinc.bakerlab.org_rosetta folder.

Decent file manager, that allows creation of a new file from scratch then has a built in text editor that didn't add any hidden characters, is alc' s "File Manager'.
Think you probably need to set permissions to rw-r-r before use.

Then reboot or force stop the app with settings_>apps (unless I missed the menu command to reread config files; looked for it).
Event Log should say Rosetta@home found app_config.
If errors are found in the app_config then you'll get red text after that statement.


All that work... no Android WU's :(
4) Message boards : Number crunching : BOINC Android trying to run 4 Rosetta WU when device doesn't have enough RAM. (Message 90499)
Posted 9 Mar 2019 by marmot
Post:

I could have easily bypassed this bug if Rosetta ARM WU came with a manual suspension control.


The suspension (pause icon) control is showing on this final WU.
The only control showing when Rosetta for Android was attempting 4 at once was to abort (stop icon).


Update 2: So when BOINC is running under incredibly low RAM circumstances; it's GUI misreports many things.

It showed the WU's were running (they were not), then it showed they were paused (they were running sporadically according to Kernel Auditor) and then that it was suspended till the battery reached 30% (current battery level 96%).

The lack of pause icon/control menu was just a misreporting aberration.

------
Can you configure R@h to use only 1 cpu for your device and see if that helps?


I'm sure that if I create an app_config.xml with <project_max_concurrent>1</..> it will help; just BOINC, with reporting from the Rosetta WU, should have pause WU's based on available RAM and that didn't happen.
5) Message boards : Number crunching : BOINC Android trying to run 4 Rosetta WU when device doesn't have enough RAM. (Message 90484)
Posted 6 Mar 2019 by marmot
Post:

I could have easily bypassed this bug if Rosetta ARM WU came with a manual suspension control.


The suspension (pause icon) control is showing on this final WU.
The only control showing when Rosetta for Android was attempting 4 at once was to abort (stop icon).
6) Message boards : Number crunching : BOINC Android trying to run 4 Rosetta WU when device doesn't have enough RAM. (Message 90481)
Posted 5 Mar 2019 by marmot
Post:
Stats on device:

ARMv7 Processor rev 3 (v7l) 4 core
Android 3.10.54+ (Android 5.1.1)
BOINC version 7.4.53
Memory 867.21 MB
Swap space 64 MB
Free Disk Space 1.96 GB

BOINC set to use 80% available RAM (642 MB).

Work units report:
rb_02_26_1337_1484_ab_t000__robetta_cstwt_5.0_IGNORE_THE_REST_07_16_818652_18_0
Work Unit: 955067857
Peak working set size 422.19 MB
Peak swap size 645.55 MB
Peak disk usage 579.42 MB

Stderr:
WARNING: linker: ../../projects/boinc.bakerlab.org_rosetta/rosetta_android_4.10_arm-android-linux-gnu: unused DT entry: type 0x6ffffffe arg 0x26cc
WARNING: linker: ../../projects/boinc.bakerlab.org_rosetta/rosetta_android_4.10_arm-android-linux-gnu: unused DT entry: type 0x6fffffff arg 0x2
(these two lines repeat 256 times)
Too many restarts with no progress. Keep application in memory while preempted.
======================================================
DONE :: 1 starting structures 25076.7 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================
BOINC :: WS_max 0
called boinc_finish(0)

-------------------------
There is another invalid WU like this that hit the 512 error limit and another that completed (because I manually forced one Rosetta into RAM by suspending Rosetta, d/ling WCG Zika, starting them, unsuspending Rosetta, then suspending individual Zika till a single Rosetta was running). Why doesn't Rosetta Android WU have a suspension control?!?!

BOINC attempted to run 4x of these work units, with 360-412MB working sets, at once in a device with 642MB RAM available to BOINC.

Something is wrong here....

I could have easily bypassed this bug if Rosetta ARM WU came with a manual suspension control.
Suspending 3 of the 4 WU's and let them complete 1 at a time while another project got the other 3 cores.

Is this the fault of BOINC ARM or is Rosetta WU not properly reporting its needed RAM to BOINC so that BOINC can properly manage the WU count? Or is it because BOINC isn't able to properly suspend Rosetta WU's (tied to the lack of manual suspend control)?

This tablet put in 70,000 hours on Universe BH SPin2 over the last 2 years.
I'd love to have given some of that time to Rosetta.
7) Message boards : Number crunching : AMD 1950x Threadripper performance (Message 90480)
Posted 5 Mar 2019 by marmot
Post:
I may have to throttle the thing way back this summer. I hate to air condition my house during most of the day. I think the temp in my office will be a bit too warm for this thing to run flat out.


We're heading to 8 billion humans on earth next year and after eliminating deserts and mountains, that's about 2 arable acres per person and 1 acre should be forest to absorb carbon and aid the rain cycle. (seaweed farming, anyone?) Imagine if all of those humans were to run air conditioning to cool their crypto currency rigs. Humans are already doing a superb job of driving things extinct as it is.

My machines only do science and I don't use a/c . They provide almost all heat for the house in winter and only a couple down clocked laptops run in the summer. plus a non-BOINC phone.

I'm really glad to see someone else thinking about the choice of using a/c on their BOINC rigs.
8) Message boards : Number crunching : For the betterment of BOINC (Message 90474)
Posted 4 Mar 2019 by marmot
Post:
Yeah, 2 most important changes that could ease management are::
Project Assignments per Core.
and/or
How many tasks at once? With options 0 to max core count. (Yes, zero needs to be an option.)


Example problem I ran into today:
Android Rosetta finishes fine, when it's 1 task along with 3 WCG Zika's on my Lollipop 5.1 tablet.

Even though I've reduced the z-cache to 32MB (elimination eludes me), 4x Rosetta fill the 800MB RAM and it's constantly swapping then just gives up doing any work once the screen goes energy mode.

All I would want is a user dialog asking me "How many tasks running at once?" So I could choose 1 or 2.

I have to pause Rosetta, get WCG to d/l tasks, pause one of those 4 running, unpause Rosetta, where I finally get the situation of 1 Rosetta + 3 Zika running.

No common user should need to EVER touch an app_config.xml (and no config file edits on Android at all) just to get only 1 task running at a time...
9) Message boards : Number crunching : For the betterment of BOINC (Message 90468)
Posted 4 Mar 2019 by marmot
Post:
End users need better control over work loads.

1) WU Black List: Black list of WU that the end user refuses to accept, controlled at the client, not on the server.
For example, I do not want to run any mini-Rosetta on my machines and there is no way to stop them except their manual deletion on 34 clients. <max_concurrent>0</max_concurrent> is not properly interpreted by BOINC.exe. Had a similar situation at LHC@home where even when you choose on the server side no ATLAS WU's, the project still sent down the ~6GB virtual machine data set.. They actually sent down every WU data set and filled my SSD drive. So if we Black List a WU, none of it's parts will be accepted by the machine.

2) Project core affinity control: BOINC workload cache was designed in years when there were few cores per machine. Now we have 32 core desktops available. There needs to be a core management tab in advanced BOINC client control where we assign project affinities to each core or set of X cores (4 preferred). This solves the issue where BOINC doesn't adjust work loads dependent on the <max_concurrent> choices made for work units. The current work cache system assigned to each group of 4 cores, thus having 8 working caches on a 32 core machine, would be a possible easy first incarnation. Setting up 8 BOINC client directories can work but it's not friendly and increases long term management time whereas the affinity tab would have an initial time consuming cost, but would require less time in the long haul. Setting up a series of 4 core virtual machines on a 32 core host also works but it's a time consuming and increases management time by a great deal and introduces RAM and core cycle inefficiencies as 8 OS's are running.

3) Priority WU List. We need methods for handling limited WU releases during a week or very short daily punctuated WU releases. Accept the fact that some projects have limited work available and users might not want to wait days for the working cache to properly prioritize the projects. Give us a method to prioritize certain work units so that the client receives them when available and these WU's can even preempt other work units to the point those work units fail their deadlines.. On the BOINC main forums, one of the devs admitted that BOINC works best when it's assumed an infinite stream of work units and doesn't deal well with outages or short punctuated work releases. High priority WU's need to be checked for every 1 to 5 minutes regardless if the same project already has a full work cache or the cache is full of other project WU"s.

4) Project Priority Ranking: A tool for dealing with aggressive project deadline or work cache abuse: My management time could be reduced by hours a week if I was given the option of ranking projects in order of preemption. Assign a rank to projects and the higher rank projects work units can preempt other projects work units in cases where the lower priority WU has a shorter deadline. This gives users some method of control over the projects (every project thinks theirs is the most important) that have aggressive deadlines in order to manipulate users' work caches to favor their project. Projects are in competition for computation time and aggressive deadlines are one method. Currently it takes setting "switch between tasks every 9999 minutes" and manual WU management.

5) Deal with credit inflation: Credit has been talked about the most in this forum thread. I'll just say that some projects are using credit inflationary techniques as another method of attracting computation base away from other projects. There needs to be a secondary credit normalization formula (or better governance from the BOINC central committee) to prevent an increasing inflationary cycle as each new project has to up the ante. I can tell this is happening because I worry more about magnitude of my crypto coin and my machines almost always end up on projects with low credit returns because the inflated credit projects have lured away my competition.
10) Message boards : Number crunching : Link from tasks page does not show private details when signed on to website. (Message 90431)
Posted 26 Feb 2019 by marmot
Post:
If that's all it takes then the HTML fix could be simple.

It has been fixed for me already few hours ago, at least I don't find any pages anymore where I'm not logged in. Eventually you have to reload the pages, which you have opened before they fixed it.


It worked after I posted my message.
Just didn't modify the message to indicate this.

I think I fixed this. If there are other web site related issues please email me directly at dekim at uw dot edu


Thanks for the quick fix.
11) Message boards : Number crunching : Link from tasks page does not show private details when signed on to website. (Message 90424)
Posted 25 Feb 2019 by marmot
Post:
I

Example, the tasks page brings you here when you click the host ID:
https://boinc.bakerlab.org/show_host_detail.php?hostid=xxxxxxx
change the URL in your browser to:
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=xxxxxxx


Thanks,

each machine has several days of work now and, if there are any invalids or errors, this'll help me identify the machine.

If that's all it takes then the HTML fix could be simple.
12) Message boards : Number crunching : Problems with web site (Message 90419)
Posted 24 Feb 2019 by marmot
Post:

For example with a greater partecipation of admins to forums.


Is that why my post about the broken computer summary page is not being responded too?
13) Message boards : Number crunching : Link from tasks page does not show private details when signed on to website. (Message 90405)
Posted 21 Feb 2019 by marmot
Post:
Did an advanced search on 'anonymous' and, surprisingly, didn't find a post on this issue.

My preferences are set to not to show my computers to other users.

When I go to the account section and show my computers I see the list with their ID's and can look at tasks from there; so that functions as expected.

The issue is that I want to look at the 'tasks' display screen and see 'in process' list then click on the machine ID and figure out which of the 30+ machines have active tasks.

When I click on the ID it takes me to a web page showing an anonymous computer statistics and at the upper right hand corner has the controls for me to Sign Up or Login. Clicking 'Login' gives the error that I am already logged in.

So effectively, my own computer is made anonymous to me and since 24 of the machines have identical CPU, OS and RAM it's impossible to tell which one from characteristics.

Not had this issue on 25+ other project sites with similar (or identical) preference settings.
14) Message boards : Number crunching : Want absolutely no mini WU's; how? (Message 90378)
Posted 17 Feb 2019 by marmot
Post:
Have you tried <max_concurrent>0</max_concurrent> for the mini application?


On LHC@home in 2016. BOINC main forums help desk said that the client ignores a 0 setting.


Or app_info.xml (anonymous platform) with only the "big" rosetta application?


I'll look into this.



BTW, you need more RAM and not more cores.


Well normally, but I can't compete for mag as well on Rosetta as I can on other projects so Rosetta is a backup project which will never get all my cores.

I wanted to dedicate my Android tablet's core to the 100k hour goal but it's never gotten a single WU.
Need to look into why.
15) Message boards : Number crunching : Want absolutely no mini WU's; how? (Message 90368)
Posted 16 Feb 2019 by marmot
Post:
there's nothing mini about them, actually they are even a bit larger than the "normal" Rosetta WUs, when it comes to memory use

Then why are they referred to as 'mini'?


Or is there any other reason, why zou don't want them?


Only real reason is WUProp, mini's are currently well over 10,000 hours but regular just broke 1000.

I want 100,000 hours on all WU's my machines take on for my epic purple badge.

I noticed these WU's do love to eat RAM and had to limit them to <max_concurrent>1</> so that makes breaking 100,000 hours some 3 to 5 years off for Rosetta unless I find money for more cores and not being able to control the mini vs regular means wasted hours on one.
16) Message boards : Number crunching : Want absolutely no mini WU's; how? (Message 90357)
Posted 15 Feb 2019 by marmot
Post:
Set the run time to 24 hours in an attempt to prevent any Rosetta-mini d/ls but yet 8 more showed up on various machines.

Any server side setting that can accomplish this?

Local app_config.xml can only limit to minimum 1 WU with <max_concurrent>; 0 is not parsed and used. Sometimes a trick to set max_mem intolerably low can cause unwanted WU to die in seconds and get a backoff from the server; but I want the large WU so my bag of tricks is empty... or is it... hmmm.

Maybe I could set the core count for mini's to greater than the client's core count so they never run although they will clog up the work cache.

Something like this for an 8 core client:

<app>
<name>minirosetta</name>
<max_concurrent>1</max_concurrent>
<fraction_done_exact/>
</app>
<app_version>
<app_name>minirosetta</app_name>
<avg_ncpus>9</avg_ncpus>
<cmdline>-t9</cmdline>
</app_version>


Is -t command switch implemented for Rosetta WU's?
17) Message boards : Number crunching : Rosetta is not playing nice with my other projects. (Message 90356)
Posted 15 Feb 2019 by marmot
Post:
Setting Rosetta's resource share to 0 has made it play nicer with my higher priority projects that will go days without any work then hit in bursts that forces the backup Rosetta WU's into 'waiting'.
Still have to delete some Rosetta when high priority project WU come in cause of the short Rosetta deadline.






©2023 University of Washington
https://www.bakerlab.org