Message boards : Number crunching : Rosetta running on ARM platforms
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
lakotamm Send message Joined: 28 Jun 19 Posts: 22 Credit: 171,192 RAC: 0 |
Reading this thread I can see that my RPI3 will not be useful... Fair enough, I can have one device doing Einstein. If you find a way to use it, I will be happy to set it up. |
p3d-cluster Send message Joined: 24 Oct 07 Posts: 3 Credit: 30,401,989 RAC: 0 |
Let's start a conversation about ARM. Yes! I have several Odroid XU4, C2 and 4 x N2 crunching for BOINC. The N2s with 4GB RAM I attached to Rosetta. I had only purchased 8GB eMMC for these, which had never been an issue. Now with Rosetta consuming >500MB for the project directory as well as 862MB per each slot, this wouldn't work for even one WU with the little space that was free. I put the BOINC data directory on an additional SD card, a USB thumb drive as well as an NFS share served from a NAS. Each works well with 3-4 WUs in parallel so far. Sometimes they hit the max. memory allowance (95% of 4GB), then only 3 WUs run. The rest of the cores are crunching the way smaller Tn-Grid WUs, so there are no lost CPU cycles ;-) I'll also try and use an Odroid C2 with 2GB, maybe it can run 1-2 WUs, lets see... edit: Thats not going to work on Odroid C2: Sat 04 Apr 2020 23:59:02 CEST | Rosetta@home | Message from server: Rosetta for Portable Devices needs 1907.35 MB RAM but only 1554.63 MB is available for use. |
bkil Send message Joined: 11 Jan 20 Posts: 97 Credit: 4,433,288 RAC: 0 |
You may consider using a file system with transparent compression like btrfs (not sure if ZFS works in such a tight case). |
p3d-cluster Send message Joined: 24 Oct 07 Posts: 3 Credit: 30,401,989 RAC: 0 |
These Odroids have only a single / file system created during install, that automatically enlarges itself to maximum size of the underlying device during first boot. Hard to squeeze in extra file systems :-) I'm OK with the current setup, maybe I'll move all N2 BOINC data directories to the NAS NFS. Being lazy, I'm most likely only doing that once the SD or thumb drives fail :-D |
MarkJ Send message Joined: 28 Mar 20 Posts: 72 Credit: 25,238,680 RAC: 0 |
Does this include Raspberry Pis? If so, do you know which models/OS? Pi4B, preferably the 4GB version. You’ll have to put it into 64 bit mode (aarch64). You can use Raspbian or Ubuntu. Rosetta tasks can use up to 1.5GB of memory when running so you need to limit how many run at a time. See MarksRpiCluster for how to get a Pi4 with Raspbian going with Rosetta. BOINC blog |
bkil Send message Joined: 11 Jan 20 Posts: 97 Credit: 4,433,288 RAC: 0 |
Not sure what initramfs it has, but if it supports that file system you can convert the whole root from your PC. |
CallMeFoxie Send message Joined: 22 Mar 20 Posts: 8 Credit: 152,280 RAC: 0 |
Too bad it requires 1900MB -ish, as I have a cluster of 8x Pine64+ (quadcore Cortex, aarch64, 2GB RAM) but due to how the memory is laid out I have about 1850MB after booting up available :( cannot crunch even 1 - 2 tasks. And adding a small swap gets ignored unsurprisingly. |
PorkyPies Send message Joined: 6 Apr 20 Posts: 45 Credit: 1,650,779 RAC: 0 |
Too bad it requires 1900MB -ish, as I have a cluster of 8x Pine64+ (quadcore Cortex, aarch64, 2GB RAM) but due to how the memory is laid out I have about 1850MB after booting up available :( cannot crunch even 1 - 2 tasks. And adding a small swap gets ignored unsurprisingly. It doesn't need 1.9GB. The highest I've seen my Pi4's using has been 918MB. We have some people using the Pi4 2GB model so it shouldn't be an issue. You'll only be able to run one at a time but given you have 8 Pine64's that is 8 tasks you could be running. MarksRpiCluster |
bkil Send message Joined: 11 Jan 20 Posts: 97 Credit: 4,433,288 RAC: 0 |
Not sure whether you could increase your available memory a bit by reducing video card allocation (not familiar with that platform). I myself face the very same problem on a phone and a netbook. I've patched out the check from the boinc source code and now it works correctly... zram with deflate compression does wonders along with a little swap, but they don't use more than 0.5-1GB anyway. They check whether you have 2000000000 bytes available, not sure how much thought went into determining this limit. |
CallMeFoxie Send message Joined: 22 Mar 20 Posts: 8 Credit: 152,280 RAC: 0 |
Too bad it requires 1900MB -ish, as I have a cluster of 8x Pine64+ (quadcore Cortex, aarch64, 2GB RAM) but due to how the memory is laid out I have about 1850MB after booting up available :( cannot crunch even 1 - 2 tasks. And adding a small swap gets ignored unsurprisingly. yup the problem is that when it tries to download a task it checks for available memory. I might patch it out as others did. No idea how to change GPU memory on this platform either, tried googling with no luck tbh. And on top of GPU memory there's another 165MB reserved by kernel for some peripherals and I have no idea which ones, I removed most of the drivers I didn't even need and nothing in the DTBs. :( time to go into the source of BOINC! |
bkil Send message Joined: 11 Jan 20 Posts: 97 Credit: 4,433,288 RAC: 0 |
Watch out, as although I mostly see an RSS between 300-600MB, sometimes over a GB, I've now finished one that had 1.85GB RSS! Probably that wasn't all active set, but still something to keep in mind when sizing your RAM + zram + swap. To patch it, basically this is what you need: cat boinc-7.9.3+dfsg/debian/patches/allow_memory_overcommit Description: <short summary of the patch> TODO: Put a short summary on the line above and replace this paragraph with a longer explanation of this change. Complete the meta-information with other relevant fields (see below for details). To make it easier, the information below has been extracted from the changelog. Adjust it or drop it. . boinc (7.9.3+dfsg-5) unstable; urgency=medium . * Upload to unstable * Drop update-boinc-applinks, useless now? - lets re-evaluate the situation in case somebody complains (LP: #1765576) Author: Gianfranco Costamagna <locutusofborg@debian.org> Bug-Ubuntu: https://bugs.launchpad.net/bugs/1765576 --- The information above should follow the Patch Tagging Guidelines, please checkout http://dep.debian.net/deps/dep3/ to learn about the format. Here are templates for supplementary fields that you might want to add: Origin: <vendor|upstream|other>, <url of original patch> Bug: <url in upstream bugtracker> Bug-Debian: https://bugs.debian.org/<bugnumber> Bug-Ubuntu: https://launchpad.net/bugs/<bugnumber> Forwarded: <no|not-needed|url proving that it has been forwarded> Reviewed-By: <name and email of someone who approved the patch> Last-Update: 2020-04-10 --- boinc-7.9.3+dfsg.orig/client/hostinfo_unix.cpp +++ boinc-7.9.3+dfsg/client/hostinfo_unix.cpp @@ -445,6 +445,7 @@ static void parse_meminfo_linux(HOST_INF } } fclose(f); + host.m_nbytes = 3000000000; // hack for Rosetta@home } // Unfortunately the format of /proc/cpuinfo is not standardized. Here are some notes you may find helpful: sudo vi /etc/apt/sources.list # deb-src ... universe sudo apt update sudo apt install -fy fakeroot sudo apt build-dep -fy boinc-client apt-get -b source boinc # maybe we can drop the -b here cd boinc-7.9.3+dfsg vi client/hostinfo_unix.cpp dpkg-buildpackage -us -uc sudo apt remove boinc-client boinc-manager libbionc7 boinc dpkg -i ../*.deb # or whichever you need A less intrusive solution would be to enable KSM and zram on the machine and create a qemu VM with an appropriate amount of memory, like 3GB. Of course this is only meaningful if the CPU has virtualization, but otherwise it would be very advantageous because this may potentially also provide deduplication between slots! A third workaround without a need for recompilation would be to do a chroot and mock out /proc/meminfo there. |
CallMeFoxie Send message Joined: 22 Mar 20 Posts: 8 Credit: 152,280 RAC: 0 |
Thanks! Easier than me going through the sources :) meanwhile the current tasks got crunched and new ones (rb_04_11_21299_20821) appeared, which require 400 - 600MB RAM. No need to patch out boinc client for those so for now I am fine :) |
PorkyPies Send message Joined: 6 Apr 20 Posts: 45 Credit: 1,650,779 RAC: 0 |
New 4.15 Rosetta app (for aarch64). After the download 3 started at once, the active LED on the Pi stayed on for a good 5 minutes indicating a huge amount of SD card activity. Memory is currently up to 736MB (each) but will probably go higher as they progress. MarksRpiCluster |
PorkyPies Send message Joined: 6 Apr 20 Posts: 45 Credit: 1,650,779 RAC: 0 |
New 4.15 Rosetta app (for aarch64). After the download 3 started at once, the active LED on the Pi stayed on for a good 5 minutes indicating a huge amount of SD card activity. Memory is currently up to 736MB (each) but will probably go higher as they progress. It’s the data and other files that it has to unzip into the slot directory. The database I looked at contained approx 1GB of (uncompressed) files and the other was a bit smaller. That means when each task starts it will unzip approx 1.5GB into the slot directory. That’s isn’t going to do much for a SD card based system like the Pi. I think the Rosetta 4.15 shouldn’t be sent to the Portable Devices. MarksRpiCluster |
CallMeFoxie Send message Joined: 22 Mar 20 Posts: 8 Credit: 152,280 RAC: 0 |
Today after the 4.15 update I got a sudden error on all the tasks :( <stderr_out> <![CDATA[ <message> process exited with code 1 (0x1, -255)</message> <stderr_txt> command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.15_aarch64-unknown-linux-gnu -silent_gz -mute all -s chainA_chainB_20_04_15_28_12.pdb -run:protocol jd2_scripting -jd2:dd_parser -parser:protocol local_docking_20_04_15_28_12.xml -out:nstruct 10000 -jd2:ntrials 100 -ex1 -ex2aro -beta -use_input_sc -in:file:native chainA_chainB_20_04_15_28_12.pdb -out:file:silent default.out -out:file:silent_struct_type protein -run:write_failures false -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 3988117 ERROR: Cannot determine file type. Current supported types are: PDB, CIF, SRLZ, MMTF ERROR:: Exit from: src/core/import_pose/import_pose.cc line: 380 BOINC:: Error reading and gzipping output datafile: default.out 14:04:47 (453): called boinc_finish(1) </stderr_txt> ]]> </stderr_out> I didn't have zip installed, I wonder if that was missing? Would be nice if there was some requirements page (or did I overlook it?) |
CallMeFoxie Send message Joined: 22 Mar 20 Posts: 8 Credit: 152,280 RAC: 0 |
Today after the 4.15 update I got a sudden error on all the tasks :( it seems to have happened only with those tasks, robetta is still calculatong fine. |
dominik282 Send message Joined: 25 Jun 17 Posts: 1 Credit: 382,882 RAC: 579 |
Hello everyone, Does anyone know the following phenomenon and maybe a possible solution? I have an Odroid XU4 with Android and as project WCG. When I add Rosetta@home as a second project, WUs will be downloaded just fine. But the moment the XU4 starts calculating on a Rosetta-WU, the calculation is stopped with the request to connect the XU4 to a power source. (Of course, this could not be displayed without power. ;-) ). Rosetta cannot be removed afterwards. The phenomenon can also be reproduced with a newly formatted and setuped SD card. If only WCG is running, the XU4 will calculate weeks or months just fine. Does anyone have an idea? Thank you very much ... |
lakotamm Send message Joined: 28 Jun 19 Posts: 22 Credit: 171,192 RAC: 0 |
Its interesting. I have one RPi3B+ with 1Gigs of RAM and it crunched a lot of WUs but meanwhile R@H increased minimum RAM requirements from 850MB to ~1700MB and now it hasn't got WU-s. My RPI 3B+ is still happily crunching. I often get the message that 1,7GB is required, but in the end I always get WUs. But it already happened that new WUs did not come for 24h, so I have a 1 day buffer. I have also an OdroidC2 with 2Gis of RAM and it crunching fine with 2 WUs in parallel around 50°C. Is this due to RAM limitations? BTW my RPI4B+ (4Gig of RAM) crunching only one WU at a time because of stock heatshink and it runs around 55-60°C. Could it make more sense to downclock it and run more WUs in parallel? CPU cores usually become more efficient at lower frequencies. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
There are preferences about whether you allow computations while on battery power. Do you have the same preference settings on each BOINC project? Local settings override those from the projects too. Sometimes that is the simplest way to ensure you know which settings are taking precedence. Rosetta Moderator: Mod.Sense |
sgaboinc Send message Joined: 2 Apr 14 Posts: 282 Credit: 208,966 RAC: 0 |
there is something about the android phones running on arm cores the high end ones often have pretty decent ram and some have as many as 8 super scalar cores specs for an 'old' Samsung Galaxy S7 https://www.gsmarena.com/samsung_galaxy_s7-7821.php the thing though is that we'd need to run a 'flavor' of android os that doesn't interfere and throttle the tasks. it makes a lot of sense to throttle that on a regular mobile phone and in regular use. but running boinc is anything but 'regular' i.e. the phone would need to run off a power supply and it may even need a heat sink + fan to dissipate all that heat for now 4GB Pi4 is probably an ideal platform to realize that kind of performance the a72 superscalar arm cores really made a difference but they run hot, so a sink + fan is needed more of that over in the Pi4 thread https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13732 down clock would sacrifice performance and may result in longer time per model and less models returned and in turn less credits per task. overclock is the most interesting thing happening now but u'd need a good sink fan combo to do all those, i think overclock provides the returns in credits per watt . hour. at the expense of more sink-fan hardware. |
Message boards :
Number crunching :
Rosetta running on ARM platforms
©2024 University of Washington
https://www.bakerlab.org