Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 82 · 83 · 84 · 85 · 86 · 87 · 88 . . . 294 · Next
Author | Message |
---|---|
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
Edit- i'd run the BOINC Manager benchmarks on the i7- it's showing the default values which are way less than what that system is capable of. Tools/Run CPU Benchmarks completed. Do I need to do anything after that? I'm not a smart man, something that the following anecdote will no doubt confirm. The system in question is operating off of an external SSD in an enclosure (don't ask). That SSD used to be in another, different homegrown desktop box, and was the only install of Linux that I ever did where I put the / and /home in different partitions. Little did i Know that, by taking that fateful step four years ago, I lit a fuse that reached the powder only yesterday. I found that BOINC shut down because of a lack of available disk space. This happened because, first, BOINC apparently writes to things in the / partition and not the /home partition (which seems a bit strange to me, but whatever); and two, something about BOINC activity on that box is causing a firestorm of comments in the var/log/journal folder, eating up yet more disk space. So that host was down for a time before I discovered this. I don't know if any of the foregoing is affecting whatever you are seeing about that box, but thought that I would mention it. |
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
wall time I just wanted you to know that this is now become a treasured part of my vocabulary. ME: blah blah blah in wall time. COLLEAGUE: What's wall time? ME: Oh (chuckles genially) just a term from this protein folding project that I'm involved with online. COLLEAGUE: Do you get paid for that? |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
Tools/Run CPU Benchmarks completed. Do I need to do anything after that?Nothing else to do; the values have been captured. We see them on the details page for that computer. You should see your credits per task on that machine increase significantly now that BOINC has measured the performance. |
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
You should see your credits per task on that machine increase significantly now that BOINC has measured the performance. Thank thee (that's Quaker). Can you provide me with an Idiot's Guide to All Things BIONC explanation about why the size of the lift in my pickup truck affects the credit I get for completing a task? |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
The bigger the lift, the more you can lift in one go… Rosetta@home tasks are fixed duration, not fixed work. The faster the machine, the more work each task accomplishes in that time, and thus the more credit awarded. |
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
fixed duration Doesn't have quite the same pizzazz as "wall time," but it gets the point across. However, I question the point. A scan down my completed tasks list shows a variety of different times (CPU and wall), so it doesn't look like it is fixed. Also, the progress indicators seem to show % completed, which is independent of the time. If it's fixed duration, wouldn't every task run for 8 hours or whatever and then just end? What am I missing? |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
Well, all right – tasks aren’t strictly fixed duration: they work to a target run time. They process work in chunks (a.k.a. ‘models’ or ‘decoys’), and only consider whether to stop or continue at the end of each chunk. Those chunks can take different lengths of time to process, which leads to some variation in the total run time of each task. Basically: at the end of each chunk the task decides whether it thinks it has time to complete another one without going over the target run time. If not, it will finish early. If so, it will start another chunk – and if that chunk happens to take longer than average, the task will overrun. With or without CPU benchmarks, there’s a substantial amount of variation in number of models completed (and thus credit granted), even for tasks of the same type within the same batch. The task percentage-complete calculations can be pretty inaccurate, due to the unpredictability of model duration. Your tasks have almost all finished within a few minutes of the 8-hour target. The one outlier is the Robetta task which finished 20 minutes early. That’s common; they seem to have quite different characteristics from the ‘normal’ tasks. |
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
Basically: at the end of each chunk the task decides whether it thinks it has time to complete another one without going over the target run time. This is really interesting. So what happens in the event that a task ends early and has a chunk left over? Does it get added to a different task? It seems like, at some time or another, you would have a task which is mainly composed of "orphan" chunks. |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 386 Credit: 11,725,193 RAC: 2,213 |
Basically: at the end of each chunk the task decides whether it thinks it has time to complete another one without going over the target run time. My understanding is that Work units consist of a protein chain in an initial state. Each task within the work unit then takes a random seed value which determines where and how to start folding that protein in the search for the lowest energy configuration so there are effectively a near infinite number of tasks that can be performed. When the work unit is returned any promising configurations found can be used as the starting point for another work unit or a particularly good configuration can be accepted as a working model for the protein. I would be interested in any corrections to this understanding from those that know. |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
It doesn’t matter. The aim is not to examine every possibility: given the unfathomably large search space, it cannot be. With hundreds of thousands of work units in each batch, it is statistically insignificant whether any individual task completes N or N+1 models. The probability that the ‘orphan’ is the one that will cure all the world’s ills is negligible. What I imagine does happen is that if any regions of the search space look ‘interesting’ they will be studied more closely in a subsequent batch of work. That is one of the reasons the task deadlines are so short: the results of one batch are analysed rapidly to guide the choice of parameters for the next. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1229 Credit: 14,153,394 RAC: 1,152 |
[snip] My understanding is that Work units consist of a protein chain in an initial state. Each task within the work unit then takes a random seed value which determines where and how to start folding that protein in the search for the lowest energy configuration so there are effectively a near infinite number of tasks that can be performed. Some of them are like that. Some are, instead, one step each from a list of starting points. Some are two proteins, to see if these two will bind together. I'm sure that there are more varieties I don't yet know about. |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 386 Credit: 11,725,193 RAC: 2,213 |
[snip] Thanks, I’m learning slowly :-) |
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
given the unfathomably large search space Okay, scratch the infinite monkey approach then. Where does the excess potential energy go? |
FrankMeade Send message Joined: 8 Mar 20 Posts: 1 Credit: 11,549,660 RAC: 0 |
I am getting a little sick of the "waiting for memory" practice of suspending computation of a task when there is no shortage of available memory |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 386 Credit: 11,725,193 RAC: 2,213 |
I am getting a little sick of the "waiting for memory" practice of suspending computation of a task when there is no shortage of available memory Is it restricted to certain computer(s) and / or certain project(s)? |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
Is this happening when BOINC is trying to switch between projects to satisfy resource share settings? Rosetta tasks allocate hundreds of megabytes each; if they’re being kept in memory when something else is trying to run, there may well be a shortage. Try deselecting Computing preferences » Leave non-GPU tasks in memory while suspended. Aside: Running the benchmarks on this 3900X will get it earning the right amount of credit. (There may be other machines that need it too; I didn’t go through them all…) |
mrhastyrib Send message Joined: 18 Feb 21 Posts: 90 Credit: 2,541,890 RAC: 0 |
Is this happening when BOINC is trying to switch between projects to satisfy resource share settings? I see it, on multiple machines, and I'm only running Rosetta. It's peculiar. There's clearly enough free main memory to run a project, not to mention the swap. Diddling with the values in preferences doesn't seem to wake it up, either. It just goes away on its own. I've written it off to a cost of doing business, but if there is a fix for it I wouldn't mind at all. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I see it, on multiple machines, and I'm only running Rosetta.Several of your machines have only 8 GB memory. This isn't enough for 12 cores. You need at least 1 GB/core to run Rosetta (I usually have 32 GB on my 12-core machines, and more on the larger ones). Just looking at the "free" memory doesn't do it. Rosetta (and all other BOINC projects too) need to reserve a certain amount to run. They won't leave home without it. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1619 Credit: 16,561,575 RAC: 4,537 |
And 32GB for 24 cores isn't enough either if you want to use all of them to do Rosetta work.I see it, on multiple machines, and I'm only running Rosetta.Several of your machines have only 8 GB memory. This isn't enough for 12 cores. I generally allow 1.3GB of RAM per Task- you need to leave enough for the Operating System & and any other programmes that might be running as well (with huge core/thread count systems (32+), you'd probably be OK with 800MB or even less per core as you don't often get many high RAM requirement Tasks (2GB+) at any given time). I've written it off to a cost of doing business, but if there is a fix for it I wouldn't mind at all.Just adding more RAM won't necessarily fix it- you need to allow BOINC to actually use what is there. In your Account settings, Computing preferences, Memory When computer is in use, use at most 95 % When computer is not in use, use at most 95 % Leave non-GPU tasks in memory while suspendedMake sure Leave non-GPU.. is Not selected. Works for me. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2065 Credit: 40,502,596 RAC: 4,041 |
FWIW Scheduler down Project down for maintenance Just got a 1hr delay after an update attempt |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org