Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 62 · 63 · 64 · 65 · 66 · 67 · 68 . . . 309 · Next
Author | Message |
---|---|
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 9,863 |
If so, limiting BOINC to using only two of the cores should make each of those cores have 2 GB available. Shouldn't be necessary. My 24 core machines used to have 12GB (now they have 36GB). It would just run less tasks at once, with some queued waiting for memory. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 9,863 |
You might hope so, but that's not what actually happens. Many programs load memory in sections, keeping pointers to the start of those sections. If the sections are not reloaded to the same memory address where they were before, the program is likely to crash as soon as it tries to use something in an out of place section. I don't follow that - what you seem to be saying is Windows paging programs should often break them. I basically allow 1.3GB RAM per Task. Some use a lot more, most use a lot less, so overall as long as you allow 1.3GB per Task, set the number of cores/threads you use to match the available RAM in the system and you won't run in to issues. Surely you need to allow some room for the OS? Yes if your mean 6 hours + 12 hours of run time. No if you mean 6 hours + 12 hours of clock time. I thought it was run time. It does say something like "6 hours of work". I still have the switch-between interval set at 1,000 minutes, so I hope that does not cause my OpenPandemic tasks to expire. We'll see. It should automatically override your 1000 minute setting if something has a short due date. I've seen mine do it with LHC tasks (because they can unexpectedly suddenly have 10 days work to complete instead of 3 hours). Looks like you're counting only the power used by the CPU chip, but not the power used by the rest of the computer. The computer's power supply must be able to handle the surges of power used to start up various sections of the computer, plus the power it uses itself, with enough of a margin that it does not run into the inefficiency of being too close to its limits. That wouldn't account for 95W becoming 400W. No, it's my graphics cards that eat all the juice. Like you, I did not know any of this when I started. Where did I learn it? Right here in these forums. Yes, an amount of cruft has built up over the years – old, long threads with information that may or may not still be relevant. A lot of what’s written is unclear at best. And input from insiders who can tell us how it actually works is infrequent. But amongst it all are some useful nuggets – and passing those on is the least I can do. Unfortunately the search engine in these forums doesn't seem to be very good. I can never find information on something and have to make a new post. I assumed Peter was talking about measured consumption, not a number printed on the label of his power supply. Correct, one of those energy meters for the entire house. Plus the 13A plug gets warm! I think it does. He is only allowing 50% of the memory to be used by BOINC, so only 4 GB, or 1 GB/core. That should then run two or the units instead. I've seen it happen on mine, some tasks clearly state "waiting for memory" and change colour. It seems that I can have either the BOINC Manager or the World Community Grid, but not both. Does WCG insist on you using their manager? I only have it running on my phones, which let me add it within Boinc. Best just to add it within Boinc as another project, it won't be able to remove Boinc that way. In fact it shouldn't have been able to do it anyway. Uninstalling somebody else's program is extremely nasty and I'm sure Windows would have informed you some suspicious virus like activity was going on. I missed this earlier, sorry. I was assuming that if you had the virus, even if it had got bad enough to make you bed-ridden, that taking this protein would destroy all the viruses in your body, hence you would recover pretty quickly once your body had repaired the damage. Or is it the case that once the virus has got into your cells, it's hidden from this protein and can no longer be neutralised? Perhaps with your adjusted cells still creating more and more virus? Kinda like killing all the cat fleas in your house but not the eggs - you need to keep on killing the adult fleas until the eggs have run out. yeah. It's identical to BOINC, but the WCG team does thorough testing before updating their version of the BOINC software. So in theory, less likely to run into bugs. Ah so perhaps if you install WCG Boinc, it "updates" your normal Boinc to its version, and in this case failed to copy over one of the projects - Rosetta. PS- Anyone have an idea why my signature stats don't show WCG? Is there maybe a lag between adding the project and the stat-box-generator capturing a new user? I added WCG yesterday or the day before from within BOINC mgr, and started using the signature yesterday. Log into Boincstats, then go to BAM, then signature on the left column. Check setting in there like "Do not show projects with a BS-RAC lower than" Also, can you see WCG in your stats on the main Boincstats page similar to my https://www.boincstats.com/stats/-1/user/detail/6470/projectList but change to your user number? If not, you may need to authorise exporting data in the WCG settings (daft EU legislation). Nope, you can choose the running time. I set mine to 4 hours or else i would have more tasks failing or being cancelled. I assume that's hidden in a config file somewhere? I've never come across it. Why would you need it as low as 4 hours? Is your computer rarely switched on? |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
Peter Hucker wrote: Daedalus wrote:AIUI this one can only be set online: Rosetta@home preferences > Target CPU run timeNope, you can choose the running time.I assume that's hidden in a config file somewhere? |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 9,863 |
Peter Hucker wrote:Daedalus wrote:AIUI this one can only be set online: Rosetta@home preferences > Target CPU run timeNope, you can choose the running time.I assume that's hidden in a config file somewhere? It doesn't matter to me how long each task runs, so would I be correct in thinking they prefer 8 hours at their end? If so I'll leave it on that. |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
would I be correct in thinking they prefer 8 hours at their end?Yes: Mod.Sense wrote: Admin has requested that people leave the preference unset 14 year-old FAQ entry which may or may not still be relevant: How do I set the adjustable Work Unit time parameter in my Preferences ? What should I set it to ? |
robertmiles Send message Joined: 16 Jun 08 Posts: 1233 Credit: 14,338,560 RAC: 2,014 |
[snip] Most, but not all, BOINC programs try to keep the last two checkpoint files written, so that if there was a problem writing the most recent one, the next older one can be used instead. A few BOINC projects using very big checkpoint files erase the previous one just before they write a new one. I suspect that those are projects that do not understand the structures of their programs well enough that they can separate the changeable sections from the unchangeable sections. Windows allows programs to declare that they should never be paged. That's one way to avoid such problems. Another way is for BOINC to have its own pager, which keep track of which pointers point to which memory blocks, pages them only an entire block at a time, and adjusts the pointers if those blocks are retrieved to different addresses. I basically allow 1.3GB RAM per Task. Some use a lot more, most use a lot less, so overall as long as you allow 1.3GB per Task, set the number of cores/threads you use to match the available RAM in the system and you won't run in to issues. Yes. About 1 GB for Windows 10, less for earlier versions of Windows. [snip] Looks like you're counting only the power used by the CPU chip, but not the power used by the rest of the computer. The computer's power supply must be able to handle the surges of power used to start up various sections of the computer, plus the power it uses itself, with enough of a margin that it does not run into the inefficiency of being too close to its limits. MOST of the rest of the juice, but not all of the rest. Like you, I did not know any of this when I started. Where did I learn it? Right here in these forums. Yes, an amount of cruft has built up over the years – old, long threads with information that may or may not still be relevant. A lot of what’s written is unclear at best. And input from insiders who can tell us how it actually works is infrequent. But amongst it all are some useful nuggets – and passing those on is the least I can do. I agree. They seem to have some hidden assumptions, possibly that you only want to see messages from the last year and you don't want to find anything if you don't know how to spell its name exactly. I assumed Peter was talking about measured consumption, not a number printed on the label of his power supply. He's assuming that surges in power use won't trip the breaker or blow the fuse, then. For me, it's hard enough to reach the breaker that I want to be sure I never trip it. I think it does. He is only allowing 50% of the memory to be used by BOINC, so only 4 GB, or 1 GB/core. It looks like he doesn't mind losing run time while two or more tasks are waiting for memory, and nothing is running that's getting close to releasing any memory. It seems that I can have either the BOINC Manager or the World Community Grid, but not both. No. I've had WCG running under the other manager for years. It would be good if one of the Science bods at Rosetta could make a more complete statement of what's happened in the last few weeks. The science bods rarely make ANY statement through Rosetta@home other than an occasional summary of something they have finished doing. They expect one of the forum moderators to read through all of the posts and find the very few that actually require their attention. yeah. It's identical to BOINC, but the WCG team does thorough testing before updating their version of the BOINC software. So in theory, less likely to run into bugs. My experience indicates that BOINC uses a standard location in the file structure to list its projects, and updates are written so that they don't disturb that standard location and therefore don't need to copy the list to a new version. However, if you updated from a version that does not use the standard location to a version that does, this could have caused such a problem. BOINC normally uses a 64-bit version on the 64-bit versions of Windows that can use more than 4 GB or memory. I think I recently saw that the WCG version is available only as a 32-bit version. Switching between a 64-bit version and a 32-bit version might also cause such problems, due to the separation between where 64-bit and 32-bit programs are stored under 64-bit Windows. [snip] Nope, you can choose the running time. I set mine to 4 hours or else i would have more tasks failing or being cancelled. Yes, one specific to Rosetta@home and therefore normally changed only through their server. Under the Advanced view of BOINC Manager, click on Projects. Scroll down (if necessary) to Rosetta@home and click on it. Click on Your account in the left column. Click on Preference for this project. There can be up to 4 sets of these preferences. Find the one relevant to your computer, and click on its Edit preferences. Click on the V to the right of the box for Target CPU run time. Click on the new value you want to use. Rosetta@home currently sets these values to 8 hours when creating a new account, so that's your target time until you change it. Click on Edit preferences. If you can't tell which set of preferences is for which computer, you can safely make the same change in any set that is not for another of your computers. In the top line, click on one of the two X symbols to close the window. Since Rosetta@home is probably the only BOINC project measuring tasks by target run time, it's unclear if BOINC has a way of properly sending the target run time to your BOINC Manager. Therefore, don't be surprise if making this change makes it do strange things to the number of tasks it will send you for a few days. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,378,164 RAC: 20,578 |
Yep.I basically allow 1.3GB RAM per Task. Some use a lot more, most use a lot less, so overall as long as you allow 1.3GB per Task, set the number of cores/threads you use to match the available RAM in the system and you won't run in to issues.Surely you need to allow some room for the OS? A 4 core/thread system with 6GB RAM would be in trouble if it got 2 Tasks needing 3GB RAM, but you'd be very unlucky for that to occur. Given that the number of very large RAM requirement Tasks is only a small percentage of the total number of Tasks available, allowing 1.3GB per Task means you will usually have more than enough room for the OS. The only time where that rule of thumb breaks down is with single core dual core/thread systems- a single 3GB RAM task is more than a system with 2GB of RAM can handle, and a 3GB RAM task with a 400MB RAM task would leave a 4GB RAM 2 core/thread system in trouble as well. Grant Darwin NT |
EHM-1 Send message Joined: 21 Mar 20 Posts: 23 Credit: 183,782 RAC: 0 |
@Peter: Thanks for answering. Per your suggestion to log in to BAM, I just created a Boincstats account, and no projects are listed. I assume that's because the CPID field in my profile there is empty. I tried renewing it, but that did not populate it. But I remembered having seen a project listing for myself there yesterday (first time ever on that site)(maybe in conjunction with looking for signature stats code to use here?), and by looking in my history I found the project listing I'd seen yesterday, which is under my SETI username (created in '99) of EHM-1. If I click on the "detailed stats" link there, it takes me to a stats page that looks like yours, lists SETI and Rosetta, but not WCG. It of course shows a different user id from the one in the account I just created. I'm wondering if a sort of dummy account was generated in Boincstats by my having added their signature box url to my profile here. I can no longer say how I came up with that url with the EHM-1 user id (https://boincstats.com/signature/-1/user/40364120/sig.png). So now of course I'm completely lost. I can't say how Boincstats knows about EHM-1, nor why it would have a user id without me having created an account, but I assume I should dump the new account and try to create a log-in based on the EHM-1 user id. I will try that now. Sorry to be so clueless, and feel free to abandon any attempt to rescue me! Eric PS, follow-up: I deleted my BAM account, and can't see any way to create one based on an existing user id. system: up-to-date Windows 10, Intel quad-core 3.6 GHz processor, 8 GB RAM |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
it's unclear if BOINC has a way of properly sending the target run time to your BOINC Manager.From what I can tell it doesn’t (directly), and so Rosetta has to fudge it. Looking at client_state.xml, it seems each work unit is delivered with an estimate of how many operations it will take to complete: <rsc_fpops_est>80000000000000.000000</rsc_fpops_est> and the application that runs the task that processes the work unit is declared as performing a certain number of operations per second: <flops>2777777777.777778</flops> Divide one by the other and you get 8 hours. |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
Daedalus wrote: I set mine to 4 hours or else i would have more tasks failing or being cancelled.If BOINC is downloading more work than your computer can complete before the deadline, your Store at least setting is almost definitely too high. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 9,863 |
Another way is for BOINC to have its own pager, which keep track of which pointers point to which memory blocks, pages them only an entire block at a time, and adjusts the pointers if those blocks are retrieved to different addresses. Ah, is that why there's a setting in Boinc on how much pagefile to use? I wondered what that could mean. Although you enter a percentage, so I'm still confused on that. MOST of the rest of the juice, but not all of the rest. My GPUs are 250W TDP, although they usually use 150W to 200W. More when doing double precision (Milkyway) than single precision (Einstein). In total I think - 200W x 4 for GPUs, 75W for a smaller one, 8 processors (varying from one at 6W to mostly 75-95W TDP), 3 SSDs, 4 hard disks, and loads of fans. Some of the PSUs are not efficient (as in cheap ones). I agree. They seem to have some hidden assumptions, possibly that you only want to see messages from the last year and you don't want to find anything if you don't know how to spell its name exactly. You can change assumptions in the advanced search, but it still seems to find a lot of irrelevant stuff, rather like internet search engines used to be before Google. He's assuming that surges in power use won't trip the breaker or blow the fuse, then. I don't use breakers, I don't like false trips. Everything is fuses. And they accept surges. I used to work in a school where a room of computers would constantly trip the breakers due to surges. I (illegally) changed the breakers to slower tripping ones. I installed ones that you would have in a house, ignoring the stupid law that says somehow kids in schools suddenly need more protection than when they're at home. It looks like he doesn't mind losing run time while two or more tasks are waiting for memory, and nothing is running that's getting close to releasing any memory. Boinc seems to be quite clever with this - I watched it run out of memory for another Rosetta task, then try smaller tasks from the queue of another project. Anyway, restricting the number of cores can't help with memory, Boinc will always use as many cores as possible until it hits your set memory limit. If you can't tell which set of preferences is for which computer, you can safely make the same change in any set that is not for another of your computers. If you right click a Rosetta task and choose "your account" then click "Computers on this account", it tells you which is in which group. Einstein is better in this regard, it's the only project I've seen with a huge menu when you right click a task in Boinc. You can go straight to all sorts of things like project status, your tasks, your computers, etc. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 9,863 |
Given that the number of very large RAM requirement Tasks is only a small percentage of the total number of Tasks available, allowing 1.3GB per Task means you will usually have more than enough room for the OS. If Rosetta is anything like Einstein, large tasks tend to come together in bunches, like London buses. With Einstein it's because of higher frequency telescope data. Not sure if that would happen with Rosetta. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 9,863 |
So now of course I'm completely lost. I can't say how Boincstats knows about EHM-1, nor why it would have a user id without me having created an account, but I assume I should dump the new account and try to create a log-in based on the EHM-1 user id. I will try that now. Sorry to be so clueless, and feel free to abandon any attempt to rescue me! It's been years since I created my Boincstats account, but I think all you have to do is set it up with the same username, email address, and password as your Boinc projects use, then it should find them all (all the ones you've allowed exporting of data from - and some projects allow it by default). |
robertmiles Send message Joined: 16 Jun 08 Posts: 1233 Credit: 14,338,560 RAC: 2,014 |
Given that the number of very large RAM requirement Tasks is only a small percentage of the total number of Tasks available, allowing 1.3GB per Task means you will usually have more than enough room for the OS. For Rosetta@home, the large tasks are usually due to large proteins. Although, I've seen a message saying that doing anything for COVID-19 under Rosetta@home also causes large tasks. Rosetta@home hasn't said much if anything about whether the large protein work comes in batches. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
yeah. It's identical to BOINC, but the WCG team does thorough testing before updating their version of the BOINC software. So in theory, less likely to run into bugs. Thinking about it again, I might have seen a notice mentioning a WCG version recently, but I thought it was just the usual Boinc Manager but with a WCG badge stuck on it. The idea that IBM might also have done a security check on it makes me think I should use that version as I have very little faith in the standard Boinc Manager. But maybe updates will be delayed too. I'll stay as I am and just try to pay more attention to WCG notices to see how it works after version changes. Thanks for pointing it out anyway. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1233 Credit: 14,338,560 RAC: 2,014 |
yeah. It's identical to BOINC, but the WCG team does thorough testing before updating their version of the BOINC software. So in theory, less likely to run into bugs. Watch for signs that the WCG BOINC is only available as a 32-bit version, and is therefore unable to run any tasks using a program compiled to run under 64-bit Windows. I've seen nothing definite about whether it is, so I'll only call it something to watch for. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
WCG is a separate BOINC project. Unrelated to Rosetta@home. Thanks. If it's 32-bit I think I'll stick with the standard version |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
yeah. It's identical to BOINC, but the WCG team does thorough testing before updating their version of the BOINC software. So in theory, less likely to run into bugs. Seen that now. Thanks for confirming. I'll stick with what I'm used to. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2140 Credit: 41,518,559 RAC: 10,612 |
In short, no. Sounds very odd. When I first saw your msg for some reason none of the other very good replies and suggestions were showing, nor were your useful images, so I've only just caught up. If I can make some further suggestions: Suspend when computer is on battery: Untick. Your computer doesn't run on battery so the selection is redundant. It's intended for laptops/portables that aren't plugged into the mains. Suspend GPU computing when computer is in use: Untick. None of the projects you run offer GPU computing. And clear the field that mentions mouse/keyboard input - also redundant. Suspend when Non-Boinc CPU usage is above: Untick and clear the %age. This requires some further explanation since you mention other programs you run. You may not have picked up on it, but Grant mentioned that Rosetta (and I think tasks from all projects under Boinc) run at 'Low' or 'Idle' priority. This was a concept I didn't know existed until I started using Boinc. If you open the task manager under Windows 10, go to the "Details" column, find a Rosetta task, right-click on it and hover the mouse over the "Set Priority" option, a little sub-window will open showing the priority Rosetta runs at. On mine it shows 'Low'. Do the same on any other program you can identify and it'll show 'Normal' - a higher priority than 'Low'. That means, when you type, move the mouse, play music, watch video, or run any other program on your PC, they will use the CPU ahead of Rosetta - or Rosetta defers priority to anything else you ask your PC to do. Sometimes circumstances can arise when there's a conflict between Rosetta and other programs you run, but it's generally something pretty specialist. Recently, when work was hard to come by here, the only way I could tell I'd run out of Rosetta tasks was that the fans went quieter and the room became less hot. Certainly not for responsiveness of my PC. But if an issue arises with that program you mentioned, come back again because I think Grant had a way around it. In the usage limits section: Use at most 75% of CPU time: Unless you have issues with temperatures, using anything other than 100% can lead to task errors, particularly if tasks are starting & stopping a lot. Over-high temperatures are the only reason I'd think of reducing from 100%. But see below too. Use at most 75% of CPUs: I understand why you've cut down to 3-cores rather than 4 if it's to do with the limited RAM you have - I misunderstood this before. While I have more RAM on my i3-8350, I've also got my memory allocation set to 65% in use and 85% not in use. So I'm wondering if you should try bumping your RAM allocation up to 60% and 80% from 50% & 75% and then seeing if that allows you to run the 4th core as well. If you return to your original problem of tasks refusing to run then by all means return to 75% of CPUs, but it's worth a fresh try after changing your RAM limits imo. And note that running all 4-cores at 100%, 100% of the time, with nothing set in the "when to suspend" section, that will push up your temperatures as well, so there's two reasons you may want to go back to 75% cores if it's problematic. But even if you do drop back on cores, the other changes are worth doing. |
Stevie G Send message Joined: 15 Dec 18 Posts: 107 Credit: 865,910 RAC: 814 |
Mikey: Thanks for the response. I uninstalled WCG, downloaded BOINC again, reinstalled the BOINC Manager and all the problems were solved. Got my projects back, but still no Rosetta tasks. I think you may be right, that WCG may have been installed with BAM. It was making my computer run r-e-a-l-l-y slowly. It changed my Asteroids completion time from around 3 hours to over 4:30 hours. So I added WCG in the BOINC Manager and it downloaded one task. We'll see how that works. My machine is back up to speed now and waiting for more Rosetta. Maybe you guys were correct in saying this box is too low-spec for that kind of work. Steven Gaber Oldsmar, FL |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org