Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 62 · 63 · 64 · 65 · 66 · 67 · 68 . . . 300 · Next
Author | Message |
---|---|
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
Christopher Graesser wrote: 28.06.2020 14:17:58 | Rosetta@home | Server can't open databaseMay just be a transient issue as the client scheduler tries to contact the project server to report completed tasks or request new ones. I have the same message in one of my logs from a couple of weeks ago. All was well again the next time it tried, an hour later. Are you seeing other problems? If so, the log entries immediately before and after the error would give some context. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2113 Credit: 41,060,649 RAC: 21,432 |
WCG is a separate BOINC project. Unrelated to Rosetta@home. What?! I've been signed up at WCG since 2010 (2yrs after joining Rosetta) and I never knew this. Nor have I ever heard anyone mention it before! You live and learn... |
yoerik Send message Joined: 24 Mar 20 Posts: 128 Credit: 169,525 RAC: 0 |
yeah. It's identical to BOINC, but the WCG team does thorough testing before updating their version of the BOINC software. So in theory, less likely to run into bugs. |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 389 Credit: 12,062,600 RAC: 15,391 |
How do you want that plan adjusted to handle times when one of the two projects had no work to download for the last few days, or very little work like it's happened on Rosetta@home recently? Many people do not agree that the adjustment should take effect immediately, without even waiting for several tasks to be reported and receive credits. The one thing I would do is take the project start date into account so that newly added projects don’t hog the machine for the first few days. |
EHM-1 Send message Joined: 21 Mar 20 Posts: 23 Credit: 183,782 RAC: 0 |
@SId: Yes, I do have some suspension criteria selected, for reasons I mentioned below (running ArcGIS, for one). Yesterday's comp prefs are below in message 97710. The prefs I've been trying since last night are below in this message. @Brian re screensaver: Yes, since the setting changes I made yesterday, BOINC is now processing work units when I'm using the computer. Until then, I'd always had it set to process only after I'd been idle for a certain time. For many years I've had Windows set to invoke the screensaver after x minutes idle, and BOINC to start processing after x+1 minutes idle. The screensaver kicking in is a fun gimmick that always catches kids' (and the rare curious adults') attention, and helps perpetuate my legend among my friends' kids that I'm some kind of eccentric mad genius communicating with aliens via my computer. Thanks to you guys' help, I predict my Rosetta ranking will now soar into the low 300,000s. Now off to pick out a limo... Eric PS- Anyone have an idea why my signature stats don't show WCG? Is there maybe a lag between adding the project and the stat-box-generator capturing a new user? I added WCG yesterday or the day before from within BOINC mgr, and started using the signature yesterday. system: up-to-date Windows 10, Intel quad-core 3.6 GHz processor, 8 GB RAM |
Daedalus Send message Joined: 1 Aug 08 Posts: 39 Credit: 10,102,272 RAC: 1,270 |
Nope, you can choose the running time. I set mine to 4 hours or else i would have more tasks failing or being cancelled. |
Falconet Send message Joined: 9 Mar 09 Posts: 353 Credit: 1,214,732 RAC: 5,322 |
WCG is a separate BOINC project. Unrelated to Rosetta@home. Here you go: https://www.worldcommunitygrid.org/ms/viewDownloadAgain.action Their version is 32-bit only so BOINC benchmarks end up much lower than under the 64-bit BOINC version. They also contribute to BOINC Development/bug fixing AFAIK. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,623 RAC: 22,481 |
If so, limiting BOINC to using only two of the cores should make each of those cores have 2 GB available. Shouldn't be necessary. My 24 core machines used to have 12GB (now they have 36GB). It would just run less tasks at once, with some queued waiting for memory. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,623 RAC: 22,481 |
You might hope so, but that's not what actually happens. Many programs load memory in sections, keeping pointers to the start of those sections. If the sections are not reloaded to the same memory address where they were before, the program is likely to crash as soon as it tries to use something in an out of place section. I don't follow that - what you seem to be saying is Windows paging programs should often break them. I basically allow 1.3GB RAM per Task. Some use a lot more, most use a lot less, so overall as long as you allow 1.3GB per Task, set the number of cores/threads you use to match the available RAM in the system and you won't run in to issues. Surely you need to allow some room for the OS? Yes if your mean 6 hours + 12 hours of run time. No if you mean 6 hours + 12 hours of clock time. I thought it was run time. It does say something like "6 hours of work". I still have the switch-between interval set at 1,000 minutes, so I hope that does not cause my OpenPandemic tasks to expire. We'll see. It should automatically override your 1000 minute setting if something has a short due date. I've seen mine do it with LHC tasks (because they can unexpectedly suddenly have 10 days work to complete instead of 3 hours). Looks like you're counting only the power used by the CPU chip, but not the power used by the rest of the computer. The computer's power supply must be able to handle the surges of power used to start up various sections of the computer, plus the power it uses itself, with enough of a margin that it does not run into the inefficiency of being too close to its limits. That wouldn't account for 95W becoming 400W. No, it's my graphics cards that eat all the juice. Like you, I did not know any of this when I started. Where did I learn it? Right here in these forums. Yes, an amount of cruft has built up over the years – old, long threads with information that may or may not still be relevant. A lot of what’s written is unclear at best. And input from insiders who can tell us how it actually works is infrequent. But amongst it all are some useful nuggets – and passing those on is the least I can do. Unfortunately the search engine in these forums doesn't seem to be very good. I can never find information on something and have to make a new post. I assumed Peter was talking about measured consumption, not a number printed on the label of his power supply. Correct, one of those energy meters for the entire house. Plus the 13A plug gets warm! I think it does. He is only allowing 50% of the memory to be used by BOINC, so only 4 GB, or 1 GB/core. That should then run two or the units instead. I've seen it happen on mine, some tasks clearly state "waiting for memory" and change colour. It seems that I can have either the BOINC Manager or the World Community Grid, but not both. Does WCG insist on you using their manager? I only have it running on my phones, which let me add it within Boinc. Best just to add it within Boinc as another project, it won't be able to remove Boinc that way. In fact it shouldn't have been able to do it anyway. Uninstalling somebody else's program is extremely nasty and I'm sure Windows would have informed you some suspicious virus like activity was going on. I missed this earlier, sorry. I was assuming that if you had the virus, even if it had got bad enough to make you bed-ridden, that taking this protein would destroy all the viruses in your body, hence you would recover pretty quickly once your body had repaired the damage. Or is it the case that once the virus has got into your cells, it's hidden from this protein and can no longer be neutralised? Perhaps with your adjusted cells still creating more and more virus? Kinda like killing all the cat fleas in your house but not the eggs - you need to keep on killing the adult fleas until the eggs have run out. yeah. It's identical to BOINC, but the WCG team does thorough testing before updating their version of the BOINC software. So in theory, less likely to run into bugs. Ah so perhaps if you install WCG Boinc, it "updates" your normal Boinc to its version, and in this case failed to copy over one of the projects - Rosetta. PS- Anyone have an idea why my signature stats don't show WCG? Is there maybe a lag between adding the project and the stat-box-generator capturing a new user? I added WCG yesterday or the day before from within BOINC mgr, and started using the signature yesterday. Log into Boincstats, then go to BAM, then signature on the left column. Check setting in there like "Do not show projects with a BS-RAC lower than" Also, can you see WCG in your stats on the main Boincstats page similar to my https://www.boincstats.com/stats/-1/user/detail/6470/projectList but change to your user number? If not, you may need to authorise exporting data in the WCG settings (daft EU legislation). Nope, you can choose the running time. I set mine to 4 hours or else i would have more tasks failing or being cancelled. I assume that's hidden in a config file somewhere? I've never come across it. Why would you need it as low as 4 hours? Is your computer rarely switched on? |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
Peter Hucker wrote: Daedalus wrote:AIUI this one can only be set online: Rosetta@home preferences > Target CPU run timeNope, you can choose the running time.I assume that's hidden in a config file somewhere? |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,623 RAC: 22,481 |
Peter Hucker wrote:Daedalus wrote:AIUI this one can only be set online: Rosetta@home preferences > Target CPU run timeNope, you can choose the running time.I assume that's hidden in a config file somewhere? It doesn't matter to me how long each task runs, so would I be correct in thinking they prefer 8 hours at their end? If so I'll leave it on that. |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
would I be correct in thinking they prefer 8 hours at their end?Yes: Mod.Sense wrote: Admin has requested that people leave the preference unset 14 year-old FAQ entry which may or may not still be relevant: How do I set the adjustable Work Unit time parameter in my Preferences ? What should I set it to ? |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,264,668 RAC: 4,443 |
[snip] Most, but not all, BOINC programs try to keep the last two checkpoint files written, so that if there was a problem writing the most recent one, the next older one can be used instead. A few BOINC projects using very big checkpoint files erase the previous one just before they write a new one. I suspect that those are projects that do not understand the structures of their programs well enough that they can separate the changeable sections from the unchangeable sections. Windows allows programs to declare that they should never be paged. That's one way to avoid such problems. Another way is for BOINC to have its own pager, which keep track of which pointers point to which memory blocks, pages them only an entire block at a time, and adjusts the pointers if those blocks are retrieved to different addresses. I basically allow 1.3GB RAM per Task. Some use a lot more, most use a lot less, so overall as long as you allow 1.3GB per Task, set the number of cores/threads you use to match the available RAM in the system and you won't run in to issues. Yes. About 1 GB for Windows 10, less for earlier versions of Windows. [snip] Looks like you're counting only the power used by the CPU chip, but not the power used by the rest of the computer. The computer's power supply must be able to handle the surges of power used to start up various sections of the computer, plus the power it uses itself, with enough of a margin that it does not run into the inefficiency of being too close to its limits. MOST of the rest of the juice, but not all of the rest. Like you, I did not know any of this when I started. Where did I learn it? Right here in these forums. Yes, an amount of cruft has built up over the years – old, long threads with information that may or may not still be relevant. A lot of what’s written is unclear at best. And input from insiders who can tell us how it actually works is infrequent. But amongst it all are some useful nuggets – and passing those on is the least I can do. I agree. They seem to have some hidden assumptions, possibly that you only want to see messages from the last year and you don't want to find anything if you don't know how to spell its name exactly. I assumed Peter was talking about measured consumption, not a number printed on the label of his power supply. He's assuming that surges in power use won't trip the breaker or blow the fuse, then. For me, it's hard enough to reach the breaker that I want to be sure I never trip it. I think it does. He is only allowing 50% of the memory to be used by BOINC, so only 4 GB, or 1 GB/core. It looks like he doesn't mind losing run time while two or more tasks are waiting for memory, and nothing is running that's getting close to releasing any memory. It seems that I can have either the BOINC Manager or the World Community Grid, but not both. No. I've had WCG running under the other manager for years. It would be good if one of the Science bods at Rosetta could make a more complete statement of what's happened in the last few weeks. The science bods rarely make ANY statement through Rosetta@home other than an occasional summary of something they have finished doing. They expect one of the forum moderators to read through all of the posts and find the very few that actually require their attention. yeah. It's identical to BOINC, but the WCG team does thorough testing before updating their version of the BOINC software. So in theory, less likely to run into bugs. My experience indicates that BOINC uses a standard location in the file structure to list its projects, and updates are written so that they don't disturb that standard location and therefore don't need to copy the list to a new version. However, if you updated from a version that does not use the standard location to a version that does, this could have caused such a problem. BOINC normally uses a 64-bit version on the 64-bit versions of Windows that can use more than 4 GB or memory. I think I recently saw that the WCG version is available only as a 32-bit version. Switching between a 64-bit version and a 32-bit version might also cause such problems, due to the separation between where 64-bit and 32-bit programs are stored under 64-bit Windows. [snip] Nope, you can choose the running time. I set mine to 4 hours or else i would have more tasks failing or being cancelled. Yes, one specific to Rosetta@home and therefore normally changed only through their server. Under the Advanced view of BOINC Manager, click on Projects. Scroll down (if necessary) to Rosetta@home and click on it. Click on Your account in the left column. Click on Preference for this project. There can be up to 4 sets of these preferences. Find the one relevant to your computer, and click on its Edit preferences. Click on the V to the right of the box for Target CPU run time. Click on the new value you want to use. Rosetta@home currently sets these values to 8 hours when creating a new account, so that's your target time until you change it. Click on Edit preferences. If you can't tell which set of preferences is for which computer, you can safely make the same change in any set that is not for another of your computers. In the top line, click on one of the two X symbols to close the window. Since Rosetta@home is probably the only BOINC project measuring tasks by target run time, it's unclear if BOINC has a way of properly sending the target run time to your BOINC Manager. Therefore, don't be surprise if making this change makes it do strange things to the number of tasks it will send you for a few days. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,461,513 RAC: 24,655 |
Yep.I basically allow 1.3GB RAM per Task. Some use a lot more, most use a lot less, so overall as long as you allow 1.3GB per Task, set the number of cores/threads you use to match the available RAM in the system and you won't run in to issues.Surely you need to allow some room for the OS? A 4 core/thread system with 6GB RAM would be in trouble if it got 2 Tasks needing 3GB RAM, but you'd be very unlucky for that to occur. Given that the number of very large RAM requirement Tasks is only a small percentage of the total number of Tasks available, allowing 1.3GB per Task means you will usually have more than enough room for the OS. The only time where that rule of thumb breaks down is with single core dual core/thread systems- a single 3GB RAM task is more than a system with 2GB of RAM can handle, and a 3GB RAM task with a 400MB RAM task would leave a 4GB RAM 2 core/thread system in trouble as well. Grant Darwin NT |
EHM-1 Send message Joined: 21 Mar 20 Posts: 23 Credit: 183,782 RAC: 0 |
@Peter: Thanks for answering. Per your suggestion to log in to BAM, I just created a Boincstats account, and no projects are listed. I assume that's because the CPID field in my profile there is empty. I tried renewing it, but that did not populate it. But I remembered having seen a project listing for myself there yesterday (first time ever on that site)(maybe in conjunction with looking for signature stats code to use here?), and by looking in my history I found the project listing I'd seen yesterday, which is under my SETI username (created in '99) of EHM-1. If I click on the "detailed stats" link there, it takes me to a stats page that looks like yours, lists SETI and Rosetta, but not WCG. It of course shows a different user id from the one in the account I just created. I'm wondering if a sort of dummy account was generated in Boincstats by my having added their signature box url to my profile here. I can no longer say how I came up with that url with the EHM-1 user id (https://boincstats.com/signature/-1/user/40364120/sig.png). So now of course I'm completely lost. I can't say how Boincstats knows about EHM-1, nor why it would have a user id without me having created an account, but I assume I should dump the new account and try to create a log-in based on the EHM-1 user id. I will try that now. Sorry to be so clueless, and feel free to abandon any attempt to rescue me! Eric PS, follow-up: I deleted my BAM account, and can't see any way to create one based on an existing user id. system: up-to-date Windows 10, Intel quad-core 3.6 GHz processor, 8 GB RAM |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
it's unclear if BOINC has a way of properly sending the target run time to your BOINC Manager.From what I can tell it doesn’t (directly), and so Rosetta has to fudge it. Looking at client_state.xml, it seems each work unit is delivered with an estimate of how many operations it will take to complete: <rsc_fpops_est>80000000000000.000000</rsc_fpops_est> and the application that runs the task that processes the work unit is declared as performing a certain number of operations per second: <flops>2777777777.777778</flops> Divide one by the other and you get 8 hours. |
Brian Nixon Send message Joined: 12 Apr 20 Posts: 293 Credit: 8,432,366 RAC: 0 |
Daedalus wrote: I set mine to 4 hours or else i would have more tasks failing or being cancelled.If BOINC is downloading more work than your computer can complete before the deadline, your Store at least setting is almost definitely too high. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,623 RAC: 22,481 |
Another way is for BOINC to have its own pager, which keep track of which pointers point to which memory blocks, pages them only an entire block at a time, and adjusts the pointers if those blocks are retrieved to different addresses. Ah, is that why there's a setting in Boinc on how much pagefile to use? I wondered what that could mean. Although you enter a percentage, so I'm still confused on that. MOST of the rest of the juice, but not all of the rest. My GPUs are 250W TDP, although they usually use 150W to 200W. More when doing double precision (Milkyway) than single precision (Einstein). In total I think - 200W x 4 for GPUs, 75W for a smaller one, 8 processors (varying from one at 6W to mostly 75-95W TDP), 3 SSDs, 4 hard disks, and loads of fans. Some of the PSUs are not efficient (as in cheap ones). I agree. They seem to have some hidden assumptions, possibly that you only want to see messages from the last year and you don't want to find anything if you don't know how to spell its name exactly. You can change assumptions in the advanced search, but it still seems to find a lot of irrelevant stuff, rather like internet search engines used to be before Google. He's assuming that surges in power use won't trip the breaker or blow the fuse, then. I don't use breakers, I don't like false trips. Everything is fuses. And they accept surges. I used to work in a school where a room of computers would constantly trip the breakers due to surges. I (illegally) changed the breakers to slower tripping ones. I installed ones that you would have in a house, ignoring the stupid law that says somehow kids in schools suddenly need more protection than when they're at home. It looks like he doesn't mind losing run time while two or more tasks are waiting for memory, and nothing is running that's getting close to releasing any memory. Boinc seems to be quite clever with this - I watched it run out of memory for another Rosetta task, then try smaller tasks from the queue of another project. Anyway, restricting the number of cores can't help with memory, Boinc will always use as many cores as possible until it hits your set memory limit. If you can't tell which set of preferences is for which computer, you can safely make the same change in any set that is not for another of your computers. If you right click a Rosetta task and choose "your account" then click "Computers on this account", it tells you which is in which group. Einstein is better in this regard, it's the only project I've seen with a huge menu when you right click a task in Boinc. You can go straight to all sorts of things like project status, your tasks, your computers, etc. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,623 RAC: 22,481 |
Given that the number of very large RAM requirement Tasks is only a small percentage of the total number of Tasks available, allowing 1.3GB per Task means you will usually have more than enough room for the OS. If Rosetta is anything like Einstein, large tasks tend to come together in bunches, like London buses. With Einstein it's because of higher frequency telescope data. Not sure if that would happen with Rosetta. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,707,623 RAC: 22,481 |
So now of course I'm completely lost. I can't say how Boincstats knows about EHM-1, nor why it would have a user id without me having created an account, but I assume I should dump the new account and try to create a log-in based on the EHM-1 user id. I will try that now. Sorry to be so clueless, and feel free to abandon any attempt to rescue me! It's been years since I created my Boincstats account, but I think all you have to do is set it up with the same username, email address, and password as your Boinc projects use, then it should find them all (all the ones you've allowed exporting of data from - and some projects allow it by default). |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org