Message boards : Number crunching : Rosetta needs 6675.72 MB RAM: is the restriction really needed?
Previous · 1 . . . 3 · 4 · 5 · 6
Author | Message |
---|---|
Kissagogo27 Send message Joined: 31 Mar 20 Posts: 86 Credit: 2,981,693 RAC: 1,241 |
not yet, see it in 12 hours ... |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 8,210 |
The original figure was 7*10^9 - 7 followed by 9 zeros.hi, i've got seven of "pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST_" WU per 2GB computer That's just what we need to hear. You're back in business. Sorry it took so long. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1734 Credit: 18,532,940 RAC: 17,945 |
Just had a look at one of my systems and all Tasks bar one are presently pre_helical_bundles_round1_ type, and out of all of them, only two have the reduced memory/disk values.The original figure was 7*10^9 - 7 followed by 9 zeros.hi, i've got seven of "pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST_" WU per 2GB computer So it looks like the large value Tasks are still well & truly in the majority at this stage. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 8,210 |
Just had a look at one of my systems and all Tasks bar one are presently pre_helical_bundles_round1_ type, and out of all of them, only two have the reduced memory/disk values.The original figure was 7*10^9 - 7 followed by 9 zeros.hi, i've got seven of "pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST_" WU per 2GB computer Urghh, yes. Only 7 out of 50 here atm with reduced RAM settings, though some other task-types are beginning to come down now. Which is likely to explain why In Progress has dropped again to 425k. Very frustrating because we were looking close to a long-term solution for a little while |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 8,210 |
Just had a look at one of my systems and all Tasks bar one are presently pre_helical_bundles_round1_ type, and out of all of them, only two have the reduced memory/disk values.That's just what we need to hear. You're back in business.The original figure was 7*10^9 - 7 followed by 9 zeros.hi, i've got seven of "pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST_" WU per 2GB computer I've fed back that the reduction in RAM setting seems to have been very successful, in that I'm not aware of any crashes occurring as a result and hosts with only 2Gb RAM available have been able to download and run them successfully, and asked for more tasks to be modified to call for less RAM, on the assumption only a small sample number seem to have had their setting changed in order to confirm they run ok. And also to set it up as a daily-running job if that's what needs to be done. And I've added that the same change can be tried on other task types as well, if possible. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 8,210 |
Just had a look at one of my systems and all Tasks bar one are presently pre_helical_bundles_round1_ type, and out of all of them, only two have the reduced memory/disk values.That's just what we need to hear. You're back in business.The original figure was 7*10^9 - 7 followed by 9 zeros.hi, i've got seven of "pre_helical_bundles_round1_attempt1_SAVE_ALL_OUT_IGNORE_THE_REST_" WU per 2GB computer Once a technical issue is resolved, the balance of pre_helical_bundles tasks will be amended. Hopefully won't be too long. In the meantime, my main PC seems to have had an 'episode' today and crashed out my entire Rosetta cache, to be replaced by 240+ WCG tasks, so I won't have much idea what's going on for a little while |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1734 Credit: 18,532,940 RAC: 17,945 |
In the meantime, my main PC seems to have had an 'episode' today and crashed out my entire Rosetta cache, to be replaced by 240+ WCG tasks, so I won't have much idea what's going on for a little whileSomething rather odd happened there- it went to start a Task, but a file was missing from the folder. <core_client_version>7.16.11</core_client_version> <![CDATA[ <message> couldn't start app: Input file database_357d5d93529_n_methyl.zip missing or invalid: file missing</message> ]]> Then a whole bunch of failed downloads when it tried to download a copy of the missing file, but couldn't find it. <core_client_version>7.16.11</core_client_version> <![CDATA[ <message> app_version download error: couldn't get input files: <file_xfer_error> <file_name>database_357d5d93529_n_methyl.zip</file_name> <error_code>-120 (RSA key check failed for file)</error_code> <error_message>signature verification failed</error_message> </file_xfer_error> </message> ]]> Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 8,210 |
In the meantime, my main PC seems to have had an 'episode' today and crashed out my entire Rosetta cache, to be replaced by 240+ WCG tasks, so I won't have much idea what's going on for a little whileSomething rather odd happened there- it went to start a Task, but a file was missing from the folder. Is that one of my error messages? Yeah, I saw that in my event log too. I think it's the main Rosetta database file that gets downloaded - used with everything. It happened a 2nd time too. It crashes all the tasks but not the PC itself. Main problem is it results in a 24hr backoff that I have to interrupt. I strongly suspect it's caused by my overclock running very high temps. It isn't happening on any of my other devices, so I'm pretty sure it's specific to this one host. Ugh... |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1734 Credit: 18,532,940 RAC: 17,945 |
In progress and Successes last 24hours numbers are the lowest they've been in over a week. Over the last day or so it's been all pre_helical_bundles_ Tasks. I just had a look at what's on my system, and as near as i can tell all of them have the improved memory requirement values, but they all still have the extreme storage requirement values. So i suspect the cause for the drop in work being done (at least for now) is due to those storage values alone. A quick check shows 2.8MB as the largest amount of disk space used by a Task in my Task list at this time, but the required value is still set at <rsc_disk_bound>9000000000.000000</rsc_disk_bound> (roughly 8.8GB). Given that the most storage space used by Rosetta on my 6c/12t system that has multiple versions of Rosetta & Mini Rosetta still there has been 2.5GB, so i think 2.75GB would be more than enough (certainly no more than 3GB). That will let most people get work without needing to change their default Computing preferences, and not run in to actual lack of space issues. (Edit- if the value only needs to reflect the storage space required by the Task itself and not Rosetta as a whole, then 100MB would still be an excessive requirement given the actual amounts used, but still way better than 2.75GB, which is still way better than 8+GB). For those with extreme core count systems (24+) hopefully the owners will be smart enough to realise they'll need more resources than people with considerably less cores/threads if they want to use all of them and adjust their Computing preference settings accordingly. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1734 Credit: 18,532,940 RAC: 17,945 |
In progress and Successes last 24hours numbers are the lowest they've been in over a week. Over the last day or so it's been all pre_helical_bundles_ Tasks.Looks like that is the case. Several new batches of work have come through with less than half the required disk space value of the pre_helical_bundles_ Tasks, and work In progress is on the rise again. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 8,210 |
In the meantime, my main PC seems to have had an 'episode' today and crashed out my entire Rosetta cache, to be replaced by 240+ WCG tasks, so I won't have much idea what's going on for a little whileSomething rather odd happened there- it went to start a Task, but a file was missing from the folder. Hi. I'm back. So, I stopped to dismantle and clean out my whole computer case, fans etc, put it back together and... I lost all video output. Monitor appeared to be working - reporting no signal being received. I tried another monitor from my other PC, which has been out of commission for a few weeks but I never got round to looking at it, and the same error on the monitor received. I feared something terminal had happened to my very old graphics card - a GTX750 from 2013. Into the repair shop while I was due to be working away for a few days. My home one and my other one. First report back - nothing wrong with either of them. So I took over both of my monitors before going away - could be something to do with them. Both returned now. 5800X Ryzen - nothing wrong with it, nor the graphics card, nor the monitor, nor the DVI-D cable. No idea why anything went wrong. i3-8350K - nothing wrong with it, nor the monitor (no graphics card - on-board UHD630 graphics). But, HDMI cable failed - no wonder the second monitor wouldn't work either. Cheapest repair I ever had - the guy refused to take any money from me (but I gave him something anyway - I don't agree with freebies). I'd asked if he had a second-hand graphics card I could swap in - something between a GTX1050 & GTX1650 He's given me a card and told me to try it to see if it's any good. He didn't know what it was (and neither did I) until I checked out the serial number on it. Turns out it's a Radeon 260X Better than the GTX750 I've got, but not quite good enough to swap out, so I'll be trying it on the i3 as it's miles better than that one's onboard graphics. All being well with installing it at the end of the week I'll pay for it, then ask him to look out for a decent 1050-1650 and give me a call if one turns up. I don't have any great need for a graphics card but neither do I like the look of the pricing nor availability on a new one if this GTX750 does fail on me. Also, if I'm forced to go back to using a crappy laptop for any length of time, like I have recently, I may have to shoot myself. I despise them |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 8,210 |
i3-8350K - nothing wrong with it, nor the monitor (no graphics card - on-board UHD630 graphics). But, HDMI cable failed I've just remembered why this came up in this thread. Video output was lost but that PC was running, so I brought it home to check it out - just never got round to it. With a new monitor cable, I booted it up tonight and, as I originally suspected, I hadn't allocated sufficient RAM nor disk space to Boinc to run Rosetta tasks. And WCG tasks came down to start work straight away, which it's doing. A quick change of settings in the way we've all now learned and it's grabbed its first Rosetta tasks since the end of March, having run its backup project in most of that time. Confirming what I've suspected all along that non-techie users, or people who weren't bothered what they ran as long as their machines were working on something, may not have found a solution or not cared enough to change what had always worked before, so all those hosts may be permanently lost unless resource demands reduce on the server side by only asking tasks to demand what they require. Anyway, one more host has returned. |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,217,610 RAC: 822 |
i3-8350K - nothing wrong with it, nor the monitor (no graphics card - on-board UHD630 graphics). But, HDMI cable failed Next time install VNC on each pc so you can remote in from that crappy laptop and at least see if things are working, there are other ways too but that works for me. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 8,210 |
i3-8350K - nothing wrong with it, nor the monitor (no graphics card - on-board UHD630 graphics). But, HDMI cable failed Sounds like a good idea tbf, now I've looked up what it is. Chances of me actually doing it, zero... |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1734 Credit: 18,532,940 RAC: 17,945 |
..., as I originally suspected, I hadn't allocated sufficient RAM nor disk space to Boinc to run Rosetta tasks. And WCG tasks came down to start work straight away, which it's doing.It's looking like that is the case. I've got just the odd pre_helical_bundles_ Task now, and the ones i've seen all have the reduced RAM requirements (from their previous extreme values). But they're still way more than the Tasks ever need, and that is the case for the other Task types as well. And with the disk requirements still way more than has ever been used it looks like the project has pretty much lost all of the more RAM or disk space limited systems. Systems that are capable of doing the work, but due to the excessive Task requirement values they've gone from 550k systems down to around 440k, which means they've lost roughly 20% of their compute resources. I've never had more than 2.5GB of disk space being used for Rosetta. Apart from the RB Tasks i've never noticed any other Task type use more than 1GB of RAM (Some using 800MB, many using 600MB, 400MB and even only 200MB). But because of the high configuration values for Tasks that don't actually use anywhere near those amounts, the project is down 20% of its compute capacity. Edit- I suspect it's the disk values having the biggest impact now the RAM requirements have been reduced from their previous high, but not knowing what the default RAM & disk space values are for BOINC makes it pretty much impossible to make it nothing but a wild arse guess. Grant Darwin NT |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,217,610 RAC: 822 |
i3-8350K - nothing wrong with it, nor the monitor (no graphics card - on-board UHD630 graphics). But, HDMI cable failed I do it to cut down on the monitors, keyboards and mice in my computer room, it also means fewer times getting up to go check that pc over there |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 8,210 |
Edit- I suspect it's the disk values having the biggest impact now the RAM requirements have been reduced from their previous high, but not knowing what the default RAM & disk space values are for BOINC makes it pretty much impossible to make it nothing but a wild arse guess. Pretty sure you're right though. Don't remember whether I mentioned this before, but some time last year the Rosetta admin reorganised where the downloaded files were placed and called from on our local machines, reducing the storage space required by Rosetta (and bandwidth on downloads, no doubt) which caused some of us to reduce our allocation for disk space, which subsequently made it worse when the recent files started calling for lots more. That's certainly where I fell down anyway. The changes that've been made up to now are just to RAM and that proxy I was using of WiP doesn't seem to be reflecting that any more. Maybe it's the Disk demands that are stopping it now. I'm wondering whether to dredge the subject up at this late stage or let it ride now people are used to it |
Administrator Send message Joined: 23 Oct 14 Posts: 1 Credit: 31,591 RAC: 0 |
good job!! |
Message boards :
Number crunching :
Rosetta needs 6675.72 MB RAM: is the restriction really needed?
©2024 University of Washington
https://www.bakerlab.org