Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 34 · 35 · 36 · 37 · 38 · 39 · 40 . . . 309 · Next
Author | Message |
---|---|
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,525,460 RAC: 10,413 |
the first of the new tasks has just finished, took 4 hours to run the 1 decoy for me, these were definitely running under an hour previously. If you have your runtime to 4 hours you wont really notice the difference in time, but i'm more concerned with the actual work being done by the program. If points are an accurate indication then with 4.07 I was running at an average of 300pts per hour per core, this just finished task has returned 300 points in 4 hours, which ties in with my thinking they are not running efficiently. Are you sure? Looks more like 75/core/hr in the past to me. Sometimes 50 Also, new versions take a little while to get their scoring sorted out iirc. Looks like it started at 150/4hrs and risen to nearer 300 now. But this isn't my strong suit. Anyway, I only chimed in because I'd be happy with 8 or 16 WUs atm. 11 now here on my 8-core but still nothing for my 2 4-core machines. 60 would be a dream |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,382,444 RAC: 19,446 |
Edit-Well, it was nice while it lasted. Gone from 4 times as much down to 2 times as much- so back on par with Tasks that run for normal Target times. Grant Darwin NT |
nastasache Send message Joined: 24 Feb 07 Posts: 16 Credit: 171,383 RAC: 0 |
Thanks a lot, Robert I changed all to use 99% of RAM (was 90% as default and 50% for other). And 1% of swap. It looks no out of memory errors for now but memory usage stay as before. For 12 tasks, the total memory usage is about 6GB. It looks R@H using less memory per task than max available for 32bit app. Here is a task with max mem usage: Application Rosetta 4.12 Name 4dy3ga3h_jhr_design1_COVID-19_SAVE_ALL_OUT_903392_1 State Running Received 2020-04-01 21:33:01 Report deadline 2020-04-09 21:33:00 Estimated computation size 80,000 GFLOPs CPU time 08:11:40 CPU time since checkpoint 00:04:37 Elapsed time 15:34:17 Estimated time remaining 2d 05:56:33 Fraction done 22.400% Virtual memory size 1.12 GB Working set size 1.14 GB Directory slots/2 Process ID 14460 Progress rate 2.520% per hour Executable rosetta_4.12_windows_intelx86.exe Btw a task take about 2-3 days to finish, from an initial 4 hours estimation; it's that normal? Iulian |
strongboes Send message Joined: 3 Mar 20 Posts: 27 Credit: 5,394,270 RAC: 0 |
see below, there are no 4.07 tasks left showing, there was 9000 yesterday only 400 today, the mini was taking around an hour but gives an idea. the 4.07 were averaging a 40 min runtime, with a rate of 1 credit for 11.5 secs of runtime on average. 3600/11.5 = 313 The last 4.12 is running at 1 credit for 59.95 seconds of runtime. 4.7* slower https://boinc.bakerlab.org/rosetta/results.php?hostid=3800945&offset=340&show_names=0&state=4&appid= |
JoshuaScholar Send message Joined: 26 Mar 20 Posts: 18 Credit: 232,183 RAC: 0 |
I know this affects so few people that it won't matter much but: I have an older 2 socket Xeon system (Sandy Bridge era e5-2690s). Let me tell you what DOESN'T work properly with the Windows client on my Windows 10 pro setup: 1) NUMA. Having two sockets, the most common way to run Windows is with each processor accessing the memory that's attached to it directly preferentially. This is called NUMA, and it's slightly faster. But with NUMA enabled, the client picks the proper number of threads as if it's going to use both sockets, but then it runs all of the threads on only ONE of the sockets. 2) Hyperthreading with NUMA off. [NUMA off is called "uniform memory access", by the way.] With NUMA off and Hyperthreading enabled, the client creates the right number of threads for using both sockets BUT it allocates both threads to the SAME hyperthread in each core. So each core has one empty hyperthread and one hyperthread shared by two threads. So on this old 2 socket Xeon system running Windows 10 pro, the only efficient way to run the BOINC client is to turn off NUMA and also turn off hyperthreading. Then it works properly. On a machine this old, on a highly parallel workload, turning off hyperthreading is about a 20% throughput hit. On a newer processor it would be a greater hit. I'm not sure if there's any real hit to turning off NUMA, but it isn't a big one. Josh Scholar |
nastasache Send message Joined: 24 Feb 07 Posts: 16 Credit: 171,383 RAC: 0 |
Hi especially @Grant (SSSF) Where I am wrong? I need 2x more time to finish the tasks and 50% GFLOPS on similar i7-8700K CPU Compare: - https://boinc.bakerlab.org/rosetta/host_app_versions.php?hostid=3933928 - https://boinc.bakerlab.org/rosetta/host_app_versions.php?hostid=3914491 Thanks in advance. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
strongboes, [snip] I'm saying it doesn't look productive because the decoys are taking approximately 4 to 6 times longer to process. If you watch the graphics, it gets to a certain number of steps and then almost stops, taking 30-60 minutes for each additional step. You are assuming that each decoy does an equal amount of work, and that each step does an equal amount of work. I don't expect that to be true. Generally, the first decoy is only for checking that your computer works correctly and is the same every time, The second decoy starts the useful work. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
One thing to watch for when using CPUs with especially high numbers of cores - the bandwidth from the CPU to the memory may not be adequate to run all of the cores very well. This could leave each core in use waiting for access to memory most of the time, If so, it can be useful to reduce the number of cores BOINC is allowed to use and see if that speeds up the work enough to more than compensate for fewer cores in use. |
JoshuaScholar Send message Joined: 26 Mar 20 Posts: 18 Credit: 232,183 RAC: 0 |
That might be because of the bugs I noticed. Make sure that every thread is really allocated in its own hyperhthread, because BOINC doesn't leave it up to the OS. |
strongboes Send message Joined: 3 Mar 20 Posts: 27 Credit: 5,394,270 RAC: 0 |
One thing to watch for when using CPUs with especially high numbers of cores - the bandwidth from the CPU to the memory may not be adequate to run all of the cores very well. This could leave each core in use waiting for access to memory most of the time, If you read previous posts you will see that i'm not hyper threading and have large l3 cache and ram, I tried running just 10 cores also. It isn't that, they run roughly 4 times slower than 4.07 if they start with rb, It will be obvious soon enough. |
JoshuaScholar Send message Joined: 26 Mar 20 Posts: 18 Credit: 232,183 RAC: 0 |
Oh you're right. I just looked at my task list. Time per WU has jumped from 8 hours to 16 hours! The cores are running cooler than the last version too, suggests a bottleneck. Note 2, I just noticed that the most recent few are fast again. Maybe there was just a run of WU for a harder problem. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
A typical cause here for harder problems is larger proteins. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,525,460 RAC: 10,413 |
see below, there are no 4.07 tasks left showing, there was 9000 yesterday only 400 today, the mini was taking around an hour but gives an idea. the 4.07 were averaging a 40 min runtime, with a rate of 1 credit for 11.5 secs of runtime on average. 3600/11.5 = 313 I didn't look back that far earlier. What I notice now is that starting today, 2-Apr, the scoring for mini-Rosetta has plunged to 75/hr, down from 300/hr and 4.12 are 300/4hr - 75/hr too It looks like something has happened to <all> scoring from today - a step change down - but consistent between the two on validation. Very odd. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,525,460 RAC: 10,413 |
see below, there are no 4.07 tasks left showing, there was 9000 yesterday only 400 today, the mini was taking around an hour but gives an idea. the 4.07 were averaging a 40 min runtime, with a rate of 1 credit for 11.5 secs of runtime on average. 3600/11.5 = 313 Oh, you're not going to like this... I've just checked my own PC to see how my dribble of tasks have performed on a mere FX8370 1 Apr - Mini & 4.12 tasks around 45/hr, 280-340/8hr task. Better than I usually get tbh 2 Apr - Mini only (4.12 not reported yet) 110-120/hr, 890-950/8hr task. Lol Nothing I can say to that... |
entity Send message Joined: 8 May 18 Posts: 19 Credit: 6,122,942 RAC: 5,437 |
Oh you're right. This is a known problem in Rosetta that the developers have acknowledged but probably haven't fixed yet. They indicated that it would take a major rewrite of the code. L3 cache tends to become over utilized and the CPU waits for data to make the trip from main memory hence the CPU runs cooler (more waiting). There was a post by a developer in another project that suggested to limit the number of tasks run concurrently. They indicated that each task uses about 4MB of L3 cache. Concerning the run time, I noticed that the run parameters include something like cpu_seconds=57500. That is 16 hours. They are ignoring the Target CPU runtime setting |
Stephen "Heretic" Send message Joined: 2 Apr 20 Posts: 21 Credit: 11,028 RAC: 0 |
Hello, I have just joined this project but it seems there is no work to do at the moment. Is this a common state of affairs or have I struck a bad moment to join??Work being done has increased by 500% over the last 2 and a bit weeks, so there's not much work available as demand is far exceeding supply. . . I'm guessing fellow refugees from S@H ... oh well, I'll just have to be patient ... Stephen :( |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
I've tried to summarize the new work unit runtimes in a new thread, please post concerns about "performance" of new v4.12, or estimated time to completion over there. Rosetta Moderator: Mod.Sense |
BetelgeuseFive Send message Joined: 10 Aug 10 Posts: 4 Credit: 1,443,980 RAC: 382 |
I'm having a problem with 4.12 on Linux (CentOS 7). Found out my computer was doing nothing while there were plenty of tasks "Ready to start". First rebooted the system, but this did not change anything. Enabled cpu_sched_debug in the event log and messages indicated it was trying to start v4.12 tasks, but nothing actually started. Suspended the v4.12 tasks and other v4.08 tasks started immediately without any problems. Any clues ? Thanks, Tom |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
How much memory have you allowed BOINC to use, when active? when idle? Rosetta Moderator: Mod.Sense |
BetelgeuseFive Send message Joined: 10 Aug 10 Posts: 4 Credit: 1,443,980 RAC: 382 |
How much memory have you allowed BOINC to use, when active? when idle? System has 6 Gb configured (running inside VM). Just checked settings, it has: When in use, use at most 50% When not in use, use at most 90% Should have been plenty start at least one task. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org