Possible bug in the "Average Processing Rate" calculation.

Author	Message
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 440 Credit: 15,189,162 RAC: 1,742	Message 95996 - Posted: 4 May 2020, 8:57:10 UTC - in response to Message 95942. The claimed credit is described here https://boinc.berkeley.edu/trac/wiki/CreditNew If this is taking up too much time / has been done to death previously then please feel free to ignore me but two things confuse me :- The reliance on wu.fpops.est which, for Rosetta, appears to be a fixed 80,000 regardless of preferred run time The statement “ Then the average credit per job should be the same for all hosts.”. Surely in Rosetta, where the amount of work done by a host on a given WU is not fixed, this is not true. ID: 95996 · Rating: 0 · rate: / Reply Quote

Tomcat雄猫 Send message Joined: 20 Dec 14 Posts: 180 Credit: 5,390,659 RAC: 0	Message 95997 - Posted: 4 May 2020, 9:38:00 UTC - in response to Message 95996. Last modified: 4 May 2020, 9:38:36 UTC The claimed credit is described here https://boinc.berkeley.edu/trac/wiki/CreditNew If this is taking up too much time / has been done to death previously then please feel free to ignore me but two things confuse me :- The reliance on wu.fpops.est which, for Rosetta, appears to be a fixed 80,000 regardless of preferred run time The statement “ Then the average credit per job should be the same for all hosts.”. Surely in Rosetta, where the amount of work done by a host on a given WU is not fixed, this is not true. Well, to be fair, that was the whole point behind this and another thread. This thread mainly deals with the issue that since the estimated computation size for rosetta tasks are fixed at 80,000GFLOPS, average processing rate appears to be inversely proportional to target runtime. The other thread deals with the observation that very long-running tasks have a tendency to return absurdly low credits, probably linked to the thing you're talking about. I hope the updated validator fixes the credit side of things. At least now I can rest assured that the issue is not my main rig. ID: 95997 · Rating: 0 · rate: / Reply Quote

Tomcat雄猫 Send message Joined: 20 Dec 14 Posts: 180 Credit: 5,390,659 RAC: 0	Message 95998 - Posted: 4 May 2020, 9:44:02 UTC - in response to Message 95973. Last modified: 4 May 2020, 9:44:21 UTC Sorry, that limit should be increased on Ralph@h for new jobs now. FYI, I updated both the scheduler and the validator on R@h. So hopefully these updates will address the job cache and the crediting issues that Tomcat has highlighted. Ah, thanks a lot. All my machines are back online acrunchin', now that I've changed the URL. ID: 95998 · Rating: 0 · rate: / Reply Quote

Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0	Message 96021 - Posted: 4 May 2020, 14:15:10 UTC - in response to Message 95996. The statement “ Then the average credit per job should be the same for all hosts.”. Surely in Rosetta, where the amount of work done by a host on a given WU is not fixed, this is not true. I believe the statement you were quoting is talking about a hypothetical, fix sized WU with an exactly number of FPOPS required to complete it. R@h uses these credit claims to build up an average of credit claimed PER MODEL, and then awards credit based on the number of completed models. Not on a per WU basis. The idea of both the credit new, and the R@h method of granting credit is that credit is NOT based on how long you run a work unit, rather it is based on how much useful work (models) you produce. You might run the WU for half the time of another host, but your CPU is twice as fast. If both methods produce 15 models, on the same batch of work, then both get the same credit. Rosetta Moderator: Mod.Sense ID: 96021 · Rating: 0 · rate: / Reply Quote

wolfman1360 Send message Joined: 18 Feb 17 Posts: 73 Credit: 19,103,702 RAC: 0	Message 96082 - Posted: 5 May 2020, 1:48:54 UTC - in response to Message 96021. The statement “ Then the average credit per job should be the same for all hosts.”. Surely in Rosetta, where the amount of work done by a host on a given WU is not fixed, this is not true. I believe the statement you were quoting is talking about a hypothetical, fix sized WU with an exactly number of FPOPS required to complete it. R@h uses these credit claims to build up an average of credit claimed PER MODEL, and then awards credit based on the number of completed models. Not on a per WU basis. The idea of both the credit new, and the R@h method of granting credit is that credit is NOT based on how long you run a work unit, rather it is based on how much useful work (models) you produce. You might run the WU for half the time of another host, but your CPU is twice as fast. If both methods produce 15 models, on the same batch of work, then both get the same credit. Thank you! This placed this into language I can understand. My RAC is increasing daily, so I'm not going to complain. 12 hour runtimes seem to be doing just fine though I'm still going to have aborted wus. But they seem to be slowly getting more accurate as time goes on. ID: 96082 · Rating: 0 · rate: / Reply Quote

Tomcat雄猫 Send message Joined: 20 Dec 14 Posts: 180 Credit: 5,390,659 RAC: 0	Message 96116 - Posted: 5 May 2020, 16:06:26 UTC Last modified: 5 May 2020, 16:17:41 UTC Oh look! Whatever was changed, the APR calculations seem much better now. https://boinc.bakerlab.org/rosetta/host_app_versions.php?hostid=3208606 Before and after the change. (My phone submitted that lone 4.20 task after the change) https://boinc.bakerlab.org/rosetta/host_app_versions.php?hostid=2263735 Slowly rising back up to sane values again. https://boinc.bakerlab.org/rosetta/host_app_versions.php?hostid=3754624 Same over here. https://boinc.bakerlab.org/rosetta/host_app_versions.php?hostid=3684192 in the case of hosts with short runtimes, it's dropping back to saner values. Wait a minute, why is it only slightly higher than my phone? Wait, no. It's my phone's value that seems way too high... Whatever, it still appears to be a huge improvement. Credit-wise. It seems much more stable but we need a larger sample size. ID: 96116 · Rating: 0 · rate: / Reply Quote

Admin Project administrator Send message Joined: 1 Jul 05 Posts: 5146 Credit: 0 RAC: 0	Message 96148 - Posted: 6 May 2020, 5:36:22 UTC Looks like my updates are helping, great. ID: 96148 · Rating: 0 · rate: / Reply Quote

Bryn Mawr Send message Joined: 26 Dec 18 Posts: 440 Credit: 15,189,162 RAC: 1,742	Message 96155 - Posted: 6 May 2020, 8:03:23 UTC - in response to Message 96148. Looks like my updates are helping, great. Always :-) ID: 96155 · Rating: 0 · rate: / Reply Quote

Tomcat雄猫 Send message Joined: 20 Dec 14 Posts: 180 Credit: 5,390,659 RAC: 0	Message 96273 - Posted: 8 May 2020, 18:37:16 UTC - in response to Message 96148. Last modified: 8 May 2020, 18:39:03 UTC Looks like my updates are helping, great. Just got another full broadside of tasks in, the credits per task is stable, APR has remained sane. I think this has been completely fixed. Thanks! ID: 96273 · Rating: 0 · rate: / Reply Quote