Daily bandwidth usage for Rosetta@home

Message boards : Number crunching : Daily bandwidth usage for Rosetta@home

To post messages, you must log in.

AuthorMessage
student_

Send message
Joined: 24 Sep 05
Posts: 34
Credit: 1,519,541
RAC: 0
Message 56097 - Posted: 30 Sep 2008, 4:23:12 UTC

For an average computer running Rosetta@home all day, what is the typical bandwidth usage in terms of bytes uploaded and downloaded? Is there a way to estimate usage based on number of workunits completed? This is clearly important for users with ISP-set limits on bandwidth usage and has been briefly covered in Rosetta@home forums before, but it seems like a good idea to revisit the topic for any new information.


ID: 56097 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1605
Credit: 49,318,440
RAC: 32,064
Message 56108 - Posted: 30 Sep 2008, 12:44:42 UTC - in response to Message 56097.  

For an average computer running Rosetta@home all day, what is the typical bandwidth usage in terms of bytes uploaded and downloaded? Is there a way to estimate usage based on number of workunits completed? This is clearly important for users with ISP-set limits on bandwidth usage and has been briefly covered in Rosetta@home forums before, but it seems like a good idea to revisit the topic for any new information.


i couldn\'t give you any figures as i\'ve never monitored it but it will be per-transaction which is entirely dependent on the run-time you set. Longer run time = less bandwidth. Feet1st has written BOINCproxy for rosetta that really reduces bandwidth use if you\'ve got a few computers running...
ID: 56108 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1740
Credit: 3,444,703
RAC: 798
Message 56116 - Posted: 30 Sep 2008, 15:17:34 UTC
Last modified: 30 Sep 2008, 15:27:45 UTC

I have detailed stats from June through September of my 4 hosts. I typically have them set with a runtime preference of 24hrs, and about a day of cache. You will find that cacheing more tasks will also reduce your bandwidth slightly.

I generally am running Rosetta 24x7, some hosts have a 10% resource share to Ralph as well.

You can use any cacheing proxy, such as squidproxy. Mine has other features that I wanted, such as redirecting hosts to the URL of my choice for where to get their files.

I still have a few quirks and retries, and have no stats on when my host goes back to the project due to a failure on my proxy. Also, I don\'t proxy file uploads, and so I don\'t have any numbers on that. Uploads are generally pretty minor (100K per task) as compared to downloads. Also, the numbers vary dramatically. If a new application is released several times in a month, or if there is a wide range of types of tasks, you end up with lots of downloads. Also, I do not track specific tasks processed (yet). Just the actual downloads required, and scheduler requests.

Let me show you my hosts and what my stats tell me (per month):

684483 min 70MB, max 340MB
684744 min 190MB, max 300MB
715317 min 151MB, max 490MB
836911 min 523MB, max 1049MB

This last one has 8 CPUs.

Seemed like June was a big month for downloads. And, in general, my average for each host was near the minimum figure above.

If you use a proxy server, you can reduce the above significantly, even for a single host. There is only about 1-3MB per month received in replies from the scheduler. And the rest of the files have a high hit rate (over 50%) in a cache from any other active host. So, in rough terms, you could add 3 or 4 hosts through the same proxy and only double your bandwidth as compared to a single host. And after that, I would guess you could add about 10-20 more hosts before it would triple what a single host requires.

Your mileage will vary.
If having a DC project with BOINC is of interest to you, with volunteer or cloud computing resources, but have no time for the BOINC learning curve,
use a hosting service that understands BOINC projects: http://DeepSci.com
ID: 56116 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
student_

Send message
Joined: 24 Sep 05
Posts: 34
Credit: 1,519,541
RAC: 0
Message 56124 - Posted: 30 Sep 2008, 19:09:48 UTC - in response to Message 56116.  


Let me show you my hosts and what my stats tell me (per month):

684483 min 70MB, max 340MB
684744 min 190MB, max 300MB
715317 min 151MB, max 490MB
836911 min 523MB, max 1049MB


Since you\'re using a proxy, would that reflect the bandwidth that Rosetta@home uses for the average user (who probably is not behind a proxy)?


Longer run time = less bandwidth.


How does increasing the time a work unit is allowed to run decrease bandwidth usage?
ID: 56124 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1740
Credit: 3,444,703
RAC: 798
Message 56127 - Posted: 30 Sep 2008, 19:48:16 UTC

I gather stats on files my clients are getting from the proxy as well as those that are not found on my proxy. So the figures I presented are my closest information about what a single stand-alone client requires.

Preferred runtime is a bit ominous. You have the choice of just about any value between 1hr and 24hrs for each task. If you only need one task per day to keep your CPU busy, then you only have to download the files associated with one task. If you ran through 24 tasks each day, you\'d have to download files associated with 24 tasks. But... many of the files are the same... and keeping a list of tasks on-deck ready to process means you\'ve got all their files. So, when you make your next scheduler request, you\'ve already got a large set of files available and so might be able to avoid downloading them.

So, it\'s actually possible that the larger list of tasks you will maintain with a short runtime results in less downloaded overall, because you are keeping files required for a larger number of different proteins around.

About the only thing we can say for certain is that 24hr tasks tend to make you perform less scheduler requests. And so a client running 24hr runtime preference tends to be friendlier to the project scheduler by not hitting it as many times each day/month. But you can\'t say difinitively how the total bandwidth is effected. It just depends on the mix of work they are sending out.
If having a DC project with BOINC is of interest to you, with volunteer or cloud computing resources, but have no time for the BOINC learning curve,
use a hosting service that understands BOINC projects: http://DeepSci.com
ID: 56127 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Daily bandwidth usage for Rosetta@home



©2018 University of Washington
http://www.bakerlab.org