Internet traffic and necessary data

Author	Message
Carlos_Pfitzner Send message Joined: 22 Dec 05 Posts: 71 Credit: 138,867 RAC: 0	Message 10738 - Posted: 13 Feb 2006, 21:15:23 UTC Last modified: 13 Feb 2006, 21:22:18 UTC or do as SETI Beta are doing, and re-write the app so that it uses a basic method by default, but uses SSE when it detects that the processor is capable of handling that instruction set, best of both world then, because i'm sure you'd get a lot of people complaining that rosetta no longer runs on their older computers, and besides, rosetta is still seeking more processing power last time i checked, so excluding hosts is a bad thing, especially if the app is just going to error out on a non-compatible host A simple code to add into rosetta, to verify wich instruction set each cpu can so u can be sure that app will no crash when executing SSE U will need to adapt it ... eg: If cpu sse capable goto sse_crunch routine Else goto non_sse_crunch routine and may be u will need to have a sse_crunch and a sse2_crunch routines ... *sure, no pc will get any app crash -:) // // chkcpu.c // // Check cpu extensions for Intel compatible cpu // // Tetsuji "Maverick" Rai #include <stdio.h> main(){ unsigned long _ok_cpuid, _ecx, _edx, _init_flags, _mod_flags; __asm__ (" pushf; pop %%eax; mov %%eax, %0; xor $0x200000, %%eax; push %%eax; popf; pushf; pop %%ebx; mov %%ebx, %1;" : "=m"(_init_flags), "=m"(_mod_flags) ); printf("init flag = %08x modified flags = %08xn", _init_flags, _mod_flags); if (!((_init_flags ^ _mod_flags) & 0x200000)) { printf("cpuid isn't availablen"); return 1; } printf("nok cpuid is availablen"); __asm__ (" xor %%eax,%%eax; inc %%eax; cpuid; mov %%ecx,%0; mov %%edx,%1;" : "=m"(_ecx), "=m"(_edx) ); if (_edx & 0x8000){ printf("cmov : Yesn"); }else{ printf("cmov : Non"); } if (_edx & 0x02000000){ printf("sse : Yesn"); }else{ printf("sse : Non"); } if (_edx & 0x04000000){ printf("sse2 : Yesn"); }else{ printf("sse2 : Non"); } if (_ecx & 0x1){ printf("sse3 : Yesn"); }else{ printf("sse3 : Non"); } return 0; } Click signature for global team stats ID: 10738 · Rating: 0 · rate: / Reply Quote

Lee Carre Send message Joined: 6 Oct 05 Posts: 96 Credit: 79,331 RAC: 0	Message 10739 - Posted: 13 Feb 2006, 21:19:12 UTC - in response to Message 10708. different compression methods will only be adopted if they work across all platforms, if they don't, then they're not appropriate for BOINC use Well, that's why I suggested bzip2 as it's an open-source, plug-in replacement for gzip. Works in all platforms. great stuff, prehaps suggest it on the boinc dev mailing list if it achieves consistently greater compression ratios, it'll help everyone :) Agreed, but as you correct said more processing power, NOT necessarily more HOSTS. Have a look at CPU stats true, but if you compare processing rate (using something like TeraFLOPS) against number of hosts, you'll get a positive correlation (more hosts = more processing) Personally, I'd be happy with offering a beta-SSE-enabled Rosetta executable, as optional install, like many people install optimised BOINC app. now that's an idea, but obviously to get the most benifit for the cost then you might as well deploy an app that will do it automatically, that's the best cost:benifit ratio, but as a half-way thing then yea, a seperate app would probably help, but you'd need quite a lot of people using the optimised version to notice an improvement ID: 10739 · Rating: 0 · rate: / Reply Quote

Lee Carre Send message Joined: 6 Oct 05 Posts: 96 Credit: 79,331 RAC: 0	Message 10740 - Posted: 13 Feb 2006, 21:25:46 UTC - in response to Message 10738. A simple code to add into rosetta, to verify wich instruction set each cpu can so u can be sure that app will no crash when executing SSE U will need to adapt it ... eg: If cpu sse capable goto sse_crunch routine Else goto non_sse_crunch routine and may be u will need to have a sse_crunch and a sse2_crunch routines ... *sure, no pc will get any app crash -:) i'm no programmer, so forgive the newbie question i understand the instruction set selection method, but how hard would it be to have different versions of the routines in the same app, would there be a lot of work involved or is it quite simple? ID: 10740 · Rating: 0 · rate: / Reply Quote

dcdc Send message Joined: 3 Nov 05 Posts: 1833 Credit: 123,905,237 RAC: 22,698	Message 10782 - Posted: 15 Feb 2006, 16:34:10 UTC Can anyone tell me roughly how much bandwidth rosetta will use, post installation, on a Sempron 2600+ machine that is on ~3hrs a day, running XP? I can probably add this machine, but it's on capped broadband so the bandwidth is all important. cheers Danny ID: 10782 · Rating: 0 · rate: / Reply Quote

SwZ Send message Joined: 1 Jan 06 Posts: 37 Credit: 169,775 RAC: 0	Message 10783 - Posted: 15 Feb 2006, 16:37:57 UTC - in response to Message 10782. Can anyone tell me roughly how much bandwidth rosetta will use, post installation, on a Sempron 2600+ machine that is on ~3hrs a day, running XP? I can probably add this machine, but it's on capped broadband so the bandwidth is all important. cheers Danny About 10Mb per day ID: 10783 · Rating: 0 · rate: / Reply Quote

dcdc Send message Joined: 3 Nov 05 Posts: 1833 Credit: 123,905,237 RAC: 22,698	Message 10787 - Posted: 15 Feb 2006, 21:28:23 UTC - in response to Message 10783. Can anyone tell me roughly how much bandwidth rosetta will use, post installation, on a Sempron 2600+ machine that is on ~3hrs a day, running XP? I can probably add this machine, but it's on capped broadband so the bandwidth is all important. cheers Danny About 10Mb per day Cheers. Unfortunately, I think it'd need to be less than 100MB/month to be viable on that machine. ID: 10787 · Rating: 0 · rate: / Reply Quote

BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0	Message 10790 - Posted: 15 Feb 2006, 21:57:33 UTC Keep an eye on the boards - as later on this week we'll be given a new app that will give us the ability to select how many hours of jobs to download for each project downloaded. So, for a 24 hour run (on always on systems), we can supposedly download 24 hours of work on one project; instead of 8ea 3 hour projects or 48ea 30 minute projects. If this gives the option of up to a week's worth of work per project - then it'll be very easy to get the bandwidth usage below your target. With better compression added to the client, it'll be possible to reduce the bandwidth usage to 50% to 33% of the current project downloads. So there's still room for improvement. Reducing bandwidth usage to 1 eighth or 1/48th (depending on the type of projects being handed out at the time) just by switching to 24 hours (the default will be 8) of jobs per project (for always on 24/7 machines) will be a tremendous reduction for those with usage caps. ID: 10790 · Rating: 0 · rate: / Reply Quote

dcdc Send message Joined: 3 Nov 05 Posts: 1833 Credit: 123,905,237 RAC: 22,698	Message 10819 - Posted: 16 Feb 2006, 20:12:11 UTC - in response to Message 10790. Keep an eye on the boards - as later on this week we'll be given a new app that will give us the ability to select how many hours of jobs to download for each project downloaded. So, for a 24 hour run (on always on systems), we can supposedly download 24 hours of work on one project; instead of 8ea 3 hour projects or 48ea 30 minute projects. If this gives the option of up to a week's worth of work per project - then it'll be very easy to get the bandwidth usage below your target. With better compression added to the client, it'll be possible to reduce the bandwidth usage to 50% to 33% of the current project downloads. So there's still room for improvement. Reducing bandwidth usage to 1 eighth or 1/48th (depending on the type of projects being handed out at the time) just by switching to 24 hours (the default will be 8) of jobs per project (for always on 24/7 machines) will be a tremendous reduction for those with usage caps. Yeah - we've ordered the parts for the machine so it'll be easiest to install it when i build it, but I can ask him to install at a later date if the bandwidth requirements can be controlled to a suitable level for him. ID: 10819 · Rating: 0 · rate: / Reply Quote

Astro Send message Joined: 2 Oct 05 Posts: 987 Credit: 500,253 RAC: 0	Message 11346 - Posted: 24 Feb 2006, 21:13:57 UTC File compression may soon be offered through Boinc. See below email from Dr. Anderson: Email from Dr. Anderson: David Anderson to boinc_projects, boinc_dev More options 4:01 pm (6 minutes ago) Libcurl has the ability to handle HTTP replies that are compressed using the 'deflate' and 'gzip' encoding types. Previously the BOINC client didn't enable this feature, but starting with the next version of the client (5.4) it does. This means that BOINC projects will be able to reduce network bandwidth to data servers (and possibly server disk space) by using HTTP compression, without mucking around with applications. This is described here: http://boinc.berkeley.edu/files.php#compression -- David Interesting. ID: 11346 · Rating: 0 · rate: / Reply Quote

Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0	Message 11351 - Posted: 24 Feb 2006, 22:56:41 UTC - in response to Message 11346. Libcurl has the ability to handle HTTP replies that are compressed using the 'deflate' and 'gzip' encoding types. Previously the BOINC client didn't enable this feature, but starting with the next version of the client (5.4) it does. This means that BOINC projects will be able to reduce network bandwidth to data servers (and possibly server disk space) by using HTTP compression, without mucking around with applications. I am not familiar with 'deflate' but since the the Rosetta files already are gzipped, gzipping them a second time wouldn't have any additonal benefit. ;-) ID: 11351 · Rating: 0 · rate: / Reply Quote

Astro Send message Joined: 2 Oct 05 Posts: 987 Credit: 500,253 RAC: 0	Message 11355 - Posted: 24 Feb 2006, 23:50:55 UTC ut oh, this just in from Bruce Allen at Einstein: Bruce Allen <xxxxxx@gravity.phys.uwm.edu>to David, boinc_projects, boinc_dev More options 5:17 pm (1 hour ago) David, some project (including E@H) are already sending/returning files which are 'zipped'. We need to make sure that the cgi file_upload_handler program does not automatically uncompress files unless this has been requested specifically by the project. Cheers, Then later, Dr. A came out with: [boinc_alpha] compression bug in 5.3.21 Inbox David Anderson to boinc_alpha More options 6:24 pm (25 minutes ago) We quickly found that the support for gzip compression breaks Einstein@home and CPDN, which do their own compression. We're fixing this and it will be in 5.3.22. -- David Point here is that if Rosetta uses this compression, users shouldn't just jump for the latest dev client until testers have worked out the bugs. I am a boinc alpha tester and will soon find out if this is a problem. LOL I still have 3 4.81s' before I get on with the 4.82's ID: 11355 · Rating: 0 · rate: / Reply Quote

BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0	Message 11363 - Posted: 25 Feb 2006, 2:50:02 UTC Ask them what level of compression they're using for these transfers.. since Gzip allows you to specify a range of compression abilities ranging from Fast to Best. For large files, one hopes they're using the highest compression possible. `--fast' `--best' `-n' Regulate the speed of compression using the specified digit n, where `-1' or `--fast' indicates the fastest compression method (less compression) and `--best' or `-9' indicates the slowest compression method (optimal compression). The default compression level is `-6' (that is, biased towards high compression at expense of speed). from http://www.math.utah.edu/docs/info/gzip_4.html#SEC7 For that matter, is Rosetta using -9/--best with the zlib compression it currently uses? ID: 11363 · Rating: 0 · rate: / Reply Quote

alvin Send message Joined: 19 Jul 15 Posts: 5 Credit: 6,550,555 RAC: 0	Message 78722 - Posted: 8 Sep 2015, 2:22:28 UTC I have currently running this project and its all fine except one thing download data amount here is monthly report address download upload total bakerlab.org 24.0 GB (5.4 %) 6.00 GB (6.7 %) 30.0 GB (5.6 %) It's strange as I have opposite issue with other projects - they have huge ratio for download:upload as 1:5 or more. The issue is amount of traffic : could I ask you to pack results on client side if possible? Could compressing data be an option in settings? I suppose all those years ages ago noone cares about those amounts, but why the difference disbalance between incoming data and outcoming data is so huge? Anyway I think some action either on project side or whole boinc side could be done to pursue the balance and minimise traffic. ID: 78722 · Rating: 0 · rate: / Reply Quote

Sid Celery Send message Joined: 11 Feb 08 Posts: 2397 Credit: 45,958,211 RAC: 24,267	Message 78725 - Posted: 8 Sep 2015, 4:05:05 UTC - in response to Message 78722. I have currently running this project and its all fine except one thing download data amount here is monthly report address download upload total bakerlab.org 24.0 GB (5.4 %) 6.00 GB (6.7 %) 30.0 GB (5.6 %) It's strange as I have opposite issue with other projects - they have huge ratio for download:upload as 1:5 or more. The issue is amount of traffic : could I ask you to pack results on client side if possible? Could compressing data be an option in settings? I suppose all those years ages ago noone cares about those amounts, but why the difference disbalance between incoming data and outcoming data is so huge? Anyway I think some action either on project side or whole boinc side could be done to pursue the balance and minimise traffic. I think all tasks are already packed. I notice you keep a 7 day buffer, which is much bigger than necessary. I get away with just 2 days quite comfortably. It's rare to need anything more. But your biggest problem is that you use only a 1 hour runtime, for each of your 32 PCs! 500 or more tasks each makes 16,000! There's your problem! First, cut your buffer down to 2 days in BOINC under Computing Preferences - or whatever you're comfortable with. Leave it for 5 days to let your buffers run down, then go online and increase your run time for each machine. But do this slowly otherwise tasks will miss their deadline. So just increase from 1 hour to 2 hours at first and leave it a few days again before increasing to 4 hours. Reducing your buffers will mean you'll only have uploads for 5 days, no downloads at all. Doubling your runtime to 2 hrs will halve your previous volume of downloads while (I think) not changing the upload size (or by very little if it does increase). Doubling to 4hrs will halve downloads again. It's up to you if you decide to increase to 6hr runtimes, which is the default. If you do, your downloads will reduce again in proportion. ID: 78725 · Rating: 0 · rate: / Reply Quote

alvin Send message Joined: 19 Jul 15 Posts: 5 Credit: 6,550,555 RAC: 0	Message 78726 - Posted: 8 Sep 2015, 4:14:41 UTC - in response to Message 78725. Last modified: 8 Sep 2015, 4:18:49 UTC thanks man so is CPU running time is equivalent of portion of tasks executed? so does it slice tasks to portions therefore then I have more runtime it just crunches one task or portions of it longer? Am I correct? ------ those 7 or 10 days for tasks appeared after my fight to get tasks for projects then they claim "not a priority project" etc lets see ID: 78726 · Rating: 0 · rate: / Reply Quote

Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,661,974 RAC: 0	Message 78728 - Posted: 8 Sep 2015, 5:28:17 UTC Basically, to properly query the energy space of a structure, many decoys of said structure need to be simulated - a longer runtime means it will simulate more decoys before reporting work to/pestering for work from the server. It's... a) more efficient on your end as there's less time spent doing disk I/O switching between models b) more efficient for the project servers as they can bulk load in results/create bigger bulk work faster c) will use less bandwidth as more resources are shared between decoy runs as the target models don't change as frequently ID: 78728 · Rating: 0 · rate: / Reply Quote

Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0	Message 78736 - Posted: 8 Sep 2015, 15:50:22 UTC Last modified: 8 Sep 2015, 15:59:00 UTC The bandwidth will not vary one-to-one with runtime, because sometimes you get several tasks that use the same underlying database. And in that sense, having a large number of tasks improves your odds of already having a similar task on deck somewhere. But I agree with the the suggestions (and actually just responded with the same in response to IMs from Costa). With that many machines, night and day difference in download bandwidth will be achieved using a cacheing proxy server. This will afford the same effect as described above, where already having another task from the same batch of work will avoid a large DB download, but now leverage that across all of the hosts using the proxy, rather than just within a single host. The project also changed application levels recently, and so without a cacheing proxy, each host had to download it's own copy of the new executables and libraries. Rosetta Moderator: Mod.Sense ID: 78736 · Rating: 0 · rate: / Reply Quote

Sid Celery Send message Joined: 11 Feb 08 Posts: 2397 Credit: 45,958,211 RAC: 24,267	Message 78742 - Posted: 9 Sep 2015, 2:18:39 UTC - in response to Message 78726. Last modified: 9 Sep 2015, 2:21:34 UTC thanks man so is CPU running time is equivalent of portion of tasks executed? so does it slice tasks to portions therefore then I have more runtime it just crunches one task or portions of it longer? Am I correct? What the others said is right. I looked at one of your tasks on one machine and it reported it completed 5 "decoys" in 1 hour. If you increased to 2 hours it would run 10 and you get double the credit. I think the maximum allowed in a task is 99. The number of "decoys" varies a lot, but obviously the default 6 hour runs manage it fine. those 7 or 10 days for tasks appeared after my fight to get tasks for projects then they claim "not a priority project" etc lets see I think you run a lot of projects. When you do, BOINC goes a bit weird. Increasing the number of days makes things worse, so I read, so cutting down to 2 days (or less) will help. I think the default is actually 0.25 days so you could cut it down even more if you like. If ever tasks dry up on Rosetta (happens only once every 6 months or so) you have plenty of other project tasks to take up the slack. It's all a learning curve. No harm done. Thanks for committing so many machines to Rosetta! ID: 78742 · Rating: 0 · rate: / Reply Quote

alvin Send message Joined: 19 Jul 15 Posts: 5 Credit: 6,550,555 RAC: 0	Message 78743 - Posted: 9 Sep 2015, 2:53:09 UTC Last modified: 9 Sep 2015, 2:55:15 UTC I've started with LHC but then it goes out of tasks I got to have something esle and then it all builds up. Also some GPU-based projects' tasks are longer than 5 days alone, so this is a huge mess and mix in that. Finally boinc really plays up with different combinations of projects not getting tasks even if they are available because of whatever (berkeley support wasn't been helpful in that) so my goal is to have crunchers performing instead of idling so lets see. ID: 78743 · Rating: 0 · rate: / Reply Quote