Feedback, .. bandwidth usage :-(

Message boards : Rosetta@home Science : Feedback, .. bandwidth usage :-(

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Profile SwZ
Avatar

Send message
Joined: 1 Jan 06
Posts: 37
Credit: 169,775
RAC: 0
Message 8404 - Posted: 5 Jan 2006, 6:01:49 UTC - in response to Message 8403.  

http://www.7-zip.org/download.html
and the source code for 7zip is available for win/linux/mac as well.


Thanks! I suspect that 7zip open source and for any platform, but me bore with argue this fact :(
ID: 8404 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nothing But Idle Time

Send message
Joined: 28 Sep 05
Posts: 209
Credit: 139,545
RAC: 0
Message 8422 - Posted: 5 Jan 2006, 13:11:11 UTC - in response to Message 8404.  

Thanks! I suspect that 7zip open source and for any platform, but me bore with argue this fact :(


Don't be discouraged; your efforts are not wasted nor unnoticed.
ID: 8422 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SwZ
Avatar

Send message
Joined: 1 Jan 06
Posts: 37
Credit: 169,775
RAC: 0
Message 8423 - Posted: 5 Jan 2006, 13:24:33 UTC - in response to Message 8422.  


Don't be discouraged; your efforts are not wasted nor unnoticed.

:-)
I wrote small program for encoding/decoding Rosetta files and get about 1.5Mb in binary file as 7zip, but not exact preserve information. So 7zip is better! ;-)
And optional enlarge computative cost of WU is some more better!

ID: 8423 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile blackbird

Send message
Joined: 4 Nov 05
Posts: 15
Credit: 93,414
RAC: 0
Message 8532 - Posted: 7 Jan 2006, 12:23:50 UTC

Preprocessing of the WU file for compression gains about 20% of size.
E.g. :

aa1r69_09_05.400_v1_3 8609797 bytes (Uncompressed)
aa1r69_09_05.400_v1_3.gz 2783855 bytes
aa1r69_09_05.400_v1_3.7z 1029164 bytes (-mx7)

After preprocessing with the program described below:
aa1r69_09_05.400_v1_3 8609797 bytes (Uncompressed original file)

aa1r69_09_05.400_v1_3.cr 2289588 bytes (Uncompressed coordinates in integers)
aa1r69_09_05.400_v1_3.ot 1038759 bytes (Uncompressed other information, can be reduced)

Compressed with 7z (-mx7 -mlc=4 -mlp=2)
aa1r69_09_05.400_v1_3.cr.7z 663003 bytes
aa1r69_09_05.400_v1_3.cr.ot 164619 bytes

Thus, 827731 bytes after converting versus 1029164 bytes (-19.5%).

Of course, the sheduler should assign only one type of WU for host when the work is requested, not 8 different with 20 Mb traffic!

ID: 8532 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SwZ
Avatar

Send message
Joined: 1 Jan 06
Posts: 37
Credit: 169,775
RAC: 0
Message 8534 - Posted: 7 Jan 2006, 12:32:40 UTC - in response to Message 8532.  

Blackbird, this is good and very intresting information, but most likely authors of Rosetta code to be of no concern to this discussion.
I not see feedback from their.

ID: 8534 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 8535 - Posted: 7 Jan 2006, 14:20:07 UTC

Perhaps it's time to ask which person on the project deals with the client-server communications programming, and if they've seen this and other recent discussions on the matter and are willing to comment on what we're suggesting. Have them post a list of objections or problems they see that we can then counter, overcome, or make further suggestions on possibilities that will help the project.
ID: 8535 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 8538 - Posted: 7 Jan 2006, 14:55:14 UTC

If you read the earlier posts Jack Schonbrun has been doing the reading/posting to this post (Project Developer & Scientist btw)


Team mauisun.org
ID: 8538 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SwZ
Avatar

Send message
Joined: 1 Jan 06
Posts: 37
Credit: 169,775
RAC: 0
Message 8539 - Posted: 7 Jan 2006, 14:56:21 UTC

If open page http://staff.washington.edu/laidig/
we see BOINC server for the Rosetta@HOME project.
So may be he (Keith E. Laidig) answer for the Project.
And may be we can send e-mail, to it? laidig@u.washington.edu

Next questions from me
1) Can we optional control for WU computational cost (beleave constant data transfer per WU).
2) Can we change form gzip to 7zip packing?
3) Who translate some pages to Russian?
ID: 8539 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SwZ
Avatar

Send message
Joined: 1 Jan 06
Posts: 37
Credit: 169,775
RAC: 0
Message 8540 - Posted: 7 Jan 2006, 15:03:10 UTC - in response to Message 8538.  

If you read the earlier posts Jack Schonbrun has been doing the reading/posting to this post (Project Developer & Scientist btw)


Aha, thanks! So, Jack Schonbrun. Very nice!
May be worth place feedback information about self (e-mail for example) on main page. At least with small font :)
ID: 8540 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SwZ
Avatar

Send message
Joined: 1 Jan 06
Posts: 37
Credit: 169,775
RAC: 0
Message 8541 - Posted: 7 Jan 2006, 15:10:18 UTC - in response to Message 8540.  

From Jack Schonbrun profile:
"In addition of my regular research on protein folding, I've always been interested in the power of visualization to help us understand abstract concepts. So I've been helping out on developing the graphics for the screen saver portion of Rosetta@home."

4) Haw about my suggestion about graphic in thread
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=849
"I see problem in rotating view of proteins.
a) Views not synchronized. When work algorithm, witch evaluate RMSD finding rotation matrix and translation vector which can used for rotate all structures on the screen to one point of view.
b) Center of rotation not is center of gravity of protein.
c) Rotation "not right". It is makes around of model axis, but usualy more comfortable rotating around scene axis (like we move by mouse nearest screen plane, and under it rolling protein ball:-)"

ID: 8541 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile blackbird

Send message
Joined: 4 Nov 05
Posts: 15
Credit: 93,414
RAC: 0
Message 8648 - Posted: 9 Jan 2006, 15:00:21 UTC
Last modified: 9 Jan 2006, 15:09:31 UTC

As for me, i'm temporary switching to P@H until sheduler and traffic issues will be resolved. Jack?
ID: 8648 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
David Baker
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 17 Sep 05
Posts: 705
Credit: 559,847
RAC: 0
Message 8652 - Posted: 9 Jan 2006, 15:33:36 UTC - in response to Message 8532.  

Preprocessing of the WU file for compression gains about 20% of size.
E.g. :

aa1r69_09_05.400_v1_3 8609797 bytes (Uncompressed)
aa1r69_09_05.400_v1_3.gz 2783855 bytes
aa1r69_09_05.400_v1_3.7z 1029164 bytes (-mx7)

After preprocessing with the program described below:
aa1r69_09_05.400_v1_3 8609797 bytes (Uncompressed original file)

aa1r69_09_05.400_v1_3.cr 2289588 bytes (Uncompressed coordinates in integers)
aa1r69_09_05.400_v1_3.ot 1038759 bytes (Uncompressed other information, can be reduced)

Compressed with 7z (-mx7 -mlc=4 -mlp=2)
aa1r69_09_05.400_v1_3.cr.7z 663003 bytes
aa1r69_09_05.400_v1_3.cr.ot 164619 bytes

Thus, 827731 bytes after converting versus 1029164 bytes (-19.5%).

Of course, the sheduler should assign only one type of WU for host when the work is requested, not 8 different with 20 Mb traffic!


This makes a lot of sense. Does anybody know if BOINC allows this?

ID: 8652 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SwZ
Avatar

Send message
Joined: 1 Jan 06
Posts: 37
Credit: 169,775
RAC: 0
Message 8653 - Posted: 9 Jan 2006, 15:45:40 UTC - in response to Message 8652.  

This makes a lot of sense. Does anybody know if BOINC allows this?


This operations (packing/unpacking) makes Rosetta server and client application irrelative to BOINC. So it is enough change gzip algorithm to 7zip algorithm.
ID: 8653 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile River~~
Avatar

Send message
Joined: 15 Dec 05
Posts: 761
Credit: 285,578
RAC: 0
Message 8659 - Posted: 9 Jan 2006, 17:31:49 UTC - in response to Message 8652.  
Last modified: 9 Jan 2006, 17:35:32 UTC


[different zip method] ...

This makes a lot of sense. Does anybody know if BOINC allows this?


hi David,

if BOINC doesn't allow it, you could zip the file before they go on the server, and unzip from the app when the app starts.

If you take that approach you may well want to wait to test the new pre-zipped wu until the Ralph (Rosetta Alpha) sub-project is running - good as it would be to have better compression none of us (project or donors) is in the mood for another series of bad wu just now ;-)

I also think a fair point has been made that wu should be matched to the files a user has on their hard drive - Einstein do this so you may like to ask your colleagues to talk to the folks at Einstein to see if you can do it as well. This will save your bandwidth as well as users. This also entails making absolutely sure that if a file has the same name it has the same contents.

Both these changes would make a big difference to people on a metered connection (about 75% of UK internet users are either on modem or on ADSL with a usage cap and financial penalties for overrun).

River~~
ID: 8659 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Strop
Avatar

Send message
Joined: 2 Nov 05
Posts: 6
Credit: 305,041
RAC: 0
Message 8673 - Posted: 9 Jan 2006, 19:41:09 UTC

I agree that the bandwidth used is alot.
Even I'm on cable with an upload limit of 4.5GB for 30 days... I'm having problems.
Don't forget, people use theire internet to to surf, download, read e-mails and so on...
I had to stop rosetta at my puters at home because of danger to get on smallband..
No extra payement here.. but put on smallband.
I can certainly understand people who are on paying connections that they will have much bigger problems.. like paying extra, or even be cutt off completely from the net.
So, again, if you could, can you make this one of you're priority's :-)
Really is holding back alot of people in my opinion.


BOINC.BE The team for Belgians who love the smell of glowing red cpu's in the morning.
ID: 8673 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator7
Volunteer moderator

Send message
Joined: 27 Dec 05
Posts: 10
Credit: 0
RAC: 0
Message 8679 - Posted: 9 Jan 2006, 20:44:30 UTC

I don't think the decision has been made as to whether to better compress the files, or assign result types based on the host to reduce the number of files downloaded, or both - but the bandwidth issue is on the "to-do" list, near or in the "top ten". I don't know how long it will take to get to it though. Limited number of people to do the work...

ID: 8679 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 8680 - Posted: 9 Jan 2006, 20:46:15 UTC

The suggestions mentioned in this thread are great and we've thought about these issues at the start of project development. Keep in mind that our staff is small and we are limited in development time. Also, some of these suggestions have to be implemented in boinc rather than rosetta (i.e. changes in the scheduler and compressing the executable without having to provide another app).

The easiest way for us to reduce bandwidth would be:

1. to use a better compression method like bzip2 or 7zip. We'll look into this but it will take time to develop and test. At the start of project development, we added zlib to rosetta specifically for boinc and chose it over other packages because of cross-platform issues, available wrappers/support libraries, and ease of use.

2. use smaller fragment files (the _v1_3 files). We are currently testing whether larger fragment files improve results and the jury is still out but if they improve results by a small fraction do we continue to use them?

3. to implement locality scheduling provided by boinc. This would require changes to rosetta to read in a single large fragment file. We'll look into this.

some comments:

Due to the nature of our application and the large input files that are required, R@h is not suitable for dial-up users who mind the long wait required for downloads.

The application cannot be compressed unless we provide another application that uncompresses it which adds a bit of complexity since we would have to use a compound application. This can be done but would require development and testing. A better option would be to add compression to boinc so that it would always send compressed data to the clients and back but this is a task for boinc developers.

Download files are considered immutable so they should always have the same content - we follow this boinc rule. We could easily make the large fragment files "sticky" like we do with the rosetta database files (which are not work unit specific) but then we would use up client disk space and would have to manage the files on every client somehow to make sure they are removed when work unit batches are finished since they are specific to each batch. It would be better to use locality scheduling which is provided by boinc and should handle file managemnt, and as mentioned above, we will look into modifying rosetta for this.

IF THERE ARE ANY DEVELOPERS OUT THERE INTERESTED IN ADDING A BETTER COMPRESSION METHOD LIKE 7zip or bzip2 THAT IS CROSS-PLATFORM COMPATIBLE, PLEASE SEND ME AN EMAIL AND I CAN PROVIDE YOU WITH OUR COMPRESSION WRAPPER CODE. THE WRAPPER CODE CAN BE USED, EXTENDED, AND TESTED INDEPENDENT OF ROSETTA.

dekim at u dot washington dot edu

The best we could do in the short term is to use a better compression method and use locality scheduling but this would still require some development.


ID: 8680 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile blackbird

Send message
Joined: 4 Nov 05
Posts: 15
Credit: 93,414
RAC: 0
Message 8706 - Posted: 10 Jan 2006, 8:43:34 UTC

UPX can be helpful with compressing Rosetta application:
rosetta_4.80_i686-pc-linux-gnu 8323696 (uncompressed)
rosetta_4.80_i686-pc-linux-gnu 3257302 (compressed)

From my basic understanding how Rosetta works, i can suspect that the best way to decrease the traffic would be assigning a lot of random points for one protein. Eg. host downloads a protein, then 20 WUs with random points instead of loading 20 different proteins, thus reducing traffic (it would be the best solution).

Another way to decrease the traffic is WU compression. As i have mentioned before, deep knowledge of WU structure is required to find the most apropriate compression method. In fact, if 2 digit mantissa of traectories is enough for computations, then traectories can be stored as words, which can give about 30% less compressed file size.

Traffic issue is an often forgotten problem for scientists when intranet computations are transferred to internet-based solution. I believe that it is very important problem because more users mean more bandwidth for servers. In fact, you must select between building new server and optimising transfers.


ID: 8706 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SwZ
Avatar

Send message
Joined: 1 Jan 06
Posts: 37
Credit: 169,775
RAC: 0
Message 8708 - Posted: 10 Jan 2006, 9:42:43 UTC - in response to Message 8680.  

IF THERE ARE ANY DEVELOPERS OUT THERE INTERESTED IN ADDING A BETTER COMPRESSION METHOD LIKE 7zip or bzip2 THAT IS CROSS-PLATFORM COMPATIBLE, PLEASE SEND ME AN EMAIL AND I CAN PROVIDE YOU WITH OUR COMPRESSION WRAPPER CODE. THE WRAPPER CODE CAN BE USED, EXTENDED, AND TESTED INDEPENDENT OF ROSETTA.


Send this Wrapper code to <gionov@mail.ru>, please.

With best wishes, Gennady Ionov.


ID: 8708 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 8722 - Posted: 10 Jan 2006, 14:53:31 UTC - in response to Message 8708.  
Last modified: 10 Jan 2006, 14:55:52 UTC

IF THERE ARE ANY DEVELOPERS OUT THERE INTERESTED IN ADDING A BETTER COMPRESSION METHOD LIKE 7zip or bzip2 THAT IS CROSS-PLATFORM COMPATIBLE, PLEASE SEND ME AN EMAIL AND I CAN PROVIDE YOU WITH OUR COMPRESSION WRAPPER CODE. THE WRAPPER CODE CAN BE USED, EXTENDED, AND TESTED INDEPENDENT OF ROSETTA.


Send this Wrapper code to <gionov@mail.ru>, please.

With best wishes, Gennady Ionov.




You need to send an email his way

PLEASE SEND ME AN EMAIL AND I CAN PROVIDE YOU WITH OUR COMPRESSION WRAPPER CODE. THE WRAPPER CODE CAN BE USED, EXTENDED, AND TESTED INDEPENDENT OF ROSETTA.

dekim at u dot washington dot edu





As for application compression, anyone have contacts with CPDN as they seem to do it.
Team mauisun.org
ID: 8722 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Rosetta@home Science : Feedback, .. bandwidth usage :-(



©2024 University of Washington
https://www.bakerlab.org