Here are some interesting results from a small experiment I performed -
The same data was copied between same machine with similar network conditions and loads using different methods with and without compression. Both fedora installations are new and home partitions almost empty (I have no idea at the moment what would happen if they were somewhat filled). 2 dry runs were carried out before actual tests to give cache benefit to all methods.
Data being copied : 1.6GB worth matlab2008a(unix) installation - contains bunch of avi videos in megabytes, moderate number of jar files and many small .m files.
source machine:
Pentium D 820@2.8GHz, 2GB DDR2 dual channel@667, Intel 946 mobo, on board SATA controller; Samsung 160GB SATA hard disk drive@7200rpm and 8M cache, Linux f10 2.6.27.12-170.2.5.fc10.i686,
35GB ext3 source partition (fresh).
target machine:
Pentium core2 duo E4500@2.2GHz, 2GB DDR2 dual channel@667, Intel 965 mobo, on board SATA controller; Samsung 160GB SATA hard disk drive@7200rpm and 2M cache, Linux f10 2.6.27.12-170.2.5.fc10.i686,
100GB ext3 copy-to partition (fresh).
Results
Method command | Timing | avg. CPU util. |
uncompressed recursive scp scp -rq matlab_install prashant@10.105.41.19:matlab_install_regular_scp_unc | real 9m54.554s user 0m23.204s sys 0m15.103s | 20 |
compressed recursive scp scp -Crq matlab_install prashant@10.105.41.19:matlab_install_regular_scp | real 11m8.391s user 3m48.508s sys 0m25.200s | 85 |
uncompressed recursive rsync rsync -a matlab_install prashant@10.105.41.19:matlab_install_regular_rsync_unc | real 3m3.604s user 0m26.709s sys 0m21.664s | 40 |
compressed recursive rsync rsync -az matlab_install prashant@10.105.41.19:matlab_install_regular_rsync | real 4m11.651s user 3m11.847s sys 0m31.892s | 90 |
uncompressed tar+ssh tar -cf- matlab_install | ssh prashant@10.105.41.19 'tar -xf- -C ~/matlab_install_hack_unc' | real 2m59.706s user 0m21.428s sys 0m14.020s | 20 |
compressed tar+ssh tar -cf- matlab_install | gzip -f1 | ssh prashant@10.105.41.19 'tar -xzf- -C ~/matlab_install_hack_compr' | real 2m44.349s user 2m7.709s sys 0m18.114s | 60 |
Conclusion
as seen from the timings, rsync and tar+ssh perform close-up; though tar+ssh beats rsync here.On the other hand though, when updating the huge tree; rsync wins hands down - insane speedups!
scp is not to be used with more than a hundred files. period.
whats the point of enabling compression if it is going to be SLOWer...
ReplyDeletecompression works nicely only when the files being copied are compressible; which is usually the case when copying large plaintext files.
ReplyDeleteI guess the cost/time of compression is more than the saved transfer-time in this case.