[clue-tech] Reading rsync results, was: CLUE Talk Mailing list
mbox file too big to rsynch
Angelo Bertolli
angelo.bertolli at gmail.com
Sun Feb 1 13:01:58 MST 2009
On Sun, Feb 1, 2009 at 2:16 PM, Angelo Bertolli
<angelo.bertolli at gmail.com>wrote:
> On Sun, Feb 1, 2009 at 1:41 PM, David L. Anselmi <anselmi at anselmi.us>wrote:
>
>> Angelo Bertolli wrote:
>>
>>> The problem with rsyncing an ever-growing mbox is that it always needs to
>>> be
>>> copied since it will always be different.
>>>
>>
>> It always needs to be copied but only blocks that don't match are copied
>> (in this case only the new data at the end of the file). Which is exactly
>> what we want.
>>
>
> So rdiff then?
>
> Normally rsync doesn't compare the two files: it just compares the
> timestamps and sizes, then copies the whole file if they don't match. You
> can force a comparison check, but that usually takes even longer.
>
> Maybe rdiff will copy just the changed blocks, but it will still have to
> perform the diff. Yeah, I know there's supposed to be some "magic" to
> efficiently do diffs over the net (maybe using checksums?)
>
>
I decided to checkup on myself a little bit. Online docs say that rsync
sends only the differences and "information about structure" but my
experience seems to indicate otherwise. So I ran a little test too on a
file where I just changed the timestamp without changing the file contents:
Bertolli at galileo ~
$ cp -a testfile-100M destfile
Bertolli at galileo ~
$ rsync -av testfile-100M destfile
sending incremental file list
sent 56 bytes received 12 bytes 8.00 bytes/sec
total size is 104857600 speedup is 1542023.53
Bertolli at galileo ~
$ touch testfile-100M
Bertolli at galileo ~
$ rsync -av testfile-100M destfile
sending incremental file list
testfile-100M
sent 104870495 bytes received 31 bytes 113804.15 bytes/sec
total size is 104857600 speedup is 1.00
I didn't time it, but the initial cp easily took 1/3rd the time the final
rsync did--I would estimate more like 1/4th or 1/5th the time. I've always
read "speedup is 1.00" as "this is equivalent to having copied the whole
file." Meaning I gained no speedup over just transmitting the file
regularly, as calculated by the algorithm.
Angelo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://cluedenver.org/pipermail/clue-tech/attachments/20090201/d89a7825/attachment.html
More information about the clue-tech
mailing list