[clue-tech] Reading rsync results, was: CLUE Talk Mailing list mbox file too big to rsynch

Angelo Bertolli angelo.bertolli at gmail.com
Sun Feb 1 13:01:58 MST 2009


On Sun, Feb 1, 2009 at 2:16 PM, Angelo Bertolli
<angelo.bertolli at gmail.com>wrote:

> On Sun, Feb 1, 2009 at 1:41 PM, David L. Anselmi <anselmi at anselmi.us>wrote:
>
>> Angelo Bertolli wrote:
>>
>>> The problem with rsyncing an ever-growing mbox is that it always needs to
>>> be
>>> copied since it will always be different.
>>>
>>
>> It always needs to be copied but only blocks that don't match are copied
>> (in this case only the new data at the end of the file).  Which is exactly
>> what we want.
>>
>
> So rdiff then?
>
> Normally rsync doesn't compare the two files:  it just compares the
> timestamps and sizes, then copies the whole file if they don't match.  You
> can force a comparison check, but that usually takes even longer.
>
> Maybe rdiff will copy just the changed blocks, but it will still have to
> perform the diff.  Yeah, I know there's supposed to be some "magic" to
> efficiently do diffs over the net (maybe using checksums?)
>
>
I decided to checkup on myself a little bit.  Online docs say that rsync
sends only the differences and "information about structure" but my
experience seems to indicate otherwise.  So I ran a little test too on a
file where I just changed the timestamp without changing the file contents:

Bertolli at galileo ~
$ cp -a testfile-100M destfile

Bertolli at galileo ~
$ rsync -av testfile-100M destfile
sending incremental file list

sent 56 bytes  received 12 bytes  8.00 bytes/sec
total size is 104857600  speedup is 1542023.53

Bertolli at galileo ~
$ touch testfile-100M

Bertolli at galileo ~
$ rsync -av testfile-100M destfile
sending incremental file list
testfile-100M

sent 104870495 bytes  received 31 bytes  113804.15 bytes/sec
total size is 104857600  speedup is 1.00

I didn't time it, but the initial cp easily took 1/3rd the time the final
rsync did--I would estimate more like 1/4th or 1/5th the time.  I've always
read "speedup is 1.00" as "this is equivalent to having copied the whole
file."  Meaning I gained no speedup over just transmitting the file
regularly, as calculated by the algorithm.


Angelo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://cluedenver.org/pipermail/clue-tech/attachments/20090201/d89a7825/attachment.html


More information about the clue-tech mailing list