[clue-tech] Mirroring Debian testing, i386.

David L. Anselmi anselmi at anselmi.us
Mon Dec 24 12:58:56 MST 2007


Yeah, I keep fooling with this so you keep hearing about it.  Hit delete 
now.

David L. Anselmi wrote:
[...]
 > So finally I got the official mirror script:
 >
 > http://www.debian.org/mirror/ftpmirror

Well, I warned you to hit delete. ;-)

The ftpmirror script is a little painful.  Seems that each run takes 
many hours, even if it isn't updating much.  I'm guessing rsync overhead.

But rsync seems like the wrong way to mirror a repository.  It's good at 
updating changed files efficiently (and there's nothing wrong with how 
it copies over new files--superior to the way cp/rcp work).  But in a 
repository files don't change once there and I don't think rsync can 
figure out that 90% of the blocks in foo_1.2_i386.deb are the same as 
those in foo_1.1_i386.deb.  So its overhead for checking for changes is 
wasted.

All you really want is a list of new and removed files in the repository 
compared to your mirror.  You can get that off the Packages files.

apt-mirror seems to work that way so I tried switching over to it.  (I 
should have copied the pool directory first but I got confused about 
disk space and deleted it instead.)

> I copied all the pool/ stuff from the CDs by hand (should be close 
> enough) and tried apt-mirror. It insists on putting all the files
> under the URL you're mirroring, so using it with multiple sources is
> a pain.

I should be able to make links to get different URLs to put their files 
in the same place (and I'll likely have to make a link to make this look 
like a sane repository for apt to use).  So this may not be such a big 
deal (except that my external drive is still vfat, so no links allowed).

> It also doesn't create the 3 directories it needs. The install
> scripts make them under /var but if you decide you want the mirror
> put on your big external drive you get errors until you make them
> yourself. Ought to file a bug on that.

So I've reported this bug, and knowing about it it's easy to work 
around.  Let's see...

The i386 binary repository is 18GiB and after almost 24 hours of 
downloading my drive crashed.  Restarting seems easy enough -- 
regenerating the list of what needs downloading is quick compared to 
what rsync was doing.  And it kindly tells me 5.9GiB to go.

I set up munin to see what my network utilization was.  It seems to be 
running right up there at 1.3Mbps (near max for a 1.5Mbps DSL line I'd 
guess).  It'll be interesting to see what munin has to say over the rest 
of the d/l.

Isn't it nice that you can have 20 wget processes running and maxing 
your d/l bandwidth and when you decide to download something else that 
process jumps right in and takes its share?

Dave


More information about the clue-tech mailing list