[clue-tech] Mirroring Debian testing, i386.
David L. Anselmi
anselmi at anselmi.us
Mon Dec 24 12:58:56 MST 2007
Yeah, I keep fooling with this so you keep hearing about it. Hit delete
now.
David L. Anselmi wrote:
[...]
> So finally I got the official mirror script:
>
> http://www.debian.org/mirror/ftpmirror
Well, I warned you to hit delete. ;-)
The ftpmirror script is a little painful. Seems that each run takes
many hours, even if it isn't updating much. I'm guessing rsync overhead.
But rsync seems like the wrong way to mirror a repository. It's good at
updating changed files efficiently (and there's nothing wrong with how
it copies over new files--superior to the way cp/rcp work). But in a
repository files don't change once there and I don't think rsync can
figure out that 90% of the blocks in foo_1.2_i386.deb are the same as
those in foo_1.1_i386.deb. So its overhead for checking for changes is
wasted.
All you really want is a list of new and removed files in the repository
compared to your mirror. You can get that off the Packages files.
apt-mirror seems to work that way so I tried switching over to it. (I
should have copied the pool directory first but I got confused about
disk space and deleted it instead.)
> I copied all the pool/ stuff from the CDs by hand (should be close
> enough) and tried apt-mirror. It insists on putting all the files
> under the URL you're mirroring, so using it with multiple sources is
> a pain.
I should be able to make links to get different URLs to put their files
in the same place (and I'll likely have to make a link to make this look
like a sane repository for apt to use). So this may not be such a big
deal (except that my external drive is still vfat, so no links allowed).
> It also doesn't create the 3 directories it needs. The install
> scripts make them under /var but if you decide you want the mirror
> put on your big external drive you get errors until you make them
> yourself. Ought to file a bug on that.
So I've reported this bug, and knowing about it it's easy to work
around. Let's see...
The i386 binary repository is 18GiB and after almost 24 hours of
downloading my drive crashed. Restarting seems easy enough --
regenerating the list of what needs downloading is quick compared to
what rsync was doing. And it kindly tells me 5.9GiB to go.
I set up munin to see what my network utilization was. It seems to be
running right up there at 1.3Mbps (near max for a 1.5Mbps DSL line I'd
guess). It'll be interesting to see what munin has to say over the rest
of the d/l.
Isn't it nice that you can have 20 wget processes running and maxing
your d/l bandwidth and when you decide to download something else that
process jumps right in and takes its share?
Dave
More information about the clue-tech
mailing list