[CLUE-Talk] Import large fixed length records into mysql (in a
single pass)?
Jed S. Baer
thag at frii.com
Fri Oct 8 10:04:59 MDT 2004
On Fri, 8 Oct 2004 00:31:01 -0700 (PDT)
dave price <dp_kinaole at yahoo.com> wrote:
> The data is one record per line, but rather than any
> delimiter, we just know the character positions of
> each field in the line. i.e.:
>
> position length field_name
> ======== ====== ==========
> 1 4 field_1
> 5 10 field_2
> 11 10 field_3
Here are three approaches from the book "Data Munging With Perl".
#!/usr/bin/perl
my at cols = qw(n n n n ...); #n is position (0 based) of each column
while (<STDIN>) {
my @rec;
my $prev = 0;
foreach my $col (@cols) {
push @rec, substr($_, $prev, $col - $prev);
$prev = $col;
}
print join('|', @rec);
print "\n";
}
+++++++++++++++++++++
#!/usr/bin/perl
my @widths = qw(n n n n n ...); #column lengths
my $regex;
$regex .= "(.{$_}) foreach @widths;
while (<STDIN>) {
my @rec = /$regex/;
print join('|', at rec);
print "\n";
}
++++++++++++++++++++++
#!/usr/bin/perl
my $template = 'a5a20a8a6aa8'; #for unpack
while (<STDIN>) {
my @rec = unpack($template, $_);
print join('|', at src);
print "\n";
}
The author says the unpack method is best. You probably want to use the
"A" element instead of "a" in the unpack template, so you'll get strings
trimmed on the right.
So once you've built your template (or modified one of the other two
scripts appropriately), then you just run the file through Perl (script
name is renoberate.pl):
$ renoberate.pl < original.file > psv.file
Obviously, fix the /path/to/perl on the shebang line if/as approprate.
jed
--
http://s88369986.onlinehome.us/freedomsight/
... it is poor civic hygiene to install technologies that could someday
facilitate a police state. -- Bruce Schneier
More information about the clue-talk
mailing list