[CLUE-Tech] Web Statistics Programs

Jed S. Baer thag at frii.com
Tue Mar 16 20:29:20 MST 2004


On Tue, 16 Mar 2004 18:11:44 -0700
David Anselmi <anselmi at anselmi.us> wrote:

> > Speaking of hacking up Apache logs, does anyone know of a way to tell
> > Perl to split a string, but consider /regex/ characters as quoting
> > characters?
> 
> What do you mean by "quoting characters"?  The split prototype is:
> 
> split /REGEX/, EXPR, LIMIT
> 
> so the first argument is a regex that define what characters you're 
> splitting on.  But you must know that already.
> 
> I just can't grok what you're asking.

Hey, just look at an Apache access log. OK ... it's like this.

token token [multi-part token] token "quoted token" token

So, anything enclosed by /[\[\]"]/ is a single token (uh, if I've properly
escaped the character class inside the regex, that is). And, the
[ultra]split function would then still split on regex, except it would
recognize quoted strings (using whatever is defined as quoting characters)
as single items. It'd be:

  ultrasplit quotelist (or regex), delim, expr, limit

Worst case, I dig it out of the code for awstats, I guess.

jed
-- 
http://s88369986.onlinehome.us/freedomsight/

... it is poor civic hygiene to install technologies that could someday
facilitate a police state. -- Bruce Schneier



More information about the clue-tech mailing list