[clue-tech] [spam?] text processing howto
Collins Richey
crichey at gmail.com
Wed Oct 6 17:36:39 MDT 2010
On Wed, Oct 6, 2010 at 7:36 AM, Jim Ockers <ockers at ockers.net> wrote:
> Hi David,
>
> [ockers at agadez ~]$ echo "this that1 that2 that3 theotherthing" | awk '{print
> $NF}'
> theotherthing
>
> [ockers at agadez ~]$ echo "this that1 that2 that3 theotherthing" | awk '{print
> $2 "," $3 "," $4}'
> that1,that2,that3
>
> NF means "number of fields" and is a numerical value equal to the number of
> whitespace-separated list items. Obviously "theotherthing" is $5. Also -F
> command line option to awk indicates the field separator, by default it uses
> whitespace as the field separator.
>
> If you want to do something fancy you should know that awk is very powerful
> and supports "for" loops. If you want to know the loop iterator syntax I
> can suggest that too, but you didn't ask for that.
>
> No perl! :) awk is great for simple text processing.
>
> HTH,
> Jim
>
> --
> Jim Ockers, P.Eng. (ockers at ockers.net)
> Contact info: http://www.ockers.ca/pason.html
>
> David L. Willson wrote:
>
> given lines of the form:
> this that1 that2 that3 theotherthing
>
> where the field separator is any combination of spaces and tabs
> and there may be 0-9 that's
>
> how do I reliably capture theotherthing, and make a packed, comma-separated
> list of all the that's.
>
> This is where I really wish I'd paid more attention in perl class.
> Bonus point for not using any perl... :-)
>
You can do the same thing in two lines of perl. I won't bore you with
the power of pcre engines, but this has been adopted by the likes of
php, ruby, and even Windows Powershell (ugh!!!).
BTW, I would love to see a thorough presentation of AWK as a topic for
a CLUE meeting. AWK is indeed a powreful utility, but I, for one, have
always been too lazy to learn it!
--
Collins Richey
If you fill your heart with regrets of yesterday and the worries
of tomorrow, you have no today to be thankful for.
More information about the clue-tech
mailing list