[CLUE-Tech] vim's join command on large documents answer

David Anselmi anselmi at americanisp.net
Fri May 10 22:47:08 MDT 2002


Kevin Cullis wrote:

[...]


> :g/[.,a-z]$/j
>
> It sort of works every third try and I have to do it a number of times
> to get a good finished product.  While it works, why the number of times
> versus just once on the whole document?

Let me see if I understand what you want.  You paste text from a browser
window into vim.  That gives you paragraphs made up of several lines.  Each
line starts with several spaces and paragraphs are separated by a blank
line.

You want to end up with one long line where each paragraph was, with a blank
line between paragraphs.  So you need something different than gq as we
originally assumed.

As it happens, the last line of each paragraph ends with a space.  Otherwise
the above command would join out the blank lines and you'd join all the
paragraphs into one.  That being the case, here's a shorter version of what
you have:

:g/[^ ]$/j

which means "lines that end with other than a space".

This needs to be run more than once because the global command goes to each
matching line, does a join, then goes to the next matching line.  It doesn't
go to a line and do joins until the line doesn't match.  Few of the Unix
line mode tools (sed, awk, etc) will do what you want without a little
programming.

Here's something closer, I think:

:g /[^ ]$/ /[^ ]$/ j | /[^ ]$/ j

This says, on each line that ends with other than a space, do two joins -
but only if the line ends with other than a space.  I'm a little fuzzy on
why this works, it's getting late.  If anyone wants to explain, I'm all
ears.

Note that the downside of this is that if a paragraph happens to end on a
line without a space, you'll lose the blank line afterwards.  And it's a lot
of mung to type.

A perhaps better way is gq.  Just do the gggqG (or whatever like we said
earlier) to run the gq command on each paragraph in the file.  The trick is
that first you use :set tw=<large num> (where <large num> is a number larger
than the character count of any paragraph).  Then you get your reformating
but each paragraph is turned into one line.

Finally, both of these leave the final (one-line) paragraphs indented as the
original, but you already know how to fix that.

I hope all that helps (whew!)  If not, maybe you need to restate the problem
more clearly.

Dave





More information about the clue-tech mailing list