[CLUE-Tech] PHP/XML and CDATA Woes: The Answer

Jed S. Baer thag at frii.com
Fri Feb 6 15:03:31 MST 2004


I have discovered the answer to pulling "<![CDATA[ ... ]]>" sections out
of XML using the PHP Expat parser. I guess it was so obvious that it
didn't bear any mention in any of the docs.

What happens is the declared "character data" handler does, in fact,
handle both CDATA and PCDATA. But since the element handlers don't trip on
the CDATA section, it wasn't obvious (to me anyway) that the character
data handler would also handle data in the CDATA section.

However, the "default" handler does trip on the CDATA open/close tags,
thus providing an event to use for setting up handling of the enclosed
data.

I note that getting a parser running this way reminds me of writing RPG,
where one sets "indicators" in various code sections, in order to affect
processing in a different code section (or sections).

I also note that writing a specific parser this way impresses me as being
less than robust, and likely points to using XSLT as a better solution for
XML transformations. I suppose it depends upon what you're planning on
doing with the data, but for changing the display/storage format, I
suspect XSLT wins 99.9% of the time.

jed
-- 
http://s88369986.onlinehome.us/freedomsight/

... it is poor civic hygiene to install technologies that could someday
facilitate a police state. -- Bruce Schneier



More information about the clue-tech mailing list