XSL, CDATA, and ME

I haven’t been concentrating on CSS much lately. I’ve been immersed in the world of XSL and recovering from bad televison. I did come across a very confusing problem in XSL the other day and a co-worker was able to point me to the answer.

XSL allows you to take raw XML and convert it to HTML, PDF files, or maybe even paper airplanes that can fly for miles. However, this chunk of XML that I was working with had HTML in a field that I needed to insert into my new, shiny XHTML pages. Further, the HTML was wrapped in a CDATA comment. (CDATA comments allow you to insert code that isn’t pretty and not throw the validators for a looop. And this code wasn’t pretty!)

The resulting pages had unprocessed HTML tags, i.e. links and br tags. How was I going to take this chunk and turn it into happy, shiny people-friendly elements? I tried the standard tools but the CDATA wrapping was blocking my every effort.

Finally, co-workerX sent me a link to Ned Batchelder’sCDATA isn’t special” post. The answer is actually pretty simple. At least, I want to think it is.

Ned suggests using a set of attributes to force the XSL to convert the HTML tags into elements and then uses XPATH to select the text node of the XML element instead of the entire element. Therefore, we are skipping the CDATA statement, grabbing the text within it, and displaying it to the world.

Here’s the final snippet

<xsl:value-of disable-output-escaping="yes" select="braincells/caffeine_powered/text()"/>

Am I barking up the wrong tree?

Is my logic correct? I hope this is the logic behind this solution. If not, please leave a comment to set the record straight.

7 thoughts on “XSL, CDATA, and ME”

  1. Hey Ted,

    I’ve recently done some playing around with XSL and XML for a CMS I’ve just developed. I don’t however, consider myself to be an expert on the technology.

    From my experience, I’ve tried to avoid using CDATA. If you can ensure that you’re HTML is well formed, then it should probably be integrated into your XML.

    This way you can traverse the document tree of your XML source.

    For example:

    Publications: Blah Company

    Page Under Development

    Blah Company
    55 555 555 555

    ...

    I always make sure my XSL outputs my well-formed XML source to XML, not HTML – cause if it’s well formed, it may as well stay XML – as you know, XHTML is an XML derived markup language, so it may as well stay XML.

    Like so:

    Then I can treat the content part in my XSL like so:

    Notice the line? This will actual take a copy of the XML structure within the “content” element, not just the data – this is how you can retain an XML document once you’ve transformed the file to XHTML.

    The only time I’ve found it necessary to use CDATA is for my doctype declaration – the XSL parser seems to wig out otherwise:

    ]]>

    Hope this is helpful – sorry for being so lengthy.

    PaulH

  2. So, you’re problem is that you want to clean up the HTML in the CDATA section? Sounds like you’re on the right track, but it would probably be help if you showed us the XML file you’re processing. The code snippet means nothing without it to put it into context.

  3. Dude, what’s up with your comment form…all the fields are jumbly, speaking of people-friendly elements…

    I never bothered to get much into XSL. I understand it has incredible power, but there doesn’t seem to be a great point in learning it unless you’re doing software stuff. Or is there a way to implement it to the web over a broad range of browsers that I’m not aware of?

  4. I know my comment form is a mess. It’s one of those things I’ve been meaning to fix. I’ve got some buggy wordpress templates and have been waiting for the new version to be released before taking the site back under the knife. Thanks for plugging through.

  5. Ben: I can’t show more code than I did, proprietary, top secret stuff…. Essentially, I’m grabbing XML from an API and plugging the data into HTML. There was one field in one of the API’s that included HTML code wrapped in CDATA comments. I needed to get that information to work as normal HTML, not as literal bracket a href=”” close bracket (spelling it out so that the blog doesn’t make it disappear). The technique I used allowed this to happen.

    Edward: XSL can be used effectively on the web today. You can use it to create mobile pages, pdf (eek) for the web, HTML pages, and much more.

Comments are closed.