Skip to content

Instantly share code, notes, and snippets.

@mislav
Created May 12, 2010 08:31

Revisions

  1. mislav revised this gist May 12, 2010. 1 changed file with 7 additions and 0 deletions.
    7 changes: 7 additions & 0 deletions html.rb
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,7 @@
    require 'nokogiri'

    ugly = Nokogiri::HTML ARGF
    tidy = Nokogiri::XSLT File.open('tidy.xsl')
    nice = tidy.transform(ugly).to_html

    puts nice
  2. mislav created this gist May 12, 2010.
    46 changes: 46 additions & 0 deletions tidy.xsl
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,46 @@
    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" encoding="UTF-8"/>
    <xsl:param name="indent-increment" select="' '"/>

    <xsl:template name="newline">
    <xsl:text disable-output-escaping="yes">
    </xsl:text>
    </xsl:template>

    <xsl:template match="comment() | processing-instruction()">
    <xsl:param name="indent" select="''"/>
    <xsl:call-template name="newline"/>
    <xsl:value-of select="$indent"/>
    <xsl:copy />
    </xsl:template>

    <xsl:template match="text()">
    <xsl:param name="indent" select="''"/>
    <xsl:call-template name="newline"/>
    <xsl:value-of select="$indent"/>
    <xsl:value-of select="normalize-space(.)"/>
    </xsl:template>

    <xsl:template match="text()[normalize-space(.)='']"/>

    <xsl:template match="*">
    <xsl:param name="indent" select="''"/>
    <xsl:call-template name="newline"/>
    <xsl:value-of select="$indent"/>
    <xsl:choose>
    <xsl:when test="count(child::*) > 0">
    <xsl:copy>
    <xsl:copy-of select="@*"/>
    <xsl:apply-templates select="*|text()">
    <xsl:with-param name="indent" select="concat ($indent, $indent-increment)"/>
    </xsl:apply-templates>
    <xsl:call-template name="newline"/>
    <xsl:value-of select="$indent"/>
    </xsl:copy>
    </xsl:when>
    <xsl:otherwise>
    <xsl:copy-of select="."/>
    </xsl:otherwise>
    </xsl:choose>
    </xsl:template>
    </xsl:stylesheet>