nu.xom.samples
Class StreamingXHTMLPurifier

java.lang.Object
  extended by nu.xom.NodeFactory
      extended by nu.xom.samples.StreamingXHTMLPurifier

public class StreamingXHTMLPurifier
extends nu.xom.NodeFactory

Demonstrates a custom NodeFactory that strips out all non-XHTML elements. It���s easy enough to drop out any elements that are not in the XHTML namespace. However, in the case of SVG, MathML and most other applications you���ll want to remove the content of these elements as well. I���ll assume that the namespace for text is the same as the namespace of the parent element. (This is not at all clear from the namespaces specification, but it makes sense in many cases.) To track the nearest namespace for non-elements, makeElement() will push the element���s namespace onto a stack and endElement() will pop it off. Peeking at the top of the stack tells you what namespace the nearest element uses. This is modeled after Example 8-9 in Processing XML with Java.

Version:
1.0
Author:
Elliotte Rusty Harold

Field Summary
static java.lang.String XHTML_NAMESPACE
           
 
Constructor Summary
StreamingXHTMLPurifier()
           
 
Method Summary
 nu.xom.Nodes finishMakingElement(nu.xom.Element element)
           Signals the end of an element.
static void main(java.lang.String[] args)
           
 nu.xom.Nodes makeAttribute(java.lang.String name, java.lang.String URI, java.lang.String value, nu.xom.Attribute.Type type)
           Returns a new Nodes object containing an attribute in the specified namespace with the specified name and type.
 nu.xom.Nodes makeComment(java.lang.String data)
           Returns a new Nodes object containing a comment with the specified text.
 nu.xom.Nodes makeDocType(java.lang.String rootElementName, java.lang.String publicID, java.lang.String systemID)
           Returns a new Nodes object containing a DocType object with the specified root element name, system ID, and public ID.
 nu.xom.Nodes makeProcessingInstruction(java.lang.String target, java.lang.String data)
           Returns a new Nodes object containing a new ProcessingInstruction object with the specified target and data.
 nu.xom.Nodes makeText(java.lang.String data)
           Returns a new Nodes object containing a text node with the specified content.
 nu.xom.Element startMakingElement(java.lang.String name, java.lang.String namespace)
           Creates a new Element in the specified namespace with the specified name.
 
Methods inherited from class nu.xom.NodeFactory
finishMakingDocument, makeRootElement, startMakingDocument
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

XHTML_NAMESPACE

public static final java.lang.String XHTML_NAMESPACE
See Also:
Constant Field Values
Constructor Detail

StreamingXHTMLPurifier

public StreamingXHTMLPurifier()
Method Detail

makeText

public nu.xom.Nodes makeText(java.lang.String data)
Description copied from class: nu.xom.NodeFactory

Returns a new Nodes object containing a text node with the specified content.

Subclasses may change the content or other characteristics of the text returned. Subclasses may also change the nodes returned from this method. They may return a Nodes object containing any number of nodes which are added or appended to the current parent node. This Nodes object must not contain any Document nodes. All of the nodes returned must be parentless. Subclasses may return an empty Nodes to indicate the text should not be included in the finished document.

Overrides:
makeText in class nu.xom.NodeFactory
Parameters:
data - the complete text content of the node
Returns:
the nodes to be added to the tree

makeComment

public nu.xom.Nodes makeComment(java.lang.String data)
Description copied from class: nu.xom.NodeFactory

Returns a new Nodes object containing a comment with the specified text.

Subclasses may change the content or other characteristics of the comment returned. Subclasses may change the nodes returned from this method. They may return a Nodes object containing any number of children and attributes which are appended and added to the current parent element. This Nodes object should not contain any Document objects. All of the nodes returned must be parentless. Subclasses may return an empty Nodes to indicate the comment should not be included in the finished document.

Overrides:
makeComment in class nu.xom.NodeFactory
Parameters:
data - the complete text content of the comment
Returns:
the nodes to be added to the tree

startMakingElement

public nu.xom.Element startMakingElement(java.lang.String name,
                                         java.lang.String namespace)
Description copied from class: nu.xom.NodeFactory

Creates a new Element in the specified namespace with the specified name.

Subclasses may change the name, namespace, content, or other characteristics of the Element returned. Subclasses may return null to indicate the Element should not be created. However, doing so will only remove the element's start-tag and end-tag from the result tree. Any content inside the element will be attached to the element's parent by default, unless it too is filtered. To remove an entire element, return an empty Nodes object from the finishMakingElement() method.

Overrides:
startMakingElement in class nu.xom.NodeFactory
Parameters:
name - the qualified name of the element
namespace - the namespace URI of the element
Returns:
the new element

finishMakingElement

public nu.xom.Nodes finishMakingElement(nu.xom.Element element)
Description copied from class: nu.xom.NodeFactory

Signals the end of an element. This method should return the Nodes to be added to the tree. They need not contain the Element that was passed to this method, though most often they will. By default the Nodes returned contain only the built element. However, subclasses may return a list containing any number of nodes, all of which will be added to the tree at the current position in the order given by the list (subject to the usual well-formedness constraints, of course. For instance, the list should not contain a DocType object unless the element is the root element, and the document does not already have a DocType). All of the nodes returned must be parentless. If this method returns an empty list, then the element (including all its contents) is not included in the finished document.

To process an element at a time, override this method in a subclass so that it functions as a callback. When you're done processing the Element, return an empty list so that it will be removed from the tree and garbage collected. Be careful not to return an empty list for the root element though. That is, when the element passed to this method is the root element, the list returned must contain exactly one Element object. The simplest way to check this is testing if element.getParent() instanceof Document.

Do not detach element or any of its ancestors while inside this method. Doing so can royally muck up the build.

Overrides:
finishMakingElement in class nu.xom.NodeFactory
Parameters:
element - the finished Element
Returns:
the nodes to be added to the tree

makeDocType

public nu.xom.Nodes makeDocType(java.lang.String rootElementName,
                                java.lang.String publicID,
                                java.lang.String systemID)
Description copied from class: nu.xom.NodeFactory

Returns a new Nodes object containing a DocType object with the specified root element name, system ID, and public ID.

Subclasses may change the root element name, public ID, system ID, or other characteristics of the DocType returned. Subclasses may change the nodes returned from this method. They may return a Nodes object containing any number of comments and processing instructions which are appended to the current parent node. This Nodes object may not contain any Document, Element, Attribute, or Text objects. All of the nodes returned must be parentless. Subclasses may return an empty Nodes to indicate the DocType should not be included in the finished document.

Overrides:
makeDocType in class nu.xom.NodeFactory
Parameters:
rootElementName - the declared, qualified name for the root element
publicID - the public ID of the external DTD subset
systemID - the URL of the external DTD subset
Returns:
the nodes to be added to the document

makeProcessingInstruction

public nu.xom.Nodes makeProcessingInstruction(java.lang.String target,
                                              java.lang.String data)
Description copied from class: nu.xom.NodeFactory

Returns a new Nodes object containing a new ProcessingInstruction object with the specified target and data.

Subclasses may change the target, data, or other characteristics of the ProcessingInstruction returned. Subclasses may change the nodes returned from this method. They may return a Nodes object containing any number of nodes which are added or appended to the current parent node. This Nodes object must not contain any Document nodes. If the processing instruction appears in the prolog or epilog of the document, then it must also not contain any Element, Attribute, or Text objects. All of the nodes returned must be parentless. Subclasses may return an empty Nodes to indicate the processing instruction should not be included in the finished document.

Overrides:
makeProcessingInstruction in class nu.xom.NodeFactory
Parameters:
target - the target of the processing instruction
data - the data of the processing instruction
Returns:
the nodes to be added to the tree

makeAttribute

public nu.xom.Nodes makeAttribute(java.lang.String name,
                                  java.lang.String URI,
                                  java.lang.String value,
                                  nu.xom.Attribute.Type type)
Description copied from class: nu.xom.NodeFactory

Returns a new Nodes object containing an attribute in the specified namespace with the specified name and type.

Subclasses may change the nodes returned from this method. They may return a Nodes object containing any number of children and attributes which are appended and added to the current parent element. This Nodes object may not contain any Document objects. All of the nodes returned must be parentless. Subclasses may return an empty Nodes to indicate the attribute should not be created.

Overrides:
makeAttribute in class nu.xom.NodeFactory
Parameters:
name - the prefixed name of the attribute
URI - the namespace URI
value - the attribute value
type - the attribute type
Returns:
the nodes to be added to the tree

main

public static void main(java.lang.String[] args)


Copyright 2002-2005 Elliotte Rusty Harold
elharo@metalab.unc.edu