com.ibm.wbi.markuplanguage.html
Class HtmlTokenizer

java.lang.Object
  |
  +--java.io.Reader
        |
        +--java.io.FilterReader
              |
              +--com.ibm.wbi.markuplanguage.html.HtmlTokenizer

public class HtmlTokenizer
extends java.io.FilterReader

Tokenizer class to easily read tokens from an HTML input stream. The HtmlTokenizer allows to read tokens from an HTML source, such as tags, text or scripts.


Field Summary
protected  int ibIndex
           
protected  int ibSize
           
protected  int increment
           
protected  char[] inputBuffer
           
protected  int inputBufferSize
           
protected  char[] outputBuffer
           
 
Fields inherited from class java.io.FilterReader
in
 
Fields inherited from class java.io.Reader
lock
 
Constructor Summary
HtmlTokenizer(java.io.InputStream s)
          Deprecated. Use HtmlTokenizer(Reader) instead.
HtmlTokenizer(java.io.Reader r)
          Create the HtmlTokenizer off a base Reader.
 
Method Summary
protected  void backupOne()
          Decrement the buffer index of the HtmlTokenizer.
protected  HtmlItem Done()
          Called if end of input reached.
protected  int getNextChar()
          Read the next character from the stream.
static void main(java.lang.String[] args)
          Testing code
 HtmlItem nextToken()
          Get the next token from the stream.
protected  HtmlItem ReadTag(char ch)
          Reads the next Html tag from the input stream.
protected  HtmlItem ReadText(char ch)
          Reads a text section from the input stream.
 
Methods inherited from class java.io.FilterReader
close, mark, markSupported, read, read, ready, reset, skip
 
Methods inherited from class java.io.Reader
read
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

outputBuffer

protected char[] outputBuffer

increment

protected final int increment

inputBuffer

protected char[] inputBuffer

inputBufferSize

protected final int inputBufferSize

ibIndex

protected int ibIndex

ibSize

protected int ibSize
Constructor Detail

HtmlTokenizer

public HtmlTokenizer(java.io.InputStream s)
              throws java.io.UnsupportedEncodingException
Deprecated. Use HtmlTokenizer(Reader) instead.

Create an HtmlTokenizer from an InputStream of bytes using the iso-8859-1 encoding. This is a bad idea, because there is no way to guarantee the bytes represent an iso-8859-1-encoded String.
Throws:
java.io.UnsupportedEncodingException - If iso-8859-1 encoding is not supported on this platform.

HtmlTokenizer

public HtmlTokenizer(java.io.Reader r)
Create the HtmlTokenizer off a base Reader.
Parameters:
n - the Reader.
Method Detail

nextToken

public HtmlItem nextToken()
Get the next token from the stream.
Returns:
The next token.

Done

protected HtmlItem Done()
Called if end of input reached.
Returns:
Always null.

ReadText

protected HtmlItem ReadText(char ch)
                     throws java.io.IOException
Reads a text section from the input stream.
Parameters:
ch - (In/Out) If set to 0 an empty string is returned. If non-null, ch holds the current character after text was returned.
Returns:
The text from the current stream position up to the next HTML tag or an empty string if parameter ch is 0.

ReadTag

protected HtmlItem ReadTag(char ch)
                    throws java.io.IOException
Reads the next Html tag from the input stream.
Parameters:
ch - (In/Out) If set to 0 an empty string is returned.
Returns:
The next tag or an empty string if parameter ch is 0.

getNextChar

protected int getNextChar()
                   throws java.io.IOException
Read the next character from the stream.
Returns:
The next character as int.
Throws:
java.io.IOException - if an exception ocurred while reading from the stream.

backupOne

protected void backupOne()
Decrement the buffer index of the HtmlTokenizer.

main

public static void main(java.lang.String[] args)
Testing code