com.ibm.wbi.markuplanguage.html
Class HtmlTokenizer
java.lang.Object
|
+--java.io.Reader
|
+--java.io.FilterReader
|
+--com.ibm.wbi.markuplanguage.html.HtmlTokenizer
- public class HtmlTokenizer
- extends java.io.FilterReader
Tokenizer class to easily read tokens from an HTML input stream.
The HtmlTokenizer allows to read tokens from an HTML source, such as tags, text
or scripts.
| Fields inherited from class java.io.FilterReader |
in |
| Fields inherited from class java.io.Reader |
lock |
|
Constructor Summary |
HtmlTokenizer(java.io.InputStream s)
Deprecated. Use HtmlTokenizer(Reader) instead. |
HtmlTokenizer(java.io.Reader r)
Create the HtmlTokenizer off a base Reader. |
|
Method Summary |
protected void |
backupOne()
Decrement the buffer index of the HtmlTokenizer. |
protected HtmlItem |
Done()
Called if end of input reached. |
protected int |
getNextChar()
Read the next character from the stream. |
static void |
main(java.lang.String[] args)
Testing code |
HtmlItem |
nextToken()
Get the next token from the stream. |
protected HtmlItem |
ReadTag(char ch)
Reads the next Html tag from the input stream. |
protected HtmlItem |
ReadText(char ch)
Reads a text section from the input stream. |
| Methods inherited from class java.io.FilterReader |
close, mark, markSupported, read, read, ready, reset, skip |
| Methods inherited from class java.io.Reader |
read |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
outputBuffer
protected char[] outputBuffer
increment
protected final int increment
inputBuffer
protected char[] inputBuffer
inputBufferSize
protected final int inputBufferSize
ibIndex
protected int ibIndex
ibSize
protected int ibSize
HtmlTokenizer
public HtmlTokenizer(java.io.InputStream s)
throws java.io.UnsupportedEncodingException
- Deprecated. Use HtmlTokenizer(Reader) instead.
- Create an HtmlTokenizer from an InputStream of bytes using the
iso-8859-1 encoding. This is a bad idea, because there is no
way to guarantee the bytes represent an iso-8859-1-encoded
String.
- Throws:
java.io.UnsupportedEncodingException - If iso-8859-1 encoding
is not supported on this platform.
HtmlTokenizer
public HtmlTokenizer(java.io.Reader r)
- Create the HtmlTokenizer off a base Reader.
- Parameters:
n - the Reader.
nextToken
public HtmlItem nextToken()
- Get the next token from the stream.
- Returns:
- The next token.
Done
protected HtmlItem Done()
- Called if end of input reached.
- Returns:
- Always null.
ReadText
protected HtmlItem ReadText(char ch)
throws java.io.IOException
- Reads a text section from the input stream.
- Parameters:
ch - (In/Out) If set to 0 an empty string is returned. If non-null, ch holds
the current character after text was returned.- Returns:
- The text from the current stream position up to the next HTML tag or an empty string if
parameter ch is 0.
ReadTag
protected HtmlItem ReadTag(char ch)
throws java.io.IOException
- Reads the next Html tag from the input stream.
- Parameters:
ch - (In/Out) If set to 0 an empty string is returned.- Returns:
- The next tag or an empty string if parameter ch is 0.
getNextChar
protected int getNextChar()
throws java.io.IOException
- Read the next character from the stream.
- Returns:
- The next character as int.
- Throws:
java.io.IOException - if an exception ocurred while reading from the stream.
backupOne
protected void backupOne()
- Decrement the buffer index of the HtmlTokenizer.
main
public static void main(java.lang.String[] args)
- Testing code