Class TokenParser

java.lang.Object
io.sf.jclf.text.TokenParser
All Implemented Interfaces:
Iterator<String>

public final class TokenParser extends Object implements Iterator<String>
A StringTokenizer replacement with an Iterator interface and more flexibility.

Separates a String in tokens separated by sep, but grouped by delim.

Example: a,b,"c,d",e,f gives 5 tokens with default constructor.

If the separator contains white space ' ', consecutive separators are ignored.

  • Constructor Summary

    Constructors
    Constructor
    Description
    Separates a String in tokens separated by commas, and grouped by double quotes.
    TokenParser(String line, String separator)
    Separates a String in tokens separated by separator, and grouped by double quotes.
    TokenParser(String line, String separator, String delimiters)
    Separates a String in tokens separated by separator, and grouped by delimiters.
    TokenParser(String line, String separator, String delimiters, boolean keepDelimiters)
    Separates a String in tokens separated by separator, and grouped by delimiters.
  • Method Summary

    Modifier and Type
    Method
    Description
    char
    Get the last separator that was found in the text just before the latest extracted token.
    char
    Get the next separator used to delimit the latest extracted token.
    boolean
    Returns true if the iteration has more tokens.
    boolean
    Tests whether this parser has more than one token (i.e. it has at least one separator and more than one token, including empty tokens).
    boolean
    Returns true if the iteration has more elements (more tokens).
    Returns the next token in the iteration.
    Returns the next token in the iteration.
    void
    Operation not supported.
    static String[]
    tokenize(String l, String sep, char[] delim, int init_size)
    Tokenizes a string l using sep as a token separator, while delim[] delimits a single token.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface java.util.Iterator

    forEachRemaining
  • Constructor Details

    • TokenParser

      public TokenParser(String line)
      Separates a String in tokens separated by commas, and grouped by double quotes.
      Parameters:
      line - the String to separate in tokens
    • TokenParser

      public TokenParser(String line, String separator)
      Separates a String in tokens separated by separator, and grouped by double quotes.
      Parameters:
      line - the String to separate in tokens
      separator - the separator String
      Throws:
      IllegalArgumentException - if the separator contains a null (\0) character.
    • TokenParser

      public TokenParser(String line, String separator, String delimiters)
      Separates a String in tokens separated by separator, and grouped by delimiters.
      Parameters:
      line - the String to separate in tokens
      separator - the separator String
      delimiters - a String of token delimiters. Note that using '(' as a delimiter implies ')', and the same for the "[]" and "{}" couples. Delimiters are removed from the tokens before returning them.
      Throws:
      IllegalArgumentException - if the separator contains a null (\0) character.
    • TokenParser

      public TokenParser(String line, String separator, String delimiters, boolean keepDelimiters)
      Separates a String in tokens separated by separator, and grouped by delimiters.
      Parameters:
      line - the String to separate in tokens
      separator - the separator String
      delimiters - a String of token delimiters. Note that using '(' as a delimiter implies ')', and the same for the "[]" and "{}" couples.
      keepDelimiters - true if we want to keep the string delimiters in the returned tokens, false otherwise.
      Throws:
      IllegalArgumentException - if the separator contains a null (\0) character.
  • Method Details

    • hasNext

      public boolean hasNext()
      Returns true if the iteration has more elements (more tokens).
      Specified by:
      hasNext in interface Iterator<String>
      Returns:
      true if the iteration has more tokens, false otherwise.
    • hasMoreTokens

      public boolean hasMoreTokens()
      Returns true if the iteration has more tokens.

      This method exists for compatibility with StringTokenizer.

      Returns:
      true if the iteration has more tokens, false otherwise.
    • next

      public String next()
      Returns the next token in the iteration.
      Specified by:
      next in interface Iterator<String>
      Returns:
      the next token in the iteration.
      Throws:
      NoSuchElementException - if no more tokens are available in the iterator.
    • nextToken

      public String nextToken()
      Returns the next token in the iteration.

      This method exists for compatibility with StringTokenizer.

      Returns:
      the next token in the iteration.
      Throws:
      NoSuchElementException - if no more tokens are available in the iterator.
    • remove

      public void remove()
      Operation not supported.
      Specified by:
      remove in interface Iterator<String>
    • getLastSeparator

      public char getLastSeparator()
      Get the last separator that was found in the text just before the latest extracted token.

      The current 'last separator' was the 'next separator' for the previous token.

      Returns:
      the latest separator.
      Throws:
      IllegalStateException - if called before next().
    • getNextSeparator

      public char getNextSeparator()
      Get the next separator used to delimit the latest extracted token.
      Returns:
      the next separator.
      Throws:
      IllegalStateException - if no separators are present in the text.
    • hasMultipleTokens

      public boolean hasMultipleTokens()
      Tests whether this parser has more than one token (i.e. it has at least one separator and more than one token, including empty tokens).
      Returns:
      true if this parser has more than one token, including empty tokens.
    • tokenize

      public static String[] tokenize(String l, String sep, char[] delim, int init_size)
      Tokenizes a string l using sep as a token separator, while delim[] delimits a single token.

      Example: a,b,"c,d",e,f gives 5 tokens.

      Warning: to preserve backwards-compatibility, this static method does not behave exactly as the (newer and recommended) Iterator version (see comment below).

      Parameters:
      l - the input line to tokenize.
      sep - the separator. In order to be compatible with legacy applications, this static version of the class handles sep as a separator which can be multiple-character.
      delim - the delimiter (generally {'"'}).
      init_size - a guess of the number of tokens to be found, used to set the initial size of the array.
      Returns:
      the array with the separated tokens, or null if the input line is empty or null.