com.aptana.ide.lexer
Class AbstractLexer

java.lang.Object
  extended by com.aptana.ide.lexer.AbstractLexer
All Implemented Interfaces:
ILexer
Direct Known Subclasses:
AsciiLexer, CodeBasedLexer, MatcherLexer, UnicodeLexer

public abstract class AbstractLexer
extends java.lang.Object
implements ILexer

Author:
Kevin Lindsey

Field Summary
protected  int currentOffset
          Current offset within the source code where the next match will begin
protected  int lastMatchedTokenIndex
          The token index of the last match.
 
Constructor Summary
AbstractLexer()
          Create a new instance of Lexer
 
Method Summary
 void addLanguage(ITokenList tokens)
          Add the given language tokens to this lexer
protected  Lexeme createLexeme(IToken token, java.lang.String text, int offset)
          Create a new lexeme.
abstract  Range find(java.lang.String groupName)
          Use the currently active language and group to locate the first position that matches that group.
protected  Lexeme getCachedLexeme()
          getCachedLexeme
abstract  char getCharacterAt(int offset)
          Return the character at the given offset
 int getCurrentOffset()
          Return the current offset within the source text where the next token will be matched
 ITokenList getCurrentTokenList()
          Get the token list that is currently active in this lexer
 int getEOFOffset()
          Returns the offset that is considered as the EOF
 java.lang.String getGroup()
          Returns the name of the group that is current active
 java.lang.String getLanguage()
          Get the language type this lexer is targeting
 java.lang.String[] getLanguages()
          Get all languages that are contained in this Lexer
 Lexeme getNextLexeme()
          Get the next token from the source text
abstract  java.lang.String getSource()
          Get the text being processed by this lexer
protected abstract  char[] getSourceCharacters()
          Get the character array for the current source
abstract  int getSourceLength()
          Return the number of characters in this lexer's source code
 ITokenList getTokenList(java.lang.String language)
          Get the token list for the specified language
 boolean isEOS()
          Determine if we've processed all the source text
protected abstract  int match()
          match
 void seal()
          Build all of the token regexes and build the lexer group state machines
 void setCurrentOffset(int offset)
          Set the current offset within the source text where the next token will be matched
 void setEOFOffset(int offset)
          Set the offset that is considered the end of the input stream.
 void setGroup(java.lang.String groupName)
          Set the currently active lexer group
 void setIgnoreSet(java.lang.String language, int[] set)
          Set the list of Token types to skip when scanning the source text
 void setLanguage(java.lang.String language)
          Set the language name this lexer targets
 void setLanguageAndGroup(java.lang.String language, java.lang.String group)
          Set the current language and the group within that language
 void setLexemeCache(LexemeList lexemeCache)
          Set the lexeme cache
 void setLexerState(java.lang.String group, char[] source, int offset, LexemeList cache)
          Set properties of the lexer to prepare for rescanning of modified source
 void setLexerState(java.lang.String group, int offset)
          Set properties of the lexer to prepare for rescanning of unmodified source
abstract  void setSource(char[] value)
          Set the text to be processed by this lexer
 void setSource(java.lang.String value)
          Set the text to be processed by this lexer
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface com.aptana.ide.lexer.ILexer
getSourceUnsafe
 

Field Detail

lastMatchedTokenIndex

protected int lastMatchedTokenIndex
The token index of the last match. This value will be -1 if getNextLexeme did not find a new lexeme


currentOffset

protected int currentOffset
Current offset within the source code where the next match will begin

Constructor Detail

AbstractLexer

public AbstractLexer()
Create a new instance of Lexer

Method Detail

setLexemeCache

public void setLexemeCache(LexemeList lexemeCache)
Description copied from interface: ILexer
Set the lexeme cache

Specified by:
setLexemeCache in interface ILexer
Parameters:
lexemeCache - The list of previously scanned lexemes
See Also:
ILexer.setLexemeCache(com.aptana.ide.lexer.LexemeList)

getCachedLexeme

protected Lexeme getCachedLexeme()
getCachedLexeme

Returns:
Lexeme

getCharacterAt

public abstract char getCharacterAt(int offset)
Description copied from interface: ILexer
Return the character at the given offset

Specified by:
getCharacterAt in interface ILexer
Parameters:
offset - The offset within the source
Returns:
The character at the given offset
See Also:
ILexer.getCharacterAt(int)

getCurrentOffset

public int getCurrentOffset()
Description copied from interface: ILexer
Return the current offset within the source text where the next token will be matched

Specified by:
getCurrentOffset in interface ILexer
Returns:
The current offset within the source text
See Also:
ILexer.getCurrentOffset()

getCurrentTokenList

public ITokenList getCurrentTokenList()
Description copied from interface: ILexer
Get the token list that is currently active in this lexer

Specified by:
getCurrentTokenList in interface ILexer
Returns:
ITokenList
See Also:
ILexer.getCurrentTokenList()

setCurrentOffset

public void setCurrentOffset(int offset)
Description copied from interface: ILexer
Set the current offset within the source text where the next token will be matched

Specified by:
setCurrentOffset in interface ILexer
Parameters:
offset - The new offset value
See Also:
ILexer.setCurrentOffset(int)

getEOFOffset

public int getEOFOffset()
Description copied from interface: ILexer
Returns the offset that is considered as the EOF

Specified by:
getEOFOffset in interface ILexer
Returns:
The EOF offset
See Also:
ILexer.getEOFOffset()

setEOFOffset

public void setEOFOffset(int offset)
Description copied from interface: ILexer
Set the offset that is considered the end of the input stream. The offset will not be included in the stream

Specified by:
setEOFOffset in interface ILexer
Parameters:
offset - The new EOF offset for the current stream of text being processed by this lexer
See Also:
ILexer.setEOFOffset(int)

getGroup

public java.lang.String getGroup()
Description copied from interface: ILexer
Returns the name of the group that is current active

Specified by:
getGroup in interface ILexer
Returns:
Returns the current group name
See Also:
ILexer.getGroup()

setGroup

public void setGroup(java.lang.String groupName)
              throws LexerException
Description copied from interface: ILexer
Set the currently active lexer group

Specified by:
setGroup in interface ILexer
Parameters:
groupName - The name of the group to activate
Throws:
LexerException
See Also:
ILexer.setGroup(java.lang.String)

setIgnoreSet

public void setIgnoreSet(java.lang.String language,
                         int[] set)
Description copied from interface: ILexer
Set the list of Token types to skip when scanning the source text

Specified by:
setIgnoreSet in interface ILexer
Parameters:
language - The target language to apply this ignore set
set - The set of token type to skip
See Also:
ILexer.setIgnoreSet(java.lang.String, int[])

getLanguage

public java.lang.String getLanguage()
Description copied from interface: ILexer
Get the language type this lexer is targeting

Specified by:
getLanguage in interface ILexer
Returns:
The language this lexer is targeting
See Also:
ILexer.getLanguage()

getLanguages

public java.lang.String[] getLanguages()
Description copied from interface: ILexer
Get all languages that are contained in this Lexer

Specified by:
getLanguages in interface ILexer
Returns:
Returns a string array of all languages in this lexer
See Also:
ILexer.getLanguages()

setLanguage

public void setLanguage(java.lang.String language)
                 throws LexerException
Description copied from interface: ILexer
Set the language name this lexer targets

Specified by:
setLanguage in interface ILexer
Parameters:
language - The language this lexer will target
Throws:
LexerException
See Also:
ILexer.setLanguage(java.lang.String)

setLanguageAndGroup

public void setLanguageAndGroup(java.lang.String language,
                                java.lang.String group)
                         throws LexerException
Description copied from interface: ILexer
Set the current language and the group within that language

Specified by:
setLanguageAndGroup in interface ILexer
Parameters:
language - The new language
group - The group within the language
Throws:
LexerException
See Also:
ILexer.setLanguageAndGroup(java.lang.String, java.lang.String)

getSource

public abstract java.lang.String getSource()
Description copied from interface: ILexer
Get the text being processed by this lexer

Specified by:
getSource in interface ILexer
Returns:
The source text
See Also:
ILexer.getSource()

getSourceCharacters

protected abstract char[] getSourceCharacters()
Get the character array for the current source

Returns:
char[]

getSourceLength

public abstract int getSourceLength()
Description copied from interface: ILexer
Return the number of characters in this lexer's source code

Specified by:
getSourceLength in interface ILexer
Returns:
Returns the source code character count
See Also:
ILexer.getSourceLength()

setSource

public abstract void setSource(char[] value)
Description copied from interface: ILexer
Set the text to be processed by this lexer

Specified by:
setSource in interface ILexer
Parameters:
value - The new source text
See Also:
ILexer.setSource(char[])

setSource

public void setSource(java.lang.String value)
Description copied from interface: ILexer
Set the text to be processed by this lexer

Specified by:
setSource in interface ILexer
Parameters:
value - The new source text
See Also:
ILexer.setSource(java.lang.String)

getTokenList

public ITokenList getTokenList(java.lang.String language)
Description copied from interface: ILexer
Get the token list for the specified language

Specified by:
getTokenList in interface ILexer
Returns:
Returns the token list for the specified language
See Also:
ILexer.getTokenList(java.lang.String)

isEOS

public boolean isEOS()
Description copied from interface: ILexer
Determine if we've processed all the source text

Specified by:
isEOS in interface ILexer
Returns:
Returns true if we have processed all of the source text
See Also:
ILexer.isEOS()

addLanguage

public void addLanguage(ITokenList tokens)
Description copied from interface: ILexer
Add the given language tokens to this lexer

Specified by:
addLanguage in interface ILexer
See Also:
ILexer.addLanguage(com.aptana.ide.lexer.ITokenList)

seal

public void seal()
          throws LexerException
Description copied from interface: ILexer
Build all of the token regexes and build the lexer group state machines

Specified by:
seal in interface ILexer
Throws:
LexerException
See Also:
ILexer.seal()

createLexeme

protected Lexeme createLexeme(IToken token,
                              java.lang.String text,
                              int offset)
Create a new lexeme. Sub-classes will need to override this method to create their own lexeme sub-classes

Parameters:
token - The token class for this lexeme
text - The token's associated text
offset - The token's offset within the source file
Returns:
Returns a newly created lexeme

find

public abstract Range find(java.lang.String groupName)
                    throws LexerException
Description copied from interface: ILexer
Use the currently active language and group to locate the first position that matches that group. Note that token switchTo's are ignored by this method

Specified by:
find in interface ILexer
Parameters:
groupName - The current language's group name that contains the patterns to find
Returns:
Returns an Range where the starting offset is the point where the match occurred and the ending offset is the position where the match ended. If there is no match, then an empty Range will be returned
Throws:
LexerException
See Also:
ILexer.find(java.lang.String)

match

protected abstract int match()
match

Returns:
Returns the position of the last failed or successful match

setLexerState

public void setLexerState(java.lang.String group,
                          int offset)
                   throws LexerException
Description copied from interface: ILexer
Set properties of the lexer to prepare for rescanning of unmodified source

Specified by:
setLexerState in interface ILexer
Parameters:
group - The group to switch to before beginning the rescan
offset - The offset within the source where to begin rescanning
Throws:
LexerException
See Also:
ILexer.setLexerState(java.lang.String, int)

setLexerState

public void setLexerState(java.lang.String group,
                          char[] source,
                          int offset,
                          LexemeList cache)
                   throws LexerException
Description copied from interface: ILexer
Set properties of the lexer to prepare for rescanning of modified source

Specified by:
setLexerState in interface ILexer
Parameters:
group - The group to switch to before beginning the next lex
source - The new source code to lex
offset - The offset within the source where to begin re-lexing
cache - The lexeme cache from the last lex
Throws:
LexerException
See Also:
ILexer.setLexerState(java.lang.String, char[], int, com.aptana.ide.lexer.LexemeList)

getNextLexeme

public Lexeme getNextLexeme()
Description copied from interface: ILexer
Get the next token from the source text

Specified by:
getNextLexeme in interface ILexer
Returns:
The next token in Token stream
See Also:
ILexer.getNextLexeme()