morfologik.fsa
Class FSAVer5Impl

java.lang.Object
  extended by morfologik.fsa.FSA
      extended by morfologik.fsa.FSAVer5Impl
All Implemented Interfaces:
java.lang.Iterable<java.nio.ByteBuffer>

public final class FSAVer5Impl
extends FSA

FSA (Finite State Automaton) dictionary traversal implementation for version 5 of the FSA automaton.

Version 5 indicates the dictionary was built with these flags: FSAFlags.FLEXIBLE, FSAFlags.STOPBIT and FSAFlags.NEXTBIT. The internal representation of the FSA must therefore follow this description (please note this format describes only a single transition (arc), not the entire dictionary file).

 Byte
       +-+-+-+-+-+-+-+-+\
     0 | | | | | | | | | +------ label
       +-+-+-+-+-+-+-+-+/
 
                  +------------- node pointed to is next
                  | +----------- the last arc of the node
                  | | +--------- the arc is final
                  | | |
             +-----------+
             |    | | |  |
         ___+___  | | |  |
        /       \ | | |  |
       MSB           LSB |
        7 6 5 4 3 2 1 0  |
       +-+-+-+-+-+-+-+-+ |
     1 | | | | | | | | | \ \
       +-+-+-+-+-+-+-+-+  \ \  LSB
       +-+-+-+-+-+-+-+-+     +
     2 | | | | | | | | |     |
       +-+-+-+-+-+-+-+-+     |
     3 | | | | | | | | |     +----- target node address (in bytes)
       +-+-+-+-+-+-+-+-+     |      (not present except for the byte
       : : : : : : : : :     |       with flags if the node pointed to
       +-+-+-+-+-+-+-+-+     +       is next)
   gtl | | | | | | | | |    /  MSB
       +-+-+-+-+-+-+-+-+   /
 gtl+1                           (gtl = gotoLength)
 


Field Summary
protected  byte[] arcs
          An array of bytes with the internal representation of the automaton.
protected  int arcSize
          Size of a single arc (in bytes).
protected static int gotoOffset
          An offset in the arc structure, where the address field begins.
 
Fields inherited from class morfologik.fsa.FSA
filler, gotoLength, version, VERSION_5
 
Constructor Summary
FSAVer5Impl(java.io.InputStream fsaStream, java.lang.String dictionaryEncoding)
          Creates a new automaton reading it from a file in FSA format, version 5.
 
Method Summary
 int getArc(int node, byte label)
          Returns the identifier of an arc leaving node and labeled with label.
 byte getArcLabel(int arc)
          Return the label associated with a given arc.
 int getEndNode(int arc)
          Return the end node pointed to by a given arc.
 int getFirstArc(int node)
          Returns the identifier of the first arc leaving node or 0 if the node has no outgoing arcs.
 int getNextArc(int node, int arc)
          Returns the identifier of the next arc after arc and leaving node.
 int getNumberOfArcs()
          Returns the number of arcs in this automaton.
 int getNumberOfNodes()
          Returns the number of nodes in this automaton.
 int getRootNode()
          Returns the start node of this automaton.
 boolean isArcFinal(int arc)
          Returns true if the destination node at the end of this arc corresponds to an input sequence created when building this automaton.
 boolean isArcTerminal(int arc)
          Returns true if this arc does not have a terminating node.
protected  void readHeader(java.io.DataInput in, long fileSize)
          Reads a FSA header from a stream.
 
Methods inherited from class morfologik.fsa.FSA
getAnnotationSeparator, getFillerCharacter, getFlags, getInstance, getInstance, getTraversalHelper, getVersion, iterator, readFully
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

arcSize

protected int arcSize
Size of a single arc (in bytes).


gotoOffset

protected static final int gotoOffset
An offset in the arc structure, where the address field begins. For this version of the automaton, this is a constant value.

See Also:
Constant Field Values

arcs

protected byte[] arcs
An array of bytes with the internal representation of the automaton. Please see the documentation of this class for more information on how this structure is organized.

Constructor Detail

FSAVer5Impl

public FSAVer5Impl(java.io.InputStream fsaStream,
                   java.lang.String dictionaryEncoding)
            throws java.io.IOException
Creates a new automaton reading it from a file in FSA format, version 5.

Throws:
java.io.IOException
Method Detail

getNumberOfArcs

public int getNumberOfArcs()
Returns the number of arcs in this automaton. This method performs a full scan of all arcs in this automaton.

Specified by:
getNumberOfArcs in class FSA

getNumberOfNodes

public int getNumberOfNodes()
Returns the number of nodes in this automaton. This method performs a full scan of all arcs in this automaton.

Specified by:
getNumberOfNodes in class FSA

getRootNode

public int getRootNode()
Returns the start node of this automaton. May return 0 if the start node is also an end node.

Specified by:
getRootNode in class FSA
See Also:
FSA.getTraversalHelper()

readHeader

protected void readHeader(java.io.DataInput in,
                          long fileSize)
                   throws java.io.IOException
Reads a FSA header from a stream.

Overrides:
readHeader in class FSA
Throws:
java.io.IOException - If the stream is not a dictionary, or if the version is not supported.

getFirstArc

public final int getFirstArc(int node)
Description copied from class: FSA
Returns the identifier of the first arc leaving node or 0 if the node has no outgoing arcs.

Specified by:
getFirstArc in class FSA
See Also:
FSA.getTraversalHelper()

getNextArc

public final int getNextArc(int node,
                            int arc)
Description copied from class: FSA
Returns the identifier of the next arc after arc and leaving node. Zero is returned if no more arcs are available for the node.

Specified by:
getNextArc in class FSA
See Also:
FSA.getTraversalHelper()

getArc

public int getArc(int node,
                  byte label)
Description copied from class: FSA
Returns the identifier of an arc leaving node and labeled with label. An identifier equal to 0 means the node has no outgoing arc labeled label.

Specified by:
getArc in class FSA
See Also:
FSA.getTraversalHelper()

getEndNode

public int getEndNode(int arc)
Description copied from class: FSA
Return the end node pointed to by a given arc. Terminal arcs (those that point to a terminal state) have no end node representation and throw a runtime exception.

Specified by:
getEndNode in class FSA
See Also:
FSA.getTraversalHelper()

getArcLabel

public byte getArcLabel(int arc)
Description copied from class: FSA
Return the label associated with a given arc.

Specified by:
getArcLabel in class FSA

isArcFinal

public boolean isArcFinal(int arc)
Description copied from class: FSA
Returns true if the destination node at the end of this arc corresponds to an input sequence created when building this automaton.

Specified by:
isArcFinal in class FSA

isArcTerminal

public boolean isArcTerminal(int arc)
Description copied from class: FSA
Returns true if this arc does not have a terminating node.

Specified by:
isArcTerminal in class FSA