com.fatdog.xmlEngine
Class XQEngine

java.lang.Object
  extended bycom.fatdog.xmlEngine.XQEngine

public class XQEngine
extends java.lang.Object

The main driver for the query engine.

Version:
0.66
Author:
Howard Katz, howardk@fatdog.com

Field Summary
static int AGGRESSIVE
           
static java.lang.String DEFAULT_EXPLICIT_DOCNAME
           
static int MEEK
           
static int MILD
           
 
Constructor Summary
XQEngine()
           
 
Method Summary
 boolean doSerializeIndex()
           
static double elapsedTime(long elapsedMillis)
           
static java.lang.String fileToURLString(java.io.File file)
          Utility routine borrowed (thanks) from James Clark.
 java.lang.String getDocumentName(int docId)
          Get the name of the document corresponding to its docId.
 IndexManager getIndexManager()
          Return the IndexManager for this session.
 int[] getNodeTypeCounts()
          Get a breakdown of node types in the index.
 int getNumDocuments()
          Get the number of documents in the index.
 int getNumTotalElementNames()
           
 int getNumUniqueElementNames()
           
 SAXHandler getSaxHandler()
          Get the SAXHandler currently in use.
 java.lang.String getSerializationDirectory()
           
 long getTotalFileLength()
          Return the aggregate file length of all files in the index.
 boolean getUseLexicalPrefixes()
           
 boolean isDebugOutputToConsole()
          Queries whether the debug-output-to-console flag has been set.
 boolean isDistributedQNameDictionaries()
           
 boolean isShowFileIndexing()
          Queries whether file-indexing progress is to be written to the console.
 boolean isWordIndexing()
           
 void printSessionStats(long elapsedMillis, java.io.PrintWriter dest)
          Print a summary of indexing information.
 void printSessionStats(long elapsedMillis, java.io.PrintWriter destination, boolean doTextBlocks)
          Print a summary of indexing information.
 void registerProtocolHandler(java.lang.String scheme, IProtocolHandler yourHandler)
          Register an IProtocolHandler object to receive, via its IProtocolHandler.content(java.lang.String) method, the address argument passed in to either the setDocument(String) method or the setExplicitDocument(String) function.
 void setDebugOutputToConsole(boolean debugOutput)
          Set a flag indicating whether debug output is desired.
 int setDocument(java.lang.String name)
           
 int setDocument(java.lang.String name, boolean inQuery)
          Index this document.
 int setExplicitDocument(java.lang.String explicitContent)
          Add explicit (literal angle-bracket) XML to the index.
 void setIsDistributedQNameDictionaries(boolean isDistributed)
           
 void setIsWordIndexing(boolean wordIndexing)
          NYI.
 void setMinIndexableWordLength(int length)
          Set the minimum length of word to index.
 ResultList setQuery(java.lang.String query)
          Pass an XQuery query to the engine, get a ResultList back.
 ResultList setQueryFromFile(java.lang.String file)
          An alternative form of setQuery(String).
 void setShowFileIndexing(boolean showFileIndexing)
          Query status of file-indexing flag.
 void setUseLexicalPrefixes(boolean useLexicalPrefixes)
          Indicate whether lexical (non-standard) namespace searching is in effect.
 IndexManager setXMLReader(org.xml.sax.XMLReader reader)
          Indicate the SAX2 XMLReader you want to use for parsing XML documents.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_EXPLICIT_DOCNAME

public static final java.lang.String DEFAULT_EXPLICIT_DOCNAME
See Also:
Constant Field Values

AGGRESSIVE

public static int AGGRESSIVE

MILD

public static int MILD

MEEK

public static int MEEK
Constructor Detail

XQEngine

public XQEngine()
Method Detail

setIsDistributedQNameDictionaries

public void setIsDistributedQNameDictionaries(boolean isDistributed)

isDistributedQNameDictionaries

public boolean isDistributedQNameDictionaries()

setIsWordIndexing

public void setIsWordIndexing(boolean wordIndexing)
NYI. To be used when explicit word indexing is required.

Parameters:
wordIndexing -

isWordIndexing

public boolean isWordIndexing()

setUseLexicalPrefixes

public void setUseLexicalPrefixes(boolean useLexicalPrefixes)
Indicate whether lexical (non-standard) namespace searching is in effect.

Lexical prefix/namespace searching allows specification of location-path queries such as //biblio:book -- the QName prefix will be searched for exactly as is. A namespace declaration is not required (and is an error if present).

If lexical prefix searching is false, a namespace declaration for the prefix is required.

Parameters:
useLexicalPrefixes - a boolean flag indicating whether non-standard lexical prefix searching is allowed.

getUseLexicalPrefixes

public boolean getUseLexicalPrefixes()

setDebugOutputToConsole

public void setDebugOutputToConsole(boolean debugOutput)
Set a flag indicating whether debug output is desired.

This sets a flag in the engine that calling routines can interrogate via isDebugOutputToConsole(). This flag is only informational and doesn't actually force debug output (which results from printing the result of ResultList.toString() to the console).

Parameters:
debugOutput - boolean toogle for indicating desired output type

isDebugOutputToConsole

public boolean isDebugOutputToConsole()
Queries whether the debug-output-to-console flag has been set.


isShowFileIndexing

public boolean isShowFileIndexing()
Queries whether file-indexing progress is to be written to the console.


setShowFileIndexing

public void setShowFileIndexing(boolean showFileIndexing)
Query status of file-indexing flag.

See Also:
isShowFileIndexing()

getIndexManager

public IndexManager getIndexManager()
Return the IndexManager for this session.


setMinIndexableWordLength

public void setMinIndexableWordLength(int length)
Set the minimum length of word to index.

Default is 1.

Parameters:
length - Only index words of this length or greater

setXMLReader

public IndexManager setXMLReader(org.xml.sax.XMLReader reader)
Indicate the SAX2 XMLReader you want to use for parsing XML documents.

Parameters:
reader - The XMLReader
Returns:
The IndexManager for the query engine

getSaxHandler

public SAXHandler getSaxHandler()
Get the SAXHandler currently in use.

Returns:
current SAXHandler

registerProtocolHandler

public void registerProtocolHandler(java.lang.String scheme,
                                    IProtocolHandler yourHandler)
Register an IProtocolHandler object to receive, via its IProtocolHandler.content(java.lang.String) method, the address argument passed in to either the setDocument(String) method or the setExplicitDocument(String) function. The query engine will have determined that the address is prefixed by the scheme prefix being registered by this function. The engine uses the scheme to distinguish custom-protocol-based addresses from normal filepath-based addresses, so you should ensure the engine can use your scheme to disambiguate between the two.

Parameters:
scheme - An arbitrary string prefix of your own devising
yourHandler - An IProtocolHandler object
Throws:
java.lang.IllegalArgumentException - If scheme or yourHandler are null, or the scheme is already registered

setExplicitDocument

public int setExplicitDocument(java.lang.String explicitContent)
                        throws CantParseDocumentException,
                               MissingOrInvalidSaxParserException
Add explicit (literal angle-bracket) XML to the index.

Can be called repeatedly to add different explicit documents to the index. Unlike setDocument(java.lang.String), this method takes literal, angle-bracket XML content and not a filename as a argument.

Parameters:
explicitContent - The literal "angle bracket" XML content you want indexed
Returns:
The integer document ID that identifies the document in the index
Throws:
CantParseDocumentException - If the SAX parser can't parse the document
MissingOrInvalidSaxParserException - If the SAX parser is missing or invalid

setDocument

public int setDocument(java.lang.String name)
                throws java.io.FileNotFoundException,
                       CantParseDocumentException,
                       MissingOrInvalidSaxParserException
Throws:
java.io.FileNotFoundException
CantParseDocumentException
MissingOrInvalidSaxParserException

setDocument

public int setDocument(java.lang.String name,
                       boolean inQuery)
                throws java.io.FileNotFoundException,
                       CantParseDocumentException,
                       MissingOrInvalidSaxParserException
Index this document.

The document can be either :

Parameters:
name - The name of the document to be indexed
Returns:
The unique integer docID assigned this document by the query engine
Throws:
java.io.FileNotFoundException - If the file couldn't be located
CantParseDocumentException - If the SAX parser complained
MissingOrInvalidSaxParserException - No SAX parser had been registered

elapsedTime

public static double elapsedTime(long elapsedMillis)

getNodeTypeCounts

public int[] getNodeTypeCounts()
Get a breakdown of node types in the index.

Returns:
A 3-entry integer array containing the node-count breakdowns for the index. In order, counts represent :
  1. element nodes
  2. attribute nodes
  3. text nodes

getTotalFileLength

public long getTotalFileLength()
Return the aggregate file length of all files in the index.

Returns:
The aggregate file length of all files indexed to date

printSessionStats

public void printSessionStats(long elapsedMillis,
                              java.io.PrintWriter dest)
Print a summary of indexing information.

Parameters:
elapsedMillis - Elapsed time in milliseconds since the indexing session began.
dest - PrintWriter where output should be directed.

printSessionStats

public void printSessionStats(long elapsedMillis,
                              java.io.PrintWriter destination,
                              boolean doTextBlocks)
Print a summary of indexing information.

Parameters:
elapsedMillis - Elapsed time in milliseconds since the indexing session began.
destination - PrintWriter where output should be directed.
doTextBlocks - Show space allocated in NodeTrees for concatenated element and attribute text

fileToURLString

public static java.lang.String fileToURLString(java.io.File file)
Utility routine borrowed (thanks) from James Clark.

Parameters:
file - The java File object representation of the file.
Returns:
The name translated to a 'fttp:/ ...' URL in String form

setQueryFromFile

public ResultList setQueryFromFile(java.lang.String file)
                            throws InvalidQueryException
An alternative form of setQuery(String).

Parameters:
file - Name of the file containing the XQuery query
Returns:
a ResultList object representing the results
Throws:
InvalidQueryException - If the query was invalid

setQuery

public ResultList setQuery(java.lang.String query)
                    throws InvalidQueryException
Pass an XQuery query to the engine, get a ResultList back.

Parameters:
query - A valid XQuery query in String format
Returns:
a ResultList object
Throws:
InvalidQueryException

getNumDocuments

public int getNumDocuments()
Get the number of documents in the index.


getDocumentName

public java.lang.String getDocumentName(int docId)
Get the name of the document corresponding to its docId.

Parameters:
docId - The integer ID assigned to the document when it was indexed.
See Also:
setDocument(java.lang.String), setExplicitDocument(java.lang.String)

getNumUniqueElementNames

public int getNumUniqueElementNames()

getNumTotalElementNames

public int getNumTotalElementNames()

doSerializeIndex

public boolean doSerializeIndex()

getSerializationDirectory

public java.lang.String getSerializationDirectory()