DLESE Tools
v1.6.0

org.dlese.dpc.index.reader
Class FileIndexingServiceDocReader

java.lang.Object
  extended by org.dlese.dpc.index.reader.DocReader
      extended by org.dlese.dpc.index.reader.FileIndexingServiceDocReader
All Implemented Interfaces:
Serializable
Direct Known Subclasses:
ErrorDocReader, SimpleFileIndexingServiceDocReader, XMLDocReader

public abstract class FileIndexingServiceDocReader
extends DocReader
implements Serializable

An abstract bean for accessing the data stored in a Lucene Document that was created by a FileIndexingServiceWriter. This class may be extended for each Document type that might be returned in a search.

Author:
John Weatherley
See Also:
Serialized Form

Field Summary
 
Fields inherited from class org.dlese.dpc.index.reader.DocReader
conf, doc, resultDoc, score
 
Constructor Summary
protected FileIndexingServiceDocReader()
          Constructor that initializes an empty DocReader.
protected FileIndexingServiceDocReader(org.apache.lucene.document.Document doc)
          Constructor that may be used programatically to wrap a reader around a Lucene Document that was created by a DocWriter.
 
Method Summary
 boolean fileExists()
          Determine whether the file associated with this Document exists.
 Date getDateFileWasIndexed()
          Gets the date this record was indexed.
 String getDateFileWasIndexedString()
          Gets the date and time this record was indexed, as a String.
protected static String getDateStamp()
          Return a string for the current time and date, sutiable for display in log files and output to standout:
 String getDeleted()
          Determine whether the status of this Document is deleted, indicated by a return value of "true".
 String getDocDir()
          Gets the absolute path of the directory that contained the File used to index the Document.
 String getDocsource()
          Gets the absolute path of the file that was used to index the Document.
 String getDocsourceEncoded()
          Gets the absolute path of the file that was used to index the Document, encoded.
 String getDoctype()
          Gets doctype associated with the Document, for example 'dlese_ims,' 'adn,' or 'html'.
 File getFile()
          Gets the File that was used to index the Document.
 String getFileExists()
          Determine whether the file associated with this Document exists, indicated by a return value of "true".
 String getFileName()
          Gets the name of the File that was used to index the Document.
 String getFullContent()
          Gets the full content of the file that was used to index the Document.
 String getFullContentEncodedAs(String characterEncoding)
          Gets the full content of the file that was used to index the Document, returned in the given character encoding, for example UTF-8.
 long getLastModified()
          Gets the File modification time of the File used to index the Document.
 String getLastModifiedAsUTC()
          Gets the file modification date in UTC format for the given record.
 String getLastModifiedString()
          Gets a String representataion of the File modification time of the File used to index the Document.
 boolean isDeleted()
          Determine whether the status of this Document is deleted.
protected static void prtln(String s)
          Output a line of text to standard out, with datestamp, if debug is set to true.
protected static void prtlnErr(String s)
          Output a line of text to error out, with datestamp.
protected static void setDebug(boolean db)
          Sets the debug attribute.
 
Methods inherited from class org.dlese.dpc.index.reader.DocReader
doInit, getAttribute, getDocMap, getDocument, getIndex, getLazyDocMap, getQuery, getReaderType, getRepositoryManager, getScore, init, setDoc
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FileIndexingServiceDocReader

protected FileIndexingServiceDocReader(org.apache.lucene.document.Document doc)
Constructor that may be used programatically to wrap a reader around a Lucene Document that was created by a DocWriter.

Parameters:
doc - A Lucene Document.
See Also:
DocWriter

FileIndexingServiceDocReader

protected FileIndexingServiceDocReader()
Constructor that initializes an empty DocReader.

Method Detail

getFullContent

public final String getFullContent()
Gets the full content of the file that was used to index the Document. This includes all XML or HTML tags, etc.

Returns:
The full content as text, or empty string if unable to process.

getFullContentEncodedAs

public final String getFullContentEncodedAs(String characterEncoding)
Gets the full content of the file that was used to index the Document, returned in the given character encoding, for example UTF-8.

Parameters:
characterEncoding - The character encoding to return, for example 'UTF-8'
Returns:
The full content as text, or empty string if unable to process.

getDoctype

public String getDoctype()
Gets doctype associated with the Document, for example 'dlese_ims,' 'adn,' or 'html'. Note that to support wildcard searching, the doctype is indexed with a leading '0' appened to the beginning. This method strips the leading zero prior to returning.

Returns:
The doctype value.

getDeleted

public String getDeleted()
Determine whether the status of this Document is deleted, indicated by a return value of "true". This does not necessarily mean the file has been deleted.

Returns:
The String "true" if the status is deleted, else "false".

isDeleted

public boolean isDeleted()
Determine whether the status of this Document is deleted. This does not necessarily mean the file has been deleted.

Field: status [true]

Returns:
True if the status is deleted.

getFileExists

public String getFileExists()
Determine whether the file associated with this Document exists, indicated by a return value of "true".

Returns:
The String "true" if the file exists, else "false".

fileExists

public boolean fileExists()
Determine whether the file associated with this Document exists.

Returns:
True if the file exists, else false.

getDateFileWasIndexedString

public String getDateFileWasIndexedString()
Gets the date and time this record was indexed, as a String.

Returns:
The date and time this record was indexed

getDateFileWasIndexed

public Date getDateFileWasIndexed()
Gets the date this record was indexed.

Returns:
The date this record was indexed

getLastModifiedString

public String getLastModifiedString()
Gets a String representataion of the File modification time of the File used to index the Document. Note that while this represents the File modification time, this date stamp does not get updated until the File is re-indexed by the indexer.

Returns:
The File modification time.

getLastModifiedAsUTC

public String getLastModifiedAsUTC()
Gets the file modification date in UTC format for the given record.

Returns:
The file modification date value.

getLastModified

public long getLastModified()
Gets the File modification time of the File used to index the Document. Note that while this represents the File modification time, this date stamp does not get updated until the File is re-indexed by the indexer.

Returns:
The File modification time.

getFile

public File getFile()
Gets the File that was used to index the Document.

Returns:
The source File.

getFileName

public String getFileName()
Gets the name of the File that was used to index the Document.

Returns:
The source File name.

getDocsource

public String getDocsource()
Gets the absolute path of the file that was used to index the Document.

Returns:
The absolute path the the underlying file.

getDocsourceEncoded

public String getDocsourceEncoded()
Gets the absolute path of the file that was used to index the Document, encoded.

Returns:
The absolute path the the underlying file.

getDocDir

public String getDocDir()
Gets the absolute path of the directory that contained the File used to index the Document.

Returns:
The docDir value.

getDateStamp

protected static final String getDateStamp()
Return a string for the current time and date, sutiable for display in log files and output to standout:

Returns:
The dateStamp value

setDebug

protected static final void setDebug(boolean db)
Sets the debug attribute.

Parameters:
db - The new debug value

prtlnErr

protected static void prtlnErr(String s)
Output a line of text to error out, with datestamp.

Parameters:
s - The text that will be output to error out.

prtln

protected static void prtln(String s)
Output a line of text to standard out, with datestamp, if debug is set to true.

Parameters:
s - The String that will be output.

DLESE Tools
v1.6.0