|
DLESE Tools v1.6.0 |
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.dlese.dpc.oai.harvester.Harvester
public class Harvester
Harvests metadata from an OAI data provider, saving the results to file or returning the raw XML as an array of Strings. Supports data providers that use resumption tokens for flow control , selective harvesting by date or set , gzip response compression and other protocol features. Supports OAI protocol versions 1.1 and 2.0 .
To perform a harvest, use one of the following methods:
harvest(java.lang.String, java.lang.String, java.lang.String, java.util.Date, java.util.Date, java.lang.String, boolean, org.dlese.dpc.oai.harvester.HarvestMessageHandler, org.dlese.dpc.oai.harvester.OAIChangeListener, boolean, boolean, boolean, int)
}
main(java.lang.String[])
doHarvest(java.lang.String, java.lang.String, java.lang.String, java.util.Date, java.util.Date, java.lang.String, boolean, java.lang.String, java.lang.String, boolean, boolean, boolean)
.
HarvestMessageHandler
,
OAIChangeListener
Constructor Summary | |
---|---|
Harvester()
Creates a Harvester that uses no HarvestMessageHandler or OAIChangeListener. |
|
Harvester(HarvestMessageHandler msgHandler,
OAIChangeListener oaiChangeListener,
int timeOutMilliseconds)
Creates a Harvester that uses the given HarvestMessageHandler. |
Method Summary | |
---|---|
String[][] |
doHarvest(String baseURL,
String metadataPrefix,
String setSpec,
Date from,
Date until,
String outdir,
boolean splitBySet,
String zipName,
String zDir,
boolean writeHeaders,
boolean harvestAll,
boolean harvestAllIfNoDeletedRecord)
Performs the harvest. |
void |
error(SAXParseException exc)
Handles errors. |
void |
fatalError(SAXParseException exc)
Handles fatal errors. |
long |
getEndTime()
Gets the endTime when the havest completed either because of an error or at the end of a successful harvest. |
String |
getHarvestedRecordsDir()
Gets the harvestedRecordsDir attribute of the Harvester object |
long |
getHarvestUid()
Returns a unique ID for this harvest. |
int |
getNumRecordsHarvested()
Gets the current number of records that have been harvested by this harvester. |
int |
getNumResumptionTokensIssued()
Gets the number of resumption tokens that have currently been issued by the data provider. |
long |
getStartTime()
Gets the startTime when the harvest began, or 0 if it has not begun yet. |
static String[][] |
harvest(String baseURL,
String metadataPrefix,
String setSpec,
Date from,
Date until,
String outdir,
boolean splitBySet,
HarvestMessageHandler msgHandler,
OAIChangeListener oaiChangeListener,
boolean writeHeaders,
boolean harvestAll,
boolean harvestAllIfNoDeletedRecord,
int timeOutMilliseconds)
Harvest the given provider, saving the resulting metadata to file or returning the results as an array of Strings. |
boolean |
isRunning()
Determines whether this Harvester is currently running or not. |
void |
kill()
Gracefully kills the harvest after the current record is finished being harvested. |
static void |
main(String[] args)
Command line interface for the harvester. |
static void |
setDebug(boolean db)
Sets the debug attribute object |
void |
setNumRecordsForNotification(int numRecords)
Sets the number of records harvested before statusMessage notifications to the HarvestMessageHandler are made. |
void |
warning(SAXParseException exc)
Handles warnings. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public Harvester()
public Harvester(HarvestMessageHandler msgHandler, OAIChangeListener oaiChangeListener, int timeOutMilliseconds)
msgHandler
- The HarvestMessageHandler that will receive messages as the harvest
progresses, or null if none.oaiChangeListener
- The OAIChangeListener that will recieve notifications, or null for none.timeOutMilliseconds
- Number of milliseconds the harvester will wait for a response from the data
provider before timing outMethod Detail |
---|
public static void main(String[] args)
Arguments (required arguments must be in this order, optional arguments may be in any order):
args
- The command line argumentspublic static String[][] harvest(String baseURL, String metadataPrefix, String setSpec, Date from, Date until, String outdir, boolean splitBySet, HarvestMessageHandler msgHandler, OAIChangeListener oaiChangeListener, boolean writeHeaders, boolean harvestAll, boolean harvestAllIfNoDeletedRecord, int timeOutMilliseconds) throws Hexception, OAIErrorException
SimpleHarvestMessageHandler
to have harvest messages sent to standard out. A OAIChangeListener
may be specified to recieve messages about chages to harvested records.
baseURL
- The baseURL of the data provider, for example
"http://www.dlese.org/oai/provider"metadataPrefix
- The metadataPrefix, for example "oai_dc"setSpec
- The set to harvest, for example "testset", or null to harvest all
setsfrom
- The from date, for example "2003-12-31T23:59:59Z", or null for noneuntil
- The until date, for example "2003-12-31T23:59:59Z", or null for noneoutdir
- The path of output dir. If null or "", we return the String[][]
array; if specified we return nullmsgHandler
- A handler for status messages that occur during the harvest, or null
to ingnore messagesoaiChangeListener
- The OAIChangeListener that will recieve notifications, or null for
nonewriteHeaders
- True to have OAI headers written to the output, false not toharvestAll
- True to delete previous harvested record files and harvest all
records again from scratch; false to preserve previous record files and replace or delete only those
that have changedharvestAllIfNoDeletedRecord
- True to harvest all record files from scratch if deleted records are
not supportedsplitBySet
- True to save each record in separate directories split by set inside
outdir, false to save all records to the root of outdirtimeOutMilliseconds
- Number of milliseconds the harvester will wait for a response from
the data provider before timing out
Hexception
- If serious error
OAIErrorException
- If OAI errorpublic void kill()
public void setNumRecordsForNotification(int numRecords)
numRecords
- The new numRecordsForNotification valuepublic long getStartTime()
public String getHarvestedRecordsDir()
public long getHarvestUid()
public long getEndTime()
public int getNumRecordsHarvested()
public int getNumResumptionTokensIssued()
public boolean isRunning()
public String[][] doHarvest(String baseURL, String metadataPrefix, String setSpec, Date from, Date until, String outdir, boolean splitBySet, String zipName, String zDir, boolean writeHeaders, boolean harvestAll, boolean harvestAllIfNoDeletedRecord) throws Hexception, OAIErrorException
metadataPrefix
- metadataPrefix. e.g., "oai_dc", or null to harvest all formatssetSpec
- set. e.g., "testset" or null for none.from
- from date. May be null.until
- until date. May be null.outdir
- path of output dir. If null or "", we return the String[][] array; if
specified we return null.writeHeaders
- True to have oai headers written to file, false not to. baseURL
- The baseURL of the data provider.splitBySet
- To split setzipName
- Name of the zip file to save to, or null for no zippingzDir
- Directory of the zipfileharvestAll
- True to delete previous harvested records and harvest all records
again from scratchharvestAllIfNoDeletedRecord
- True to harvest all records from scratch if deleted records are not
supported
Hexception
- If serious error.
OAIErrorException
- If OAI error was returned by the data provider.public static void setDebug(boolean db)
db
- The new debug valuepublic void fatalError(SAXParseException exc)
fatalError
in interface ErrorHandler
exc
- The Exception thrownpublic void error(SAXParseException exc)
error
in interface ErrorHandler
exc
- The Exception thrownpublic void warning(SAXParseException exc)
warning
in interface ErrorHandler
exc
- The Exception thrown
|
DLESE Tools v1.6.0 |
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |