|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectde.pangaea.metadataportal.harvester.Harvester
de.pangaea.metadataportal.harvester.SingleFileEntitiesHarvester
public abstract class SingleFileEntitiesHarvester
Abstract harvester class for single file entities (like files from web page or from a local directory). The harvester makes it possible
to add XML documents given by a Source to the index. These are harvested, but if an fatal parse error occurs, the harvester will then stop harvesting
(like it would be with OAI-PMH), ignore the document, or delete it (if existent in index) depending on the harvester property "parseErrorAction".
This panFMP harvester supports the following harvester properties in adidition to the default ones:
parseErrorAction: What to do if a parse error occurs?
Can be STOP, IGNOREDOCUMENT, DELETEDOCUMENT (default is to ignore the document)deleteMissingDocuments: remove documents after harvesting that were deleted from source (maybe a heavy operation). (default: true)
| Field Summary |
|---|
| Fields inherited from class de.pangaea.metadataportal.harvester.Harvester |
|---|
fromDateReference, harvestCount, harvestMessageStep, iconfig, index, log |
| Constructor Summary | |
|---|---|
SingleFileEntitiesHarvester()
|
|
| Method Summary | |
|---|---|
protected void |
addDocument(String identifier,
Date lastModified,
Source xml)
Adds a document to the Harvester.index working in the background. |
protected void |
addDocument(String identifier,
long lastModified,
Source xml)
Adds a document to the Harvester.index working in the background. |
protected void |
cancelMissingDocumentDelete()
disable the property "deleteMissingDocuments" for this instance. |
void |
close(boolean cleanShutdown)
Closes harvester. |
protected void |
enumerateValidHarvesterPropertyNames(Set<String> props)
This method is used by subclasses to enumerate all available harvester properties that are implemented by them. |
void |
open(SingleIndexConfig iconfig)
Opens harvester for harvesting documents into the index described by the given SingleIndexConfig. |
| Methods inherited from class de.pangaea.metadataportal.harvester.Harvester |
|---|
addDocument, createMetadataDocumentInstance, getValidHarvesterPropertyNames, harvest, isClosed, isDocumentOutdated, isDocumentOutdated, main, runHarvester, runHarvester, setHarvestingDateReference |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public SingleFileEntitiesHarvester()
| Method Detail |
|---|
public void open(SingleIndexConfig iconfig)
throws Exception
HarvesterSingleIndexConfig.
Opens Harvester.index for usage in Harvester.harvest() method.
open in class HarvesterException - if an exception occurs during opening (various types of exceptions can be thrown).
public void close(boolean cleanShutdown)
throws Exception
HarvesterHarvester.index is closed.
close in class HarvestercleanShutdown - enables writing of status information to the index for the next harvesting. If an error occured during harvesting this should not be done.
Exception - if an exception occurs during closing (various types of exceptions can be thrown).
Exceptions can be thrown asynchronous and may not affect the currect document.
protected final void addDocument(String identifier,
Date lastModified,
Source xml)
throws Exception
Harvester.index working in the background. If a parsing error occurs the document is handled according to parseErrorAction.
It is also added to the valid identifiers (if unseen documents should be deleted).
identifier - is the document's identifier in the indexlastModified - is the last-modification date which is used to calculate the next harvesting start date. If document is older that the last harvesting, it is skipped.xml - is the transformer source of the document, null to only update document status (lastModified) and adding to valid identifiers
ExceptionHarvester.addDocument(MetadataDocument)
protected void addDocument(String identifier,
long lastModified,
Source xml)
throws Exception
Harvester.index working in the background.
ExceptionaddDocument(String,Date,Source)protected void cancelMissingDocumentDelete()
protected void enumerateValidHarvesterPropertyNames(Set<String> props)
HarvesterSet.
The public API for client code requesting property names is Harvester.getValidHarvesterPropertyNames().
enumerateValidHarvesterPropertyNames in class HarvesterHarvester.getValidHarvesterPropertyNames()
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||