public class PanFMP1IndexHarvester extends SingleFileEntitiesHarvester
Since panFMP was upgraded to use Elasticsearch 2.0, it is no longer possible to directly read old Lucene 3 indexes as used by panFMP 1.x. To use this harvester, you have to first download a latest Apache Lucene 4.10.x version and run the IndexUpgrader command line tool. After converting the index, you can harvest the index using this tool.
This harvester supports the following additional harvester properties:
indexDir
: file system directory with the old panFMP v1 indexquery
: query that matches all documents to harvest (default: all documents)analyzerClass
: class name of Analyzer
to use for the above query string (default: "org.apache.lucene.analysis.standard.StandardAnalyzer")queryParserClass
: class name of QueryParser
to use for the above query string (default: "org.apache.lucene.queryparser.classic.QueryParser")defaultQueryParserOperator
: default operator when parsing above query string (AND/OR) (default: "AND")identifierPrefix
: This prefix is added in front of all identifiers from the foreign index (default: "")luceneMatchVersion
: The Version
constant passed to the analyzer and query parser of the foreign index (default is Version.LUCENE_CURRENT
)Modifier and Type | Field and Description |
---|---|
static String |
FIELDNAME_CONTENT |
static String |
FIELDNAME_DATESTAMP |
static String |
FIELDNAME_IDENTIFIER |
static String |
FIELDNAME_XML |
fromDateReference, harvestCount, HARVESTER_METADATA_FIELD_LAST_HARVESTED, harvestMessageStep, iconfig, log, processor
Constructor and Description |
---|
PanFMP1IndexHarvester(HarvesterConfig iconfig) |
Modifier and Type | Method and Description |
---|---|
void |
close(boolean cleanShutdown)
Closes harvester.
|
protected void |
enumerateValidHarvesterPropertyNames(Set<String> props)
This method is used by subclasses to enumerate all available harvester
properties that are implemented by them.
|
void |
harvest()
This method is called by the harvester after
Harvester.open(de.pangaea.metadataportal.processor.ElasticsearchConnection, java.lang.String) 'ing it. |
void |
open(ElasticsearchConnection es,
String targetIndex)
Opens harvester for harvesting documents described by the
given
HarvesterConfig . |
addDocument, addDocument, cancelMissingDocumentDelete
addDocument, createMetadataDocumentInstance, deleteDocument, finishReindex, getValidHarvesterPropertyNames, isAllIndexes, isClosed, isDocumentOutdated, main, prepareReindex, runHarvester, runHarvester, setHarvestingDateReference, setValidIdentifiers
public static final String FIELDNAME_CONTENT
public static final String FIELDNAME_IDENTIFIER
public static final String FIELDNAME_DATESTAMP
public static final String FIELDNAME_XML
public PanFMP1IndexHarvester(HarvesterConfig iconfig) throws Exception
Exception
public void open(ElasticsearchConnection es, String targetIndex) throws Exception
Harvester
HarvesterConfig
. Opens Harvester.processor
for usage in
Harvester.harvest()
method.public void close(boolean cleanShutdown) throws Exception
Harvester
Harvester.processor
is
closed.close
in class SingleFileEntitiesHarvester
cleanShutdown
- enables writing of status information to the Elasticsearch instance for the next
harvesting. If an error occurred during harvesting this should not
be done.Exception
- if an exception occurs during closing (various types of
exceptions can be thrown). Exceptions can be thrown asynchronous
and may not affect the correct document.public void harvest() throws Exception
Harvester
Harvester.open(de.pangaea.metadataportal.processor.ElasticsearchConnection, java.lang.String)
'ing it.
Overwrite this method in your harvester class. This method should harvest
files from somewhere, generate MetadataDocument
s and add them with
Harvester.addDocument(de.pangaea.metadataportal.processor.MetadataDocument)
.protected void enumerateValidHarvesterPropertyNames(Set<String> props)
Harvester
Set
. The public API for client code requesting property names
is Harvester.getValidHarvesterPropertyNames()
.enumerateValidHarvesterPropertyNames
in class SingleFileEntitiesHarvester
Harvester.getValidHarvesterPropertyNames()
Copyright ©2007-2013 panFMP Developers c/o Uwe Schindler