Class PanFMP1IndexHarvester


  • public class PanFMP1IndexHarvester
    extends SingleFileEntitiesHarvester
    This harvester supports replication XML contents from a legacy panFMP 1.x installation. It is possible to replicate indexes with a different XML schema (by applying a transformation on the harvested XML content) or replicate only sub-sets of other indexes, based on a query string.

    Since panFMP was upgraded to use Elasticsearch 2.0, it is no longer possible to directly read old Lucene 3 indexes as used by panFMP 1.x. To use this harvester, you have to first download a latest Apache Lucene 4.10.x version and run the IndexUpgrader command line tool. After converting the index, you can harvest the index using this tool.

    This harvester supports the following additional harvester properties:

    • indexDir: file system directory with the old panFMP v1 index
    • query: query that matches all documents to harvest (default: all documents)
    • analyzerClass: class name of Analyzer to use for the above query string (default: "org.apache.lucene.analysis.standard.StandardAnalyzer")
    • queryParserClass: class name of QueryParser to use for the above query string (default: "org.apache.lucene.queryparser.classic.QueryParser")
    • defaultQueryParserOperator: default operator when parsing above query string (AND/OR) (default: "AND")
    • identifierPrefix: This prefix is added in front of all identifiers from the foreign index (default: "")
    • luceneMatchVersion: The Version constant passed to the analyzer and query parser of the foreign index (default is Version.LUCENE_CURRENT)
    Author:
    Uwe Schindler