de.pangaea.metadataportal.harvester
Class ZipFileHarvester

java.lang.Object
  extended by de.pangaea.metadataportal.harvester.Harvester
      extended by de.pangaea.metadataportal.harvester.SingleFileEntitiesHarvester
          extended by de.pangaea.metadataportal.harvester.ZipFileHarvester

public class ZipFileHarvester
extends SingleFileEntitiesHarvester

Harvester for unzipping ZIP files and reading their contents. Identifiers look like: "zip:<identifierPrefix><entryFilename>"

This harvester supports the following additional harvester properties:

Author:
Uwe Schindler

Field Summary
static int DEFAULT_RETRY_COUNT
           
static int DEFAULT_RETRY_TIME
           
static int DEFAULT_TIMEOUT
           
protected  int retryCount
          the retryCount from configuration
protected  int retryTime
          the retryTime from configuration
protected  int timeout
          the timeout from configuration
 
Fields inherited from class de.pangaea.metadataportal.harvester.Harvester
fromDateReference, harvestCount, harvestMessageStep, iconfig, index, log
 
Constructor Summary
ZipFileHarvester()
           
 
Method Summary
protected  void enumerateValidHarvesterPropertyNames(Set<String> props)
          This method is used by subclasses to enumerate all available harvester properties that are implemented by them.
 void harvest()
          This method is called by the harvester after Harvester.open(de.pangaea.metadataportal.config.SingleIndexConfig)'ing it.
 void open(SingleIndexConfig iconfig)
          Opens harvester for harvesting documents into the index described by the given SingleIndexConfig.
 
Methods inherited from class de.pangaea.metadataportal.harvester.SingleFileEntitiesHarvester
addDocument, addDocument, cancelMissingDocumentDelete, close
 
Methods inherited from class de.pangaea.metadataportal.harvester.Harvester
addDocument, createMetadataDocumentInstance, getValidHarvesterPropertyNames, isClosed, isDocumentOutdated, isDocumentOutdated, main, runHarvester, runHarvester, setHarvestingDateReference
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_RETRY_TIME

public static final int DEFAULT_RETRY_TIME
See Also:
Constant Field Values

DEFAULT_RETRY_COUNT

public static final int DEFAULT_RETRY_COUNT
See Also:
Constant Field Values

DEFAULT_TIMEOUT

public static final int DEFAULT_TIMEOUT
See Also:
Constant Field Values

retryCount

protected int retryCount
the retryCount from configuration


retryTime

protected int retryTime
the retryTime from configuration


timeout

protected int timeout
the timeout from configuration

Constructor Detail

ZipFileHarvester

public ZipFileHarvester()
Method Detail

open

public void open(SingleIndexConfig iconfig)
          throws Exception
Description copied from class: Harvester
Opens harvester for harvesting documents into the index described by the given SingleIndexConfig. Opens Harvester.index for usage in Harvester.harvest() method.

Overrides:
open in class SingleFileEntitiesHarvester
Throws:
Exception - if an exception occurs during opening (various types of exceptions can be thrown).

harvest

public void harvest()
             throws Exception
Description copied from class: Harvester
This method is called by the harvester after Harvester.open(de.pangaea.metadataportal.config.SingleIndexConfig)'ing it. Overwrite this method in your harvester class. This method should harvest files from somewhere, generate MetadataDocuments and add them with Harvester.addDocument(de.pangaea.metadataportal.harvester.MetadataDocument).

Specified by:
harvest in class Harvester
Throws:
Exception - of any type.

enumerateValidHarvesterPropertyNames

protected void enumerateValidHarvesterPropertyNames(Set<String> props)
Description copied from class: Harvester
This method is used by subclasses to enumerate all available harvester properties that are implemented by them. Overwrite this method in your own implementation and append all harvester names to the supplied Set. The public API for client code requesting property names is Harvester.getValidHarvesterPropertyNames().

Overrides:
enumerateValidHarvesterPropertyNames in class SingleFileEntitiesHarvester
See Also:
Harvester.getValidHarvesterPropertyNames()


Copyright ©2007-2009 panFMP Developers c/o Uwe Schindler