Class DirectoryHarvester


  • public class DirectoryHarvester
    extends SingleFileEntitiesHarvester
    Harvester for traversing file system directories. Identifiers are build from the relative path of files against the base directory.

    This harvester supports the following additional harvester properties:

    • directory: file system directory to harvest
    • recursive: traverse in subdirs (default: false)
    • identifierPrefix: This prefix is appended before all relative file system pathes (that are the identifiers of the documents) (default: "")
    • filenameFilter: regex to match the filename (default: none)
    Author:
    Uwe Schindler