Class MetadataDocument
- java.lang.Object
-
- de.pangaea.metadataportal.processor.MetadataDocument
-
- Direct Known Subclasses:
OAIMetadataDocument
public class MetadataDocument extends Object
This class holds all information harvested and provides methods forDocumentProcessorto create aXContentBuilderinstance from it.- Author:
- Uwe Schindler
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description classMetadataDocument.XMLConverterThis class handles the transformation from any source to the "official" metadata format and can even validate it
-
Field Summary
Fields Modifier and Type Field Description protected Instantdatestampprotected booleandeletedprotected HarvesterConfigiconfigThe harvester configuration.protected Stringidentifier
-
Constructor Summary
Constructors Constructor Description MetadataDocument(HarvesterConfig iconfig)Constructor, that creates an empty instance for the supplied index configuration.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidaddField(KeyValuePairs kv, FieldConfig f, String val)Helper method to add a field in the correct format to givenXContentBuilder.protected voidaddFields(KeyValuePairs kv)Helper method that adds all fields to the givenXContentBuilderinstance.protected voidaddSystemVariables(Map<QName,Object> vars)Helper method to register all standard variables for the XPath/Templates evaluation.protected KeyValuePairscreateEmptyKeyValuePairs()Helper method that generates an emptyXContentBuilderinstance.protected NodeListevaluateTemplate(ExpressionConfig expr)Helper method to evaluate a template.protected StringevaluateTemplateAsXHTML(FieldConfig expr)Helper method to evaluate a template and return result as XHTML.protected voidfinalizeKeyValuePairs(KeyValuePairs kv)Helper method that finalizes the JSON documentMetadataDocument.XMLConvertergetConverter()Returns a converter instance that does transformation and validation according to index config.InstantgetDatestamp()DocumentgetFinalDOM()Returns XML contents as DOM tree.StringgetIdentifier()KeyValuePairsgetKeyValuePairs()Converts this instance to a Elasticsearch JSON nodeStringgetXML()Returns XML contents as String (a cache is used).booleanisDeleted()Returns deletion status.voidloadFromElasticSearchHit(org.elasticsearch.search.SearchHit hit)"Harvests" a ElasticsearchSearchHitfrom index for re-parsing.protected booleanprocessFilters()Helper method that evaluates all filters.protected voidprocessXPathVariables()Helper method to process all user supplied variables for the XPath/Templates evaluation.voidsetDatestamp(Instant datestamp)Set the datestamp (last modification time of document file).voidsetDeleted(boolean deleted)Marks a harvested document as deleted.voidsetFinalDOM(Document dom)Sets XML final (transformed) xml contents as DOM tree.voidsetIdentifier(String identifier)Set the document identifier.StringtoString()protected voidwalkNodeTexts(StringBuilder sb, Node n, boolean topLevel)Helper method to walk through a DOM tree node (n) and collect strings.
-
-
-
Field Detail
-
deleted
protected boolean deleted
- See Also:
setDeleted(boolean)
-
datestamp
protected Instant datestamp
- See Also:
setDatestamp(java.time.Instant)
-
identifier
protected String identifier
- See Also:
setIdentifier(java.lang.String)
-
iconfig
protected final HarvesterConfig iconfig
The harvester configuration.
-
-
Constructor Detail
-
MetadataDocument
public MetadataDocument(HarvesterConfig iconfig)
Constructor, that creates an empty instance for the supplied index configuration.
-
-
Method Detail
-
loadFromElasticSearchHit
public void loadFromElasticSearchHit(org.elasticsearch.search.SearchHit hit) throws Exception"Harvests" a ElasticsearchSearchHitfrom index for re-parsing. Extracts XML blob, identifier and datestamp fromDocument. Stored fields are not restored. They are regenerated by re-executing all XPath and Templates.HarvesterConfigis used for index specific conversions.- Throws:
Exception
-
getXML
public String getXML() throws Exception
Returns XML contents as String (a cache is used).- Throws:
Exception
-
setFinalDOM
public void setFinalDOM(Document dom)
Sets XML final (transformed) xml contents as DOM tree. Invalidates cache.
-
getFinalDOM
public Document getFinalDOM()
Returns XML contents as DOM tree.
-
getConverter
public MetadataDocument.XMLConverter getConverter()
Returns a converter instance that does transformation and validation according to index config.
-
setDeleted
public void setDeleted(boolean deleted)
Marks a harvested document as deleted. A deleted document is not indexed and will be explicitely deleted from index. A deleted document should not contain XML data, if there is XML data it will be ignored.
-
isDeleted
public boolean isDeleted()
Returns deletion status.- See Also:
setDeleted(boolean)
-
setDatestamp
public void setDatestamp(Instant datestamp)
Set the datestamp (last modification time of document file).
-
getDatestamp
public Instant getDatestamp()
- See Also:
setDatestamp(java.time.Instant)
-
setIdentifier
public void setIdentifier(String identifier)
Set the document identifier.
-
getIdentifier
public String getIdentifier()
- See Also:
setIdentifier(java.lang.String)
-
getKeyValuePairs
public KeyValuePairs getKeyValuePairs() throws Exception
Converts this instance to a Elasticsearch JSON node- Returns:
XContentBuilderornull, if doc was deleted.- Throws:
Exception- if an exception occurs during transformation (various types of exceptions can be thrown).IllegalStateException- if index configuration is unknown
-
createEmptyKeyValuePairs
protected KeyValuePairs createEmptyKeyValuePairs() throws Exception
Helper method that generates an emptyXContentBuilderinstance. The standard fields are set to the doc properties (identifier, datestamp)- Returns:
XContentBuilderornull, if doc was deleted.- Throws:
Exception- if an exception occurs during transformation (various types of exceptions can be thrown).IllegalStateException- if identifier is empty.
-
finalizeKeyValuePairs
protected void finalizeKeyValuePairs(KeyValuePairs kv) throws Exception
Helper method that finalizes the JSON document- Throws:
Exception
-
addFields
protected void addFields(KeyValuePairs kv) throws Exception
Helper method that adds all fields to the givenXContentBuilderinstance. This method executes all XPath/Templates and converts the results.- Throws:
Exception- if an exception occurs during transformation (various types of exceptions can be thrown).
-
processFilters
protected boolean processFilters() throws ExceptionHelper method that evaluates all filters. This method executes the XPath and converts the results to a boolean. The results of all filters are combined according to the ACCEPT/DENY type.- Throws:
Exception- if an exception occurs during transformation (various types of exceptions can be thrown).
-
addSystemVariables
protected void addSystemVariables(Map<QName,Object> vars)
Helper method to register all standard variables for the XPath/Templates evaluation. Overwrite this method to register any special variables dependent on theMetadataDocumentimplementation. The variables must be registered in the suppliedMap.
-
processXPathVariables
protected final void processXPathVariables() throws ExceptionHelper method to process all user supplied variables for the XPath/Templates evaluation. The variables are stored in thread local storage.- Throws:
Exception- if an exception occurs during transformation (various types of exceptions can be thrown).
-
evaluateTemplate
protected NodeList evaluateTemplate(ExpressionConfig expr) throws TransformerException
Helper method to evaluate a template. This method is called by variables and fields, when a template is used instead of a XPath.For internal use only!
- Throws:
TransformerException
-
evaluateTemplateAsXHTML
protected String evaluateTemplateAsXHTML(FieldConfig expr) throws TransformerException, IOException
Helper method to evaluate a template and return result as XHTML. This method is called by fields with datatype XHTML.For internal use only!
- Throws:
TransformerExceptionIOException
-
walkNodeTexts
protected void walkNodeTexts(StringBuilder sb, Node n, boolean topLevel)
Helper method to walk through a DOM tree node (n) and collect strings.For internal use only!
-
addField
protected void addField(KeyValuePairs kv, FieldConfig f, String val) throws Exception
Helper method to add a field in the correct format to givenXContentBuilder. The format is defined by theFieldConfig. The value is given as string.For internal use only!
- Throws:
Exception- if an exception occurs during transformation (various types of exceptions can be thrown).
-
-