Elasticsearch can index media files by using external tools/parsers to extract the text content of the files.
The parsers can be command line tools or web services. They must return plain text or a JSON structure.
Define parsers for each file extension that should be processed. Use ''%in%'' to specify the input file for CLI tools. Web services must accept the input file as POST data.
Here's a short example:
<code>
pdf /usr/bin/pdftotext %in% -
docx http://givemetext.okfnlabs.org/tika/rmeta
</code>
|
DokuWiki markup allowed
|
Namespace:
|
|
Namespaces:
|
|
Last modified:
|
|
Language:
|
|
by:
|
|
Results found: %s
|
|
Elasticsearch
|
|
ElasticSearch servers: one per line, add port number after a colon, give optional proxy after a comma
|
|
Elastic username is required if security is enabled in Elastic (default since version 8)
|
|
Elastic password is required if security is enabled in Elastic (default since version 8)
|
|
Index name to use, must exist or can be created with the cli.php tool.
|
|
Text to show in search result snippets
|
|
Search in wiki syntax in addition to page content
|
|
How many hits to show per page
|
|
Translation plugin support: search in current language namespace by default
|
|
Disable quick search (page id suggestions)
|
|