Anchor | ||||
---|---|---|---|---|
|
Prerequisites
Java Runtime Environment
CogStack Pipeline requires Java SE Runtime Environment in version >= 8.0 to be present in the system.
It is usually recommended to use the official Oracle Java SE JDK. However, the OpenJDK variant of Java Runtime Environment should also work here.
External applications
There are some additional, external applications that selected components of CogStack Pipeline use when processing data. They need to be installed on the system prior running CogStack. These are:
- TesseractOCR – for extracting text from images,
- Image Magick – for performing conversion between image formats.
Running
CogStack Pipeline is run as a command-line application – just type:
java[parameters]
-jar cogstack-*.jar <directory>
where <directory>
specifies the directory where the CogStack configuration file(s) are kept and which will be parsed by CogStack Pipeline application. This is the only one obligatory parameter to provide.
Moreover, CogStack Pipeline provides a number of [optional]
parameters:
-DLOG_LEVEL=<level>
(default:INFO ;
available:DEBUG | INFO | ERROR
) – specifies the logging verbosity level of the displayed to standard output,-DLOG_FILE_NAME=<name>
– specifies the filename where the application logs will be stored (in HTML format),-DFILE_LOG_LEVEL=<level> (default: INFO
;
available:DEBUG | INFO | ERROR
)
– logging verbosity level of the displayed to the file.
Running as a containerised app
CogStack Pipeline application can be also run inside the container, using the official Docker image available from the official cogstacksystems Docker Hub. This is the highly recommended method to run CogStack Pipeline. Docker can provide lightweight virtualisation of a variety of microservices that CogStack makes use of. Hence, when coupled with the microservice orchestration docker compose technology, all of the components required to use CogStack can be set up with a few simple commands.
There are two images available to use: cogstacksystems/cogstack-pipeline:latest
(stable) and cogstacksystems/cogstack-pipeline:dev-latest
(development) – see: Building CogStack for more information.
The Dockerfile used to build both images is available in the main CogStack pipeline directory.
Prerequisites
The only one prerequisite is to have the Docker installed on the system in version >= 1.13.
Running
CogStack Pipeline can be run either as a single container or as a part of ecosystem communicating with other microservices.
Using docker run
To run CogStack Pipeline inside a single container using Docker one can type:
docker run -it cogstacksystems/cogstack-pipeline:latest /bin/bash
This which will launch the CogStack container and spawn a bash
console. From the console, one can launch CogStack Pipeline as explained in Running locally.
Using docker-compose
Running CogStack Pipeline as a container within a configured stack of microservices using Docker Compose is based on the provided microservices configuration file (Docker Compose file, in YAML format). Multiple sample configurations have been covered in the Examples part in the documentation.
For example, using the docker-compose.yml
file from Example 2, CogStack Pipeline service has been defined as:
cogstack:
image: cogstacksystems/cogstack-pipeline:latest
volumes:
- ./cogstack:/usr/src/docker-cogstack/cogstack/cogstack_conf:ro
environment:
- LOG_LEVEL=info
- FILE_LOG_LEVEL=off
depends_on:
- pgsamples
- postgres
- elasticsearch
It uses the latest
version of cogstack-pipeline
image from the Docker hub. It also specifies the mapping of the directories from the local machine ./cogstack
directory to the host's directory /usr/src/docker-cogstack/cogstack/cogstack_conf
(there usually reside CogStack Pipeline configuration file(s)). When deployed, it will launch CogStack Pipeline application and process the data according to the pipeline configuration file residing in the previously mounted /usr/src/docker-cogstack/cogstack/cogstack_conf
directory on the host.
To deploy the CogStack Pipeline application according to the specified microservices configuration and running as one of them, one only needs to type in the directory with the YAML file:
docker-compose up
For more examples with deploying the services, please see Examples part.