Dataflow | How to Install and configure NIFI

What is and for what ?

Nifi is a web tool that we can use to perform data ingestion.

It allows us to listen, format and make a first filter of the messages (data) received by several of its embedded processors.

Likewise process and distribute data.

Download Packages and checks

Download the packages from:

There we will find different versions of the product.

Then we proceed to perform a CHECKSUM to verify that it is a safe package and that it did not suffer alterations along the way.

[amercado.nbfor101200] ➤ md5sum nifi-1.4.0-bin.tar.gz
28c5511073452cf59e9ec1b278a1a7e4 nifi-1.4.0-bin.tar.gz


Unzip the package in the path
that we selected for our installation, in my particular case I defined the / opt to perform the installation of all my BD packages.

cd /opt
tar -xvzf nifi-1.4.0-bin.tar.gz

In the case that I expose, the user who owns my infra of BigData is the hadoop user, then I will assign the necessary permissions, but you can choose the one you have available or create a user called nifi.

cd /opt
tar -xvzf
chown -R hadoop. /opt/nifi-1.4.0/

As a recommendation, I propose to create a symbolic Link in order to have a simple name.

This point will be easy search the data or integrate the env variable, and the specific version that we have.

ln -s  /opt/nifi-1.4.0/ /opt/nifi

Let’s configure NIFI

In order to start with the application startup, we must review in the following configuration file:

vi /opt/nifi/conf/
  • Here we could change for example the default port, which in 8080.
  • Configurations related to the configuration in Cluster mode, kerberos and zookeper.

NIFI Prerequistes

  • Have Java Installed.
  • Have the Java environment variables set in the .bash_profile file.

Starting the Service

Starting the Service.

In order to start the NIFI service, we can do it manually or as a service, explore the two options.

Start of the service manually.

 /opt/nifi/ start
 /opt/nifi/ status
 /opt/nifi/ stop

Starting the Service, as a Linux Service

We must start with the installation as a service,  So Let’s start by going to the NIFI home, where the binaries are located.

cd /opt/nifi/
bin/ install

Remember that we can choose the name of the service with which we will identify our NIFI, it is a dataflow, we could identify it under that name, or with the name that we think convenient as administrators.

cd /opt/nifi/
bin/ install dataflow

In the case of not choosing any name, remember that you will use the default name of the service, identified as nifi.

Already configured, we upload the services with the classic commands for it:

service start nifi
service status nifi
service stop nifi

Let’s see an example

# service start nifi
# service status nifi
● nifi.service - Apache NiFi
   Loaded: loaded (/etc/systemd/system/nifi.service; disabled; vendor preset: enabled)
   Active: active (running) since Thu 2017-11-23 15:33:50 -03; 52s ago
  Process: 26296 ExecStart=/opt/nifi/bin/ start (code=exited, status=0/SUCCESS)
 Main PID: 26314 (
    Tasks: 91
   Memory: 1.8G
      CPU: 1min 12.788s
   CGroup: /system.slice/nifi.service
           ├─26314 /bin/sh /opt/nifi/bin/ start
           ├─26316 /usr/bin/java -cp /opt/nifi/conf:/opt/nifi/lib/bootstrap/* -Xms12m -Xmx24m -Dorg.apache.nifi.bootstrap.config.log.dir=/opt/nifi/logs -Dorg.apache.nifi.bootstrap.config
           └─26333 java -classpath /opt/nifi/./conf:/opt/nifi/./lib/logback-core-1.2.3.jar:/opt/nifi/./lib/nifi-properties-1.4.0.jar:/opt/nifi/./lib/nifi-framework-api-1.4.0.jar:/opt/nif

Nov 23 15:33:47 srvhadoopt1 systemd[1]: Starting Apache NiFi...
Nov 23 15:33:47 srvhadoopt1[26296]: /opt/nifi/bin/ 88: /opt/nifi/bin/ source: not found
Nov 23 15:33:47 srvhadoopt1[26296]: JAVA_HOME not set; results may vary
Nov 23 15:33:47 srvhadoopt1[26296]: Java home:
Nov 23 15:33:47 srvhadoopt1[26296]: NiFi home: /opt/nifi
Nov 23 15:33:47 srvhadoopt1[26296]: Bootstrap Config File: /opt/nifi/conf/bootstrap.conf
Nov 23 15:33:50 srvhadoopt1 systemd[1]: Started Apache NiFi.

We navigate the url in a browser