Loading data into Neo4j often starts with a number of simple scripts. As your project grows, so does the number and size of your scripts: they quickly become hard to scale, hard to manage, and hard to maintain. Apache Hop is an open-source data orchestration platform that was designed from the ground up to tackle these type of problems.
Apache Hop allows data engineers and data developers to visually design workflows and pipelines to build powerful solutions. No other data engineering platform currently has the integration with Neo4j that Apache Hop offers.
More detailed information can be found here: https://hop.apache.org/manual/latest/getting-started/hop-what-is-hop.html
How to Download and Install HOP?
Prerequisites: Apache Hop requires Java 11 to be installed on the system. Check the Java docs to download and install a Java runtime for your operating system.
How to Download?: Apache Hop can be downloaded from their official downloads page: http://hop.apache.org/download/
How to install?:
Unzip Hop to a folder of your choice. From that folder, you’ll be able to run the different Hop tools through their corresponding scripts.
To access HOP user interface:
For windows: Double click on hop-gui.bat file to open HOP GUI.
For Linux: Run hop-gui.sh file
Few basics you need to know before starting working with HOP:
Transform: A Transform is a part of the pipeline. It is a unit of work performed in a Pipeline.
Pipeline: Actual work is performed in Pipelines.
Workflow: A Workflow is a sequence of operations that are performed sequentially by default
Action: An Action is one operation performed in a Workflow. Actions are executed sequentially by default, with parallel execution as a configuration option. An Action returns a true or false exit code, which can be used (or ignored) in the Workflow’s execution.
Hop: A Hop links Actions in a Workflow or Transforms in a Pipeline
Aura Connection Configuration:
For configuring Aura Connection on HOP, Please follow below steps:
- Go to `File-> New->Neo4j Connection` or you can click on `Metdata->Neo4j Connection` icon on the left side in HOP GUI as shown in below screenshot.
- Configure Aura connection with details according to your environment.
In the Basic Tab:
Connection Name: Fill a connection name of your choice here
Automatic? - Uncheck this checkbox
Server or IP Address: Fill in your Aura Connection URI here. For details on where you can find your URI , please refer this article:https://aura.support.neo4j.com/hc/en-us/articles/4402613544851
Database name: neo4j (Default database name for Aura)
Username: Fill your Neo4j Aura Database username here. (Default is neo4j)
Password: Fill your Neo4j Aura Database password here.
- Go to Protocol Tab:
Version 4 database: Check this if you are running Neo4j version 4 on Aura Db
Use routing, neo4j:// protocol?: check this checkbox
Routing Policy: Leave this one blank
Use encryption?: check this checkbox
Trust all certificates? : Leave unchecked (this will ensure you validate against a public and registered certificate)
- Click on `Test` to make sure the Connection is successful.
- Click on `Save` icon on the top left of the screen to save this connection configuration.
As metadata is set up on the Project level, you can use this connection in your pipelines or workflows to connect with Neo4j Aura without having to fill in the details over and over again.