The Neo4j Data Warehouse (DWH) Connector has been developed to simplify interoperability between Neo4j (including AuraDB and AuraDS), Spark, and popular data warehouse technologies.
The Neo4j DWH Connector has been created to improve upon the existing functionality available in the Apache Spark Connector for building Spark-based ETLs involving with workflows that move data between Neo4j and popular data warehouses, via a simple, high-level, API which runs on top of Spark.
How to use the connector
The DWH connector can be used to move data between a data warehouse and Neo4j in either direction.
It can be applied by using a high level configuration you generate (including source and target details) and then run the DWH Connector with PySpark or Scala Spark.
The DWH Connector can also be specified as a Spark Submit job, by providing a JSON configuration file with source and target database configurations.
Supported Data Warehouse & Technologies
The DWH connector allows you to interact with the following popular data warehouse technologies:
- Google BigQuery
- Amazon RedShift
- Azure Synapse
For more information regarding the Neo4j Data Warehouse Connector, please refer to the Github repository documentation and there is also an introduction article in the Neo4j Developer Blog which includes some examples.