If you have loaded a lot of data that you then have deleted your aura instance disk usage may not reflect well in the metrics displayed in the Monitoring tabs. This behaviour is normal and a result of an optimisation of the datastore : nodes ID reuse. This means when a node is deleted it essentially becomes un-referenced and instead of involving a write operation (that would consume IOPS and involve a system call) it is only affecting the metadata. As a result the datastore size on disk doesn't shrink accordingly but becomes essentially "uncompacted" with many unused disk data space reserved but not used.
See more information on space reuse
Currently, Aura does not offer the compaction feature as standard and while we work to deliver this important management feature this is the current approach you should follow these steps:
- (on-prem) Set up a Neo4j with the same minor version as Aura (ex Version 4.3.5 for Aura-4.3). We recommend downloading Neo4j through Neo4j Desktop for an easy selection of the version and administration. Also using the Enterprise version ensure you have access to the essential database administration tools for the next operations.
- (Aura) Download a Dump file (Export) from your Aura instance
- (on-prem) Load the dump file
neo4j-admin load
, this will create the store and should require the same disk space as in your Aura instance (ie before disk compaction). See details here - (on-prem) Start the database to validate it is imported properly (from Desktop)
- (on-prem) Stop the database (from Desktop)
- (on-prem) Run
neo4j-admin copy --compact-node-store
to copy and compact the datastore. See details here.
Note that you should consider the following- All index are not going to be copied but a file is generated containing the commands to recreate them later.
- Your machine needs a lot of memory and excellent disk performance (see note on IOPS)
- If your datastore is large you need to help allocate more memory to process large chunks at a time :
--to-pagecache=4G
--from-pagecache=16G
(as close to as possible of the source datastore size) - Disk space: you will need the original store size + the compacted size (or use different disks/partitions ).
- (on-prem) Start the compacted database to validate it starts and that the copy operation worked successfully.
- (on-prem) Stop the database.
- (on-prem) Create a Dump file (will be based on compacted data)
neo4j-admin dump
. See details here. - (on-prem) Run
push-to-cloud
the size validation that happens client side will should succeed. - (Aura) Once the dump is loaded, go and verify the Metrics -> Storage Used (%) chart and the size change should be reflected (note you may need to select the 6hours time range to see it reflected)
- Connect to your AuraDB Instance and run the cypher commands to re-create the indexes based on Step #6.1 output file.
Comments
0 comments
Please sign in to leave a comment.