Neo4j
Whether you’re setting up a retrieval pipeline or building a knowledge graph, connecting Neo4j to UltiHash lets you efficiently retrieve the raw data behind your graph nodes, like documents, images, or videos. Neo4j stores relationships and metadata, but not the actual file content. That’s where UltiHash comes in.
To make this work, you’ll need to setup a middleware app: it listens for graph queries from Neo4j, resolves the node or relationship metadata to actual file references, and pulls the raw data from UltiHash on demand.
Your typical pipeline will look like:
Store your raw data (e.g., PDFs, videos) in UltiHash.
Ingest metadata and relationships into Neo4j, including references (e.g., file paths or IDs) to the data stored in UltiHash.
A middleware app handles queries to Neo4j and resolves the relevant nodes or relationships.
Based on the references, the middleware retrieves the raw files from UltiHash.
The raw data is then served to the user or passed to downstream applications.
Here’s an example API that handles both the query to Neo4j and the retrieval of raw data from UltiHash, so you can serve the right files directly from your knowledge graph.
FlaskAPI
from neo4j import GraphDatabase
from flask import Flask, request, jsonify
from io import BytesIO
from PIL import Image
import boto3
NEO4J_URI = "neo4j+s://your-link.neo4j.io"
NEO4J_USER = "user"
NEO4J_PASSWORD = "your_password_here"
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASSWORD))
ULTIHASH_ENDPOINT = "endpoint_url"
ULTIHASH_BUCKET = "movies" #my bucket name
ULTIHASH_ACCESS_KEY = "YOUR-ACCESS KEY"
ULTIHASH_SECRET_KEY = "YOUR-SECRET-KEY"
s3 = boto3.client(
"s3",
endpoint_url=ULTIHASH_ENDPOINT,
aws_access_key_id=ULTIHASH_ACCESS_KEY,
aws_secret_access_key=ULTIHASH_SECRET_KEY
)
app = Flask(__name__)
@app.route("/query", methods=["POST"])
def query():
data = request.get_json()
cypher_query = data.get("query")
if not cypher_query:
return jsonify({"error": "Missing query parameter"}), 400
with driver.session() as session:
results = session.run(cypher_query)
records = [record.data() for record in results]
if not records:
return jsonify({"error": "No results found"}), 404
output = []
poster_filenames = [record["poster_filename"] for record in records if "poster_filename" in record]
if poster_filenames:
fetch_and_open_images(poster_filenames)
return jsonify({"results": records})
def fetch_and_open_images(filenames):
for filename in filenames:
if not filename.lower().endswith(('.jpg', '.jpeg')): #my objects have file extensions
filename += ".jpg"
try:
response = s3.get_object(Bucket=ULTIHASH_BUCKET, Key=filename)
file_data = response["Body"].read()
image = Image.open(BytesIO(file_data))
image.show()
except Exception as e:
print(f"Error fetching/opening {filename}: {e}")
if __name__ == "__main__":
app.run(debug=True)
API Query
Neo4j is built around Cypher, a query language designed specifically for working with graph data. It is optimized for pattern-based queries, making it easy to express relationships and navigate connected data efficiently. For this setup, I wanted to keep the user experience seamless: you can interact with Neo4j just as you normally would using Cypher. The API is built to take in any Cypher query, execute it on the database, and return structured results. That means you don’t have to learn new syntax or change how you query, everything works the same. The only difference? When your query involves the associated files (like movie posters), they’re directly retrieved from UltiHash, so you don’t have to manage raw data retrieval separately (and manually!).
curl -X POST "<http://127.0.0.1:5000/query>" \\
-H "Content-Type: application/json" \\
-d '{
"query": "MATCH (a:Actors {actor: \\"Brad Pitt\\"})-[:`Acts in`]->(m:Movies)<-[:`Directed`]-(d:Directors {director: \\"Quentin Tarantino\\"}) RETURN m.movie_title AS movie_title, m.poster_filename AS poster_filename",
"fetch_image": true
}'
Last updated
Was this helpful?