Adding a Graph Database to Comind
Claude summary: This post covers the new Neo4j integration we built for Comind. We added service management scripts, batch sync tools, and real-time updates so you can query and explore relationships between concepts, thoughts, and emotions as a graph instead of digging through individual ATProto records.
Cameron’s note
This is another collaborative post between Claude and me. We spent a bunch of time building comprehensive graph database support for Comind. It’s pretty cool - you can finally see how all the concepts relate to each other without having to manually trace through record references.
The Problem
Comind generates a lot of structured data - concepts like “distributed systems” or “municipal broadband,” emotions like curiosity and frustration, thoughts that analyze content. Each piece is stored as an ATProto record with proper references to related content.
The issue is that exploring those relationships is a pain. If you want to see how concepts connect to each other, or track emotional patterns around certain topics, you end up writing custom code to traverse record references. The data is all there, but getting insights out of it requires a lot of manual work.
We’ve been thinking about this for a while. Comind creates all these interesting conceptual relationships, but there wasn’t a good way to explore them interactively.
Graph Database Solution
Graph databases are built for exactly this kind of relationship-heavy data. Instead of tables and foreign keys, you get nodes and edges - concepts connected to thoughts, emotions linked to content, spheres containing related ideas.
We went with Neo4j because it has solid query language support and good tooling for exploration. Most importantly, we could add it as an additional query layer without messing with Comind’s core ATProto architecture.
The key insight was treating it as secondary storage - your ATProto records stay as the source of truth, but the graph database gives you a convenient way to explore relationships.
What We Built
We didn’t just bolt on graph support - we built a complete integration system:
Service Management
First, we needed reliable ways to run Neo4j alongside Comind. The ./scripts/services.sh
script handles everything:
# Start just the database
./scripts/services.sh start database
# Start everything (database + inference)
./scripts/services.sh start all
# Check what's running
./scripts/services.sh status
This plugs into our existing Docker setup, so Neo4j runs alongside the inference services with proper networking.
Batch Sync Tools
For existing data, we built comprehensive sync tools in scripts/graph_sync.py
:
# Set up database schema
python scripts/graph_sync.py --setup-schema
# Sync all existing records
python scripts/graph_sync.py --sync-all
# Sync specific types
python scripts/graph_sync.py --sync-collection me.comind.concept
The sync process maps ATProto records to graph nodes and creates typed relationships. It handles edge cases like missing references and circular dependencies.
Real-Time Updates
The trickiest part was real-time sync. Every time Comind creates a new concept, thought, or emotion, it should immediately show up in the graph database.
This required deep integration with the RecordManager
class:
# Enable graph sync when creating records
record_manager = RecordManager(client, enable_graph_sync=True)
The important part is this never breaks record creation - if the graph database is down, your ATProto records still get created successfully. We built careful error handling to maintain data integrity.
Data Model
We spent time figuring out how to map Comind’s cognitive content to graph structures:
- Concepts become nodes with their text content
- Thoughts become nodes connected to referenced concepts
- Emotions become nodes linked to triggering content
- Relationships become explicit edges with semantic types
The tricky part was handling Comind’s relationship complexity while keeping the graph structure queryable.
What You Get
The result is what we’re calling a “living knowledge graph” - a dynamic representation of your cognitive engagement that grows as you process new content.
When you like a post about municipal broadband, Comind generates concepts like “internet infrastructure,” “local government,” and “digital equity.” These become nodes in your graph. When you encounter another post about rural internet access, new concepts emerge and connect to existing ones, building a network of related ideas.
Interactive Exploration
With Neo4j Browser, you can explore these patterns immediately:
// Find concepts that appear across multiple conversations
MATCH (c:Concept)<-[:CONCEPT_RELATION]-(source1)
MATCH (c)<-[:CONCEPT_RELATION]-(source2)
WHERE source1 <> source2
RETURN c.text, count(DISTINCT source1) + count(DISTINCT source2) as connections
ORDER BY connections DESC
This shows concepts that bridge different contexts - often the most interesting ideas in your graph.
Tracking Evolution
You can see how your understanding develops over time:
// Concept discovery over the last month
MATCH (c:Concept)
WHERE c.createdAt > datetime() - duration({days: 30})
RETURN date(c.createdAt) as day, count(*) as new_concepts
ORDER BY day
This reveals the rhythm of your cognitive engagement - periods of intense concept generation usually mean you’re discovering new topics or diving deep into familiar ones.
Emotional Patterns
The graph reveals emotional patterns that aren’t obvious from individual records:
// What emotions technical concepts trigger
MATCH (c:Concept)<-[:CONCEPT_RELATION]-(content)
MATCH (content)-[:EMOTION_RELATION]->(e:Emotion)
WHERE c.text CONTAINS "system" OR c.text CONTAINS "network"
RETURN e.emotionType, count(*) as frequency
ORDER BY frequency DESC
You might discover that technical content consistently triggers curiosity, or that certain system discussions generate frustration.
Technical Details
The implementation balances several concerns:
Data Integrity
ATProto records remain the source of truth. The graph database is always secondary - if there’s a conflict, the ATProto record wins. This preserves Comind’s decentralized architecture.
Fault Tolerance
Graph sync failures never prevent record creation. If Neo4j is unavailable, records still get created in your ATProto repository. The sync process can resume and self-heal.
Configuration
You can enable/disable the integration at multiple levels:
# Environment variable
COMIND_GRAPH_SYNC_ENABLED=true
# Database connection settings
COMIND_NEO4J_URI=bolt://localhost:7687
COMIND_NEO4J_USER=neo4j
COMIND_NEO4J_PASSWORD=comind123
Use Cases
Research and Analysis
The graph is great for exploratory analysis. Start with a concept and traverse its neighborhood to discover related ideas, or find bridge concepts that connect different topics.
Content Discovery
When you’re interested in a particular concept, the graph shows you related content you might have forgotten about. It’s especially good at surfacing connections across time.
Pattern Recognition
Large-scale patterns become visible that would be hard to spot in individual records. Which concepts cluster together? What emotional patterns emerge around different content types?
Building Tools
The graph provides a foundation for custom analysis tools. Want to build concept recommendations? Emotional pattern dashboards? Export relationship data? The graph structure makes these much easier.
What Doesn’t Change
This integration doesn’t change Comind’s core operation at all. Your ATProto records are still canonical. The lexicons remain unchanged. The jetstream consumer works exactly as before.
The graph database is purely additive. If you never enable it, Comind works exactly as it always has. If you enable it and later disable it, you lose the graph query capabilities but keep all your ATProto data.
We made this choice deliberately - we wanted powerful new capabilities without creating dependencies or lock-in.
Getting Started
Want to try it?
- Start Neo4j:
./scripts/services.sh start database
- Sync existing data:
./scripts/services.sh sync
- Enable real-time sync: Add
COMIND_GRAPH_SYNC_ENABLED=true
to your.env
- Explore: Open Neo4j Browser at http://localhost:7474
The Graph Database documentation has detailed setup and query examples.
Future Ideas
This foundation enables some interesting possibilities:
Adaptive concept generation - using graph topology to influence how Comind creates new concepts and relationships. Dense clusters might suggest synthesis opportunities, sparse areas might indicate knowledge gaps.
Cross-instance networks - multiple Comind instances could contribute to shared knowledge graphs while preserving individual data sovereignty.
Better analysis tools - the graph structure enables sophisticated analysis of knowledge evolution, concept lifecycles, and cognitive patterns.
Thoughts
Building this has been interesting because it reveals patterns in cognitive content that aren’t obvious from individual records. The graph structure makes the hidden connections between ideas visible, along with emotional resonance and understanding evolution.
We’re particularly excited about the real-time aspect - watching your knowledge graph grow as you engage with new content creates a kind of cognitive awareness that’s both useful and engaging. You start noticing which content generates new concepts versus reinforcing existing ones, how emotional responses cluster around topics, and how ideas from different areas unexpectedly connect.
The graph database doesn’t change what Comind does, but it changes how you can interact with what Comind discovers. It transforms isolated cognitive artifacts into a connected knowledge landscape you can explore and learn from.
– Cameron & Claude