Claude's context window is 200K tokens. That's roughly 150,000 words — enough for a decent-sized novel. Surely that's enough context for any project, right? Not even close.
The Scale of Team Knowledge
A moderately active team generates thousands of observations across their AI conversations every week. Architecture discussions, design explorations, debugging sessions, product decisions, customer insights, technical research — all of it producing context that's valuable to future conversations.
After a month of active use, a typical SLEDS space contains context equivalent to 2-3 million tokens. After six months, it can be 10-20 million. No context window can hold that. And it shouldn't have to.
The Right Context, Not All Context
The insight behind SLEDS isn't "make the window bigger." It's "make the window smarter." When you start a conversation in Claude connected to a sled, you don't get 10 million tokens of context dumped in. You get the 5,000-10,000 tokens that are most relevant to what you're working on right now.
This works through a three-stage retrieval pipeline. First, SLEDS uses the current conversation's topic to perform a semantic search across all threads and assets in the space. Second, it ranks results by recency, relevance, and connection strength in the knowledge graph. Third, it synthesizes the top results into a compact context bundle that fits comfortably within any model's window.
Semantic Search + Knowledge Graph
The combination of vector search (pgvector with HNSW indexing) and knowledge graph traversal is what makes this work. Vector search finds conceptually related content: if you're discussing payment integration, it surfaces threads about Stripe, PCI compliance, and pricing — even if those threads don't share keywords.
The knowledge graph adds structural relationships. If the Stripe integration thread is linked to the database schema asset, and the database schema is linked to the migration planning thread, those connections surface relevant context that pure semantic search might miss.
Dynamic Relevance
The context bundle isn't static. As a conversation evolves, the relevant context shifts. If you start by discussing the payment flow and then pivot to error handling, the context system can surface different threads for the second half of the conversation. This happens transparently — you don't need to ask for different context.
The Result: Functionally Infinite Memory
From the user's perspective, the AI knows everything the team knows. You never hit a wall where the AI says "I don't have context about that." You never need to paste in background information. You never need to tell the AI about a decision that was made in a different conversation.
The context window is still 200K tokens. But the memory behind it is unlimited. That's the difference SLEDS makes — not a bigger window, but a smarter one.