Context windows are now absurdly long, but retrieval still matters.

Long contexts are impressive, but retrieval isn't obsolete yet.

What changed
Context windows expanded to 200k-2M tokens across major providers
Performance degrades in middle regions of very long contexts
Cost and latency scale roughly linearly with context size
Who it affects
Developers building RAG systems
Teams analyzing large documents
Anyone considering ditching retrieval for huge contexts
What to do now
Test needle-in-haystack performance with your actual content
Calculate cost at scale before committing to large contexts
Keep using retrieval for knowledge bases unless testing proves otherwise
Organize long contexts with clear structure and navigation aids