Retrieval-Augmented Generation (RAG) systems often suffer from the "lost in the middle" phenomenon and limited context windows. When retrieving 20+ documents, the JSON structure overhead (keys, brackets) can consume 30-40% of your valuable context tokens.
By standardizing your vector database metadata index to TOON, you significantly increase the "Information Density" of your retrieved chunks.
Implementation Strategy
Instead of storing raw JSON in your vector store's `metadata` field, store a TOON-formatted string. When retrieving, pass this directly to the LLM. Most modern LLMs (GPT-4, Claude 3) understand TOON natively without decoding.
params = {
# Traditional approach: Verbose
"docs_json": [
{"id": "doc_1", "score": 0.89, "content": "Fiscal policy tightening..."},
{"id": "doc_2", "score": 0.85, "content": "Market reaction was..."}
],
# TOON approach: Dense
"docs_toon": """[2]{id,score,content}:
doc_1,0.89,Fiscal policy tightening...
doc_2,0.85,Market reaction was..."""
}
# The TOON prompt uses 40% fewer tokens for the structure,
# leaving room for 1-2 extra documents in the same window.