latent-briefing

Solid

This skill should be used when the user asks to "share memory between agents", "KV cache compaction for multi-agent", "orchestrator worker context", "latent briefing", "reduce worker tokens", "cross-agent memory without summarization", or discusses Attention Matching compaction, recursive language models with workers, or token explosion in hierarchical agents.

AI & Automation 895 stars 164 forks Updated today MIT

Install

View on GitHub

Quality Score: 93/100

Stars 20%
98
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Latent Briefing and KV Cache Memory Sharing Hierarchical multi-agent systems often pay for the same context twice. The orchestrator accumulates a long reasoning trajectory, but each worker usually receives only a narrow text handoff such as a subtask prompt plus raw document slices. Passing the full trajectory fixes coverage but drives token cost up on every worker call. Summarization introduces latency and information loss. Retrieval helps with document access but does not preserve the orchestrator's evolving reasoning state. Latent Briefing addresses this by sharing memory at the **representation level** rather than the text level. The core idea is to compact the orchestrator trajectory in the worker model's KV cache, keeping positions that are most relevant to the **current worker task**. The method builds on **Attention Matching (AM)** KV cache compaction and adapts it for inference-time multi-agent handoff with task-guided queries, a shared token mask across heads, and robust thresholding. ## When to Activate Activate this skill when: - Designing orchestrator-worker or supervisor-specialist systems where workers need access to prior orchestrator state without replaying the full trajectory as text - Evaluating alternatives to LLM summarization or RAG for cross-agent state transfer - Implementing or studying **KV cache compaction** as a first-class inference primitive, not only prefix caching of identical prompts - Debugging token explosion in recursive, hierarchica...

Details

Author
guanyang
Repository
guanyang/antigravity-skills
Created
5 months ago
Last Updated
today
Language
TypeScript
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category