paper-writing-benchlisted
Install: claude install-skill Ar9av/PaperOrchestra
# PaperWritingBench (§3)
Faithful implementation of the PaperWritingBench dataset construction
procedure from PaperOrchestra (Song et al., 2026, arXiv:2604.05018, §3 and
App. C, F.2).
The original benchmark contains 200 papers (100 CVPR 2025 + 100 ICLR 2025).
For each paper, the authors reverse-engineer the (I, E) tuple by stripping
narrative flow from the original PDF using the three prompts in App. F.2.
You can use this skill to reverse-engineer your own benchmark cases from
any paper PDF.
## What this skill does
Given an existing AI research paper (PDF or markdown extract), produce:
- `idea.md` (Sparse variant) — high-level concept note, no math, no
experimental results
- `idea.md` (Dense variant) — detailed technical proposal with LaTeX
equations and variable definitions, but still no experimental results
- `experimental_log.md` — exhaustive raw experimental setup, numeric data,
and qualitative observations, with all narrative references stripped
These three files form a complete (I, E) input pair for the
paper-orchestra pipeline. You can then run the pipeline and compare its
output to the original paper using `paper-autoraters`.
## Inputs
- A paper PDF or extracted markdown text. The paper uses MinerU
(Wang et al., 2024) for PDF→markdown extraction; you (the host agent)
should use whatever PDF extractor your environment provides.
- For controlled experiments, you may also extract figures separately
(PDFFigures 2.0 in the paper).
## Outputs
- `bench/