Example results
Human vs synthetic reviewer alignment, argument by argument.
This page compares how synthetic reviewers align with real human reviewer feedback across published manuscripts. Every human critique in the dataset is matched against synthetic outputs and labeled as found, partially found, or missed.
Use the manuscript coverage strips to focus on one paper, then drill down in the table to inspect each individual argument pair: the original human issue and the corresponding synthetic reviewer evidence.
Loading comparison results...