References
Aczel, Balazs, Barnabas Szaszi, and Alex O Holcombe, “A
billion-dollar donation: Estimating the cost of researchers’ time spent
on peer review,” Research integrity and peer review, 6
(2021), 1–8 (Springer).
Asher, Samuel G. Z., Janet Malzahn, Jessica M. Persano, Elliot J.
Paschal, Andrew C. W. Myers, and Andrew B. Hall, “Do claude code
and codex p-hack? Sycophancy and statistical analysis in
large language models,” 2026.
Asirvatham, Hemanth, Elliott Mokski, and Andrei Shleifer, “GPT as a
measurement tool,” {NBER} Working Paper, 2026 (National
Bureau of Economic Research).
Choi, Byungjin, Tae Joon Jun, Joung Won Sung, Il Woo Park, Jeong-Moo
Lee, Soo Ick Cho, Hyung Jun Park, Ro Woon Lee, and Jungyo Suh, “Invisible text
injection and peer review by AI models,” JAMA Network
Open, 9 (2026), e2552099.
Elsevier, “Generative
AI policies for journals” (Feb. 19, 2026).
Hsu, Chao-Chun, and Chenhao Tan, “OpenAIReview:
Open-source AI-assisted academic paper reviewing,” 2026.
IsItCredible.com, “Is it
credible?” (Feb. 19, 2026).
Leung, Tiffany I., “LLMs in peer
review—how publishing policies must advance,” JAMA
Network Open, 9 (2026), e2552042.
QED Science, “QED science:
Critical thinking AI for research,” 2026.
Rajakumar, Hamrish Kumar, Kailash Abhishek Sankaran, Manasi Pillai
Ashok, and Srinivas Rachoori, “Peer review in the age of
artificial intelligence: A comparative study of human and AI-generated
review reports,” Postgraduate Medical Journal,
(2026), qgag005.
Refine, “FAQ -
refine” (Feb. 19, 2026).
Spitzer, Markus Wolfgang Hermann, “The emerging
submission crisis in behavioral science,” Trends in
Neuroscience and Education, 42 (2026), 100276.
Thomas, Llewellyn D. W., Angelo Kenneth G. Romasanta, and Laia Pujol
Priego, “Jagged
competencies: Measuring the reliability of generative AI in academic
research,” Journal of Business Research, 203 (2026),
115804.
Wang, Yuehan, Jinyan Huang, Lun Du, Yuxin Guo, Ying Liu, and Rong Wang,
“Evaluating
large language models as raters in large-scale writing assessments: A
psychometric framework for reliability and validity,”
Computers and Education: Artificial Intelligence, 9 (2025),
100481.
Zhang, Tianmai M, and Neil F Abernethy, “Reviewing scientific papers
for critical problems with reasoning LLMs: Baseline approaches and
automatic evaluation,” arXiv preprint
arXiv:2505.23824, (2025).