Automated Grading for Efficiently Evaluating the Dual-Use Biological Capabilities of Large Language Models

Bria Persaud, Ying-Chiang Jeffrey Lee, Jordan Despanie, Helin Hernandez, Henry Alexander Bradley, Sarah L. Gebauer, Greg McKelvey, Jr.

Published Feb 28, 2025

The authors of this working paper developed a proof-of-concept automated grader and used it to assess large language models' abilities to answer knowledge-based questions and generate protocols that explain how to perform common laboratory techniques that could be used in the creation of proxies for biological threats.

Document Details

Citation

RAND Style Manual

Persaud, Bria, Ying-Chiang Jeffrey Lee, Jordan Despanie, Helin Hernandez, Henry Alexander Bradley, Sarah L. Gebauer, and Greg McKelvey, Jr., Automated Grading for Efficiently Evaluating the Dual-Use Biological Capabilities of Large Language Models, RAND Corporation, WR-A3124-1, 2025. As of April 8, 2025: https://www.rand.org/pubs/working_papers/WRA3124-1.html

Chicago Manual of Style

Persaud, Bria, Ying-Chiang Jeffrey Lee, Jordan Despanie, Helin Hernandez, Henry Alexander Bradley, Sarah L. Gebauer, and Greg McKelvey, Jr., Automated Grading for Efficiently Evaluating the Dual-Use Biological Capabilities of Large Language Models. Santa Monica, CA: RAND Corporation, 2025. https://www.rand.org/pubs/working_papers/WRA3124-1.html.
BibTeX RIS

Research conducted by

This work was independently initiated and conducted within the Technology and Security Policy Center of RAND Global and Emerging Risks using income from operations and gifts from philanthropic supporters. A complete list of donors and funders is available at www.rand.org/TASP.

This publication is part of the RAND working paper series. RAND working papers are intended to share researchers' latest findings and to solicit informal peer review. They have been approved for circulation by RAND but may not have been formally edited or peer reviewed.

This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit www.rand.org/pubs/permissions.

RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.