SyntaxGym is a unified platform for targeted syntactic evaluation of language models. The Gym supports all steps of the evaluation process, from designing test suites to visualizing final results. Our goal is to make psycholinguistic assessment of language models more standardized, reproducible, and accessible to a wide variety of researchers.

Test suites

Create new psycholinguistic test suites, or browse existing ones in our database.

Language models

Evaluate a set of neural language models ranging in architecture and size.


Visualize results across models and test suites through interactive charts.

Not sure where to start? Read our FAQ or take a look at the documentation.