About the data – OpenAlex

OpenAlex is more than just a catalog of research publications. We do the work of disambiguating and connecting scholarly works, authors, institutions, sources, and other entities. We then offer the data and analytics on top of it in three different channels, depending on your needs:

OpenAlex Web — Our friendly web user interface
OpenAlex API — A fast, modern REST API to get the data programmatically
Data Snapshot — A periodic snapshot of the data, available to download in its entirety, for free

Data overview

At the heart of OpenAlex is our dataset—a catalog of works. A work is any sort of scholarly output. A research article is one kind of work, but there are others such as datasets, books, and dissertations. We keep track of these works—their titles (and abstracts and full text in many cases), when they were created, etc. But that's not all we do. We also keep track of the connections between these works, finding associations through things like journals, authors, institutional affiliations, citations, topics, and funders. There are hundreds of millions of works out there, and tens of thousands more being created every day, so it's important that we have these relationships to help us make sense of research at a large scale.

Our data sources

OpenAlex aggregates and standardizes data from a whole bunch of other great projects, like a river fed by many tributaries. Our two most important data sources are MAG and Crossref. Other key sources include:

ORCID
ROR
DOAJ
Unpaywall
Pubmed
Pubmed Central
The ISSN International Centre
Internet Archive
Web crawls
Subject-area and institutional repositories from arXiv to Zenodo and many in between

Learn more about how OpenAlex gets its works: Where do works in OpenAlex come from?