This page contains all information related to our author disambiguation. This help page will be updated with the latest information as it becomes available.
Authors Overview
Our information about authors comes from MAG, Crossref, PubMed, ORCID, and publisher websites. We use an algorithm to disambiguate authors; this uses an author’s name, their publication record, their citation patterns, and (where available) their ORCID. So for example, if J. Schmidt and John Jacob Jingleheimer Schmidt both write about 19th-century ketchup production, we’ll treat them as one author–but we won’t include the JJJ Schmidt who writes about weasel migration (even though his name is their name, too).
In late July, 2023, we switched to a new, more accurate author disambiguation system, with a better machine-learning model to identify authors, a smarter strategy for author assignments for new works, and a much better integration with ORCID data when it is available. As part of that switch, we deprecated all of the old OpenAlex Author IDs, and assigned new Author IDs to all authors. You can find the old Author IDs, along with their associated works, as a data dump here. New Author IDs have a numeric component of their OpenAlex ID >5000000000. The new Author IDs have been used since late July, 2023, and in the data snapshots starting in August, 2023.
If you would like to learn more about the author object itself (such as making API requests or filtering authors in the API) you can go to our author entity page.
Author Profile Curation
If you see an error in your author profile, you can submit a request using our author curation form. Please see this help article which has more information about how to submit a request and what information you should include.
Code, Data, and Methods
Our methods, code, and models are all, of course, fully open! If you would like to view the python code, learn about the methods used, or download the training data, you can go to the name-disambiguation github repo. For the latest live disambiguation code, you can go to the live-disambiguation repo.
Null Authors (A9999999999) and Deleted Authors (A5317838346)
You may come across these OpenAlex Author IDs particularly if you are using the data snapshot. Please see this article for more information.