sosia: Automatic author matching in Scopus on-line

sosia is a Python library match authors in the Scopus database, using the pybliometrics package.


Install sosia from PyPI:

$ pip install sosia

or directly from the GitHub repository (may be unstable):

$ pip install git+

To access the Scopus database using pybliometrics, you need to obtain credentials. Install and configure pybliometrics before using sosia.


sosia performs a series of queries in the Scopus database using the pybliometrics package. After configuring your local pybliometrics (providing access credentials and eventually setting cache directories), you can use sosia:

>>> import sosia
>>> sosia.create_fields_sources_list()  # Necessary only once
>>> sosia.create_cache()  # Necessary only once
>>> stefano = sosia.Original(55208373700, 2017)  # Scopus ID and year
>>> stefano.define_search_sources()  # Sources similiar to scientist
>>> stefano.define_search_group()  # Authors publishing in similar sources
>>> matches = stefano.find_matches()  # List of namedtuples
>>> matches[0]
Match(ID='53164702100', name='Sapprasert, Koson', first_year=2011,
num_coauthors=7, num_publications=6, country='Norway', language='eng',
reference_sim=0.0212, abstract_sim=0.1695)

Full reference:

Original(scientist, year[, year_margin, …]) Class to represent a scientist for which we want to find a control group.

Indices and tables