Match Authors in Scopus automatically with sosia¶
sosia (Italian for doppelgänger) finds researchers that are similar to another one. Use the matching researcher as a control in Diff-in-Diff anlyses. sosia is developed and described by econometricians for scientists of science.
sosia does not pre-compute annual characteristics to find controls as in Coarsened Exact Matching. Instead, sosia searches the entire Scopus database via pybliometrics. Once pybliometrics is configured (Scopus access key, etc.), let sosia find a match for you.
Example¶
Install sosia from PyPI using the console or command line interpreter:
$ pip install sosia
In Python, set up sosia (and eventually pybliometrics) and search for similar scientists using their Scoups Author Profile IDs.
>>> import sosia
>>>
>>> # You need the Scopus ID and the year, optionally set a database path
>>> stefano = sosia.Original(55208373700, 2018)
>>> # Sources similiar to those stefano publishes in
>>> stefano.define_search_sources()
>>> # Authors publishing in search sources every 2 years
>>> stefano.identify_candidates_from_sources(first_year_margin=1, chunk_size=2)
>>> # Find candidates whose characteristics fall within margins
>>> stefano.filter_candidates(same_discipline=True, first_year_margin=1,
>>> pub_margin=0.2, cits_margin=0.2,
>>> coauth_margin=0.15)
>>> print(stefano.matches)
>>> [55567912500]
>>> # Optional step to provide additional information
>>> stefano.inform_matches()
>>> print(stefano.matches[0])
Match(ID=55567912500, name='Eling, Katrin', first_name='Katrin',
surname='Eling', first_year=2013, last_year=2018, num_coauthors=9,
num_publications=8, num_citations=56, subjects=['BUSI', 'COMP', 'ENGI'],
affiliation_country='Netherlands', affiliation_id='60032882',
affiliation_name='Technische Universiteit Eindhoven',
affiliation_type='univ', language='eng', num_cited_refs=0)
Full reference:
|
Representation of a scientist for whom to find a control scientist. |
Citation¶
If sosia helped you getting data for research, please cite our corresponding paper:
Rose, Michael E. and Stefano H. Baruffaldi: “Finding Doppelgängers in Scopus: How to Build Scientists Control Groups Using Sosia”, Max Planck Institute for Innovation & Competition Research Paper No. 20-20.
Citing the paper helps the development of sosia, because it justifies funneling resources into the development. It also signals that you created your control group in a transparent and replicable way.