API

Access the API online here.

Sample URLs:

  • Fetch data on a specific person: James Joyce
  • Fetch data on a specific city: Dublin, IRL
  • Fetch data on a specific country: Ireland
  • Fetch data on a specific occupation: Writer
  • Fetch data on a specific era: Newspaper Era

People:

Identifiers:

  • en_curid : English Wikipedia page ID.
  • name : English Wikipedia page title.
  • wd_id : Wikidata entity ID.

Birth and death:

  • birth_place : Birth place.
  • lat,lon : Coordinates of birth place.
  • birth_year : Year of birth (or best estimation of).
  • birth_town : Closest settlement (above 15000 people) to birth_place (identified using the geonameid from geonames.org).
  • birth_civ : Ancient civilization of birth according to the birth_place and birth_year.

Occupation:

  • occupation : Occupation that best describes the person’s mayor field of contribution.
  • prob_ratio : Score of the quality of the estimation.

Memorability indicators (static):

  • L : Number of Wikipedia language editions that have a biography about this character.
  • pv : Total English Wikipedia pageviews from XX 2007 until XX 2016.
  • HPI : Human popularity index following Yu et. al. using pageviews data from July 2015 until June 2016.

Memorability indicators (dynamic):

  • L : Number of Wikipedia language editions as a function of time.
  • l : Fraction of Wikipedia language editions over the number of Wikipedias expressed in Pantheon.
  • l* : Number of Wikipedia language editions normalized by the mean number of editions of all people in Pantheon.
  • C : Article coverage. For each language at each month we define the language’s coverage as the number of people with a page in that language divided by the number of people with a page in any language. The coverage for each person is calculated as the total coverage of the languages he or she appears on.
  • c : Article coverage normalized by the total coverage of all language editions expressed in the dataset.
  • S : Score. The score for each language edition at each month is calculated as one over the coverage. The score of each article is calculated as the total score of the language editions it appears on. This measure is meant to assign more value to rare language editions.
  • s : Score normalized by the total score of all language editions expressed in the dataset.
  • R : Number of revisions of the English Wikipedia page as a function of time.
  • r : Number of revisions of the English Wikipedia normalized by all the revisions in Pantheon.
  • r* : Number of revisions of the English Wikipedia normalized by the average number of revisions.
  • pv : Monthly pageviews for the English Wikipedia page.

Reference indicators (dynamic)

  • nw : Total number of Wikipedia editions expressed in the dataset.
  • L_av : Average number of language editions of biographies in the dataset.
  • cov : Total coverage of all the language editions expressed in the dataset.
  • score : Total score of all language editions expressed in the dataset.

Town:

birth_town : geonameid identifier.

population : Current population estimate according to geonames.org

lat,lon : Coordinates according to geonames.org

city : Big city the town belongs to, following the DBSCAN algorithm to group towns into clusters.

country : Country the town belongs to (country code), according to modern borders.

Country:

country : Country code identifier.

founded : Country foundation year according to Wikipedia.

region : UN region the country belongs to.

continent : Continent the country belongs to.

Groupings:

least_developed : Countries that are the least developed according to the UN.

language : Countries grouped according to shared language.

colonial : Countries grouped according to shared colonial past.

industry : Second level of aggregation for occupations.

domain : Third level of aggregation for occupations.

group : Alternative aggregation for occupations (used to track changes in composition of history due to changes in media).