The image to the left is the History Navigator desktop application that displays Wikidata historical information textually and geographically using the Bing Maps API.
The program permits the user to select :
a topic (defining a set of historical entities (people, groups, etc.), events or movements,
a timeframe, and
a region for analysis and geographical display.
The HistoryNavigator program is currently a desktop application written in C# (.Net Windows Presentation Foundation), but I hope to migrate it to a web solution. However, the desktop environment provides a lot of benefits (I am old fashioned).
The program permits the user to select :
a topic (defining a set of historical entities (people, groups, etc.), events or movements,
a timeframe, and
a region for analysis and geographical display.
The HistoryNavigator program is currently a desktop application written in C# (.Net Windows Presentation Foundation), but I hope to migrate it to a web solution. However, the desktop environment provides a lot of benefits (I am old fashioned).
HistoryNavigator Screenshots: The screen shots in the slide show below are showing the first set of extracts from Wikidata, capturing all Philosophers and Scientists, based on the Occupation property.
How this was done: I downloaded the Wikidata dump of 4/27/15, which was around 4 gigabytes compressed, and uncompressed became a giant 44 gigabyte JSON file, consisting of 17 million or so items each expressed as a JSON element, one per line for ease of processing. Wikidata items are statements about the item, which is identified by a Q-number (For instance, Stephen Dodson Ramseur, a Civil War general, is Q2339228, accessible at www.wikidata.org/wiki/Q2339228 or to get the JSON format, https://www.wikidata.org/wiki/Special:EntityData/Q2339228.json), and various statements can be made about that item. Statements are Claims consisting of a property (for instance, P106 is "has the occupation of") and a Q-numbered object to which the property applies. A custom program ReadWikiData (borrowing code from a very nice C# program tool https://github.com/ValterVB/VBot) processes this dump file to extract data based on the criteria desired. Additionally, Wikidata may be accessed online, mainly via a URL wdq.wmflabs.org/api?q=... followed by a query.
In the case of Philosophers and Scientists in the slide show below, I performed the following online selection for all items with Occupation (P106) Philosopher (Q4964182) or Scientist (Q901), getting all entities (people are the only ones assumed to have occupations), using the following URL (the letters P and Q are omitted):
Online Query of Wikidata: wdq.wmflabs.org/api?q=CLAIM[106:901] and wdq.wmflabs.org/api?q=CLAIM[106:4964182]
This yielded 9194 Philosophers and 1467 Scientists, and it turned out that 58 were both, as can be confirmed by a combined query:
http://wdq.wmflabs.org/api?q=CLAIM[106:901]%20AND%20CLAIM[106:4964182]
I learned from this that Spider Man is a scientist! However, qualifying the query to add the requirement that all selected are "Instances Of" (P31) Human (Q5) will ascertain that non-humans are excluded. The search above yielded 10,603 people, for which I used ReadWikiData to process the whole JSON dump file to identify all who had both birth date and location, which ended up just under 5,000. The birth location is a Q-numbered identifier, so a second pass against the dump file looked for these locations and retrieved the geographic coordinates for them. The display below uses blue dots for philosphers, red for scientists (shown on top of the blue), and yellow for both (shown on top). Displaying the data on Bing Maps allows the Bing zoom commands to select various parts of the earth. While the depiction below shows all philosophers and scientists found, it omits any without birth place and date (or where birth place omits geographic coordinates - of which there were only 58, such as West Prussia Q161947). Another issue is that, if a biochemist (Q2919046) is not also classified as a scientist (Q901), the current program will not pick up the biochemist. However, I am hoping to build a set of rules to ascertain if all sub-classes of a class are also associated with the class (or to enhance the search to include all sub-classes of a class).
The Vision: The vision behind the structured history concept is to be able to formally or programmatically assess questions of historical veracity, where it is difficult to prove an historical narrative is accurate and true, but it may be possible by these means to identify and disprove false narratives. The requirements for this are to have a structured body of knowledge, which Wikidata provides (in capability but not yet in content), one or more robust ontological frameworks in which inferences and deductions can be made, and tools for accessing and analyzing the data. Realizing this concept will also necessitate, I believe, a way to deal better with contentious issues. For instance, where there is contention between various schools of thought, have a method to identify which school of thought is asserting a given claim. A current example is the renewed debate in Anthropology between the Recent Out of Africa and the Multiregional camps. The other aim is to be able to capture historical data at a high level of granularity. For instance, track military campaigns down to the regiment level, and have detailed information on George Washington for every day of his life.
The HistoryNavigator Slideshow: Birth location and "occupation" are shown by 9 arbitrary time periods in the slides below, first showing births within a period and then showing cumulative births through the end of the period. Where a location has multiple births, this fact is not annunciated. In the birth location popularity contest, Paris takes the lead at 136 (perhaps it is easier to be deemed a philosopher in Paris?), with Berlin, Vienna, New York City and Moscow following, with 105, 78, 59 and 58 births, respectively. The sole red dot in Africa in the earliest period is Imhotep (2800 BCE, Q131171), whose birthplace is shown as Africa - should this be changed to Egypt?
How this was done: I downloaded the Wikidata dump of 4/27/15, which was around 4 gigabytes compressed, and uncompressed became a giant 44 gigabyte JSON file, consisting of 17 million or so items each expressed as a JSON element, one per line for ease of processing. Wikidata items are statements about the item, which is identified by a Q-number (For instance, Stephen Dodson Ramseur, a Civil War general, is Q2339228, accessible at www.wikidata.org/wiki/Q2339228 or to get the JSON format, https://www.wikidata.org/wiki/Special:EntityData/Q2339228.json), and various statements can be made about that item. Statements are Claims consisting of a property (for instance, P106 is "has the occupation of") and a Q-numbered object to which the property applies. A custom program ReadWikiData (borrowing code from a very nice C# program tool https://github.com/ValterVB/VBot) processes this dump file to extract data based on the criteria desired. Additionally, Wikidata may be accessed online, mainly via a URL wdq.wmflabs.org/api?q=... followed by a query.
In the case of Philosophers and Scientists in the slide show below, I performed the following online selection for all items with Occupation (P106) Philosopher (Q4964182) or Scientist (Q901), getting all entities (people are the only ones assumed to have occupations), using the following URL (the letters P and Q are omitted):
Online Query of Wikidata: wdq.wmflabs.org/api?q=CLAIM[106:901] and wdq.wmflabs.org/api?q=CLAIM[106:4964182]
This yielded 9194 Philosophers and 1467 Scientists, and it turned out that 58 were both, as can be confirmed by a combined query:
http://wdq.wmflabs.org/api?q=CLAIM[106:901]%20AND%20CLAIM[106:4964182]
I learned from this that Spider Man is a scientist! However, qualifying the query to add the requirement that all selected are "Instances Of" (P31) Human (Q5) will ascertain that non-humans are excluded. The search above yielded 10,603 people, for which I used ReadWikiData to process the whole JSON dump file to identify all who had both birth date and location, which ended up just under 5,000. The birth location is a Q-numbered identifier, so a second pass against the dump file looked for these locations and retrieved the geographic coordinates for them. The display below uses blue dots for philosphers, red for scientists (shown on top of the blue), and yellow for both (shown on top). Displaying the data on Bing Maps allows the Bing zoom commands to select various parts of the earth. While the depiction below shows all philosophers and scientists found, it omits any without birth place and date (or where birth place omits geographic coordinates - of which there were only 58, such as West Prussia Q161947). Another issue is that, if a biochemist (Q2919046) is not also classified as a scientist (Q901), the current program will not pick up the biochemist. However, I am hoping to build a set of rules to ascertain if all sub-classes of a class are also associated with the class (or to enhance the search to include all sub-classes of a class).
The Vision: The vision behind the structured history concept is to be able to formally or programmatically assess questions of historical veracity, where it is difficult to prove an historical narrative is accurate and true, but it may be possible by these means to identify and disprove false narratives. The requirements for this are to have a structured body of knowledge, which Wikidata provides (in capability but not yet in content), one or more robust ontological frameworks in which inferences and deductions can be made, and tools for accessing and analyzing the data. Realizing this concept will also necessitate, I believe, a way to deal better with contentious issues. For instance, where there is contention between various schools of thought, have a method to identify which school of thought is asserting a given claim. A current example is the renewed debate in Anthropology between the Recent Out of Africa and the Multiregional camps. The other aim is to be able to capture historical data at a high level of granularity. For instance, track military campaigns down to the regiment level, and have detailed information on George Washington for every day of his life.
The HistoryNavigator Slideshow: Birth location and "occupation" are shown by 9 arbitrary time periods in the slides below, first showing births within a period and then showing cumulative births through the end of the period. Where a location has multiple births, this fact is not annunciated. In the birth location popularity contest, Paris takes the lead at 136 (perhaps it is easier to be deemed a philosopher in Paris?), with Berlin, Vienna, New York City and Moscow following, with 105, 78, 59 and 58 births, respectively. The sole red dot in Africa in the earliest period is Imhotep (2800 BCE, Q131171), whose birthplace is shown as Africa - should this be changed to Egypt?