Databases and surroundings: Wikidata, SPARQL & Scarlett Johansson
Hi Guys,
Wikidata is therefore responsible for cataloging all this information through a series of properties and classifiers definably users.
Wikidata
Sometimes, for example, a new property is proposed.
In this case it is discussed and in the end it is voted and if the majority expresses a positive vote the property is created.
As the element "planet earth" will have the properties "population", "highest point" and so on, for example the element Scarlett Johansson as "human being" will have others:
- "Instance of" (P31) with value "human" (Q5)
- "Image" (P5) that contain an image as value
- "Sex" (P21) with value "female" (Q6581072)
or
Now that we have all this data, how do we interrogate them?
How to interrogate wikidata: SPARQL
#American actresses living
SELECT ?item ?itemLabel ?itemDescription ?height (SAMPLE(?img) AS ?image) (SAMPLE(?dob) AS ?dob) ?sl
WHERE {
?item wdt:P106 wd:Q33999 ;
wdt:P27 wd:Q30 ;
wdt:P21 wd:Q6581072 .
MINUS { ?item wdt:P570 [] }
OPTIONAL { ?item wdt:P2048 ?height }
OPTIONAL { ?item wdt:P18 ?img }
OPTIONAL { ?item wdt:P569 ?dob }
OPTIONAL { ?item wikibase:sitelinks ?sl }
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en"}
} GROUP BY ?item ?itemLabel ?itemDescription ?height ?sl
ORDER BY DESC(?sl)
If you insert this query and the press the blue arrow you will obtain this result:This is not a SPARQL course at all, so let’s just outline the query we just used.
If you know the SQL language you will recognize some common elements:
SELECT, WHERE, GROUP BY, ORDER BY token also exist in SPARQL:
Then logic becomes a little different..
"?item" represent all the elements of wikidata so you need to apply some filters to extract the data you want.
you can filter by applying filters as shown below:
This filter apply 3 conditions.
The first condition is:
?item wdt:P106 wd:Q33999;
wdt:P27 wd:Q30 ;
wdt:P21 wd:Q6581072 .
In this case we are asking to extract only those items where
the property P106 is equal to the item Q33999
Property P106 means "occupation of a person" and Q33999 is the entry "Actor"
Similarly, the second condition means "country of citizenship" (P27) must be "United States of America" (Q30)
?item wdt:P106 wd:Q33999;
wdt:P27 wd:Q30 ;
wdt:P21 wd:Q6581072 .
Finally the third condition means "Sex or Gender" (P21) must be "Female" (Q6581072)
?item wdt:P106 wd:Q33999;
wdt:P27 wd:Q30 ;
wdt:P21 wd:Q6581072 .
That’s really nice, isn’t it?
Another basic thing to know is how to extract and display information.
For example, we have the list of actors and we want to see their date of birth or height.
The Token OPTIONAL allows you to specify a property that in this case is property P2048. This property is the height.
Of course, as mentioned, this is just an introduction. SPARQL syntax can do much more and become much more complex.
But this weekend you can practice searching for the information that most interests you directly from wikidata.
I hope you enjoyed this post!
Previous post: SQL Server 2022 RC 1 is out! What's new?
Comments
Post a Comment