Data Set 3.7
This pages provides downloads of the DBpedia datasets. The DBpedia datasets are licensed under the terms of the Creative Commons Attribution-ShareAlike License and the GNU Free Documentation License. The downloads are provided as N-Triples and N-Quads, where the N-Quads version contains additional provenance information for each statement. All files are bz2 packed.
Older Versions: DBpedia 3.6, DBpedia 3.5.1, DBpedia 3.5, DBpedia 3.4, DBpedia 3.3, DBpedia 3.2, DBpedia 3.1, DBpedia 3.0, DBpedia 3.0RC, DBpedia 2.0
See also the change log for recent changes and developments.
Content
1 Wikipedia Input Files
The datasets were extracted from Wikipedia dumps generated in late July 2011 (see also all specific dates and times).
2 Core Datasets
NOTE: You can find DBpedia dumps in 97 languages at our DBpedia download server.
Click on the dataset names to obtain additional information.
Dataset | en | ca | de | el | es | fr | ga | hr | hu | it | nl | pl | pt | ru | sl | tr |
DBpedia Ontology ( preview ) | owl | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | ||||
Ontology Infobox Types ( preview ) | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Ontology Infobox Properties ( preview ) | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Ontology Infobox Properties (Specific) ( preview ) | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | -- | nt nq | nt nq | -- | nt nq | nt nq |
Dataset | en | ca | de | el | es | fr | ga | hr | hu | it | nl | pl | pt | ru | sl | tr |
Articles Categories ( preview ) | nt nq | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
Categories (Labels) ( preview ) | nt nq | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
Categories (Skos) ( preview ) | nt nq | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
External Links ( preview ) | nt nq | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
Links to Wikipedia Article ( preview ) | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Wikipedia Pagelinks ( preview ) | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Redirects ( preview ) | nt nq | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
Disambiguation Links ( preview ) | nt nq | nt nq | nt nq | nt nq | nt nq | -- | -- | -- | -- | nt nq | -- | nt nq | nt nq | nt nq | -- | -- |
Page IDs ( preview ) | nt nq | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
Revision IDs ( preview ) | nt nq | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
3 i18n Datasets
These datasets contain all articles of the respective Wikipedia, including the ones that do not have an equivalent English article. more...
CAUTION: the URIs in these dumps have language-specific namespaces (e.g. http://el.dbpedia.org/...).
NOTE: You can find DBpedia dumps in 97 languages at our DBpedia download server.
Click on the dataset names to obtain additional information.
Dataset | ca | de | el | es | fr | ga | hr | hu | it | nl | pl | pt | ru | sl | tr |
Ontology Infobox Types | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Ontology Infobox Properties | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Ontology Infobox Properties (Specific) | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | -- | nt nq | nt nq | -- | nt nq | nt nq |
Dataset | ca | de | el | es | fr | ga | hr | hu | it | nl | pl | pt | ru | sl | tr |
Titles | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Short Abstracts | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Extended Abstracts | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Images | -- | nt nq | nt nq | nt nq | -- | -- | -- | -- | -- | -- | -- | nt nq | nt nq | -- | -- |
Geographic Coordinates | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | -- | nt nq |
Raw Infobox Properties | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Raw Infobox Property Definitions | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Homepages | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | -- | -- | -- | -- | nt nq | nt nq | nt nq | -- | -- |
Persondata | -- | nt nq | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
PND | -- | nt nq | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
Dataset | ca | de | el | es | fr | ga | hr | hu | it | nl | pl | pt | ru | sl | tr |
Links to Wikipedia Article | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Wikipedia Pagelinks | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Disambiguation Links | nt nq | nt nq | nt nq | nt nq | -- | -- | -- | -- | nt nq | -- | nt nq | nt nq | nt nq | -- | -- |
Inter-language links | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Dataset | ca | de | el | es | fr | ga | hr | hu | it | nl | pl | pt | ru | sl | tr |
Links to Wikipedia Article | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Wikipedia Pagelinks | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
Disambiguation Links | nt nq | nt nq | nt nq | nt nq | -- | -- | -- | -- | nt nq | -- | nt nq | nt nq | nt nq | -- | -- |
Inter-language links | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq | nt nq |
4 External Links
NOTE: You can find DBpedia dumps in 97 languages at our DBpedia download server.
Click on the dataset names to obtain additional information.
5 Dataset Descriptions
DBpedia Ontology
The DBpedia ontology in OWL. See our JWS paper for more details.
Ontology Infobox Types
Contains triples of the form $object rdf:type $class from the ontology-based extraction.
Ontology Infobox Properties
High-quality data extracted from Infoboxes using the strict ontology-based extraction. The predicates in this dataset are in the /ontology/ namespace.
Ontology Infobox Properties (Specific)
Infoboxes Data from the loose ontology-based extraction.
Titles
Titles of all Wikipedia Articles in the corresponding language
Short Abstracts
Short Abstracts (max. 500 chars long) of Wikipedia Articles
Extended Abstracts
Additional, extended English abstracts.
Images
Thumbnail Links from Wikipedia Articles
Geographic Coordinates
Geographic coordinates extracted from Wikipedia.
Raw Infobox Properties
Information that has been extracted from Wikipedia infoboxes. Note that this data is in the less clean /property/ namespace. The Ontology Infobox Properties (/ontology/ namespace) should always be preferred over this data.
Raw Infobox Property Definitions
All properties / predicates used in infoboxes.
Homepages
Links to external webpages.
Persondata
Information about persons (date and place of birth etc.) extracted from the English and German Wikipedia, represented using the FOAF vocabulary.
PND
Dataset containing PND (Personennamendatei) identifiers.
Articles Categories
Links from concepts to categories using the SKOS vocabulary.
Categories (Labels)
Labels for Categories.
Categories (Skos)
Information which concept is a category and how categories are related using the SKOS Vocabulary.
External Links
Links to external web pages about a concept.
Links to Wikipedia Article
Links to corresponding Articles in Wikipedia
Wikipedia Pagelinks
Dataset containing internal links between DBpedia instances. The dataset was created from the internal pagelinks between Wikipedia articles. The dataset might be useful for structural analysis, data mining or for ranking DBpedia instances using Page Rank or similar algorithms.
Redirects
Dataset containing redirects between Articles in Wikipedia
Disambiguation Links
Extraction from Disambiguation Templates
Page IDs
Dataset containing the Wikipedia Page IDs.
Revision IDs
Dataset containing the Wikipedia Revision IDs.
Inter-language links
Dataset containing links between the different DBpedia URIs for various languages.
Links to RDF Bookmashup
Links between books in DBpedia and data about them provided by the RDF Book Mashup. Provided by Georgi Kobilarov. Update mechanism: unclear/copy over from previous release.
Links to Bricklink
Links between DBpedia and Bricklink.
Links to DailyMed
Links between DBpedia and DailyMed. Update mechanism: unclear/copy over from previous release.
Links to DBLP
Links between computer scientists in DBpedia and their publications in the DBLP database. Links were created manually. Update mechanism: Copy over from previous release.
Links to Diseasome
Links between DBpedia and Diseasome. Update mechanism: unclear/copy over from previous release.
Links to DrugBank
Links between DBpedia and DrugBank. Update mechanism: unclear/copy over from previous release.
Links to EUnis
TODO
Links to Eurostat
Links between countries and regions in DBpedia and data about them from Eurostat. Links were created manually. Update mechanism: Copy over from previous release.
Links to CIA Factbook
Links between countries in DBpedia and data about them from CIA Factbook. Links were created manually. Update mechanism: Copy over from previous release.
Links to flickr wrappr
Links between DBpedia concepts and photo collections depicting them generated by the flikr wrappr. Update mechanism: script in Mercurial.
Links to Freebase
Links between DBpedia and Freebase (MIDs). Update mechanism: script in Mercurial.
Links to GADM
Links between places in DBpedia and GADM.
Links to Geonames
Links between geographic places in DBpedia and data about them in the Geonames database. Provided by the Geonames people. Update mechanism: unclear/copy over from previous release.
Links to GeoSpecies
//Links between species in DBpedia and GeoSpecies.
Links to Project Gutenberg
Links between writers in DBpedia and data about them from Project Gutenberg. Update mechanism: script in Mercurial. Since this requires manual changes of files and a D2R installation, it will be copied over from the previous DBpedia version and updated between releases by the maintainers (Piet Hensel and Georgi Kobilarov).
Links to Italian Public Schools
Links between DBpedia and Italian Public Schools.
Links to LinkedMDB
TODO
Links to MusicBrainz
Links between artists, albums and songs in DBpedia and data about them from MusicBrainz. Created manually using the result of SPARQL queries. Update mechanism: unclear/copy over from previous release.
Links to New York Times
Links between New York Times subject headings and DBpedia concepts.
Links to Cyc
Links between DBpedia and Cyc concepts. Details. Update mechanism: awk script.
Links to Revyu
Links to Reviews about things in Revyu. Created manually by Tom Heath. Update mechanism: unclear/copy over from previous release.
Links to SIDER
Links between DBpedia and SIDER. Update mechanism: unclear/copy over from previous release.
Links to TCMGeneDIT
Links between DBpedia and TCMGeneDIT. Update mechanism: unclear/copy over from previous release.
Links to Umbel
TODO
Links to US Census
Links between US cities and states in DBpedia and data about them from US Census. Update mechanism: unclear/copy over from previous release.
Links to WikiCompany
Links between companies in DBpedia and companies in Wikicompany. Update mechanism: script in Mercurial.
Links to WordNet
Classification links to RDF representations of WordNet classes. Update mechanism: unclear/copy over from previous release.
Links to YAGO2
Dataset containing links between DBpedia and YAGO, YAGO type information for DBpedia resources and the YAGO class hierarchy. Currently maintained by Johannes Hoffart.
6 NLP Datasets
DBpedia also includes a number of NLP Datasets — datasets specifically targeted at supporting Computational Linguistics and Natural Language Processing (NLP) tasks. Among those, we highlight the Lexicalization Dataset, Topic Signatures, Thematic Concepts and Grammatical Genders.