Data Set 2.0

Dataset category: 
Publication Year: 

This pages provides the DBpedia dataset for download. The dataset has been extracted from the July 16th, 2007 (enwiki20070716)database dump of Wikipedia.

The DBpedia dataset is licensed under the terms of GNU Free Documentation License

Because of its size, the dataset has been split into different files. All files are bz2 packed and contain N-Triple data.


Link Size(packed/unpacked/triples) Description Properties
Articles 197MB/1.3GB/7.6M Descriptions of all 1.95 million concepts within the English version of Wikipedia including English titles and English short abstracts (max. 500 chars long), thumbnails, links to the corresponding articles in the English Wikipedia. This is the DBpedia basis file which should be loaded into each DBpedia repository. rdfs:label rdfs:comment foaf:image foaf:depiction foaf:page
Extended Abstracts 380MB/1.39GB/2.1M Additional, extended English abstracts (max. 3000 chars long). dbpedia:abstract
External Links 30MB/221MB/1.63M Links to external web pages about a concept. dbpedia:reference
Articles Categories 42MB/780MB/5.2M Links from concepts to categories using the SKOS vocabulary. skos:subject
Additional Languages 171MB/2.98GB/5.7M Additional titles, short abstracts and Wikipedia article links in 13 languages (German, French, Spanish, Italian, Portuguese, Polish, Swedish, Dutch, Japanese, Chinese, Russian, Finnish, Norwegian). wikipage-{lang}
Languages Extended Abstracts 355MB/1.6GB/1.9M Extended abstracts in 13 languages. dbpedia:abstract
Infoboxes 118MB/2.03GB/15.5M Information that has been extracted from Wikipedia infoboxes. see Infobox Properties
Properties 400kb/8MB/57k All properties / predicates used in infoboxes. rdf:type
Categories (Labels) 3MB/35MB/261k Labels for Categories. rdfs:label
Categories (Skos) 7MB/158MB/1M Information which concept is a category and how categories are related using the SKOS Vocabulary. skos:prefLabel skos:broader rdf:type
Persons 4MB/65MB/560k Information about 80,200 persons (date and place of birth etc.) extracted from the German Wikipedia, represented using the FOAF vocabulary. foaf:name foaf:givenname foaf:surname dbpedia:birthPlace dbpedia:birth dbpedia:deathPlace dbpedia:death dc:description rdf:type
Yago Classes 15MB/291MB/2M Dataset containing rdf:type Statements for all DBpedia instances using YAGO classification algorithm. rdf:type
Yago Class Hierarchy 1MB/15MB/117k RDFS Hierarchy of all Yago Classes rdfs:label rdfs:subClassOf
Wordnet Classes 2MB/53MB/338k Classification links to W3C Wordnet. dbpedia:wordnet_type
Geographic coordinates 3MB/63MB/450k Geographic coordinates extracted from Wikipedia. geo:lat geo:long geonames:featureClass geonames:featureCode
Homepages 3MB/24MB/200k Links to external webpages. foaf:homepage
Links to Geonames 800kB/10MB/86k Links between geographic places in DBpedia and data about them in the Geonames database owl:sameAs
Links to RDF Bookmashup 70kB/1.3MB/9k Links between books in DBpedia and data about them provided by the RDF Book Mashup. owl:sameAS
Links to DBLP 3kB/30kB/200 Links between computer scientists in DBpedia and their publications in the DBLP database. owl:sameAS
Links to Eurostat 2kB/20kB/137 Links between countries and regions in DBpedia and data about them from Eurostat. owl:sameAs
Links to CIA-Factbook 3kB/30kB/230 Links between countries in DBpedia and data about them from CIA Factbook. owl:sameAs
Links to Project Gutenberg 40kB/440kB/2500 Links between writers in DBpedia and data about them from Project Gutenberg. owl:sameAs
Links to Musicbrainz 600kB/4MB/23k Links between artists, albums and songs in DBpedia and data about them from Musicbrainz. owl:sameAs
Links to Quotationsbook 22kB/300kB/2500 Links between persons in DBpedia and data about them from Quotationsbook. owl:sameAs
Links to Revyu   Links to Reviews about things in Revyu. owl:sameAs
Links to US Census 150kB/2MB/12k Links between US cities and states in DBpedia and data about them from US Census. owl:sameAs
Links to the flikr wrappr 22MB/1.95M Links between DBpedia concepts and photo collections depicting them generated by the flikr wrappr. dbpedia:hasPhotoCollection
PageLinks 368MB/8.15GB/62.8M Dataset containing internal links between DBpedia instances. The dataset was created from the internal pagelinks between Wikipedia articles. The dataset might be useful for structural analysis, data mining or for ranking DBpedia instances using PageRank or similar algorithms. It's data is NOT available at our Sparql-Endpoint dbpedia:wikilink
Links to Wikicompany 116kB/1.3MB/8k Links between companies in DBpedia and companies in Wikicompany. Needs to be updated to the latest version: wikicompany-opendata-current.bz2 owl:sameAs
Links to Cyc 485kB/6.1MB/45k Links between DBpedia and Cyc concepts. Details. owl:sameAs
Links to QDOS 270kB/2MB/8k Links between data about people in DBpedia and QDOS. owl:sameAs