Best Practice Benefits DBpedia with DataID Statement
1. Provide metadata

Provide metadata for both human users and computer applications.

  • Reuse
  • Comprehension
  • Discoverability
  • Processability
DataID central concept of DataID
2. Provide descriptive metadata

Provide metadata that describes the overall features of datasets and distributions.

  • Reuse
  • Comprehension
  • Discoverability
DataID central concept of DataID
3. Provide structural metadata

Provide metadata that describes the schema and internal structure of a distribution.

  • Reuse
  • Comprehension
  • Processability
DataIDDBpedia using void:vocabulary points out the DBpedia ontology in use
4. Provide data license information

Provide a link to or copy of the license agreement that controls use of the data.

  • Reuse
  • Trust
DataID licensing of data can be provided via dct:license and odrl:Policy instances on Dataset and Distribution level (in the case of DBpedia: http://purl.oclc.org/NET/rdflicense/cc-by-sa3.0)
5. Provide data provenance information

Provide complete information about the origins of the data and any changes you have made.

  • Reuse
  • Comprehension
  • Trust
DataID central concept of DataID; for DBpedia: complete record of involved Agents and source Dataset
6. Provide data quality information

Provide information about data quality and fitness for particular purposes.

  • Reuse
  • Trust
None not supported by DataID core - one of the DataID extension ontologies will cover DQ by importing DQV
7. Provide a version indicator

Assign and indicate a version number or date for each dataset.

  • Reuse
  • Trust
DataID version numbers are provided in the query of the Dataid/Dataset/Distributon uri, without that parameter we reference the latest version of the resource (we have pointers for all prov:Entities for next, prev. and latest version)
8. Provide version history

Provide a complete version history that explains the changes made in each version.

  • Reuse
  • Trust
DBpedia is indirectly provided by the diff between various DataIDs and general documentation of new releases
9. Use persistent URIs as identifiers of datasets

Identify each dataset by a carefully chosen, persistent URI.

  • Reuse
  • Linkability
  • Discoverability
  • Interoperability
DataID true; URIs defined in a DataID graph
10. Use persistent URIs as identifiers within datasets

Reuse other people's URIs as identifiers within datasets where possible.

  • Reuse
  • Linkability
  • Discoverability
  • Interoperability
DBpedia DBpedia resource uris
11. Assign URIs to dataset versions and series

Assign URIs to individual versions of datasets as well as to the overall series.

  • Reuse
  • Discoverability
  • Trust
DataID see: Provide a version indicator
12. Use machine-readable standardized data formats

Make data available in a machine-readable, standardized data format that is well suited to its intended or potential use.

  • Reuse
  • Processability
DBpedia DBpedia is published as linked data in RDF
13. Use locale-neutral data representations

Use locale-neutral data structures and values, or, where that is not possible, provide metadata about the locale used by data values.

  • Reuse
  • Comprehension
DataIDDBpedia partially true for DBpedia (e.g. dates)
14. Provide data in multiple formats

Make data available in multiple formats when more than one format suits its intended or potential use.

  • Reuse
  • Processability
DBpedia DBpedia is published in multiple RDF serializations & on a public SPARQL endpoint
15. Reuse vocabularies, preferably standardized ones

Use terms from shared vocabularies, preferably standardized ones, to encode data and metadata.

  • Reuse
  • Processability
  • Comprehension
  • Trust
  • Interoperability
DBpedia DBpedia: rdfs, dct and others
16. Choose the right formalization level

Opt for a level of formal semantics that fits both data and the most likely applications.

  • Reuse
  • Comprehension
  • Interoperability
DBpedia Difficult to address, since DBpedia is a community effort. In general we try to keep the DBpedia ontology as shallow as possible.
17. Provide bulk download

Enable consumers to retrieve the full dataset with a single request.

  • Reuse
  • Access
DBpedia true for sub-datasets; whole language editions can not be collected with one click
18. Provide Subsets for Large Datasets

If your dataset is large, enable users and applications to readily work with useful subsets of your data.

  • Reuse
  • Linkability
  • Access
  • Processability
DBpedia true: DataIDs are structured in 'Main Datasets' for each DBpedia language edition containing multiple sub datasets.
19. Use content negotiation for serving data available in multiple formats

Use content negotiation in addition to file extensions for serving data available in multiple formats.

  • Reuse
  • Access
DBpedia Yes as far as the official endpoint is concerned.
20. Provide real-time access

When data is produced in real time, make it available on the Web in real time or near real-time.

  • Reuse
  • Access
DBpedia Provided, when it comes to DBpedia-live. The official DBpedia releases are snap shots of data.
21. Provide data up to date

Make data available in an up-to-date manner, and make the update frequency explicit.

  • Reuse
  • Access
DBpedia see: Provide real-time access
22. Provide an explanation for data that is not available

For data that is not available, provide an explanation about how the data can be accessed and who can access it.

  • Reuse
  • Trust
None The primary data provided are static dump files, which should always be accessible, for every release. Resources The data not represented in the public endpoint is not accounted for its absence there.
23. Make data available through an API

For data that is not available, provide an explanation about how the data can be accessed and who can access it.

  • Reuse
  • Processability
  • Interoperability
  • Access
DBpedia Some of the data (mostly from the english language edition) is available via the official SPARQL endpoint of DBpedia.
24. Use Web Standards as the foundation of APIs

Provide complete information on the Web about your API. Update documentation as you add features or make changes.

  • Reuse
  • Linkability
  • Interoperability
  • Discoverability
  • Access
  • Processability
DBpedia true: Sparql endpoint sponsored by Open Link
25. Provide complete documentation for your API

Provide complete information on the Web about your API. Update documentation as you add features or make changes.

  • Reuse
  • Trust
DBpedia outside of scope for DBpedia; The official endpoint conforms to SPARQL 1.1. and the api documentation is provided by Open Link, the provider of the endpoint.
26. Avoid Breaking Changes to Your API

Avoid changes to your API that break client code, and communicate any changes in your API to your developers when evolution happens.

  • Trust
  • Interoperability
DBpedia outside of scope for DBpedia; since the official DBpedia endpoint is following the SPARQL 1.1. specification, this should not be the case
27. Preserve identifiers

When removing data from the Web, preserve the identifier and provide information about the archived resource.

  • Reuse
  • Trust
DBpedia DBpedia follows Wikipedia when it comes to deleted wiki pages, providing dbo:redirect, pointing out the resource Wikipedia is redirecting to. The identifier itself is preserved.
28. Assess dataset coverage

Assess the coverage of a dataset prior to its preservation.

  • Reuse
  • Trust
None difficult to realize
29. Gather feedback from data consumers

Provide a readily discoverable means for consumers to offer feedback.

  • Reuse
  • Comprehension
  • Trust
DBpedia DBpedia is in the process of providing a triple level Feedback Loop. At the moment Feedback is collected via multiple mailing lists.
30. Make feedback available

Make consumer feedback about datasets and distributions publicly available.

  • Reuse
  • Trust
DBpedia All current and future means of feedback will be readily available for anyone.
31. Enrich data by generating new data

Enrich your data by generating new data when doing so will enhance its value.

  • Reuse
  • Comprehension
  • Trust
  • Processability
DBpedia New data is been generated, for example based on NLP algorithms on the Wikipage texts.
32. Provide Complementary Presentations

Enrich data by presenting it in complementary, immediately informative ways, such as visualizations, tables, Web applications, or summaries.

  • Reuse
  • Comprehension
  • Access
  • Trust
DBpedia This is a task for the DBpedia community. We do provide DBpedia releases as tables though.
33. Provide Feedback to the Original Publisher

Let the original publisher know when you are reusing their data. If you find an error or have suggestions or compliments, let them know.

  • Reuse
  • Interoperability
  • Trust
None Difficult to extend the feedback loop to Wikipedia editors.
34. Follow Licensing Terms

Find and follow the licensing requirements from the original publisher of the dataset.

  • Reuse
  • Trust
DBpedia We are following the licenses in place by Wikipedia.
35. Cite the Original Publication

Acknowledge the source of your data in metadata. If you provide a user interface, include the citation visibly in the interface.

  • Reuse
  • Discoverability
  • Trust
DataIDDBpedia We point out the original source in the dataset metadata (orig. XML dump), as well as triple level (orig. Wikipedia page).