DKAN Overview

DKAN is an open-source open-data platform inspired by CKAN (Comprehensive Knowledge Archive Network) and built on top of the very popular Drupal CMS (Content Management System).


Structure

DKAN is a Drupal module that adds data management functionality.

Within DKAN there are additional modules to organize internal subsystems. Information about the subsystems/components in DKAN can be found in the Components page.

DKAN’s modules and subsystems are organized around four main data functions:

  1. Management

  2. Aggregation

  3. Discoverability

  4. Usability

Data Management

The main function of any open data platform is to help manage data. Making data public is simple, anyone can place a file in a web-accessible server, but making data open takes a bit more work. True open data is accessible, discoverable, machine-readable, linked to other resources that provide context, published in an open format and under an open license.

This is what we mean by data management: providing tools that empower data publishers to make data open, which empowers data consumers to find and use the data they need.

Note

For more on the fundamentals of open data, read the Open Definition and 5-Star Open Data.

Most data management functions in DKAN are provided by the DKAN Metastore module.

Data Aggregation

Many open data catalogs are aggregations of other sources of data. DKAN provides tools to allow any DKAN catalog to host aggregated or federated datasets in conjunction with originally-sourced data. A very large real-world example of this is Data.gov, a catalog which aggregates datasets the U.S. federal government.

Aggregating or importing datasets from different remote sources into a catalog is often known as harvesting. DKAN has robust and extensible functionality for this that lives in the DKAN Harvest module.

Discoverability

Finally, data is only useful and open to the degree to which it can be found and understood. This is why many of the modules in DKAN are dedicated to helping make data more accessible.

The DKAN Metastore helps data publishers give context (metadata) to their data. The DKAN Search module provides a configurable way to allow data consumers to use metadata and find what they need.

The searchable metadata provided by the metastore_search module will help users narrow down their search, but ultimately the user will have to look at the data itself.

Usability

Data in files isn’t naturally searchable, but the DKAN Datastore module parses and stores data in a more explorable format. DKAN can then use the datastore to provide direct access to the data, through tools like the DatastoreQuery Endpoint.


DKAN is actively maintained by CivicActions.

To learn more about the DKAN community visit DKAN Discussions.