Glossary/Knowledge Representation

Linked Data

Linked Data is a method of publishing structured information on the web using standard formats and linking that data to external sources, enabling automatic discovery and integration across diverse systems.

Linked Data applies web principles to structured information. Traditional web content (HTML documents) links to other documents via hyperlinks. Linked Data links data to data: a dataset identifies related entities using standard URIs and formats, enabling automated systems to discover and traverse relationships. The four principles of Linked Data are: use URIs for identification, use HTTP URIs so identifiers are resolvable, provide information in standard formats (RDF), and link to external URIs.

Linked Data initiatives have created massive interconnected knowledge repositories. DBpedia extracts structured data from Wikipedia and links it to other datasets. Wikidata provides a centralized structured knowledge base linked to external sources. These enable queries like "all authors who won the Nobel Prize and published at least 10 papers" to traverse multiple linked datasets automatically.

Linked Data is foundational to the semantic web vision: the web as not just documents but as interconnected knowledge that machines can understand and reason about. However, Linked Data adoption faces challenges: creating and maintaining high-quality linked data requires significant effort, and adoption varies significantly by domain (strong in research and government, weaker in private industry).

Key Characteristics

  • Uses URIs for globally unique identification enabling unambiguous references
  • Publishes information in standardized machine-readable formats (RDF)
  • Links data to external sources enabling integrated querying across sources
  • Makes information resolvable via HTTP enabling automated discovery
  • Follows standard ontologies and vocabularies enabling semantic understanding
  • Enables automated integration without point-to-point mappings

Why It Matters

  • Enables automated discovery and integration of information across organizations
  • Reduces data integration costs by using standardized formats and linking
  • Provides foundation for knowledge discovery across distributed data sources
  • Facilitates open data initiatives by standardizing how data is published
  • Supports AI systems by providing interconnected, structured knowledge
  • Enables compliance and transparency through explicit, machine-readable relationships

Example

A Linked Data initiative for research publishes datasets about researchers, publications, and organizations as RDF with URIs. Researcher Alice "works_at" Organization X using the standard foaf:workplaceHomepage predicate. Publication Y references Researcher Alice using the standard dc:creator predicate. An automated system can query across datasets: "What publications are by researchers at this organization?" without custom integration.

Coginiti Perspective

Coginiti's Analytics Catalog implements linked data principles for analytics, where semantic models become standardized, shareable knowledge assets linked across projects and teams through version control and promotion workflows. By publishing SMDL models with explicit relationships, Coginiti enables organizations to build interconnected analytics knowledge that developers and analysts can discover, reuse, and extend without rebuilding definitions.

Related Concepts

See Semantic Intelligence in Action

Coginiti operationalizes business meaning across your entire data estate.