Linked data is a way of describing information so that data published by different parties can refer to the same things by name and be combined without prior coordination. The underlying model is the Resource Description Framework (RDF), a W3C standard that represents every statement as a triple — subject, predicate, object — with each part identified by an IRI, or, for the object, by a literal value.

The four linked-data principles

In a 2006 design note, Tim Berners-Lee set out four principles for publishing linked data on the web:

Use IRIs as names for things.
Use HTTP IRIs so those names can be looked up.
When an IRI is dereferenced, return useful information in a standard format — typically RDF in some serialisation.
Include links to other IRIs so a client can discover more.

These principles apply equally to a public knowledge base, a schema.org snippet inside an HTML page, and a single user's bookmark file in a Solid pod.

Triples

The atomic statement in RDF is the triple. A bookmark is described by a set of triples that share the same subject:

<bookmark-iri> rdf:type bookmark:Bookmark
<bookmark-iri> dc:title "The article title"
<bookmark-iri> bookmark:recalls <https://example.com/article>
<bookmark-iri> bookmark:hasTopic <topic-rust>

Subject and predicate are always IRIs. The object is either an IRI — a link to another resource — or a literal: a string, number, boolean, or typed value such as an xsd:dateTime. Literals can carry a language tag ("colour"@en) or a datatype IRI.

A subject that doesn't need an external name can be a blank node — useful for intermediate structures such as a postal address whose street, city, and postcode are the useful triples, not the structure itself.

IRIs

RDF uses IRIs (Internationalized Resource Identifiers) rather than URIs, so identifiers can contain non-ASCII characters. In practice most IRIs are HTTP(S) URIs, which means they can be dereferenced with a regular HTTP GET. Dereferencing is what turns linked data from "structured data with names" into "data on the web that can be navigated": a client follows an IRI and gets back the resource, typically RDF in some serialisation.

Vocabularies and ontologies

A vocabulary defines a set of classes (types of thing) and properties (relationships between things). Vocabularies are themselves published as RDF documents at well-known IRIs. A few cover most needs:

RDF + RDFS — the bootstrap layer that defines rdf:type, rdfs:Class, rdfs:label, rdfs:subClassOf.
OWL — extends RDFS with richer class and property axioms (cardinality, equivalence, restrictions).
Dublin Core (dc: / dct:) — cross-cutting metadata: title, creator, description, date.
FOAF — people, accounts, and their relationships.
schema.org — the vocabulary used for structured-data markup on web pages.
SKOS — thesauri, taxonomies, and classification systems (concepts, broader/narrower, related).
W3C Bookmark (bookmark:) — bookmark:Bookmark, bookmark:Topic, bookmark:recalls, bookmark:hasTopic. mnera.io uses this vocabulary.

Most documents mix vocabularies. A bookmark file uses W3C Bookmark for the type system, Dublin Core for title and description, and schema.org for the cover-image IRI. Triples from different vocabularies coexist in the same graph because every predicate is itself a fully-qualified IRI.

Graphs and datasets

A set of triples is a graph. A graph can be named with its own IRI to track provenance — useful when triples come from multiple sources and need to be queried or updated independently. An RDF dataset (RDF 1.1) is a default graph plus zero or more named graphs.

Serialisations

RDF is an abstract model; the same triples can be written in several concrete formats:

Turtle (.ttl) — compact, human-readable. mnera.io stores bookmarks as Turtle.
JSON-LD (.jsonld) — JSON with a @context mapping keys to predicates. The standard format for embedding schema.org markup in HTML.
N-Triples (.nt) — one triple per line with fully expanded IRIs. Trivial to stream-parse.
TriG, N-Quads — extensions of Turtle and N-Triples that carry named graphs.
RDF/XML — the original 2004 serialisation, still encountered in legacy systems.

Conversion between serialisations is lossless. Tools usually accept any input and emit one preferred format.

Querying

SPARQL is the W3C query language for RDF. The model is graph pattern matching rather than relational join: a query describes a shape of triples to find, with variables in any position. "Every bookmark whose topic is a sub-topic of Programming, returning its title" is a few lines of SPARQL against a triple store.

A SPARQL query can run against a remote endpoint over HTTP, or against an in-memory store. mnera.io takes the second route: the bookmark file is loaded into a client-side triple store and queried locally.

Linked data in a Solid pod

A Solid pod is a personal server that holds RDF resources, exposed over HTTP. Each resource — typically a Turtle document — is itself an IRI. Triples in one document can reference subjects in another document, in the same pod or a different pod. A Solid app authenticates with a WebID, requests the resources it needs, parses the triples, and writes back. The data layer is linked data; the access-control, identity, and notification layers are what make Solid distinct.

A working definition: Linked data is structured data described as triples in a published vocabulary, identified by IRIs that can be dereferenced over HTTP.

Bookmarks stored as linked data

mnera.io stores bookmarks as W3C-standard Turtle in your Solid pod — open vocabularies, no proprietary schema, readable by any Solid-aware app.

Get started free See pricing

Linked data and RDF