RDF Core Specification Overview

Graham Klyne, 19-Mar-2002

[[[Work in progress]]]

S1. Introduction

[From existing primer?]

S2. RDF overview

RDF is a data format for representing metadata about Web resources, and other information. It uses well established ideas from the Knowledge Representation branch of Artificial Intelligence, with recognizable relationships to Conceptual Graphs, Frames, logic-based and relational database knowedge representation forms [I3,I4,I5,I6,I7]. (John Sowa [I3] argues compellingly that these are pretty much equivalent for the purposes of knowledge representation.)

RDF builds on XML, which provides a syntactic framework for representing documents and other information. It has a simple graph-based data model and formal semantics with a rigorously defined notion of entailment, which in turn provides a basis for well founded deductions in RDF data.

The real value of RDF comes not from any single "killer" application, but from sharing data between applications. The value of information thus increases as it becomes accessible to more and more applications across the entire Internet.

S2.1 Motivation

The design of RDF has been motivated by the following uses, among others:

Web metadata: providing information about web resources and the systems that use them (e.g. content rating, capability descriptions, privacy preferences, etc.)
Applications that require open rather than constrained information formats (e.g. scheduling activities, describing organizational processes, annotation of Web resources, etc.)
To do for machine processable information (application data) what the World Wide Web has done for hypertext: to allow data to be processed outside the particular environment in which it was created, in a fashion that can work at Internet scale.
Interworking between applications: combining data from several applications to deduce new information.
Automated processing of Web information by software agents: the Web is moving from having just human-readable information to being a world-wide network of cooperating processes. RDF provides a world-wide lingua franca for these processes.

S2.2 Design goals

The design of RDF is intended to meet the following specific goals:

A simple data model
Formal semantics and well-founded inference
Extensible URI-based vocabulary
XML-based syntax
Use XML schema datatypes
Anyone can say anything about anything
A basis for legally binding agrements
Universal expression of ground facts

S2.2.1 A simple data model

RDF has a simple data model that is easy for applications to process and manipulate. The data model is independent of any specific serialization syntax.

NOTE: the term "model" used here in "data model" has a completely different sense to its use in the term "model theory". See the RDF model theory specification [N1] or a textbook on logical semantics (e.g. [I8]) for more information about what logicians call "model theory".

S2.2.2 Formal semantics and well-founded inference

RDF has a formal semantics, in the form of a model theory, which provides a formal basis for reasoning about the meaning of an RDF expression. In particular, it supports a rigorously defined notion of entailment which provides a basis for defining reliable rules of inference in RDF data.

S2.2.3 Extensible URI-based vocabulary

The vocabulary is fully extensible, being based on URIs with optional fragment identifiers (URIrefs). URIrefs are used for naming all kinds of things in RDF data. The only other kind of label that appears in RDF data is a literal string.

S2.2.4 XML-based syntax

RDF has an XML-based serialization form which, if used appropriately, allows a wide range of "ordinary" XML data to be interpreted as RDF [I9].

S2.2.5 Use XML schema datatypes

RDF can be used with XML schema datatypes, thus assisting the exchange of information between RDF and other XML applications.

S2.2.6 Anyone can say anything about anything

To allow operation at Internet scale, RDF is an open-world framework that allows anyone to say anything about anything. In general, it is not assumed that all information about any topic is available. A consequence of this is that RDF cannot prevent anyone from making nonsensical or inconsistent assertions, and applications that build upon RDF must find ways to deal with conflicting sources of information. (This is where RDF departs from the XML approach to data representation, which is generally quite prescriptive and aims to present an application with information that is well-formed and complete for the application's needs.)

S2.2.7 Universal expression of ground facts

Through its use of extensible URI-based vocabularies, RDF aims to provide for universal expression of ground facts; i.e. assertions of specific properties about specific named things.

RDF itself does not provide the machinery of inference, but provides the raw data upon which such machinery can operate. Other work is looking for ways to build more expressive expressions on the basic capabilities of the RDF core language.

S2.2.8 A basis for legally binding agrements

RDF is intended to convey assertions that are meaningful to the extent that they may, in appropriate contexts, be used to express the terms of binding agreements.

... more words? legal frameworks, etc? ...

S2.3 RDF overview

RDF uses the following key concepts:

Graph data model
URI-based vocabulary
XML serialization syntax

S2.3.1 Graph data model

The underlying structure of any RDF expression is a directed labelled graph, which consists of labelled nodes and labelled directed arcs that link pairs of nodes. The formal semantics for RDF is defined in terms of this graph syntax. An RDF expression is sometimes called an RDF graph. The graph can conveniently be prepresented as a set of triples, where each triple contains two node labels and an arc label.

[[[picture of node --arc--> node]]]

The nodes and arcs of the graph carry labels that indicate what they denote. Each arc corresponds to an assertion of a relationship between the nodes that it links. The meaning of an RDF expression is the conjunction (i.e. logical AND) of all the statements that it contains.

S2.3.2 URI-based vocabulary

Nodes and arcs in an RDF graph are labelled with URIs with optional fragment identifiers (URIrefs). (Nodes may also be labelled with literal strings, or nothing at all.)

The label on a node indicates what that node is meant to represent. The label on an arc names the relationship that is asserted to hold between the nodes connected by that arc. Some URIrefs may indicate web resources, and a node thus labelled is presumed to denote that resource. Other URIrefs may represent abstract ideas or values rather than a retreivable Web resource. RDF thus leverages the universal naming space of URIs [I10].

S2.3.3 XML serialization syntax

RDF has a specific serialization syntax based on XML. There are several ways in which a given RDF graph can be prepresented in XML: these various forms allow RDF to be represented in ways that are amenable to specific XML applications. In this way, XML application data can easily be designed to be accessible to generic RDF processors [I11].

Other syntaxes for RDF graphs are possible (e.g. [I12]), but only the XML syntax is normatively specified and recommended for use to exchange information between Internet applications.

S3. RDF specification

The RDF specification consists of:

Graph syntax and model theory (normative) [N1]
XML syntax (normative) [N2]
Schema and datatypes (normative) [N3]
Test cases (non-normative) [I1]
Primer (non-normative) [I2]

S3.1 Graph syntax and model theory

The RDF abstract graph syntax and model theory [N1] are at the heart of the RDF specifications. This document specifies the essential elements of RDF abstract syntax, and the associated model theoretic semantics. The syntax is specified in a terms of a directed labelled graph and an equivalent representation of <subject,predciate,object> triples. Also given are entailment lemmas and their proofs. The entailment lemmas form the basis of RDF-based deduction.

Building on the core language and semantics, this specification also calls out the RDF reserved vocabulary (URIrefs) for RDF schema and RDF datatyping, also with model theoretic semantics, entailment lemmas and proofs.

This document contains a fair amount of formal mathematical content, necessary to meet some of the stated goals for RDF. Because RDF is such a simple language, the document actually serves as a quite accessible introduction to formal semantics. Developers whose sole concern is to write software that processes RDF may prefer to work from the XML syntax and RDF schema specifications, referring to this formal semantics specification to resolve occasional questions about validity of deductions.

S3.2 XML syntax

The RDF XML Syntax document [N2] defines the XML serialized forms for RDF graphs. The XML syntax is described in terms of the XML infoset, and its correspondence to RDF graph triples.

This document, together with the model theory, provides a formal definition of all of the RDF core language.

S3.3 RDF schema and datatypes

The RDF Schema and Datatypes document [N3] introduces and describes the use of RDF schema and datatypes vocabularies used to describe the classes and types of things described by some RDF vocabulary. The essential information in this document is covered formally in the model theory, and this specification provides a less formal account of these features of RDF.

S3.4 Test cases

The RDF Test Cases document [I1] supplements the normative RDF description set with specific examples of XML syntax and the corresponding RDF graph triples. To achieve this, it introduces a particular syntax for RDF graph triples, a very much simplified variant of Notation 3 [I12], which used to describe RDF graphs in a very direct and intuitive fashion. The test cases themselves are also published in machine-readable form at Web locations references by this document, so developers may use these as the basis for some automated testing of RDF software.

The test cases document also contains a number of entailment tests, which indicate entailments that applications are licensed by the RDF specification to use as the basis of deductions in RDF data. Many of these entailments relate to inferences that can be drawn from RDF schema and RDF datatyping information.

The test cases are not a complete specification of RDF, and are not intended to take precendence over the normative specification documents. However, they are intended to illustrate the intent of the working group with respect to the design of RDF, and developers may find these helpful should the specification wording be unclear on any point of detail.

S3.5 Primer

The RDF Primer [I2] serves two purposes:

it provides a tutorial introduction to RDF, and
it offers advice, particularly to RDF information designers, about how RDF is expected to be used to represent different kinds of information.

S4. References

S4.1 Normative references

[N1] Graph syntax and model theory

[N2] XML syntax

[N3] RDF schema and datatypes

S4.2 Informative references

[I1] RDF Test Cases

[I2] RDF Primer

[I3] John Sowa, Knowledge Representation, ...

[I4] Conceptual Graphs (spec)

[I5] Luger and Subblefield, Artificial Intelligence

[I6] Pat Hayes... in defense of logic

[I7] Peter Gray, Logic, Algebra and Databases,...

[I8] Geoffrey Hunter, Metalogic ...

[I9] Dan Brickley, Striped XML syntax for RDF, ...

[I10] Tim Berners-Lee, (DesignIssues note on universal naming with URIs)

[I11] (example of generic XML data that is RDF compatible - draft-klyne-xxx-rfc822-xml-xxx is one...)

[I12] Tim Berners-Lee, DesignIssues note on N3, ...


$Log: Overview.html,v $
Revision 1.10  2002/03/19 17:56:31  graham
Updated RDFnotes page and RDF umbrella specification