rdf-tabular

Tabular Data RDF Reader and JSON serializer

CSV reader for RDF.rb and fully JSON serializer.

Gem Version Build Status Coverage Status Gitter chat

Features

RDF::Tabular parses CSV or other Tabular Data into RDF and JSON using the W3C CSVW specifications, currently undergoing development.

Installation

Install with gem install rdf-tabular

Description

RDF::Tabular parses CSVs, TSVs, and potentially other tabular data formats. Using rules defined for W3C CSVW, it can also parse metadata files (in JSON-LD format) to find a set of tabular data files, or locate a metadata file given a CSV:

Metadata can then provide datatypes for the columns, express foreign key relationships, and associate subjects and predicates with columns. An example metadata file for the project DOAP description is:

{
  "@context": "http://www.w3.org/ns/csvw",
  "url": "doap.csv",
  "tableSchema": {
    "aboutUrl": "https://rubygems.org/gems/rdf-tabular",
    "propertyUrl": "http://usefulinc.com/ns/doap#{_name}",
    "null": "",
    "columns": [
      {"titles": "name"},
      {"titles": "type", "propertyUrl": "rdf:type", "valueUrl": "{+type}"},
      {"titles": "homepage", "valueUrl": "{+homepage}"},
      {"titles": "license", "valueUrl": "{+license}"},
      {"titles": "shortdesc", "lang": "en"},
      {"titles": "description", "lang": "en"},
      {"titles": "created", "datatype": {"base": "date", "format": "M/d/yyyy"}},
      {"titles": "programming_language", "propertyUrl": "http://usefulinc.com/ns/doap#programming-language"},
      {"titles": "implements", "valueUrl": "{+implements}"},
      {"titles": "category", "valueUrl": "{+category}"},
      {"titles": "download_page", "propertyUrl": "http://usefulinc.com/ns/doap#download-page", "valueUrl": "{+download_page}"},
      {"titles": "mailing_list", "propertyUrl": "http://usefulinc.com/ns/doap#mailing-list", "valueUrl": "{+mailing_list}"},
      {"titles": "bug_database", "propertyUrl": "http://usefulinc.com/ns/doap#bug-database", "valueUrl": "{+bug_database}"},
      {"titles": "blog", "valueUrl": "{+blog}"},
      {"titles": "developer", "valueUrl": "{+developer}"},
      {"titles": "maintainer", "valueUrl": "{+maintainer}"},
      {"titles": "documenter", "valueUrl": "{+documenter}"},
      {"titles": "maker", "propertyUrl": "foaf:maker", "valueUrl": "{+maker}"},
      {"titles": "dc_title", "propertyUrl": "dc:title"},
      {"titles": "dc_description", "propertyUrl": "dc:description", "lang": "en"},
      {"titles": "dc_date", "propertyUrl": "dc:date", "datatype": {"base": "date", "format": "M/d/yyyy"}},
      {"titles": "dc_creator", "propertyUrl": "dc:creator", "valueUrl": "{+dc_creator}"},
      {"titles": "isPartOf", "propertyUrl": "dc:isPartOf", "valueUrl": "{+isPartOf}"}
    ]
  }
}

This associates the metadata with the CSV doap.csv, creates a common subject for all rows in the file, and a common predicate using the URI Template URI Template http://usefulinc.com/ns/doap#\{_name\} which uses the name of each column (defaulted from titles) to construct a URI in the DOAP vocabulary, and constructs object URIs for object-valued properties from the contents of the column cells. In some cases, the predicates are changed on a per-column basis by using a different propertyUrl property on a given column.

This results in the following Turtle:

@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix dc: <http://purl.org/dc/terms/> .
@prefix doap: <http://usefulinc.com/ns/doap#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<https://rubygems.org/gems/rdf-tabular> a doap:Project,
     <http://www.w3.org/ns/earl#TestSubject>,
     <http://www.w3.org/ns/earl#Software>;
   dc:title "RDF::Tabular";
   dc:creator <http://greggkellogg.net/foaf#me>;
   dc:date "2015-01-05"^^xsd:date;
   dc:description "RDF::Tabular processes tabular data with metadata creating RDF or JSON output."@en;
   dc:isPartOf <https://rubygems.org/gems/rdf>;
   doap:blog <http://greggkellogg.net/>;
   doap:bug-database <https://github.com/ruby-rdf/rdf-tabular/issues>;
   doap:category <http://dbpedia.org/resource/Resource_Description_Framework>,
     <http://dbpedia.org/resource/Ruby_(programming_language)>;
   doap:created "2015-01-05"^^xsd:date;
   doap:description "RDF::Tabular processes tabular data with metadata creating RDF or JSON output."@en;
   doap:developer <http://greggkellogg.net/foaf#me>;
   doap:documenter <http://greggkellogg.net/foaf#me>;
   doap:download-page <https://rubygems.org/gems/rdf-tabular>;
   doap:homepage <http://ruby-rdf.github.com/rdf-tabular>;
   doap:implements <http://www.w3.org/TR/tabular-data-model/>,
     <http://www.w3.org/TR/tabular-metadata/>,
     <http://www.w3.org/TR/csv2rdf/>,
     <http://www.w3.org/TR/csv2json/>;
   doap:license <https://unlicense.org/1.0/>;
   doap:mailing-list <http://lists.w3.org/Archives/Public/public-rdf-ruby/>;
   doap:maintainer <http://greggkellogg.net/foaf#me>;
   doap:name "RDF::Tabular";
   doap:programming-language "Ruby";
   doap:shortdesc "Tabular Data RDF Reader and JSON serializer."@en;
   foaf:maker <http://greggkellogg.net/foaf#me> .

 [
    a csvw:TableGroup;
    csvw:table [
      a csvw:Table;
      csvw:row [
        a csvw:Row;
        csvw:describes <https://rubygems.org/gems/rdf-tabular>;
        csvw:rownum 1;
        csvw:url <file://users/gregg/Projects/rdf-tabular/etc/doap.csv#row=2>
      ],  [
        a csvw:Row;
        csvw:describes <https://rubygems.org/gems/rdf-tabular>;
        csvw:rownum 2;
        csvw:url <file://users/gregg/Projects/rdf-tabular/etc/doap.csv#row=3>
      ],  [
        a csvw:Row;
        csvw:describes <https://rubygems.org/gems/rdf-tabular>;
        csvw:rownum 3;
        csvw:url <file://users/gregg/Projects/rdf-tabular/etc/doap.csv#row=4>
      ],  [
        a csvw:Row;
        csvw:describes <https://rubygems.org/gems/rdf-tabular>;
        csvw:rownum 4;
        csvw:url <file://users/gregg/Projects/rdf-tabular/etc/doap.csv#row=5>
      ];
      csvw:url <file://users/gregg/Projects/rdf-tabular/etc/doap.csv>
    ];
    prov:wasGeneratedBy [
      a prov:Activity;
      prov:endedAtTime "2015-04-11T12:33:26Z"^^xsd:dateTime;
      prov:qualifiedUsage [
        a prov:Usage;
        prov:entity <file://users/gregg/Projects/rdf-tabular/etc/doap.csv>;
        prov:hadRole csvw:csvEncodedTabularData
      ],  [
        a prov:Usage;
        prov:entity <file://users/gregg/Projects/rdf-tabular/etc/doap.csv-metadata.json>;
        prov:hadRole csvw:tabularMetadata
      ];
      prov:startedAtTime "2015-04-11T12:33:25Z"^^xsd:dateTime;
      prov:wasAssociatedWith <https://rubygems.org/gems/rdf-tabular>
    ]
 ] .

The provenance on table-source information can be excluded by using the :minimal option to the reader.

It can also generate JSON output (not complete JSON-LD, but compatible with it), using the {RDF::Tabular::Reader#to_json} method:

{
  "table": [
    {
      "url": "file://users/gregg/Projects/rdf-tabular/etc/doap.csv",
      "row": [
        {
          "url": "file://users/gregg/Projects/rdf-tabular/etc/doap.csv#row=2",
          "rownum": 1,
          "describes": [
            {
              "@id": "https://rubygems.org/gems/rdf-tabular",
              "http://usefulinc.com/ns/doap#name": "RDF::Tabular",
              "@type": "http://usefulinc.com/ns/doap#Project",
              "http://usefulinc.com/ns/doap#homepage": "http://ruby-rdf.github.com/rdf-tabular",
              "http://usefulinc.com/ns/doap#license": "https://unlicense.org/1.0/",
              "http://usefulinc.com/ns/doap#shortdesc": "Tabular Data RDF Reader and JSON serializer.",
              "http://usefulinc.com/ns/doap#description": "RDF::Tabular processes tabular data with metadata creating RDF or JSON output.",
              "http://usefulinc.com/ns/doap#created": "2015-01-05",
              "http://usefulinc.com/ns/doap#programming-language": "Ruby",
              "http://usefulinc.com/ns/doap#implements": "http://www.w3.org/TR/tabular-data-model/",
              "http://usefulinc.com/ns/doap#category": "http://dbpedia.org/resource/Resource_Description_Framework",
              "http://usefulinc.com/ns/doap#download-page": "https://rubygems.org/gems/rdf-tabular",
              "http://usefulinc.com/ns/doap#mailing-list": "http://lists.w3.org/Archives/Public/public-rdf-ruby/",
              "http://usefulinc.com/ns/doap#bug-database": "https://github.com/ruby-rdf/rdf-tabular/issues",
              "http://usefulinc.com/ns/doap#blog": "http://greggkellogg.net/",
              "http://usefulinc.com/ns/doap#developer": "http://greggkellogg.net/foaf#me",
              "http://usefulinc.com/ns/doap#maintainer": "http://greggkellogg.net/foaf#me",
              "http://usefulinc.com/ns/doap#documenter": "http://greggkellogg.net/foaf#me",
              "foaf:maker": "http://greggkellogg.net/foaf#me",
              "dc:title": "RDF::Tabular",
              "dc:description": "RDF::Tabular processes tabular data with metadata creating RDF or JSON output.",
              "dc:date": "2015-01-05",
              "dc:creator": "http://greggkellogg.net/foaf#me",
              "dc:isPartOf": "https://rubygems.org/gems/rdf"
            }
          ]
        },
        {
          "url": "file://users/gregg/Projects/rdf-tabular/etc/doap.csv#row=3",
          "rownum": 2,
          "describes": [
            {
              "@id": "https://rubygems.org/gems/rdf-tabular",
              "@type": "http://www.w3.org/ns/earl#TestSubject",
              "http://usefulinc.com/ns/doap#implements": "http://www.w3.org/TR/tabular-metadata/",
              "http://usefulinc.com/ns/doap#category": "http://dbpedia.org/resource/Ruby_(programming_language)"
            }
          ]
        },
        {
          "url": "file://users/gregg/Projects/rdf-tabular/etc/doap.csv#row=4",
          "rownum": 3,
          "describes": [
            {
              "@id": "https://rubygems.org/gems/rdf-tabular",
              "@type": "http://www.w3.org/ns/earl#Software",
              "http://usefulinc.com/ns/doap#implements": "http://www.w3.org/TR/csv2rdf/"
            }
          ]
        },
        {
          "url": "file://users/gregg/Projects/rdf-tabular/etc/doap.csv#row=5",
          "rownum": 4,
          "describes": [
            {
              "@id": "https://rubygems.org/gems/rdf-tabular",
              "http://usefulinc.com/ns/doap#implements": "http://www.w3.org/TR/csv2json/"
            }
          ]
        }
      ]
    }
  ]
}

Tutorials

Command Line

When the linkeddata gem is installed, RDF.rb includes a rdf executable which acts as a wrapper to perform a number of different operations on RDF files using available readers and writers, including RDF::Tabular. The commands specific to RDF::Tabular is

To use RDF::Tabular specific features, you must use the --input-format tabular option to the rdf executable.

Other rdf commands and options treat CSV as a standard RDF format.

Example usage:

rdf serialize https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv \
  --output-format ttl
rdf tabular-json --input-format tabular https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv
rdf validate https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv --validate

Note that the --validate option must be used with the validate (or other) command to detect parse-time errors in addition to validating any resulting RDF triples.

RDF Reader

RDF::Tabular also acts as a normal RDF reader, using the standard RDF.rb Reader interface:

graph = RDF::Graph.load("etc/doap.csv", minimal: true)

Documentation

Full documentation available on RubyDoc

Principal Classes

Dependencies

Installation

The recommended installation method is via RubyGems. To install the latest official release of the RDF::Tabular gem, do:

% [sudo] gem install rdf-tabular

Mailing List

Author

Contributing

License

This is free and unencumbered public domain software. For more information, see https://unlicense.org/ or the accompanying {file:UNLICENSE} file.