Class: RDF::N3::Writer

Inherits:
Writer
  • Object
show all
Includes:
Terminals, Util::Logger
Defined in:
lib/rdf/n3/writer.rb

Overview

A Notation-3 serialiser in Ruby

Note that the natural interface is to write a whole graph at a time. Writing statements or Triples will create a graph to add them to and then serialize the graph.

The writer will add prefix definitions, and use them for creating @prefix definitions, and minting pnames

Examples:

Obtaining a N3 writer class

RDF::Writer.for(:n3)         #=> RDF::N3::Writer
RDF::Writer.for("etc/test.n3")
RDF::Writer.for(file_name:      "etc/test.n3")
RDF::Writer.for(file_extension: "n3")
RDF::Writer.for(content_type:   "text/n3")

Serializing RDF graph into an N3 file

RDF::N3::Writer.open("etc/test.n3") do |writer|
  writer << graph
end

Serializing RDF statements into an N3 file

RDF::N3::Writer.open("etc/test.n3") do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Serializing RDF statements into an N3 string

RDF::N3::Writer.buffer do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Creating @base and @prefix definitions in output

RDF::N3::Writer.buffer(base_uri: "http://example.com/", prefixes: {
    nil => "http://example.com/ns#",
    foaf: "http://xmlns.com/foaf/0.1/"}
) do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Author:

Constant Summary

Constants included from Terminals

Terminals::ANON, Terminals::BASE, Terminals::BLANK_NODE_LABEL, Terminals::DECIMAL, Terminals::DOUBLE, Terminals::ECHAR, Terminals::ESCAPE_CHAR4, Terminals::ESCAPE_CHAR8, Terminals::EXPONENT, Terminals::INTEGER, Terminals::IPLSTART, Terminals::IRIREF, Terminals::IRI_RANGE, Terminals::LANGTAG, Terminals::PERCENT, Terminals::PLX, Terminals::PNAME_LN, Terminals::PNAME_NS, Terminals::PN_CHARS, Terminals::PN_CHARS_BASE, Terminals::PN_CHARS_BODY, Terminals::PN_CHARS_U, Terminals::PN_LOCAL, Terminals::PN_LOCAL_BODY, Terminals::PN_LOCAL_ESC, Terminals::PN_PREFIX, Terminals::PREFIX, Terminals::QUICK_VAR_NAME, Terminals::STRING_LITERAL_LONG_QUOTE, Terminals::STRING_LITERAL_LONG_SINGLE_QUOTE, Terminals::STRING_LITERAL_QUOTE, Terminals::STRING_LITERAL_SINGLE_QUOTE, Terminals::UCHAR, Terminals::U_CHARS1, Terminals::U_CHARS2, Terminals::WS

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(output = $stdout, **options) {|writer| ... } ⇒ Writer

Initializes the N3 writer instance.

Parameters:

  • output (IO, File) (defaults to: $stdout)

    the output stream

  • options (Hash{Symbol => Object})

    any additional options

Options Hash (**options):

  • :encoding (Encoding) — default: Encoding::UTF_8

    the encoding to use on the output stream (Ruby 1.9+)

  • :canonicalize (Boolean) — default: false

    whether to canonicalize literals when serializing

  • :prefixes (Hash) — default: Hash.new

    the prefix mappings to use (not supported by all writers)

  • :base_uri (#to_s) — default: nil

    the base URI to use when constructing relative URIs

  • :max_depth (Integer) — default: 3

    Maximum depth for recursively defining resources, defaults to 3

  • :standard_prefixes (Boolean) — default: false

    Add standard prefixes to @prefixes, if necessary.

  • :default_namespace (String) — default: nil

    URI to use as default namespace, same as prefixes

  • :unique_bnodes (Boolean) — default: false

    Use unique node identifiers, defaults to using the identifier which the node was originall initialized with (if any).

Yields:

  • (writer)

    self

  • (writer)

Yield Parameters:

  • writer (RDF::Writer)
  • writer (RDF::Writer)

Yield Returns:

  • (void)


110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
# File 'lib/rdf/n3/writer.rb', line 110

def initialize(output = $stdout, **options, &block)
  @repo = RDF::N3::Repository.new
  @uri_to_pname = {}
  @uri_to_prefix = {}
  super do
    if base_uri
      @uri_to_prefix[base_uri.to_s.end_with?('#', '/') ? base_uri : RDF::URI("#{base_uri}#")] = nil
    end
    reset
    if block_given?
      case block.arity
        when 0 then instance_eval(&block)
        else block.call(self)
      end
    end
  end
end

Instance Attribute Details

#formula_namesArray<RDF::Node>

Returns formulae names.

Returns:



62
63
64
# File 'lib/rdf/n3/writer.rb', line 62

def formula_names
  @formula_names
end

#graphRDF::Graph

Returns Graph being serialized.

Returns:

  • (RDF::Graph)

    Graph being serialized



59
60
61
# File 'lib/rdf/n3/writer.rb', line 59

def graph
  @graph
end

#repoRDF::Repository

Returns Repository of statements serialized.

Returns:

  • (RDF::Repository)

    Repository of statements serialized



56
57
58
# File 'lib/rdf/n3/writer.rb', line 56

def repo
  @repo
end

Class Method Details

.optionsObject

N3 Writer options



67
68
69
70
71
72
73
74
75
76
77
78
79
80
# File 'lib/rdf/n3/writer.rb', line 67

def self.options
  super + [
    RDF::CLI::Option.new(
      symbol: :max_depth,
      datatype: Integer,
      on: ["--max-depth"],
      description: "Maximum depth for recursively defining resources, defaults to 3.") {|arg| arg.to_i},
    RDF::CLI::Option.new(
      symbol: :default_namespace,
      datatype: RDF::URI,
      on: ["--default-namespace URI", :REQUIRED],
      description: "URI to use as default namespace, same as prefixes.") {|arg| RDF::URI(arg)},
  ]
end

Instance Method Details

#format_literal(literal, **options) ⇒ String

Returns the N-Triples representation of a literal.

Parameters:

  • literal (RDF::Literal, String, #to_s)
  • options (Hash{Symbol => Object})

Returns:

  • (String)


271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
# File 'lib/rdf/n3/writer.rb', line 271

def format_literal(literal, **options)
  literal = literal.dup.canonicalize! if @options[:canonicalize]
  case literal
  when RDF::Literal
    case literal.valid? ? literal.datatype : false
    when RDF::XSD.boolean
      %w(true false).include?(literal.value) ? literal.value : literal.canonicalize.to_s
    when RDF::XSD.integer
      literal.value.match?(/^[\+\-]?\d+$/) && !canonicalize? ? literal.value : literal.canonicalize.to_s
    when RDF::XSD.decimal
      literal.value.match?(/^[\+\-]?\d+\.\d+?$/) && !canonicalize? ?
        literal.value :
        literal.canonicalize.to_s
    when RDF::XSD.double
      if literal.nan? || literal.infinite?
        quoted(literal.value) + "^^#{format_uri(literal.datatype)}"
      else
        in_form = case literal.value
        when /[\+\-]?\d+\.\d*E[\+\-]?\d+$/i then true
        when /[\+\-]?\.\d+E[\+\-]?\d+$/i    then true
        when /[\+\-]?\d+E[\+\-]?\d+$/i      then true
        else false
        end && !canonicalize?

        in_form ? literal.value : literal.canonicalize.to_s.sub('E', 'e')
      end
    else
      text = quoted(literal.value)
      text << "@#{literal.language}" if literal.has_language?
      text << "^^#{format_uri(literal.datatype)}" if literal.has_datatype?
      text
    end
  else
    quoted(literal.to_s)
  end
end

#format_node(node, **options) ⇒ String

Returns the N3 representation of a blank node.

Parameters:

  • node (RDF::Node)
  • options (Hash{Symbol => Object})

Returns:

  • (String)


326
327
328
329
330
331
332
333
334
335
336
# File 'lib/rdf/n3/writer.rb', line 326

def format_node(node, **options)
  if node.id.match(/^([^_]+)_[^_]+_([^_]+)$/)
    sn, seq = $1, $2.to_i
    seq = nil if seq == 0
    "_:#{sn}#{seq}"
  elsif options[:unique_bnodes]
    node.to_unique_base
  else
    node.to_base
  end
end

#format_uri(uri, **options) ⇒ String

Returns the N3 representation of a URI reference.

Parameters:

  • uri (RDF::URI)
  • options (Hash{Symbol => Object})

Returns:

  • (String)


314
315
316
317
318
# File 'lib/rdf/n3/writer.rb', line 314

def format_uri(uri, **options)
  md = uri == base_uri ? '' : uri.relativize(base_uri)
  log_debug("relativize") {"#{uri.to_sxp} => <#{md.inspect}>"} if md != uri.to_s
  md != uri.to_s ? "<#{md}>" : (get_pname(uri) || "<#{uri}>")
end

#get_pname(resource) ⇒ String?

Return a pname for the URI, or nil. Adds namespace of pname to defined prefixes

Parameters:

  • resource (RDF::Resource)

Returns:

  • (String, nil)

    value to use to identify URI



203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
# File 'lib/rdf/n3/writer.rb', line 203

def get_pname(resource)
  case resource
  when RDF::Node
    return options[:unique_bnodes] ? resource.to_unique_base : resource.to_base
  when RDF::URI
    uri = resource.to_s
  else
    return nil
  end

  #log_debug {"get_pname(#{resource}), std?}"}
  pname = case
  when @uri_to_pname.key?(uri)
    return @uri_to_pname[uri]
  when u = @uri_to_prefix.keys.detect {|u| uri.index(u.to_s) == 0}
    # Use a defined prefix
    prefix = @uri_to_prefix[u]
    unless u.to_s.empty?
      prefix(prefix, u) unless u.to_s.empty?
      #log_debug("get_pname") {"add prefix #{prefix.inspect} => #{u}"}
      uri.sub(u.to_s, "#{prefix}:")
    end
  when @options[:standard_prefixes] && vocab = RDF::Vocabulary.each.to_a.detect {|v| uri.index(v.to_uri.to_s) == 0}
    prefix = vocab.__name__.to_s.split('::').last.downcase
    @uri_to_prefix[vocab.to_uri.to_s] = prefix
    prefix(prefix, vocab.to_uri) # Define for output
    #log_debug {"get_pname: add standard prefix #{prefix.inspect} => #{vocab.to_uri}"}
    uri.sub(vocab.to_uri.to_s, "#{prefix}:")
  else
    nil
  end

  # Make sure pname is a valid pname
  if pname
    md = PNAME_LN.match(pname) || PNAME_NS.match(pname)
    pname = nil unless md.to_s.length == pname.length
  end

  @uri_to_pname[uri] = pname
end

#indent(modifier = 0) ⇒ String (protected)

Returns indent string multiplied by the depth

Parameters:

  • modifier (Integer) (defaults to: 0)

    Increase depth by specified amount

Returns:

  • (String)

    A number of spaces, depending on current depth



455
456
457
# File 'lib/rdf/n3/writer.rb', line 455

def indent(modifier = 0)
  " " * (@options.fetch(:log_depth, log_depth) * 2 + modifier)
end

#order_subjectsArray<Resource> (protected)

Order subjects for output. Override this to output subjects in another order.

Uses #top_classes and #base_uri.

Returns:

  • (Array<Resource>)

    Ordered list of subjects



372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
# File 'lib/rdf/n3/writer.rb', line 372

def order_subjects
  seen = {}
  subjects = []

  # Start with base_uri
  if base_uri && @subjects.keys.select(&:uri?).include?(base_uri)
    subjects << base_uri
    seen[base_uri] = true
  end

  # Add distinguished classes
  top_classes.each do |class_uri|
    graph.query({predicate: RDF.type, object: class_uri}).
      map {|st| st.subject}.sort.uniq.each do |subject|
        log_debug("order_subjects") {subject.to_sxp}
        subjects << subject
        seen[subject] = true
      end
  end

  # Add formulae which are subjects in this graph
  @formulae.each_key do |bn|
    next unless @subjects.key?(bn)
    subjects << bn
    seen[bn] = true
  end

  # Mark as seen lists that are part of another list
  @lists.values.flatten.each do |v|
    seen[v] = true if @lists.key?(v)
  end

  list_elements = []  # Lists may be top-level elements

  # Sort subjects by resources over bnodes, ref_counts and the subject URI itself
  recursable = (@subjects.keys - list_elements).
    select {|s| !seen.include?(s)}.
    map {|r| [r.node? ? 1 : 0, ref_count(r), r]}.
    sort

  subjects += recursable.map{|r| r.last}
end

#predicate_orderArray<URI> (protected)

Defines order of predicates to to emit at begninning of a resource description. Defaults to [rdf:type, rdfs:label, dc:title]

Returns:

  • (Array<URI>)


356
357
358
359
360
361
362
363
364
365
366
# File 'lib/rdf/n3/writer.rb', line 356

def predicate_order
  [
    RDF.type,
    RDF::RDFS.label,
    RDF::RDFS.comment,
    RDF::URI("http://purl.org/dc/terms/title"),
    RDF::URI("http://purl.org/dc/terms/description"),
    RDF::OWL.sameAs,
    RDF::N3::Log.implies
  ]
end

#preprocessObject (protected)

Perform any preprocessing of statements required



416
417
418
419
420
421
422
423
424
425
426
427
# File 'lib/rdf/n3/writer.rb', line 416

def preprocess
  # Load defined prefixes
  (@options[:prefixes] || {}).each_pair do |k, v|
    @uri_to_prefix[v.to_s] = k
  end
  @options[:prefixes] = {}  # Will define actual used when matched

  prefix(nil, @options[:default_namespace]) if @options[:default_namespace]

  @options[:prefixes] = {}  # Will define actual used when matched
  repo.each {|statement| preprocess_statement(statement)}
end

#preprocess_graph_statement(statement) ⇒ Object (protected)

Perform graph-specific preprocessing

Parameters:



444
445
446
447
448
449
450
# File 'lib/rdf/n3/writer.rb', line 444

def preprocess_graph_statement(statement)
  bump_reference(statement.object)
  # Count properties of this subject
  @subjects[statement.subject] ||= {}
  @subjects[statement.subject][statement.predicate] ||= 0
  @subjects[statement.subject][statement.predicate] += 1
end

#preprocess_statement(statement) ⇒ Object (protected)

Perform any statement preprocessing required. This is used to perform reference counts and determine required prefixes.

Parameters:



432
433
434
435
436
437
438
439
440
# File 'lib/rdf/n3/writer.rb', line 432

def preprocess_statement(statement)
  #log_debug("preprocess") {statement.inspect}

  # Pre-fetch pnames, to fill prefixes
  get_pname(statement.subject)
  get_pname(statement.predicate)
  get_pname(statement.object)
  get_pname(statement.object.datatype) if statement.object.literal? && statement.object.datatype
end

#quoted(string) ⇒ String (protected)

Use single- or multi-line quotes. If literal contains t, n, or r, use a multiline quote, otherwise, use a single-line

Parameters:

  • string (String)

Returns:

  • (String)


473
474
475
476
477
478
479
480
# File 'lib/rdf/n3/writer.rb', line 473

def quoted(string)
  if string.to_s.match(/[\t\n\r]/)
    string = string.gsub('\\', '\\\\\\\\').gsub('"""', '\\"\\"\\"')
    %("""#{string}""")
  else
    "\"#{escaped(string)}\""
  end
end

#resetObject (protected)

Reset internal helper instance variables



460
461
462
463
464
465
466
# File 'lib/rdf/n3/writer.rb', line 460

def reset
  @lists = {}
  @references = {}
  @serialized = {}
  @graphs = {}
  @subjects = {}
end

#sort_properties(properties) ⇒ Array<RDF::Term>

Take a hash from predicate uris to lists of values. Sort the lists of values. Return a sorted list of properties.

Parameters:

  • properties (Hash{RDF::Term => Array<RDF::Term>})

    A hash of Property to Resource mappings

Returns:

  • (Array<RDF::Term>)

    ] Ordered list of properties. Uses predicate_order.



248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
# File 'lib/rdf/n3/writer.rb', line 248

def sort_properties(properties)
  # Make sorted list of properties
  prop_list = []

  predicate_order.each do |prop|
    next unless properties.key?(prop)
    prop_list << prop
  end

  properties.keys.sort.each do |prop|
    next if prop_list.include?(prop)
    prop_list << prop
  end

  prop_list
end

#start_documentObject (protected)

Output @base and @prefix definitions



340
341
342
343
344
345
346
347
# File 'lib/rdf/n3/writer.rb', line 340

def start_document
  @output.write("@base <#{base_uri}> .\n") unless base_uri.to_s.empty?

  log_debug("start_document: prefixes") { prefixes.inspect}
  prefixes.keys.sort_by(&:to_s).each do |prefix|
    @output.write("@prefix #{prefix}: <#{prefixes[prefix]}> .\n")
  end
end

#top_classesArray<URI> (protected)

Defines rdf:type of subjects to be emitted at the beginning of the graph. Defaults to rdfs:Class

Returns:

  • (Array<URI>)


351
# File 'lib/rdf/n3/writer.rb', line 351

def top_classes; [RDF::RDFS.Class]; end

#write_epilogue

This method returns an undefined value.

Outputs the N3 representation of all stored triples.

See Also:



157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
# File 'lib/rdf/n3/writer.rb', line 157

def write_epilogue
  @max_depth = @options[:max_depth] || 3

  self.reset

  log_debug("\nserialize: repo:") {repo.size}

  preprocess

  start_document

  @formula_names = repo.graph_names(unique: true)

  with_graph(nil) do
    count = 0
    order_subjects.each do |subject|
      unless is_done?(subject)
        statement(subject, count)
        count += 1
      end
    end

    # Output any formulae not already serialized using owl:sameAs
    formula_names.each do |graph_name|
      next if graph_done?(graph_name)

      # Add graph_name to @formulae
      @formulae[graph_name] = true

      log_debug {"formula(#{graph_name})"}
      @output.write("\n#{indent}")
      p_term(graph_name, :subject)
      @output.write(" ")
      predicate(RDF::OWL.sameAs)
      @output.write(" ")
      formula(graph_name, :graph_name)
      @output.write(" .\n")
    end
  end

  super
end

#write_quad(subject, predicate, object, graph_name)

This method returns an undefined value.

Adds a quad to be serialized

Parameters:

  • subject (RDF::Resource)
  • predicate (RDF::URI)
  • object (RDF::Value)
  • graph_name (RDF::Resource)


147
148
149
150
# File 'lib/rdf/n3/writer.rb', line 147

def write_quad(subject, predicate, object, graph_name)
  statement = RDF::Statement.new(subject, predicate, object, graph_name: graph_name)
  repo.insert(statement)
end

#write_triple(subject, predicate, object)

This method is abstract.

This method returns an undefined value.

Addes a triple to be serialized

Parameters:

  • subject (RDF::Resource)
  • predicate (RDF::URI)
  • object (RDF::Value)

Raises:

  • (NotImplementedError)

    unless implemented in subclass



136
137
138
# File 'lib/rdf/n3/writer.rb', line 136

def write_triple(subject, predicate, object)
  repo.insert(RDF::Statement(subject, predicate, object))
end