JSON-delta by example

Consider the example JSON-LD entry for John Lennon from http://json-ld.org/:

{
 "@context": "http://json-ld.org/contexts/person.jsonld",
 "@id": "http://dbpedia.org/resource/John_Lennon",
 "name": "John Lennon",
 "born": "1940-10-09",
 "spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
}

Suppose we have a piece of software that updates this record to show his date of death, like so:

{
 "@context": "http://json-ld.org/contexts/person.jsonld",
 "@id": "http://dbpedia.org/resource/John_Lennon",
 "name": "John Lennon",
 "born": "1940-10-09",

 "died": "1980-12-07",

 "spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
 }

Further suppose that we wish to communicate this update to another piece of software whose only job is to store information about John Lennon in JSON-LD format. (Yes, I know this is getting unlikely, but stay with me.) If this Lennon-record-keeper accepts updates in json-delta format, all you have to do is send the following over the wire:

[[["died"],"1980-12-07"]]

This is a complete diff in json-delta format. It is itself a JSON-serializable data structure: specifically, it is a sequence of what I refer to as diff stanzas for some reason. The format for a diff stanza is [<key path>, (<update>)] (The parentheses mean that the <update> part is optional. I’ll get to that in a minute). A key path is a sequence of keys specifying where in the data structure the node you want to alter is found, much like those emitted by JSON.sh. The stanza may be thought of as an instruction to update the node found at that path so that its content is equal to <update>.

Now, let’s do some more supposing. Suppose the software we’re communicating with is dedicated to storing information about the Beatles in general. Also, suppose we’ve remembered that it was actually on the 8th of December 1980 that John Lennon died, not the 7th. Finally, suppose we live in an Orwellian dystopia, and Cynthia Lennon has been declared a non-person who must be expunged from all records. Unfortunately, json-delta is incapable of overthrowing corrupt and despotic governments, so let’s make one last supposition, that what we’re interested in is updating the record kept by the software on the other end of the wire, which looks like this:

[
 {
  "@context": "http://json-ld.org/contexts/person.jsonld",
  "@id": "http://dbpedia.org/resource/John_Lennon",
  "name": "John Lennon",
  "born": "1940-10-09",

  "died": "1980-12-07",

  "spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
 },
 {"name": "Paul McCartney"},
 {"name": "George Harrison"},
 {"name": "Ringo Starr"}
]

(Allegations of bias in favor of specific Beatles on the part of the maintainer of this record are punished by the aforementioned despotic government. All glory to Arstotzka!)

To make the changes we’ve decided on (correcting John’s date of death, and expunging Cynthia Lennon from the record), we need to send the following sequence:

[
 [[0, "died"], "1980-12-08"],
 [[0, "spouse"]]
]

Now, of course, you see what I meant when I said I’d tell you why <update> is optional later. If a stanza includes no update material, it is interpreted as an instruction to delete the node the key-path points to.

Note also that there is no difference between a stanza that adds a node, and one that changes one.

The intention is to save as much communications bandwidth as possible without sacrificing the ability to communicate arbitrary modifications to the data structure (this format can be used to describe a change from any JSON-serialized object into any other). The worst-case scenario, where there is no commonality between the two structures, is that the protocol adds seven octets of overhead, because a diff can always be expressed as [[[],<target>]], meaning “substitute <target> for the data structure that is to be modified”.