jsq
Take control of your JSON; slice, filter, map, transform, calculate — it's up to you. All in a purpose-designed, simple language. With GZIP, it's less than 5KB.
var o = {
"data": [
{"uid": 1, "grades": [5,7,8]},
{"uid": 2, "grades": [3,9,6]}
],
"users": {
1: {"name": "Bruce Willis"},
2: {"name": "Samuel L. Jackson"}
}
};
jsq(o, '.users as $u | .data[] | {$u[.uid].name: (.grades|max)}');
// » [{"Bruce Willis":8}, {"Samuel L. Jackson":9}]
What's next
Using jsq
Jsq allows you to mold your JSON data the way you want to. It does this through it's own query language, called jsq. A jsq query is made up of filters. A filter takes an input, and produces an output. Every filter is essentially a little program performing a certain task. There's plenty of built-in functionality that allows you to slice, filter, map, transform etc.
Filters can be combined in various ways to up their power — you can pipe the output of one filter into another (like in UNIX), or collect the output of a filter into an array for example.
Some filters produce multiple results, for instance there's one that produces all the elements of its input array. Piping that filter into a second runs the second filter for each element of the array. Generally, things that would be done with loops and iteration in other languages are just done by gluing filters together in jsq.
It's important to remember that every filter has an input and an
output. Even literals like "hello" or 42 are filters — they take an
input but always produce the same literal as output. Operations that
combine two filters, like addition, generally feed the same input to
both and combine the results. So, you can implement an averaging
filter as add / length
— feeding the input array both to the add
filter and the length
filter and dividing the results.
The jsq function is defined as follows:
var output = jsq( input..., query, iterator(value, key, output)?, context? )
An arbitrary number of input
objects can be passed (not required) after which a query
string is expected. Finally an optional iterator
function and calling context
can be
defined. Jsq always returns a results array, even when there's only a single or no results.
The iterator function is called for every result. It's passed 3 arguments; the value of the result, the position of the result in the result array, and the complete result array. With the last argument, the iterator is able to manipulate the result array that's returned by jsq.
Jsq tries to mimic JavaScript's behavior as much as possible, especially with regards to
data types, comparing values and logical operators. For example, "1" == 1
will return
true while "1" === 1
will return false, 2 && 0
will return 0, and false || 2
will
return 2.
However, in some cases jsq's response will be different. .foo = 2
will for instance not
return the same as JavaScript's foo = 2
. The latter will evaluate to 2, while jsq
will return the entire object in which element 'foo' now has value 2.
Basic filters
.
The absolute simplest (and least interesting) filter
is .
. This is a filter that takes its input and
produces it unchanged as output.
. | |
Input | "Hello, world!" |
Output | "Hello, world!" |
.foo
The simplest useful filter is .foo. When given a JSON object as input, it produces the value at the key "foo", or nothing if there's none present.
.foo | |
Input | {"foo": 42, "bar": "less interesting data"} |
Output | [42] |
.foo | |
Input | {"notfoo": true, "alsonotfoo": false} |
Output | [] |
.["foo"]
You can also look up fields of an object using syntax like
.["foo"]
(.foo above is a shorthand version of this). This
one works for arrays as well, if the key is an
integer.
.[0] | |
Input | [{"name":"JSON", "good":true}, {"name":"XML", "good":false}] |
Output | {"name":"JSON", "good":true} |
.[2] | |
Input | [{"name":"JSON", "good":true}, {"name":"XML", "good":false}] |
Output | null |
.[]
If you use the .["foo"]
syntax, but omit the index
entirely, it will return all of the elements of an
array. Running .[]
with the input [1,2,3]
will produce the
numbers as a list of three results, rather than as a single
array. This also works for objects.
.[] | |
Input | [{"name":"JSON", "good":true}, {"name":"XML", "good":false}] |
Output | [{"name":"JSON", "good":true}, {"name":"XML", "good":false}] |
.[] | |
Input | {"name":"JSON", "good":true} |
Output | ["JSON", true] |
.[] | |
Input | [] |
Output | [] |
,
If two filters are separated by a comma, then the input will be fed into both and
there will be multiple outputs: first, all of the outputs produced by the left
expression, and then all of the outputs produced by the right. For instance, filter
.foo, .bar
, produces both the "foo" fields and "bar" fields as separate outputs.
By combining expression like this, you are creating what is called a list. You can read more about this in the next chapter Types and Values.
.foo, .bar | |
Input | {"foo": 42, "bar": "something else", "baz": true} |
Output | [42, "something else"] |
.user, .projects[] | |
Input | {"user":"tjoekbezoer", "projects": ["jsq", "wikiflow"]} |
Output | ["tjoekbezoer", "jsq", "wikiflow"] |
.[4,2] | |
Input | ["a","b","c","d","e"] |
Output | ["e", "c"] |
|
The | operator combines two filters by feeding the result(s) of the one on the left into the input of the one on the right. It's pretty much the same as the Unix shell's pipe, if you're used to that.
If the one on the left produces multiple results, the one on
the right will be run for each of those results. So, the
expression .[] | .foo
retrieves the "foo" field of each
result of the input expression.
.[] | .name | |
Input | [{"name":"JSON", "good":true}, {"name":"XML", "good":false}] |
Output | ["JSON", "XML"] |
Types and Values
jsq supports the same set of datatypes as JSON - strings, numbers, booleans, arrays, objects, and "null".
Simple values
Strings, numbers, booleans and null are written the same way as in javascript. Just like
everything else in jsq, these simple values take an input and produce an output — 42
is
a valid jsq expression that takes an input, ignores it, and returns 42 instead.
Like in JSON, strings are delimited using "double quotes". 'Single quotes' are not allowed.
Arrays — []
Similar to JavaScript, []
is used to construct arrays, as in [1,2,3]
. The elements
of the array can be any jsq expression. All of the results produced by all of the
expressions are collected into one big array. You can use it to construct an array out
of a known quantity of values (as in [.foo, .bar, .baz]
) or to "collect" all the results
of a filter into an array (as in [.items[].name]
)
If you have a filter X
that produces four results, then the expression [X]
will
produce a single result, an array of four elements.
[.user, .projects[]] | |
Input | {"user":"tjoekbezoer", "projects": ["jsq", "wikiflow"]} |
Output | [["tjoekbezoer", "jsq", "wikiflow"]] |
Lists — 1,2,3
Once you understand the ,
operator, you can look at jsq's array
syntax in a different light: the expression [1,2,3]
is not using a
built-in syntax for comma-separated arrays, but is actually applying
the []
operator (collect results) to the expression 1,2,3
which
produces a list of 3 integers.
A list will concatenate all results from all elements of the list.
Objects — {}
Continuing JavaScript's familiarity, objects are constructed using {}
— e.g.:
{"a": 42, "b": 17}
.
If the keys are "sensible" (all alphanumeric characters), then the quotes can be left off. The value can be any expression (although you may need to wrap it in parentheses if it's a complicated one — like when using a list), which gets applied to the {} expression's input (remember, all filters have an input and an output).
{foo: .bar}
will produce the JSON object {"foo": 42}
if given the JSON
object {"bar":42, "baz":43}
. You can use this to select
particular fields of an object: if the input is an object
with "user", "title", "id", and "content" fields and you
just want "user" and "title", you can write
{user: .user, title: .title}
Because that's so common, there's a shortcut syntax: {user, title}
.
If one of the expressions produces multiple results, multiple dictionaries will be produced. If the input's
{
"user":"tjoekbezoer",
"titles":["JSQ Primer", "More JSQ"]
}
then the expression
{user, title: .titles[]}
will produce two outputs:
[
{"user":"tjoekbezoer", "title": "JSQ Primer"},
{"user":"tjoekbezoer", "title": "More JSQ"}
]
Putting parentheses around the key means it will be evaluated as an expression. With the same input as above,
{(.user): .titles}
produces
{"tjoekbezoer": ["JSQ Primer", "More JSQ"]}
Simple expressions, like a single filter, can also be used without the parentheses. A filter used as a key name can return only one result, or jsq will throw an error.
{user, title: .titles[]} | |
Input | {"user":"tjoekbezoer","titles":["JSQ Primer", "More JSQ"]} |
Output | [{"user":"tjoekbezoer", "title": "JSQ Primer"}, {"user":"tjoekbezoer", "title": "More JSQ"}] |
{(.user): .titles} | |
Input | {"user":"tjoekbezoer","titles":["JSQ Primer", "More JSQ"]} |
Output | [{"tjoekbezoer": ["JSQ Primer", "More JSQ"]}] |
Builtin operators
Some jsq operators (for instance, +
) do different things depending on the type of their
arguments (arrays, numbers, etc.). Jsq will mimic JavaScript as much as possible, so "1"+1
will produce "11"
, and "1"-1
will produce 0
.
If one or both expressions produce multiple results, the operation will be performed for every possible combination of results. So
(1,2) + 3
Will produce [4,5]
, and
(1,2) * (3,4)
Will produce [3,4,6,8]
.
Addition - +
The operator +
takes two filters, applies them both
to the same input, and adds the results together. What
"adding" means depends on the types involved:
Numbers are added by normal arithmetic.
Arrays are added by being concatenated into a larger array.
Strings are added by being joined into a larger string.
Objects are added by merging, that is, inserting all
the key-value pairs from both objects into a single
combined object. If both objects contain a value for the
same key, the object on the right of the +
will be used.
.a + 1 | |
Input | {"a": 7} |
Output | [8] |
.a + .b | |
Input | {"a": [1,2], "b": [3,4]} |
Output | [[1,2,3,4]] |
.a + null | |
Input | {"a": 1} |
Output | [1] |
.a + 1 | |
Input | {} |
Output | [1] |
{a: 1} + {b: 2} + {c: 3} + {a: 42} | |
Input | null |
Output | [{"a": 42, "b": 2, "c": 3}] |
Subtraction - -
4 - .a | |
Input | {"a":3} |
Output | [1] |
. - ["xml", "yaml"] | |
Input | ["xml", "yaml", "json"] |
Output | [["json"]] |
Multiplication, division - *
and /
These operators only work on numbers, and do the expected.
10 / . * 3 | |
Input | 5 |
Output | [6] |
Binary — and
, or
and xor
These operators are similar to JavaScript's &
, |
and ^
.
Assignment
Assignment works a little differently in jsq than in most programming languages. jsq doesn't distinguish between references to and copies of something - two objects or arrays are either equal or not equal, without any further notion of being "the same object" or "not the same object".
If an object has two fields which are arrays, .foo
and .bar
,
and you append something to .foo
, then .bar
will not get
bigger. Even if you've just set .bar = .foo
.
=
The assignment operation .foo = 1
will take as input an object
and produce as output an object with the "foo" field set to
1. There is no notion of "modifying" or "changing" something
in jsq - all jsq values are immutable. For instance,
.foo = .bar | .foo.baz = 1
will not have the side-effect of setting .bar.baz to be set to 1, as the similar-looking program in Javascript, Python, Ruby or other languages would. Unlike these languages, there is no notion of two arrays or objects being "the same array" or "the same object". They can be equal, or not equal, but if we change one of them in no circumstances will the other change behind our backs.
This means that it's impossible to build circular values in jsq (such as an array whose first element is itself). This is quite intentional, and ensures that anything a jsq program can produce can be represented in JSON.
|=
As well as the assignment operator '=', jsq provides the "update"
operator '|=', which takes a filter on the right-hand side and
works out the new value for the property being assigned to by running
the old value through this expression. For instance, .foo |= .+1
will
build an object with the "foo" field set to the input's "foo" plus 1.
This example should show the difference between '=' and '|='. Provide input
{"a": {"b": 10}, "b": 20}
to .a = .b
and it produces {"a": 20, "b": 20}
.
to .a |= .b
and it produces {"a": 10, "b": 20}
.
+=
, -=
, *=
, /=
jsq has a few operators of the form a op= b
, which are all equivalent to
a |= . op b
. So for example, += 1
can be used to increment values.
.foo += 1 | |
Input | {"foo": 42} |
Output | {"foo": 43} |
Complex assignments
Lots more things are allowed on the left-hand side of a jsq assignment than in most langauges. We've already seen simple field accesses on the left hand side, and it's no surprise that array accesses work just as well:
.posts[0].title = "JSQ Manual"
What may come as a surprise is that the expression on the left may produce multiple results, referring to different points in the input document:
.posts[].comments |= . + ["this is great"]
That example appends the string "this is great" to the "comments" array of each post in the input (where the input is an object with a field "posts" which is an array of posts).
When jsq encounters an assignment like 'a = b', it records the "path" taken to select a part of the input document while executing a. This path is then used to find which part of the input to change while executing the assignment. Any filter may be used on the left-hand side of an equals - whichever paths it selects from the input will be where the assignment is performed.
Comparisons and Conditionals
==
, !=
, ===
For scalars, these comparison operators behave the same as their Javascript equivalents. For arrays and objects, a deep comparison is performed.
.[] == 1 | |
Input | [1, 1.0, "1", "banana"] |
Output | true ,true ,false ,false |
>
, >=
, <=
, <
The comparison operators >
, >=
, <=
, <
behave as expected. They can also be
used to compare arrays and objects.
The ordering is the same as that described for the built-in function sort
, below.
. < 5 | |
Input | 2 |
Output | true |
if(cond, then[, else])
Consider the expression if(A, B, C)
. If A
evaluates as truthy, B
will be executed.
Otherwise it will execute C
, if provided.
If the condition A
is an expression producing multiple results, it is considered "true"
if any of those results is truthy. If it produces zero results, it's considered false.
More about truthy values: Truth, equality and JavaScript
if . == 0 then "zero" elif . == 1 then "one" else "many" end | |
Input | 2 |
Output | "many" |
&&
, ||
, !
jsq supports the normal Boolean operators &&
and ||
. They have the
same standard of truth as if
expressions.
If an operand of one of these operators produces multiple results, the operator itself will produce a result for each input.
The boolean operators behave the same as JavaScript's, returning the value of the last
correct operand. So for example 0 && 2
returns 2
; !1 || 3
returns 3
.
!
simply negates the proceding value.
42 and "a string" | |
Input | null |
Output | true |
(true, false) or false | |
Input | null |
Output | true ,false |
(true, false) and (true, false) | |
Input | null |
Output | true ,false ,false ,false |
[true, false | not] | |
Input | null |
Output | [false, true] |
Variables
In jsq, all filters have an input and an output, so manual
plumbing is not necessary to pass a value from one part of a program
to the next. Many expressions, for instance a + b
, pass their input
to two distinct subexpressions (here a
and b
are both passed the
same input), so variables aren't usually necessary in order to use a
value twice.
For instance, calculating the average value of an array of numbers
requires a few variables in most languages; at least one to hold the
array, and perhaps one for each element or for a loop counter. In jsq, it's
simply add / length
— the add
expression is given the array and
produces its sum, and the length
expression is given the array and
produces its length.
So, there's generally a cleaner way to solve most problems in jsq that
defining variables. Still, sometimes they do make things easier, so jsq
lets you define variables using expression as $variable
. All
variable names start with $
. Here's a slightly uglier version of the
array-averaging example:
length as $array_length | add / $array_length
We'll need a more complicated problem to find a situation where using variables actually makes our lives easier.
Suppose we have an array of blog posts, with "author" and "title" fields, and another object which is used to map author usernames to real names. Our input looks like:
{
"posts": [
{"title": "Frist psot", "author": "anon"},
{"title": "A well-written article", "author": "person1"}],
"realnames": {
"anon": "Anonymous Coward",
"person1": "Person McPherson"
}
}
We want to produce the posts with the author field containing a real name, as in:
[
{"title": "Frist psot", "author": "Anonymous Coward"},
{"title": "A well-written article", "author": "Person McPherson"}
]
We use a variable, $names, to store the realnames object, so that we can refer to it later when looking up author usernames:
.realnames as $names | .posts[] | {title, author: $names[.author]}
The expression "foo as $x" runs foo, puts the result in $x, and returns the original input. Apart from the side-effect of binding the variable, it has the same effect as ".".
Built-in functions
Jsq comes with a set of built-in functions that allow you to perform extra operations on your
data, like filtering or transforming. Every function listed here is written in JavaScript,
and defined in the public jsq.fn
object, so it's possible to alter them, or add new ones
of your own.
length
The built-in function length
gets the length of various
different types of values:
When trying to get the length of a number, it will return nothing.
.[] | length | |
Input | [[1,2], "string", {"a":2}, null] |
Output | 2,6,1,0 |
keys
The built-in function keys
, when given an object, returns its keys
in a list. When given an array, it returns the valid indices for that
array: the integers from 0 to length-1.
When 'keying' an object, do not rely on the sort order. The order in which the keys are returned is defined by the browser implementation of object iteration, which is not cross-browser consistent.
keys | |
Input | {"abc": 1, "abcd": 2, "Foo": 3} |
Output | ["Foo", "abc", "abcd"] |
keys | |
Input | [42,3,35] |
Output | [0,1,2] |
pairs
When passed an object, pairs
will return a list of arrays, where every array
is a key-value combination. This also works on arrays.
Like with the keys
function, do not rely on sort order when 'pairing' an object.
select
The function select(foo)
produces its input unchanged if
foo
returns true for that input, and produces no output
otherwise.
It's useful for filtering lists, objects and arrays: [1,2,3] | map(select(. >= 2))
will give you [2,3]
.
map(select(. >= 2)) | |
Input | [1,5,3,0,7] |
Output | [5,3,7] |
empty
empty
returns no results. None at all. Not even null
.
1, empty, 2 | |
Input | null |
Output | 1,2 |
[1,2,empty,3] | |
Input | null |
Output | [1,2,3] |
map(x)
For any filter x
, map(x)
will run that filter for each element of the input
object/array, and produce a list of results equal to the amount of elements.
map(.+1)
will increment each element of an array of numbers.
map(x)
is equivalent to .[] | x
.
map(.+1) | |
Input | [1,2,3] |
Output | [2,3,4] |
add
The filter add
takes as input an array, and produces as
output the elements of the array added together. This might
mean summed, concatenated or merged depending on the types
of the elements of the input array - the rules are the same
as those for the +
operator (described above).
If the input is an empty array, add
returns null
.
add | |
Input | ["a","b","c"] |
Output | "abc" |
add | |
Input | [1, 2, 3] |
Output | 6 |
add | |
Input | [] |
Output | null |
tonumber
The tonumber
function will parse its input as a number. It's not able to parse objects
and arrays, nor can it parse strings not starting with a number. In these cases, tonumber
will return null
. Parsing booleans results in 1
or 0
.
.[] | tonumber | |
Input | [1, "1"] |
Output | 1,1 |
tostring
The tostring
function prints its input as a string. Jsq uses JSON.stringify
for this,
so make sure it is available (in older browser you might need to include json2.js)
.[] | tostring | |
Input | [1, "1", [1]] |
Output | "1" ,"1" ,"[1]" |
sort
The sort
function sorts its input, which must be an
array. Values are sorted in the following order:
null
false
true
The ordering for objects is a little complex: first they're compared by comparing their sets of keys (as arrays in sorted order), and if their keys are equal then the values are compared key by key.
sort | |
Input | [8,3,null,6] |
Output | [null,3,6,8] |
sort_by(.foo) | |
Input | [{"foo":4, "bar":10}, {"foo":3, "bar":100}, {"foo":2, "bar":1}] |
Output | [{"foo":2, "bar":1}, {"foo":3, "bar":100}, {"foo":4, "bar":10}] |
min
, max
Find the minimum or maximum element of the input array. Providing a filter as argument
allows you to specify a particular field or property to examine, e.g. min(.foo)
finds the object with the smallest foo
field.
min | |
Input | [5,4,2,7] |
Output | 2 |
max_by(.foo) | |
Input | [{"foo":1, "bar":14}, {"foo":2, "bar":3}] |
Output | {"foo":2, "bar":3} |
not
This function simply negates the input it receives.
not | |
Input | [0,1,2] |
Output | [true,false,false] |
recurse
The recurse
function allows you to search through a
recursive structure, and extract interesting data from all
levels. Suppose your input represents a filesystem:
{"name": "/", "children": [
{"name": "/bin", "children": [
{"name": "/bin/ls", "children": []},
{"name": "/bin/sh", "children": []}
]},
{"name": "/home", "children": [
{"name": "/home/daan", "children": [
{"name": "/home/daan/jsq", "children": []}
]}
]}
]}
Now suppose you want to extract all of the filenames
present. You need to retrieve .name
, .children[].name
,
.children[].children[].name
, and so on. You can do this
with:
recurse(.children[]) | .name
recurse(.foo[]) | |
Input | {"foo":[{"foo": []}, {"foo":[{"foo":[]}]}]} |
Output | {"foo":[{"foo":[]},{"foo":[{"foo":[]}]}]} ,{"foo":[]} ,{"foo":[{"foo":[]}]} ,{"foo":[]} |
format(string)
Much like a simplified sprintf, format
accepts an array of values to be integrated
into a custom string.
Acknowledgements & thanks