tidy.js for R

Install tidyjs from GitHub:

# install.packages("remotes")
remotes::install_github("gadenbuie/tidyjs-r")

To use tidy.js in your R Markdown documents, call use_tidyjs(). You can then access tidy functions from the Tidy object in your Shiny App or in your R Markdown document inside a JavaScript (js) chunk:

```{r echo=FALSE}
tidyjs::use_tidyjs()
```

```{js}
const { tidy, mutate, arrange, desc } = Tidy;

const data = [
  { a: 1, b: 10 }, 
  { a: 3, b: 12 }, 
  { a: 2, b: 10 }
]

const results = tidy(
  data, 
  mutate({ ab: d => d.a * d.b }),
  arrange(desc('ab'))
)
```

The examples below are actually running in your browser. The results of each JavaScript chunk, if executed, are shown below the JavaScript chunk with a gray left border.

console.log('Ready to tidy?')


tidy.js

CircleCI npm

Tidy up your data with JavaScript! Inspired by dplyr and the tidyverse, tidy.js attempts to bring the ergonomics of data manipulation from R to javascript (and typescript). The primary goals of the project are:

Secondarily, this project aims to provide acceptable types for the functions provided.

Be sure to check out a very similar project, Arquero, from UW Data.

Getting started

const { tidy, mutate, arrange, desc } = Tidy

window.data = [
  { a: 1, b: 10 }, 
  { a: 3, b: 12 }, 
  { a: 2, b: 10 }
]

const results = tidy(
  data, 
  mutate({ ab: d => d.a * d.b }),
  arrange(desc('ab'))
)

results

All tidy.js code is wrapped in a tidy flow via the tidy() function. The first argument is the array of data, followed by the transformation verbs to run on the data. The actual functions passed to tidy() can be anything so long as they fit the form:

(items: object[]) => object[]

For example, the following is valid:

tidy(
  data, 
  items => items.filter((d, i) => i % 2 === 0),
  arrange(desc('value'))
)

All tidy verbs fit this style, with the exception of exports from groupBy, discussed below.

Grouping data with groupBy

Besides manipulating flat lists of data, tidy provides facilities for wrangling grouped data via the groupBy() function.

const { tidy, summarize, sum, groupBy } = Tidy

const data = [
  { key: 'group1', value: 10 }, 
  { key: 'group2', value: 9 }, 
  { key: 'group1', value: 7 }
]

tidy(
  data,
  groupBy('key', [
    summarize({ total: sum('value') })
  ])
)

The output is:

[
  { "key": "group1", "total": 17 },
  { "key": "group2", "total": 9 },
]

The groupBy() function works similarly to tidy() in that it takes a flow of functions as its second argument (wrapped in an array). Things get really fun when you use groupBy’s third argument for exporting the grouped data into different shapes.

For example, exporting data as a nested object, we can use groupBy.object() as the third argument to groupBy().

window.data = [
  { g: 'a', h: 'x', value: 5 },
  { g: 'a', h: 'y', value: 15 },
  { g: 'b', h: 'x', value: 10 },
  { g: 'b', h: 'x', value: 20 },
  { g: 'b', h: 'y', value: 30 },
]

tidy(
  data,
  groupBy(
    ['g', 'h'], 
    [
      mutate({ key: d => `${d.g}${d.h}`})
    ], 
    groupBy.object() // <-- specify the export
  )
);

The manually formatted output is:

{
  "a": {
    "x": [{"g": "a", "h": "x", "value": 5, "key": "ax"}],
    "y": [{"g": "a", "h": "y", "value": 15, "key": "ay"}]
  },
  "b": {
    "x": [
      {"g": "b", "h": "x", "value": 10, "key": "bx"},
      {"g": "b", "h": "x", "value": 20, "key": "bx"}
    ],
    "y": [{"g": "b", "h": "y", "value": 30, "key": "by"}]
  }
}

Or alternatively as { key, values } entries-objects via groupBy.entriesObject():

tidy(data,
  groupBy(
    ['g', 'h'], 
    [
      mutate({ key: d => `${d.g}${d.h}`})
    ], 
    groupBy.entriesObject() // <-- specify the export
  )
);

The manually formatted output is:

[
  {
    "key": "a",
    "values": [
      {"key": "x", "values": [{"g": "a", "h": "x", "value": 5, "key": "ax"}]},
      {"key": "y", "values": [{"g": "a", "h": "y", "value": 15, "key": "ay"}]}
    ]
  },
  {
    "key": "b",
    "values": [
      {
        "key": "x",
        "values": [
          {"g": "b", "h": "x", "value": 10, "key": "bx"},
          {"g": "b", "h": "x", "value": 20, "key": "bx"}
        ]
      },
      {"key": "y", "values": [{"g": "b", "h": "y", "value": 30, "key": "by"}]}
    ]
  }
]

It’s common to be left with a single leaf in a groupBy set, especially after running summarize(). To prevent your exported data having its values wrapped in an array, you can pass the single option to it.

tidy(data,
  groupBy(['g', 'h'], [
    summarize({ total: sum('value') })
  ], groupBy.object({ single: true }))
);

The manually formatted output is:

{
  "a": {
    "x": {"total": 5, "g": "a", "h": "x"},
    "y": {"total": 15, "g": "a", "h": "y"}
  },
  "b": {
    "x": {"total": 30, "g": "b", "h": "x"},
    "y": {"total": 30, "g": "b", "h": "y"}
  }
}

Visit the API reference docs to learn more about how each function works and all the options they take. Be sure to check out the levels export, which can let you mix-and-match different export types based on the depth of the data. For quick reference, other available groupBy exports include:


Shout out to Netflix

I want to give a big shout out to Netflix, my current employer, for giving me the opportunity to work on this project and to open source it. It’s a great place to work and if you enjoy tinkering with data-related things, I’d strongly recommend checking out our analytics department. – Peter Beshai