Importing Readmill highlights into Readwise using `jq`

Readmill was a wonderful app for reading ebooks. Unfortunately it was short-lived and shut down after being acquired by Dropbox in 2014. At the time, I was frustrated that my preferred reading platform was being taken away, and I begrudgingly returned to Kindle where I still do most of my reading.

Readmill was shut down in style though. The team did an amazing job of allowing their users to export all of their reading data in open and accessible formats (json, txt and xml). They even generated a Reading Journal for each departing user (in ePub, PDF and HTML formats) that contained a list of their books, as well as all of their highlighted passages and notes.

Readwise.io is an unrelated product that collates all of your reading highlights and make them more useful. It ingests highlights from all types of sources (Kindle, Instapaper, Matter, Pocket and even paper books).

This post explains how to process the reading-data.json file exported from Readmill into a CSV of highlights that you can then import into Readwise.io. This is possible by using a command-line tool called jq.

TLDR: Here is the command I cobbled together. You can run it in the directory containing your reading-data.json file and it will output a highlights.csv file. You can then upload this CSV to Readwise.io to import all of your old Readmill highlights.

cat reading-data.json | jq -r ' ["Highlight", "Title", "Author", "Note", "Location", "Date"], ( .readings[] | .book.title as $title | .book.author as $author | .highlights[] | {highlight: .content, location: (.position * 100 | round), date: (.highlighted_at | gsub("[TZ]"; " ")), title: $title, author: $author} | [.highlight, .title, .author, .note, .location, .date]) | @csv ' > highlights.csv

Before you do this, please see the limitations below.

`jq`

To run the command, you’ll first need to install jq. On a Mac, you can do this using Homebrew with the following command:

brew install jq

jq is a command-line tool created by Stephen Dolan that processes json data and outputs it as json or a variety of other formats.

The `jq` command

Here is a breakdown of the jq command I put together.

Warning: I am very new to using jq. What follows is almost certainly not best practice.

Piping json data into a CSV

This part of the command is piping the reading-data.json file into jq, processing it according to the instructions (here replaced with ...) and saving it into the highlights.csv file.

cat reading-data.json | jq -r ' ... ' > highlights.csv

Setting the CSV column headers

The instructions begin by listing out the CSV column headers as an array: "Highlight", "Title"… etc. This will get piped into the CSV on the first line. These column headers are the ones that Readwise.io describes here.

' ["Highlight", "Title", "Author", "Note", "Location", "Date"], ... | @csv '

Processing instructions for json

The next part is wrapped in parenthesis and can be thought of as a self-contained function that ingests the reading-data.json file and outputs an array of highlights and their associated metadata.

' ( .readings[] | .book.title as $title | .book.author as $author | .highlights[] | {highlight: .content, location: (.position * 100 | round), date: (.highlighted_at | gsub("[TZ]"; " ")), title: $title, author: $author} | [.highlight, .title, .author, .note, .location, .date]) '

Individual jq instructions are separated by a pipe (|) character. jq will “pipe” the output from one instruction to the input of the next in sequence. Below is an annotated version of the above instructions, split at each pipe:

.readings[] 
# For each reading in reading-data.json…

.book.title as $title 
# Set a variable $title as the book’s title

.book.author as $author 
# Set a variable $author as the book’s author

.highlights[]
# For each of the book’s highlights…

{highlight: .content, location: (.position * 100 | round), date: (.highlighted_at | gsub("[TZ]"; " ")), title: $title, author: $author}
# Construct a new json object with the following keys: highlight (the content that was highlighted), location (as a percentage of progress through the book), date (when the highlight was made), title and author  

[.highlight, .title, .author, .note, .location, .date] 
# Construct an array of metadata for each highlight

Limitations

I was unable to export comments from Readmill’s reading-data.json file while also exporting highlights that lacked comments. I am sure it is possible, but I only had a handful of Readmill comments, so I didn’t spend a lot of time trying to make it work.

Gratitude

Thank you to Stephen Dolan for making jq
Thank you to the the Readmill team for making sure your users could export their data in a future-proof format. 7 years after you shut down, I can still make use of my Readmill data.