Importing Readmill highlights into Readwise using jq
Readmill was a wonderful app for reading ebooks. Unfortunately it was short-lived and shut down after being acquired by Dropbox in 2014. At the time, I was frustrated that my preferred reading platform was being taken away, and I begrudgingly returned to Kindle where I still do most of my reading.
Readmill was shut down in style though. The team did an amazing job of allowing their users to export all of their reading data in open and accessible formats (json, txt and xml). They even generated a Reading Journal for each departing user (in ePub, PDF and HTML formats) that contained a list of their books, as well as all of their highlighted passages and notes.
Readwise.io is an unrelated product that collates all of your reading highlights and make them more useful. It ingests highlights from all types of sources (Kindle, Instapaper, Matter, Pocket and even paper books).
This post explains how to process the reading-data.json
file exported from Readmill into a CSV of highlights that you can then import into Readwise.io. This is possible by using a command-line tool called jq.
TLDR: Here is the command I cobbled together. You can run it in the directory containing your reading-data.json
file and it will output a highlights.csv
file. You can then upload this CSV to Readwise.io to import all of your old Readmill highlights.
cat reading-data.json | jq -r ' ["Highlight", "Title", "Author", "Note", "Location", "Date"], ( .readings[] | .book.title as $title | .book.author as $author | .highlights[] | {highlight: .content, location: (.position * 100 | round), date: (.highlighted_at | gsub("[TZ]"; " ")), title: $title, author: $author} | [.highlight, .title, .author, .note, .location, .date]) | @csv ' > highlights.csv
Before you do this, please see the limitations below.
jq
To run the command, you’ll first need to install jq. On a Mac, you can do this using Homebrew with the following command:
brew install jq
jq is a command-line tool created by Stephen Dolan that processes json data and outputs it as json or a variety of other formats.
The jq
command
Here is a breakdown of the jq
command I put together.
Warning: I am very new to using jq. What follows is almost certainly not best practice.
Piping json data into a CSV
This part of the command is piping the reading-data.json
file into jq, processing it according to the instructions (here replaced with ...
) and saving it into the highlights.csv
file.
cat reading-data.json | jq -r ' ... ' > highlights.csv
Setting the CSV column headers
The instructions begin by listing out the CSV column headers as an array: "Highlight", "Title"…
etc. This will get piped into the CSV on the first line. These column headers are the ones that Readwise.io describes here.
' ["Highlight", "Title", "Author", "Note", "Location", "Date"], ... | @csv '
Processing instructions for json
The next part is wrapped in parenthesis and can be thought of as a self-contained function that ingests the reading-data.json
file and outputs an array of highlights and their associated metadata.
' ( .readings[] | .book.title as $title | .book.author as $author | .highlights[] | {highlight: .content, location: (.position * 100 | round), date: (.highlighted_at | gsub("[TZ]"; " ")), title: $title, author: $author} | [.highlight, .title, .author, .note, .location, .date]) '
Individual jq instructions are separated by a pipe (|
) character. jq will “pipe” the output from one instruction to the input of the next in sequence. Below is an annotated version of the above instructions, split at each pipe:
.readings[]
# For each reading in reading-data.json…
.book.title as $title
# Set a variable $title as the book’s title
.book.author as $author
# Set a variable $author as the book’s author
.highlights[]
# For each of the book’s highlights…
{highlight: .content, location: (.position * 100 | round), date: (.highlighted_at | gsub("[TZ]"; " ")), title: $title, author: $author}
# Construct a new json object with the following keys: highlight (the content that was highlighted), location (as a percentage of progress through the book), date (when the highlight was made), title and author
[.highlight, .title, .author, .note, .location, .date]
# Construct an array of metadata for each highlight
Limitations
- I was unable to export
comments
from Readmill’sreading-data.json
file while also exporting highlights that lacked comments. I am sure it is possible, but I only had a handful of Readmill comments, so I didn’t spend a lot of time trying to make it work.
Further reading
- The jq project page and manual
- An introduction to jq by Adam Gordon Bell
- A jq tutorial that you complete on the command line by RJ Zaworski
- A guide to shutting down failing products, partly inspired by Readmill’s graceful exit
Gratitude
- Thank you to Stephen Dolan for making jq
- Thank you to the the Readmill team for making sure your users could export their data in a future-proof format. 7 years after you shut down, I can still make use of my Readmill data.