A nushell script for the Seachess bulletin
Table of contents
This note captures my approach to interacting with my database of articles I've read and shared in my weekly bulletin.
# Overview
Back in 2020 I started a weekly bulletin where I share a few links that I found interesting. The original data structure was a mixture of CSV and TOML and the way to interact with it was with a command-line tool I built specifically for this purpose.
In 2023 I adopted Nushell as my interactive shell of choice. After a while I found it a much better interaction than my handmade command-line tool, allowing me to interact not only with my original CSV and TOML files but with a larger set of data I have stored in a variety of formats.
This note describes the simple data model and the Nushell scripts I use to record my readings and the process of compiling a bulletin issue.
# Data model
The fundamental concepts are:
- The trail. A set of trail entries capturing what I've read and when.
- The stash. A set of bulletin entries I flagged as candidates for the bulletin.
- The bulletin. A set of bulletin issues I have published.
# The trail entry
- date. The day I read the resource.
- url. The URL for the resource. Acts as the primary identifier.
- title. The name of the resource. Title of an article, paper or book, name of a tool, etc.
- summary. The description of the resource. Typically my own take on the resource.
- source. The place where I found the resource.
- tags. A set of tags to classify the resource.
An example:
key | value |
---|---|
date | 2020-01-23 |
url | https://github.com/flamegraph-rs/flamegraph |
title | Flamegraph |
summary | A tool to profile a running process and generate a flamegraph in SVG for the result. |
tags | [programming_language/rust, topic/tool] |
source |
# The bulletin issue
- id. The ISO week the bulletin issue was published on.
- publication_date. The date the bulletin issue was published on.
- summary. A short description of what the entries are about.
- type. The type of record. Always
bulletin
. - entries. The list of entries for the bulletin issue.
A bulletin issue entry has a slightly different shape than a trail entry :
- content type. Whether the resource is text, video or pdf.
- url. The URL for the resource. Acts as the primary identified.
- title. The name of the resource. Title of an article, paper or book, name of a tool, etc.
- summary. The description of the resource. Can be different from the trail entry, for example adding a bit of commentary on top of the description.
An example:
key | value |
---|---|
id | 2023-W01 |
publication_date | 2023-01-08 |
summary | This week has been about reactive libraries mechanics, graph […] |
type | bulletin |
entries | […] |
# Data storage
The data storage has evolved over time and as a consequence there is a variety of formats in the mix. The original idea was to store data in a text format such that Git could easily track changes. The data then would be imported and normalised into a SQLite database and queried with SQL. Currently, the picture looks like:
- The trail is stored as an append-only CSV (one file per year) where the tags column is encoded as a JSON array.
- The bulletin is stored as an append-only jsonl (JSON Lines), where each line is a single bulletin issue.
- The stash is stored as TOML.
- The sources as stored as a single CSV.
# Querying the data
Nushell makes querying data in disparate formats straightforward but it can be a bit verbose. The main commands are about listing things (see the Appendix A for the full implementation). For example, trail list
is implemented as:
# sea/trail.nu
Having the ability to list all resources with commands like {resource-type} list
makes it straightforward to compose. For example, one of the queries I used to build the Bulletin report (2023) looks like:
|
|
|
| { $in. | | }
|
|
Which results in:
year | published_sources |
---|---|
2020 | 20 |
2021 | 22 |
2022 | 22 |
2023 | 17 |
2024 | 7 |
If you squint at the query above you'll likely see the resemblance with the following SQL:
select
year
, count(source)
from (
select distinct
bulletin.year
, trail.source
from
bulletin
join
trail using(url)
)
group by
year
order by
year;
# Weekly actions
Each week there are a few common actions that help me build the next bulletin issue. The most common by far is adding a new entry to the trail. In an interactive shell I flow through the following:
# New entry record. It's an empty structure with the exception of the date.
mut entry =
# Add the URL
$entry. =
# Add the title
$entry. =
# Add the summary
$entry. =
This part is as rudimentary as it gets, plain Nushell assignments agains a record I could've created by hand. The next two bits are more interactive though:
# Prompts a fuzzy search over the list of sources.
$entry =
# Prompts a fuzzy search over the list of previously used tags.
$entry =
Once happy with the entry, I save it to the trail:
$entry |
And if the entry should be a candidate for the bulletin, I add it to the stash:
$entry | |
Finally, when I'm ready to publish a new bulletin issue I do:
# Prompts the list of stash entries and allows to select the ones I need.
mut bulletin =
$bulletin |
Finally, I generate an HTML version I can paste into my newsletter management service:
$bulletin | |
# Closing thoughts
Nushell offers a convenient way to specialise an environment to act as a sort-of DSL which makes some command-line tools redundant. This approach has meant I can quickly iterate over my commands as I find new patterns or points of friction including completely changing the shape of the data.
# Appendix A
The commands used in this note are split in multiple files and imported into a single module. To use it, activate the module:
overlay use
Note that I automatically add the overlay using a conditional hook on entering the directory that has the data.
The file structure is as follow:
├── sea
│ ├── bulletin.nu
│ ├── sources.nu
│ ├── stash.nu
│ └── trail.nu
└── sea.nu
# sea.nu
# sea/sources.nu
# Lists available sources.
# sea/trail.nu
# avoids clashing names
# Creates a new trail entry.
# sea/stash.nu
# avoids clashing names
# Opens the bulletin stash.
# sea/bulletin.nu
# avoids clashing names
# use std formats * should do it but nu seems to loose context after `use sea.nu`