cfgpkg, or how to use age encryption with YAML
I have a case where I need to store a YAML file in an S3 bucket where it’s available to be copied to a server as part of an automated deployment process. The YAML file contains a few sensitive values, like API keys, so it’d be best to keep those values encrypted.
I’d prefer not to have to encrypt the entire file, because that makes it more difficult to work with and update. It would be nice if I could safely keep a local copy of the file, update it as needed, and only have to deal with encryption when I’m adding or changing a sensitive value.
Not being aware of anything that could encrypt and decrypt just a subset of values in a YAML file—and not looking too hard for one, since this seemed like a fun problem to solve. Let’s write a script.
age reached its 1.0.0 release recently, and it looks like a good tool for the job. It has a simple command line interface, and there are binaries for multiple platforms. Plus, age uses asymmetric encryption, which will let us keep a copy of the public key locally for encrypting new and updated values without having to worry about having a private key (capable of decrypting) lying around.
First, the requirements:
- Encrypt sensitive values locally and leave others alone
- Decrypt sensitive values on the server during deployment
That’s it, really. We should be able to open our .yml file in a text editor, add a key/value pair, mark it as sensitive, and then run the script to encrypt the sensitive value.
Both requirements imply the script will need a way to identify sensitive values—it needs to know what to encrypt and what to decrypt. Since config keys aren’t likely to include an exclamation mark, that seems like a good way to mark sensitive keys. Let’s do it like this: API_KEY!
. And to let’s surround encrypted values in an age “tag” like this: <age>c2VjcmV0</age>
.
It’s probably easiest to visualize what we’re going for, so let’s start with a simple YAML file at each state we want to handle: (1) the original file with a cleartext sensitive value, (2) the encrypted version that will be stored where a deployment script can get it, and (3) the final decrypted file for our application to use.
-
The original file:
APP_HOST: app.example.com API_KEY!: secret_key_abc123
-
The encrypted file:
APP_HOST: app.example.com API_KEY: <age>c2VjcmV0X2tleV9hYmMxMjMK</age>
-
The decrypted file:
APP_HOST: app.example.com API_KEY: secret_key_abc123
The original file and decrypted file are identical except that the decrypted file does not contain a !
character marking the sensitive key.
Because the script needs to run locally and on our server, let’s write it as a Bash script. Ruby would surely make for simpler code, but that would introduce a deployment dependency our app may not already have.
Let’s start writing some code. At first we can ignore validation and usage instructions and just keep it simple. We know we need to be able to tell the program what to do (encrypt or decrypt), what YAML file to operate on, and what encryption key to use, so let’s start with a simple outline.
#!/bin/bash
cmd="$1"
yaml_file="$2"
if [[ "$cmd" == "enc" ]]; then
public_key="$3"
# TODO: encrypt sensitive values in $yaml_file
else
private_key_file="$3"
# TODO: decrypt encrypted values in $yaml_file
fi
Now’s would be a good time to see briefly how to interact with the age
program using stdout and stdin:
# generate a key pair
$ age-keygen -o key.txt
Public key: age1abc123...
# encrypt and base64-encode a string using
# public key from age-keygen output
$ echo "secret" | age -r age1abc123... | base64
xyz789...
# base64-decode and decrypt string using
# private key from age-keygen run
$ echo "xyz789..." | base64 -d | age -d -i key.txt
secret
Next, let’s implement the encryption part. sed seems like a good tool to use for our in-place file editing.1 We need to identify each sensitive key, remove the !
key suffix, and encrypt the value and wrap it in our <age>...</age>
marker. To keep things simple we’ll assume sensitive values are on a single line with their keys. Indentation won’t matter.2 And to avoid accidentally leaving a sensitive value on a commented-out line, let’s process those lines too.
So, let’s first find and extract sensitive keys into an array. (Note that I’m writing this on macOS, so I’ll use sed -E
, which is like sed -r
on Linux. We’ll address this in our script later. I’m also including extra line breaks to make the script more readable without horizontal scrolling.)
# extract keys ending with '!' from the
# YAML file, including commented-out lines
while IFS='' read -r line; do
keys+=("$line")
done < <( \
grep -E '^[# ]*\w+! *:' "$yaml_file" | sed -E \
'/[a-zA-Z0-9_]+! *:/ s/^[# ]*([a-zA-Z0-9_]+)! *:.*/\1/' \
)
Now let’s iterate over our $keys
array to remove its !
suffix and encrypt its value.
# encrypt values for extracted keys,
# and remove the '!' from the keys
for key in "${keys[@]}"; do
# find the key and extract its value
val="$(grep -E "^[# ]*$key! *:" "$yaml_file" \
| head -n 1 | sed -E "s/^[# ]*$key! *: *(.*)/\1/")"
# encrypt the value and encode as Base64
enc="$(echo -n "$val" \
| age -r "$public_key" | base64)"
# replace the "key!: val" line with "key: <age>...</age>"
[[ -n "$enc" ]] && sed -E -i \
"/<age>/! s%^([# ]*$key)!( *: *).*%\1\2<age>$enc</age>%" \
"$yaml_file"
done
Ok, not too bad, as shell scripts go. How about the decryption part? It’ll be similar to encryption—we just need to build our $keys
array by looking for encrypted values instead of sensitive keys.
# extract keys with <age> values from the
# YAML file, including commented-out lines
while IFS='' read -r line; do
keys+=("$line")
done < <( \
grep -E '^[# ]*\w+ *: *<age>.*</age>' "$yaml_file" | sed -E \
'/[a-zA-Z0-9_]+:/ s/^[# ]*([a-zA-Z0-9_]+) *:.*/\1/' \
)
Then we’ll Base64-decode and decrypt each value.
# decrypt values for extracted keys
for key in "${keys[@]}"; do
enc="$(grep -E "^[# ]*$key *:" "$yaml_file" \
| sed -E 's:.*<age>(.*)</age>.*:\1:')"
val="$(base64 -d <<< "$enc" | age -d -i "$private_key_file")"
[[ -n "$val" ]] && sed -E -i \
"s%^([# ]*$key *: *).*%\1$val%" "$yaml_file"
done
Before putting it all together, let’s make a function for invoking sed
using the arguments it expects based on the OS the script is running on. (I originally saw this in dehydrated.)
# use `-r` or `-E` depending on platform
_sed() {
if [[ "$(uname)" = "Linux" ]]; then
sed -r "${@}"
else
sed -E "${@}"
fi
}
# use `-i` or `-i ''` depending on platform
_sed_i() {
if [[ "$(uname)" = "Linux" ]]; then
sed -r -i "${@}"
else
sed -E -i '' "${@}"
fi
}
Let’s put together what we have so far, using our new _sed
and _sed_i
functions and some docs and input validation:
#!/bin/bash
# Encrypts and decrypts sensitive values in a YAML file.
#
# Values can be encrypted locally, such as in preparation
# for placing the file where it's available to a deployment
# process, and then decrypted on a server during deployment.
#
# Sub commands:
#
# enc: identifies sensitive key-value pairs (any key
# ending with "!"), then encrypts sensitive values
# and removes trailing "!" from keys
#
# dec: decrypts encrypted values
#
# Usage:
#
# cfgpkg enc <config-file> <public-key>
# cfgpkg dec <config-file> <key-file>
#
# Examples:
#
# cfgpkg enc app.yml age1abc123...
# cfgpkg dec app.yml /path/to/age.key
_sed() {
if [[ "$(uname)" = "Linux" ]]; then
sed -r "${@}"
else
sed -E "${@}"
fi
}
_sed_i() {
if [[ "$(uname)" = "Linux" ]]; then
sed -r -i "${@}"
else
sed -E -i '' "${@}"
fi
}
cmd="$1"
yaml_file="$2"
# exit if command, file, and key were not specified
if [[ $# -ne 3 ]] || [[ "$cmd" != "enc" && "$cmd" != "dec" ]]; then
>&2 echo 'Usage: cfgpkg <enc|dec> <yaml-file> <pub-key/key-file>'
exit 1
elif [[ ! -f "$yaml_file" ]]; then
>&2 echo "$yaml_file does not exist"
exit 1
fi
if [[ "$cmd" == "enc" ]]; then
public_key="$3"
# extract keys ending with '!' from the
# YAML file, including commented-out lines
while IFS='' read -r line; do
keys+=("$line")
done < <( \
grep -E '^[# ]*\w+! *:' "$yaml_file" | _sed \
'/[a-zA-Z0-9_]+! *:/ s/^[# ]*([a-zA-Z0-9_]+)! *:.*/\1/' \
)
# encrypt values for extracted keys,
# and remove the '!' from the keys
for key in "${keys[@]}"; do
# find the key and extract its value
val="$(grep -E "^[# ]*$key! *:" "$yaml_file" \
| head -n 1 | _sed "s/^[# ]*$key! *: *(.*)/\1/")"
# encrypt the value and encode as Base64
enc="$(echo -n "$val" \
| age -r "$public_key" | base64)"
# replace the "key!: val" line with "key: <age>...</age>"
[[ -n "$enc" ]] && _sed_i \
"/<age>/! s%^([# ]*$key)!( *: *).*%\1\2<age>$enc</age>%" \
"$yaml_file"
done
else
private_key_file="$3"
# extract keys with <age> values from the
# YAML file, including commented-out lines
while IFS='' read -r line; do
keys+=("$line")
done < <( \
grep -E '^[# ]*\w+ *: *<age>.*</age>' "$yaml_file" | _sed \
'/[a-zA-Z0-9_]+:/ s/^[# ]*([a-zA-Z0-9_]+) *:.*/\1/' \
)
# decrypt values for extracted keys
for key in "${keys[@]}"; do
enc="$(grep -E "^[# ]*$key *:" "$yaml_file" \
| _sed 's:.*<age>(.*)</age>.*:\1:')"
val="$(base64 -d <<< "$enc" | age -d -i "$private_key_file")"
[[ -n "$val" ]] && _sed_i \
"s%^([# ]*$key *: *).*%\1$val%" "$yaml_file"
done
fi
Ok, we have a working, (*nix) platform-independent script. I named it “cfgpkg” (config packager). Let’s give it a try using the sample YAML from the beginning of this post to ensure it works end-to-end. (You’ll need to install age if you haven’t already.)
-
Create our .yml file with sample data and generate our age key:
$ echo 'APP_HOST: app.example.com' > app.yml $ echo 'API_KEY!: secret_key_abc123' >> app.yml $ age-keygen -o age.key Public key: age1abc123...
-
Encrypt the sensitive value in our .yml file:
$ cfgpkg enc app.yml age1abc123... $ cat app.yml APP_HOST: app.example.com API_KEY: <age>c2VjcmV0X2tleV9hYmMxMjMK</age>
-
Decrypt the sensitive value in our .yml file:
$ cfgpkg dec app.yml age.key $ cat app.yml APP_HOST: app.example.com API_KEY: secret_key_abc123
There are two features I’d like to add and one known bug to fix, but we’ll save these for another time:
-
[feature] the ability to “undo” an encryption run, essentially restoring our .yml file to its original pre-encryption state (same as the decrypted state, but with the
!
sensitive key markers), possibly as arev
sub command -
[feature] the ability to extract a single encrypted value without modifying the file, possibly as an
ext
sub command -
[bug] duplicate sensitive keys cause a problem, including on commented-out lines, because sed replaces every matching occurrence it finds
I plan to update this post when these three things have been addressed. If you find other bugs or have an idea for a useful feature, feel free to email me.
-
When initially researching, I discovered sed has an
e
(execute) command that allows you to do substitutions using the output of a shell command that runs for each match. This did exactly what I needed, but I’ll save you the time you might have otherwise spent before finding this line in the docs: “This is a GNU sed extension.” This means it won’t work on the version of sed that ships with macOS, so I opted against using it here. ↩︎ -
If your YAML file contains a multi-line value, and one of the lines in that value starts with what looks like a sensitve key followed by a colon, this script will incorrectly process that line. To keep things simple, our script will process the file as text using regular expressions. In other words, it won’t use a YAML parser, because that would introduce a dependency. ↩︎