Simple way to filter email
Chris Noxz
June 11, 2022
Having a lot of different types of email had bugged me for some
time, so I sought to find a solution to this problem. The logical
way of solving this would be to somehow filter email automatically
into directories based on the type of email, i.e., sender, list id,
subject, etc.
I looked into various already available solutions like procmail,
notmuch, and alike. The problem with these ways of solving the
issue was that they were overly complicated for the problem I
needed to solve.
So, I did what I usually do and implemented my own solution
instead. As I only wanted to filter email, both new and old ones,
by various header data and their age, I only needed to use
awk , grep , find , and
mv . To keep it somewhat structured, I wrote everything
as a POSIX compliant script, which I programmed to accept a
configuration file. As I’m using maildir to store my email I made
sure the script moved each email based on the directory it resides
in – new, cur, or tmp.
This is the script I came up with (feel free to use it as you
like):
#!/bin/sh
# Copyright © 2022 Chris Noxz <chris@noxz.tech>
# This work is free. You can redistribute it and/or modify it under the
# terms of the Do What The Fuck You Want To Public License, Version 2,
# as published by Sam Hocevar. See http://www.wtfpl.net/ for more details.
# declare default configuration variables
MAILDIR="${MAILDIR:-"${HOME}/mail"}"
MAILFILTERRC="${MAILFILTERRC:-"${HOME}/.config/mfrc"}"
# This script is used for applying sorting/filtering rules to maildirs
# Use the following format for rules:
# src|pat|dst
# src: source directory relative to MAILDIR eg. INBOX or me@domain.com/INBOX
# pat: regex pattern applied to mail header eg. ^From:.*me@domain.com
# dst: destination directory relative to MAILDIR eg. Archive or My/Mail
# example rule: INBOX|^From:.*github.com|github
# this moves mail from INBOX to github retrieved from the domain github.com
#
# age based patterns can be used for archiving old emails ($AGE+<days>).
# example rule: INBOX|$AGE+365|Archive
# this archives mail from INBOX to Archive that are 365 days or older
#
# rules are read from ~/.config/mfrc by default
# extract rules from rule file based on pattern "src|pat|dst"
grep -o "^[^#][^|]\+|[^|]\+|[^|]\+$" "${MAILFILTERRC}" | while read -r rule; do
# extract source directory
src="${MAILDIR}/${rule%%|*}"
# extract destination directory
dst="${MAILDIR}/${rule##*|}"
# extract pattern
ptn="${rule#*|}"
ptn="${ptn%%|*}"
# validate input (at least somewhat)
[ ! -d "${src}" ] && continue # skip if source doesn't exist
[ ! -d "${dst}" ] && continue # skip if destination doesn't exist
[ -z "${ptn}" ] && continue # skip if rule is empty
# check if age based rule: ^$AGE+...
if echo "${ptn}" | grep "^\$AGE+" >/dev/null; then
age="${ptn##*+}"
if [ -n "$age" ] && [ "$age" -eq "$age" 2>/dev/null ]; then
# find file/mail based on age and move file/mail into correct
# subdirectory at destination
find "${src}"/* -maxdepth 1 -mtime +"${age}" -type f -exec \
sh -c "f=\"{}\"; mv \"\${f}\" \"${dst}/\$(dirname \"\${f}\" | xargs basename)/\"" \;
fi
continue # done with this rule, continue with the next one
fi
# find pattern in file/mail until first empty line (only check header)...
awk "/^$/{nextfile} /${ptn}/{print FILENAME; nextfile}" "${src}"/*/* |\
while read -r f; do
# ...and move file/mail into correct subdirectory at destination
mv "$f" "${dst}/$(dirname "${f}" | xargs basename)/"
done
done
Rules created in the mfrc file can look like
this:
# === chris@noxz.tech ===
# =======================
# move messages from github.com
chris@noxz.tech/INBOX|^From:.*github.com|chris@noxz.tech/github
# move messages from list: wiki.suckless.org
chris@noxz.tech/INBOX|^List-Id:.*wiki.suckless.org|chris@noxz.tech/suckless-mailing-list/wiki
# move 10 day old messages from INBOX to Archive
chris@noxz.tech/INBOX|$AGE+10|chris@noxz.tech/Archive
I can now run the script either on demand or automatically each
time I run mbsync to retrieve new email. As it turned
out, it was easier to create my own script instead of using an
existing way of solving my problem. This solution, however, is not
the most efficient because every email is examined each time the
script runs, rather than just the new ones. This could, of course,
be fixed, but would mean I would have to create a new function for
executing the filter on every email as well.
|