Simple way to filter email

Having a lot of different types of email had bugged me for some time, so I sought to find a solution to this problem. The logical way of solving this would be to somehow filter email automatically into directories based on the type of email, i.e., sender, list id, subject, etc.

I looked into various already available solutions like procmail, notmuch, and alike. The problem with these ways of solving the issue was that they were overly complicated for the problem I needed to solve.

So, I did what I usually do and implemented my own solution instead. As I only wanted to filter email, both new and old ones, by various header data and their age, I only needed to use awk, grep, find, and mv. To keep it somewhat structured, I wrote everything as a POSIX compliant script, which I programmed to accept a configuration file. As I’m using maildir to store my email I made sure the script moved each email based on the directory it resides in – new, cur, or tmp.

This is the script I came up with (feel free to use it as you like):

#!/bin/sh 
 
# Copyright © 2022 Chris Noxz <chris@noxz.tech> 
# This work is free. You can redistribute it and/or modify it under the 
# terms of the Do What The Fuck You Want To Public License, Version 2, 
# as published by Sam Hocevar. See http://www.wtfpl.net/ for more details. 
 
# declare default configuration variables 
MAILDIR="${MAILDIR:-"${HOME}/mail"}" 
MAILFILTERRC="${MAILFILTERRC:-"${HOME}/.config/mfrc"}" 
 
# This script is used for applying sorting/filtering rules to maildirs 
# Use the following format for rules: 
# src|pat|dst 
# src: source directory relative to MAILDIR eg. INBOX or me@domain.com/INBOX 
# pat: regex pattern applied to mail header eg. ^From:.*me@domain.com 
# dst: destination directory relative to MAILDIR eg. Archive or My/Mail 
# example rule: INBOX|^From:.*github.com|github 
# this moves mail from INBOX to github retrieved from the domain github.com 
# 
# age based patterns can be used for archiving old emails ($AGE+<days>). 
# example rule: INBOX|$AGE+365|Archive 
# this archives mail from INBOX to Archive that are 365 days or older 
# 
# rules are read from ~/.config/mfrc by default 
 
# extract rules from rule file based on pattern "src|pat|dst" 
grep -o "^[^#][^|]\+|[^|]\+|[^|]\+$" "${MAILFILTERRC}" | while read -r rule; do 
    # extract source directory 
    src="${MAILDIR}/${rule%%|*}" 
 
    # extract destination directory 
    dst="${MAILDIR}/${rule##*|}" 
 
    # extract pattern 
    ptn="${rule#*|}" 
    ptn="${ptn%%|*}" 
 
    # validate input (at least somewhat) 
    [ ! -d "${src}" ] && continue   # skip if source doesn't exist 
    [ ! -d "${dst}" ] && continue   # skip if destination doesn't exist 
    [ -z "${ptn}" ] && continue     # skip if rule is empty 
 
    # check if age based rule: ^$AGE+... 
    if echo "${ptn}" | grep "^\$AGE+" >/dev/null; then 
        age="${ptn##*+}" 
        if [ -n "$age" ] && [ "$age" -eq "$age" 2>/dev/null ]; then 
            # find file/mail based on age and move file/mail into correct 
            # subdirectory at destination 
            find "${src}"/* -maxdepth 1 -mtime +"${age}" -type f -exec \ 
            sh -c "f=\"{}\"; mv \"\${f}\" \"${dst}/\$(dirname \"\${f}\" | xargs basename)/\"" \; 
        fi 
        continue # done with this rule, continue with the next one 
    fi 
 
    # find pattern in file/mail until first empty line (only check header)... 
    awk "/^$/{nextfile} /${ptn}/{print FILENAME; nextfile}" "${src}"/*/* |\ 
    while read -r f; do 
        # ...and move file/mail into correct subdirectory at destination 
        mv "$f" "${dst}/$(dirname "${f}" | xargs basename)/" 
    done 
done 

Rules created in the mfrc file can look like this:

# === chris@noxz.tech === 
# ======================= 
 
# move messages from github.com 
chris@noxz.tech/INBOX|^From:.*github.com|chris@noxz.tech/github 
 
# move messages from list: wiki.suckless.org 
chris@noxz.tech/INBOX|^List-Id:.*wiki.suckless.org|chris@noxz.tech/suckless-mailing-list/wiki 
 
# move 10 day old messages from INBOX to Archive 
chris@noxz.tech/INBOX|$AGE+10|chris@noxz.tech/Archive 

I can now run the script either on demand or automatically each time I run mbsync to retrieve new email. As it turned out, it was easier to create my own script instead of using an existing way of solving my problem. This solution, however, is not the most efficient because every email is examined each time the script runs, rather than just the new ones. This could, of course, be fixed, but would mean I would have to create a new function for executing the filter on every email as well.