Creating a sed-based file substitution tool.
A few weeks ago, I was digging through my spam folder and found an email message that started out like this:
Dear #name# Congratulations on winning the $15.7 million lottery payout! To learn how to claim your winnings, please...
Obviously, it was a scam (does anyone actually fall for these?), but what captured my
attention was the
#name# sequence. Clearly that was a fail on the part of the sender who
presumably didn't know how to use AnnoyingSpamTool 1.3 or whatever the heck
he or she was using.
The more general notation for bulk email and file transformations is pretty interesting, however. There are plenty of legitimate reasons to use this sort of substitution, ranging from email newsletters (like the one I send every week from AskDaveTaylor.com—check it out!) to stockholder announcements and much more.
With that as the inspiration, let's build a tool that offers just this capability.
The simple version will be a 1:1 substitution, so
#name# becomes, say,
"Rick Deckard", while
#first# might be "Rick" and
be "Deckard". Let's build on that, but let's start
Simple Word Substitution in Linux
There are plenty of ways to tackle the word substitution from the command line,
ranging from Perl to awk, but here I'm using the original UNIX command
sed (stream editor) designed for exactly this purpose. General notation for a
substitution is s/old/new/, and if you tack on a
g at the end, it
matches every occurrence on a line, not only the first, so the full command
Before going further, here's a simple document that has necessary substitutions embedded:
$ cat convertme.txt #date# Dear #name#, I wanted to start by again thanking you for your generous donation of #amount# in #month#. We couldn't do our work without support from humans like you, #first#. This year we're looking at some unexpected expenses, particularly in Sector 5, which encompasses #state#, as you know. I'm hoping you can start the year with an additional contribution? Even #suggested# would be tremendously helpful. Thanks for your ongoing support. With regards, Rick Deckard Society for the Prevention of Cruelty to Replicants
Scan through it, and you'll see there's a lot of substitutions to do:
#suggested#. It turns out that
be replaced with the current date, and
#suggested# is one that'll be
calculated as the letter is processed, but that's for a bit later, so
stay tuned for that.