I've had a few emails recently get through my network of filters, and have been wondering how they managed it, as they used words that would trigger any spam filter, and all seem to have exactly the same text. (I do have a strange soft spot for the innovative and amusing kind of spam, but nobody wants thousands of identical emails promoting products one would never buy).
I just became annoyed enough to fiddle with the filters to stop them, and by checking the source, discovered why they had made it through.
The spam filters that filter primarily on content, work by reading your email looking for suspicious strings of letters. The spammers had got round this by adding random letters in the middle of all possibly-contentious words - so for example: Vibtlagnmbra. The random letters are different for every email, so the letter string is always different and cannot be filtered out.
Then they used HTML to colour the excess letters white, make them very small and float them away to the right margin, so that the message looked the same to the human user.
This is awfully cunning. And awfully irritating.
OK, I could turn off rendering HTML in emails, but this wouldn't stop the spams, it would just render them unreadable. I could filter out all HTML emails. But I don't want to do that, I get HTML emails that I do want to read.
I could do more validation of the sender's email address - but I don't really want to do that, because many spams come from genuine, valid addresses that are being forged. In a perfect world, there would be better systems for validating who sends emails so that I could be sure that an email coming from a different IP or SMTP server was spam, but sadly, that's not the case.
I do get HTML emails from oddlooking SMTP servers and varying IPs that I actually want to read, and the process of validating addresses using an SPF record is sufficiently complex that I can be pretty sure that many of the people who send me email will not be able to do that.
In the meanwhile, I've settled for filtering on the code that hides text and floats it to the right. This isn't a great way of doing things, as it's possible that a valid email might also contain that code, and also I can think of quite a number of permutations on this technique using different code - but it is the best 'least likely to lose desired mails' approach I can think of.
However, I suspect that ISPs handling vast amounts of email traffic will go instead for the validate sender approach - thus ensuring that genuine emails become even less likely to be reliablly delivered than they are now.
Hum. It is annoying that spammers insist on muddying their own water in this way. If they could just be a bit more restrained about it, then people would put up with them, but this sort of thing will eventually end up killing email, and driving people to more validated but less private and less universal messaging systems such as the various social media sites.