SAVEFORM
All posts
Deep Dive

Anatomy of a form spam attack, and the cheapest way to stop most of it

Real signals from production form spam: random-case gibberish, the Gmail dot trick, disposable domains. Why honeypots still work, and what they miss.

4 min readspam·security·forms

A new form goes live. Within 48 hours the submissions table starts filling with names like uKguQGoDbyxINzxieQkVUmII and email addresses like o.w.on.u.x.o.b.i.c.a.334@gmail.com. Welcome to form spam. The good news: most of it is shockingly easy to filter out, even without a CAPTCHA.

~80%
Filtered by honeypot alone
6
Common spam fingerprints
0
CAPTCHAs needed

The six fingerprints we see in production

Spam to a public form arrives in surprisingly predictable shapes. These are the patterns that show up over and over in real submission data:

  1. Honeypot fields filled. A hidden input that real users never touch but naive bots fill anyway. Still the cheapest, highest-yield spam filter ever invented.
  2. Random-case gibberish in name fields. Bots love random capitalisation, like uKguQGoDbyxINzxieQkVUmII. Humans do not type like this. A simple ratio of uppercase-to-lowercase transitions is a strong tell.
  3. The Gmail dot trick. o.w.on.u.x.o.b.i.c.a.334@gmail.com all route to the same Gmail mailbox, so spammers use them to disguise duplicate signups. Five-plus dots in the local-part of a Gmail address is essentially never legitimate.
  4. Disposable email domains. Mailinator, Yopmail, Guerrillamail, 10minutemail. There are about three hundred well-known throwaway providers, and a block-list match is O(1).
  5. Every text field is gibberish at once. A submission where name, message, and subject are all unrelated random strings is almost always a bot. Humans either fill the form properly or skip optional fields.
  6. Submissions arriving milliseconds after page load. Real humans need at least a second or two to fill in a form. Programmatic submissions from a fresh page load are often sub-100ms.

The honeypot trick, in 5 lines

The cheapest filter is also the oldest. Add a hidden input. Real users never see it. Bots fill it because they fill everything they can find:

HTMLcontact.html
<form action="/submit" method="POST">
  <!-- Honeypot. CSS-hidden, no tab focus, no autocomplete -->
  <input type="text" name="_honey" style="display:none"
         tabindex="-1" autocomplete="off" />

  <!-- Real fields below -->
  <input type="text"  name="name"  required />
  <input type="email" name="email" required />
</form>

On the server, treat any submission with a non-empty _honey as spam. That single line catches the majority of low-effort form bots. Most don't parse CSS; they just loop through every input they find and fill it.

Content scoring catches what honeypots miss

Each fingerprint gets a numeric weight. A submission is flagged when the total crosses a threshold. Two mild signals together can be more damning than one strong signal alone, and importantly, a single mild signal won't flag a real submission. A simplified scoring table:

textweights
honeypot filled            +90   // basically certain
gmail dot trick (5+ dots)   +60
random-case gibberish       +50
disposable email domain     +80
all-text-fields gibberish   +40
sub-100ms submission         +20

threshold to flag as spam:    70

Two cooperating signals get you over the line. A single ambiguous one (say, a real-looking name with a slightly weird email) passes through. The trick is calibrating thresholds so legitimate submissions stay clean while obvious junk gets caught.

Why we don't recommend CAPTCHAs

CAPTCHAs work, but they're a tax on every legitimate visitor to stop a problem caused by a tiny minority. They hurt accessibility, they hurt mobile completion rates, and modern multimodal models can solve them faster than humans can. They're also the most-cited reason people abandon forms in usability studies.

A honeypot is invisible to humans and obvious to most bots. A CAPTCHA is invisible to most bots and obvious to humans. The arithmetic is simple.

What to actually do

That's the whole playbook. CAPTCHAs are the last resort for high-stakes forms (account signups, ticket sales) where the cost of letting one bot through is high. For a contact form, the cost is one extra row in the spam folder, and if you've done the above, that row will be very lonely.

Spam filtering you don’t have to think about

SaveForm runs honeypot detection plus six content-scoring signals on every submission. Spam stays out of your inbox, doesn't fire webhooks, and doesn't count toward your quota.

Anatomy of a form spam attack, and the cheapest way to stop most of it — SaveForm.io | SaveForm.io