Scraping 1secmail mail.tm mail.gw inbox domains


Three of the most-used free disposable-email providers — 1secmail, mail.tm, mail.gw — expose their currently-active inbox domains via public APIs. They have to: their entire UX depends on giving the user a fresh, sometimes-rotating domain to use right now. The result is that the operators themselves publish the lookup table for their disposable inventory. We poll those tables once every 24 hours.

This post: which APIs we scrape, what the rotation cadence looks like in practice, how the captured domains feed the detection table, and what role this channel plays alongside the other freshness layers behind the disposable email checker API.

The three providers and their public APIs

1secmail. API endpoint at https://www.1secmail.com/api/v1/?action=getDomainList. Returns a JSON array of the currently-active inbox domains. The list rotates: some domains stay for months, some get cycled in and out weekly. Typical response: an array of 10-30 domain names. Sample patterns we’ve seen: 1secmail.com, 1secmail.net, 1secmail.org, plus harder-to-spot variants like wwjmp.com, vddaz.com, kzccv.com.

mail.tm. API endpoint at https://api.mail.tm/domains. RFC-compliant REST API, returns domains with pagination. Includes metadata: which domains are active, which are deprecated, plus isPrivate flags for paid-tier custom domains. Sample patterns: mail.tm itself, plus shifting backup pools.

mail.gw. API endpoint at https://api.mail.gw/domains. Same shape as mail.tm (the two services share a codebase ancestry). Returns the active inbox-domain inventory.

All three APIs are unauthenticated and publicly documented. The operators publish them because their own product needs to query them — there’s no realistic way to gate access without breaking the product.

What the rotation actually looks like

We’ve been polling for over a year. The observed patterns:

Across all three providers, the rotation rate matters because it’s the rate at which a static blocklist falls behind reality. An npm package maintainer who updates monthly captures roughly 50% of the live 1secmail inventory at any given moment. Quarterly refreshes drop to 25%. Daily scraping captures essentially 100%.

What we do with the captured domains

The pipeline runs every 24 hours:

  1. Hit the three APIs, fetch the active-domain lists.
  2. Diff against last run. New domains get added to the queue; retired domains get marked as inactive (but stay in the disposable table — they may come back, and historical data is useful).
  3. Cross-reference against operator: each provider has a known operator entry (1secmail-com, mail-tm, mail-gw), so new domains automatically link to the existing operator.
  4. Update detection table. The new domains are now Tier-1 hits with confidence: 95 and detection_source: 'scraper:1secmail' | 'scraper:mail.tm' | 'scraper:mail.gw'.

Total runtime: under 60 seconds per 24-hour run. The bandwidth cost is negligible (a few hundred bytes per API response). The detection-table impact: roughly 5-15 net-new domain entries per week across the three providers.

What the source tag reveals

Looking at our detection_source field for all of disposable_mail_domains, the scraper sources are visible in the breakdown:

Detection source Domain count What it is
scraped-ui 139,922 Headless probe scrape of operator inbox-dropdown UIs
scraped-ui-orphan-mx 16,127 UI scrape catch where MX matched a known operator backend
scraped-ui-infra-mx 5,258 Cloudflare-fronted catch (see weighted-disposable post)
brand-apex 507 Provider’s own apex domain
tld-untouchable 251 TLD safety net override
(various allowlist) 376 Tier 0 legit-mail allowlist
dropdown-rescan 1 Re-poll catching net-new entries

The big scraped-ui bucket (139,922) is the generic Playwright-probe channel. It visits each candidate disposable apex, renders the page, and extracts every domain the operator’s UI exposes — usually a dropdown of available inbox domains. This catches the long tail of operators that aren’t 1secmail/mail.tm/mail.gw but follow similar patterns.

The scraper:1secmail etc. specific tags identify domains directly captured from the API endpoints — a smaller, more reliable subset. Everything caught via this channel has high confidence (the operator’s own API confirms the domain is theirs).

Why three providers specifically

The 1secmail / mail.tm / mail.gw set isn’t arbitrary. Three reasons:

  1. They publish public APIs. Most free temp-mail providers don’t — you have to scrape the UI. These three are the easiest to monitor at scale.

  2. They’re high-volume signup-form-attack vectors. Customer-traffic data shows these three plus mailinator account for ~25% of all disposable hits on real signup forms we protect. Direct API monitoring of the three lets us guarantee 100% coverage on a quarter of the attack surface.

  3. Their rotation rate is fast enough to matter. A static blocklist captures the long-stable temp-mail brands (mailinator, guerrillamail) just fine. The fast-rotating ones are where the freshness gap shows — and where direct API monitoring is the only practical answer.

Other providers we monitor differently

A non-exhaustive sample of how other temp-mail operators get caught:

The direct-API-scrape channel exists specifically for providers that both rotate fast AND expose APIs. Other operator patterns get caught by other detection layers.

Why operators don’t lock down their APIs

A natural question: if disposable-mail operators don’t want their domains in detection databases, why do they expose unauthenticated APIs that publish their inventory?

Three reasons:

  1. Product UX requires it. The mail.tm web app needs to render a domain selector. The user picks one before getting their inbox URL. Hiding the list breaks the product.

  2. API access is part of the value prop. mail.tm’s developer-friendly API is one of their marketing angles. Locking it would alienate the developer segment of their users.

  3. The detection isn’t really the threat. Verifier APIs blocking their domains affects sign-up-form traffic, which is a small slice of overall temp-mail use (most users want temp-mail for newsletter signups, not for adversarial signup fraud). The operators have priced their business model around that reality.

The detection-vs-evasion arms race is shaped by these incentives. Operators that go to lengths to evade detection (Cloudflare fronting, MX cycling) also lose detectability at the cost of UX friction. The three providers covered here have settled on the “we publish; you detect; equilibrium” trade-off.

How this connects to the rest of the freshness pipeline

The scraper channel handles fast-rotating providers via known APIs. The CT-log scanner handles brand-new providers issuing first TLS certs. Customer-consensus promotion handles operators that hide from both channels. Static-list sync handles the slow-moving baseline.

The scraper channel specifically is the smallest by domain count but the most reliable per-entry — every domain captured here has the operator’s own API confirming ownership. The other channels involve more inference.

What this means for your signup form

When /v1/check hits a 1secmail / mail.tm / mail.gw domain captured via the scraper channel:

{
  "result": "undeliverable",
  "reason": "disposable",
  "reason_message": "This email provider doesn't deliver mail reliably. Please use a real address.",
  "disposable": true,
  "score": 0.0,
  "detection_source": "scraper:1secmail"
}

The detection_source field surfaces the scraper that captured the domain — useful for audit trails. The verdict itself is the same as any Tier-1 disposable hit.

FAQ

Are you violating these providers’ terms of service by scraping them?

The APIs are publicly documented and unauthenticated. We make polite, low-volume requests (one query per day per provider) using a standard User-Agent and respect rate limits. We’re not depleting their inventory or DOS-ing the service. No provider has objected — and even if one did, the data is public information the provider themselves chose to publish.

Can you catch a 1secmail domain on the same day it’s added to their rotation?

Within 24 hours, yes — the cron runs daily. For sub-24h freshness, the CT-log channel sometimes catches new 1secmail-cluster domains earlier if they’re issuing certs we’d see via crt.sh. Combined median latency on a new domain is ~12-18 hours.

What about providers that don’t publish a public API?

Caught via headless Playwright probe of the operator’s UI (the scraped-ui source tag). The probe renders the page, finds the inbox-dropdown widget, extracts every domain it lists. Slower per-operator than API scraping but works on any operator with a visible UI.

Why daily and not hourly?

Cost-benefit. 1secmail’s rotation rate is multi-day; mail.tm’s is multi-week. Hourly polling would burn quota for almost no marginal catch rate vs daily. The trade-off favors a faster CT-log poll (hourly) and a slower scraper poll (daily) — they cover different freshness lanes.

Do you maintain a downloadable list of caught 1secmail domains?

No public list. The verdict on any domain you query via vrfymail’s API is the access pattern. Pre-publishing the full catch would let the operators game the rotation harder.

Get the freshest disposable coverage in one API call

Every /v1/check hits the production table including this morning’s scraper run. Free tier: 5,000 verifies/month, no card. Get an API key →