Brand safety you can trust
Built to meet brand safety standards, without bias against marginalized voices.
Request More Information
What we screen for
Every Urban Dictionary entry is scored across 15+ harm categories.
Core Toxicity
General toxicity
Severe toxicity
Obscene language
Insults
Identity & Hate
Identity-based attacks
Race, gender, religion, or sexuality-related content
Harmful Content
Threats
Sexually explicit language
Self-harm or suicide references
Violence or criminal content

Our classifier is based on Detoxify's unbiased-toxic-roberta model, which was trained on the Jigsaw Unintended Bias dataset.

This model was selected for its high accuracy and reduced false positives in content from marginalized groups. We calibrate our thresholds using guidance from Google Publisher Policies.

Our ABCD safety grades

Each page receives a composite safety grade based on the most concerning content on the page.

Grade A
Safest Content
Minimal risk. Eligible for all advertisers.
Grade B
Generally Safe
Minor concerns. Eligible for most advertisers.
Grade C
Moderate Risk
Elevated scores. Ad eligibility restricted.
Grade D
High Risk
Violates policy. Ads are blocked.

Note: Pages without completed safety review default to Grade D to protect buyers.

How our safety system works

Urban Dictionary applies two distinct moderation layers to every piece of content — one for publication, and another for advertiser safety. This ensures ads never appear next to content that violates your risk tolerance.

1
Content moderation
Human review ensures entries meet Urban Dictionary's publication guidelines.
2
Advertiser-facing scoring
Published definitions are reanalyzed using unbiased-toxic-roberta, with calibrated thresholds across 15+ categories.
3
Grade assignment
Each page is assigned a Grade A–D based on the most severe or compounding scores on the page.
4
Default exclusions
Pages lacking classification are excluded from ad delivery until reviewed.
Flexible targeting options
Choose the safety level that fits your brand's risk tolerance:
Strict Tier
Most restrictive safety level
  • ✓ Only Grade A pages
  • ✓ Ideal for family-safe or regulated brands
Balanced Tier
Standard safety settings
  • ✓ Grades A & B
  • ✓ Suitable for most mainstream advertisers
Open Tier
Least restrictive
  • ✓ Grades A–C
  • ✓ Broadest reach, suitable for mature brands

Custom controls

Set category-level thresholds (e.g. exclude only identity-based attacks or sexual content) to tailor safety enforcement.

Built for transparency

✓ Open methodology

  • Model details, thresholds, and scoring code are available for audit.

✓ Bias mitigation

  • We use fairness-optimized models and conduct regular audits to reduce disproportionate impact on minority content.

✓ Full control

  • CSV exports, dashboard tools, and override options let you tune your brand safety settings.

Have a question about our brand safety policies?

Get detailed information about our moderation system, thresholds, and implementation.

Urban Dictionary Ads
Urban Dictionary for Business
The internet talks here. Your brand should too.
© 2025 Urban Dictionary LLC