Most language on the open web is formal: news, encyclopedias, product reviews, support forums. The vocabulary people use when they’re not being watched is mostly missing from training sets.
Urban Dictionary is the exception. The community has documented this layer of language for 26 years, in plain text, with definitions written by the people who use the words.
A model trained on it will know what “rizz” or “glaze” or “menty b” actually mean. A model that wasn’t, won’t.
The four assets above can be licensed separately or together. Commercial and research tiers are priced differently.
Delivery can be a one-time corpus dump, scheduled updates, or a live feed connection. Exclusivity is available for the live feed and for early access to new entries. The historical corpus is licensed non-exclusively.
Tell us what you’re trying to do. We’ll come back with terms.