Try and compare Deepgram with other providers

  • About Deepgram provides developers with the tools you need to easily add AI speech recognition to applications. We can handle practically any audio file format and deliver at lightning speed for the best voice experiences. Build better voice applications with faster, more accurate transcription through AI Speech Recognition.

  • Models -

  • Languages - Deepgram supports Dutch, English, French, German, Hindi, Indonesian, Italian, Japanese, Korean, Mandarin, Portuguese Russian, Spanish, Swedish, Turkish, and Ukrainian. here.

  • Features -

    • Tiers - Deepgram utilizes a Tier feature allows you to associate your API requests with a specific tier, which indicates the level of model you would like to use in your request. For self-serve customers, Deepgram provides Enhanced and Base model tiers. Enhanced models are our newest, most powerful ASR models. Enhanced models generally have higher accuracy with better word recognition than our Base models and they handle uncommon words significantly better. Base models are built on our signature end-to-end deep learning speech model architecture and offer a solid combination of accuracy and cost effectiveness. Once you have chosen your tier and model, you can select an available language and a version. To learn more about models, see Models. To learn more about pricing, see Deepgram’s Pricing & Plans.
    • Punctuation - Deepgram’s Punctuation feature adds punctuation and capitalization to your transcript.
    • Profanity Filter - Deepgram’s Profanity Filter feature looks for recognized profanity and converts it to the nearest recognized non-profane word or removes it from the transcript completely.
    • Redaction - Deepgram’s Redaction feature redacts sensitive information, replacing redacted content with asterisks (_). For example, if you chose to redact social security numbers, "My social security number is five five five two two one one one one" would appear in your transcript as "My social security number is _".
    • NER - Deepgram’s Named-Entity Recognition (NER) feature recognizes alphanumerics in audio and removes whitespace between the characters in the transcript. Some examples of use cases for NER include:
      • Tracking numbers for packages
      • Company names that are initialisms (for example, IBM)
      • Account numbers and customer codes
    • Keywords / Boosting - Just like a human listener, Deepgram can better understand mumbled, distorted, or otherwise hard-to-decipher speech when it knows the context of the conversation. When using Deepgram’s API to transcribe audio, you can specify keywords to which the model should pay particular attention to help it understand context; this is known as keyword boosting. Similarly, you can suppress keywords. Deepgram supports traditional keyword boosting works on in-vocabulary words, which means the model you are applying has been trained on the words, so the words already exist in the model’s vocabulary. Out-of-vocabulary (OOV) keyword boosting can help your model recognize words it has yet to be trained on or hasn’t encountered frequently. Proper names, product names, industry-specific terms, and addresses are great candidates for keyword boosting.
  • Pricing - Deepgram gets you started with $150 Free Credits. No credit card required. Reach out to their Sales Team about a custom Premium for tailored accuracy gains, volume pricing, private cloud as well as on-prem deployments, and dedicated support for your team.