Introduction

API Reference

Overview

To streamline and standardize the experience between providers, our API provides a common interface for many ASR providers, their transcription related services and other APIs that you would need to build modern conversational apps. Under the hood, this interface maps provider services (such as AWS Transcribe) to a simple, singular workflow, common to all providers, so you don't have to deal with provider specific work flows involving provider specific job APIs, webhooks, error handling, streaming endpoints etc. while allowing you to mix and match components between providers.

Authentication

Authentication is handled via the Bearer Token containing authorization header. Any requests made to protected resources must containg this token in the Authorization header.

Authorization: Bearer <your-token-goes-here>

Generating a token

You can generate an API token here.

Supported Providers

Supported Providers for ASR (speech to text):

  • AWS
  • Deepgram
  • AssemblyAI

Supported Providers for NLP (on the transcript from ASR):

  • AWS (Comprehend)
  • SymblAI

Language Codes

Whenever interfacing with the API, languages should be refered to via BCP 47 codes. This is a table showing you the Language Codes we support and what they map to, under the hood, for each provider we support.

API Language CodeAWS TranscribeAWS Comprehend (NLP)DeepgramAssemblySymbl (NLP)Symbl STT
enenenenenen
af-ZAaf-ZAALL
ar-AEar-AE
ar-SAar-SAar*ar-SA
da-DKda-DK*
de-CHde-CH
de-DEde-DEdede-DEdede-DE
el-GR*
en-ABen-AB
en-AUen-AUen-AUen_auen-AU
en-GBen-GBen-GBen_uken-GB
en-IEen-IEen-IE
en-INen-INen-INen-IN
en-NZen-NZen-NZ
en-USen-USenen-USen_usen-USen-US
en-WLen-WL
en-ZAen-ZAen-ZA
es-ESes-ESeses-ESeses-ES
es-USes-USes-US
fa-IRfa-IRfa-IR
fr-CAfr-CAfr-CAfr-CA
fr-FRfr-FRfr-FRfrfr-FR
he-ILhe-IL*
hi-INhi-INhihi-INhihi-IN
id-IDid-IDid-ID
it-ITit-ITitit-ITitit-IT
ja-JPja-JPjaja-JPjaja-JP
ko-KRko-KRkoko-KR*
ms-MYms-MY
nl-NLnl-NLnl-NLnlnl-NL
no-NO*
pt-BRpt-BRpt-BRpt-BR
pt-PTpt-PTpt-PTptpt-PT
ru-RUru-RUru-RU*ru-RU
sv-SEsv-SEsv-SE*
ta-INta-IN
te-INte-IN
th-THth-TH
tr-TRtr-TRtr-TR
uk-UAuk-UAuk-UA
zh-CNzh-CNzhzh-CN*
zh-TWzh-TWzh-TWzh-TW

Transcription API

Get transcripts

Method: [GET]
https://api.exemplary.ai/v1/transcript

Returns all transcripts.

Create a new transcription job

Method: [POST]
https://api.exemplary.ai/v1/transcript

Creates a new transcription to be processed, and returns the created transcript entry.

Supported ASR providers:

  • AWS Transcribe
  • Deepgram
  • AssemblyAI

Body Parameters:

url: string - required

URL of an input audio or video file.

provider: string - required

Specifies a supported ASR provider. Either one of aws, deepgram or assemblyai

speaker_labels: boolean

Returns labels that segments the different speakers in the source media.

punctuate: boolean

Specifies whether a punctutation model should be applied during ASR.

language: string

Specifies a supported ASR provider. Specify a language and variant using BCP 47 codes. We map these to the provider's language codes, so you don't have to keep track of each providers unique language code style. If an incompatible language code is supplied, we return an error.

sentiment: boolean

Specifies whether Sentiment Analysis should be run on the transcript

summarize: boolean

Specifies whether to generate summaries for the transcript.

auto_highlights: boolean

Specifies whether to return highlighted keywords and phrases from the transcript.

entities: boolean

Specifies whether entities in the audio should be identified; such as person and company names, email addresses, dates, and locations.

filler_words: boolean

Specifies whether the returned transcript should include filler words or not

content_moderation: boolean

Specifies if sensitive content should should be identified. Returns what sensitive content was found with timestamps

postprocessing: object

(NOTE: The postprocessing parameter will soon be depcrecated in favor of a better UX for adding NLP Steps to your transcript)

Specifies what postprocessing (NLP) tasks are to be done, and through which providers as shown below.

Supported NLP providers:

  • aws (AWS Comprehend)
  • symblai(Symblai)

Possible options:

{
         "postprocessing": {
            "questions": <"symblai" | "aws">,
            "sentiment": <"symblai" | "aws">,
            "entities": <"aws">
         }
}

Our Intermediate Format

At the core of our API is the collection of transcripts and the items related to them. Every provider specific response is transformed into our intermediate representation — one that is readable, flexible, easily compressible, and provider agnostic, while supporting the transforms needed for performing NLP tasks on conversational data.

This allows you to think of each Provider as just that - a provider, without all the custom Job management, callback handling, error reporting etc. Use our API to quickly take advantage of different providers, mix and match NLP tasks on the generated transcript and other niceties you can use, to build

A draft version of out Intermediate Format follows a straightforward structure:

{
  "words": [
    {
      "text": "Hey",
      "start": 0.54,
      "end": 0.81,
      "score": 0.9997
    },
    {
      "text": "everybody.",
      "start": 0.81,
      "end": 1.86,
      "score": 0.9982
    },
    ...
  ],
  "speakers": [
    {
      "speaker": "1",
      "start": 0.54,
      "end": 19.86
    },
    {
      "speaker": "2",
      "start": 22.34,
      "end": 33.26
    },
    ...
  ]
}

Getting help & Feature Requests

That said, if any feature you'd like, provider-specific or otherwise you find is missing, do fill in a feature request here, and we'll get to figuring out the best way to represent a common interface for it.

Please search for or create issues on our Github Issues page.

Don't forget to join our Discord and feel free to ask any questions there!

Previous
Getting started