rxtranscribe (Speech-to-Text)

Description

This package allows real-time speech-to-text (S2T) functionality using a WebSocket which streams audio data to AWS Transcribe. Two reasons why the team at Buccaneer decided to write a package for it:

  1. At the time this was written in Fall 2019, the Transcribe API in aws-sdk did not support real-time audio streaming and Amazon did not provide an official JavaScript client for real-time subscription. For many speech-to-text use cases, real-time transcription is mandatory. (Tip: If your goal is to transcribe an audio file in its entirety and you can wait for the entire file to be processed, then you can use the aws-sdk and you don't need this library!)

  2. This implementation makes it easy to plug AWS Transcribe into RxJS 6 pipelines.

  3. This package encapsulates the business logic of AWS Transcribe streaming into a separate npm package so that applications can focus on what they're good at and don't need to worry about the internals of how AWS Transcribe processes audio data streams.

Installation

yarn add @buccaneer/rxtranscribe
npm i --save @buccaneer/rxtranscribe

Compatibility

💡 This package could perhaps be modified to work universally by polyfilling in the Buffer object. The authors haven't bothered to do it because our use case did not require it to run on clients. But if you want to take a stab at implementing it, contact us!

Basic Usage

import { transcribe } from '@buccaneer/rxtranscribe';

// The pipeline takes a stream of .wav audio chunks (Buffer, String, Blob or Typed Array)
const buffer$ = chunk$.pipe(
  map(chunkStr => Buffer.from(chunkString, 'base64')),
  transcribe()
);

💡 One limitation of the current package is that it currently only supports .wav files as inputs, which are encoded as .mp4 files. It would be nice to support a wider variety of file types. If you know how to do this and want to help implement this functionality, please contact us.

Documentation & Guides

Last updated