GreenPT Docs

Pre-Recorded Audio (STT)

Transcribe uploaded audio files with speaker diarization and multi-language support.

POST

Transcribe uploaded audio files with advanced features like speaker diarization and multi-language support. Our API follows OpenAI and Deepgram compatible patterns.

SDK integration

Use the official Deepgram SDK with our API endpoint.

import { createClient } from '@deepgram/sdk';

const client = createClient('YOUR_API_KEY', {
  global: {
    fetch: { options: { url: 'https://api.greenpt.ai/v1' } },
  },
});
from deepgram import DeepgramClient
from deepgram.environment import DeepgramClientEnvironment

# Create environment for GreenPT API
greenpt_env = DeepgramClientEnvironment(
    base="https://api.greenpt.ai",
    production="wss://api.greenpt.ai",
    agent="wss://api.greenpt.ai",
)

deepgram = DeepgramClient("YOUR_API_KEY", environment=greenpt_env)

Endpoint

POST https://api.greenpt.ai/v1/listen

Request body

Required and optional parameters.

ParameterTypeRequiredDescription
filebinaryYesThe audio file to transcribe (WAV, MP3, FLAC, etc.).
modelstringNoSpeech model to use. Defaults to "green-s".
languagestringNoLanguage code (e.g. "en", "fr", "de"). Auto-detected if not specified.
diarizebooleanNoEnable speaker diarization to identify different speakers.
punctuatebooleanNoAdd punctuation and capitalization to transcript.
smart_formatbooleanNoApply formatting to transcript output for improved readability.
filler_wordsbooleanNoInclude filler words like "uh" and "um" in transcript.
numeralsbooleanNoConvert numbers from written format to numerical format.
sentimentbooleanNoAnalyze sentiment throughout the transcript.
topicsbooleanNoDetect topics throughout the transcript.
intentsbooleanNoRecognize speaker intent throughout the transcript.

Example request: local file

curl \
  --request POST \
  --header 'Authorization: Token YOUR_API_KEY' \
  --header 'Content-Type: audio/wav' \
  --data-binary @youraudio.wav \
  --url 'https://api.greenpt.ai/v1/listen?model=green-s&language=en&diarize=true&punctuate=true'

Example request: URL / bucket

curl \
  --request POST \
  --header 'Authorization: Token YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{"url":"https://static.deepgram.com/examples/Bueller-Life-moves-pretty-fast.wav"}' \
  --url 'https://api.greenpt.ai/v1/listen?model=green-s&language=en&diarize=true&punctuate=true'

Example response

{
  "metadata": {
    "request_id": "a847f427-4ad5-4d67-9b95-db801e58251c",
    "duration": 25.933313,
    "channels": 1,
    "created": "2024-05-12T18:57:13.426Z"
  },
  "results": {
    "channels": [
      {
        "alternatives": [
          {
            "transcript": "Hello, this is a test of the speech to text API.",
            "confidence": 0.98,
            "words": [
              {
                "word": "Hello",
                "start": 0.5,
                "end": 0.8,
                "confidence": 0.99,
                "speaker": 0
              }
            ]
          }
        ]
      }
    ]
  }
}

Complete SDK example

Full working example with the Deepgram SDK.

import { createClient } from '@deepgram/sdk';
import fs from 'fs';

const client = createClient('YOUR_API_KEY', {
  global: {
    fetch: { options: { url: 'https://api.greenpt.ai/v1' } },
  },
});

async function transcribeLocalFile() {
  const { result, error } = await client.listen.prerecorded.transcribeFile(
    fs.readFileSync('path/to/audio.wav'),
    {
      model: 'green-s',
      language: 'en',
      punctuate: true,
      diarize: true,
    },
  );

  if (error) throw error;
  console.log(result);
}

transcribeLocalFile();
from deepgram import DeepgramClient
from deepgram.environment import DeepgramClientEnvironment

# Create environment for GreenPT API
greenpt_env = DeepgramClientEnvironment(
    base="https://api.greenpt.ai",
    production="wss://api.greenpt.ai",
    agent="wss://api.greenpt.ai",
)

deepgram = DeepgramClient("YOUR_API_KEY", environment=greenpt_env)

# For URL-based transcription
response = deepgram.listen.v1.media.transcribe_url(
    url="https://static.deepgram.com/examples/Bueller-Life-moves-pretty-fast.wav",
    model="green-s",
    language="en",
    punctuate=True,
    diarize=True,
    smart_format=True,
)

print(response)

Available models

Choose the model that fits your language and use case.

green-s: GreenS

Reliable speech-to-text for single-language audio. Great for recordings, podcasts, and archived content.

Supported languages: English, German, Spanish, French, Italian, Dutch, Portuguese, Romanian.

green-s-pro: GreenS Pro

Advanced model with automatic language detection. Ideal for international content and mixed-language recordings.

Supported languages: English, German, Dutch, Swedish, Turkish.

Multilingual: use multi for automatic language detection across languages in the same file.

Multilingual mode

Transcribe conversations where speakers switch between languages.

With green-s-pro, set language=multi to transcribe audio where multiple languages are spoken. The model automatically detects and transcribes each language as speakers switch.

Languages supported in multilingual mode: English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, Dutch.

Price difference

Multilingual processing costs more than single-language. For pre-recorded: €0.60/hour vs €0.52/hour for monolingual.

Output differences

With language=multi, the response adds a languages array and a language field per word:

"alternatives": [{
  "transcript": "No recuerdo mi bank password.",
  "languages": ["es", "en"],
  "words": [
    { "word": "no", "language": "es" },
    { "word": "recuerdo", "language": "es" },
    { "word": "bank", "language": "en" }
  ]
}]

Available features

Add-on capabilities for pre-recorded transcription.

FeatureDescriptiongreen-sgreen-s-pro
Speaker diarizationIdentify different speakers in the audio.YesYes
Entity detectionDetect names, dates, and other entities in the transcript.YesYes
Language detectionAutomatically detect spoken language.YesYes
Profanity filterFilter or mask profanity in the transcript.YesYes
Speech intent & topicsDetect topics and speaker intent.YesYes
SummarizationGenerate a summary of the transcript (English recommended).YesYes
Smart formattingImproved punctuation and readability (English only).Yes-

Pricing

Pre-recorded transcription rates per hour of audio.

ModelRate
green-s: all supported languages€0.52 / hour
green-s-pro: monolingual€0.52 / hour
green-s-pro: multilingual€0.60 / hour

All prices in EUR, excl. taxes. green-s and green-s-pro monolingual share the same pre-recorded rate. Choose green-s-pro for multilingual mode or additional language options.

On this page