Pre-Recorded Audio (STT)
Transcribe uploaded audio files with speaker diarization and multi-language support.
Transcribe uploaded audio files with advanced features like speaker diarization and multi-language support. Our API follows OpenAI and Deepgram compatible patterns.
SDK integration
Use the official Deepgram SDK with our API endpoint.
import { createClient } from '@deepgram/sdk';
const client = createClient('YOUR_API_KEY', {
global: {
fetch: { options: { url: 'https://api.greenpt.ai/v1' } },
},
});from deepgram import DeepgramClient
from deepgram.environment import DeepgramClientEnvironment
# Create environment for GreenPT API
greenpt_env = DeepgramClientEnvironment(
base="https://api.greenpt.ai",
production="wss://api.greenpt.ai",
agent="wss://api.greenpt.ai",
)
deepgram = DeepgramClient("YOUR_API_KEY", environment=greenpt_env)Endpoint
POST https://api.greenpt.ai/v1/listenRequest body
Required and optional parameters.
| Parameter | Type | Required | Description |
|---|---|---|---|
file | binary | Yes | The audio file to transcribe (WAV, MP3, FLAC, etc.). |
model | string | No | Speech model to use. Defaults to "green-s". |
language | string | No | Language code (e.g. "en", "fr", "de"). Auto-detected if not specified. |
diarize | boolean | No | Enable speaker diarization to identify different speakers. |
punctuate | boolean | No | Add punctuation and capitalization to transcript. |
smart_format | boolean | No | Apply formatting to transcript output for improved readability. |
filler_words | boolean | No | Include filler words like "uh" and "um" in transcript. |
numerals | boolean | No | Convert numbers from written format to numerical format. |
sentiment | boolean | No | Analyze sentiment throughout the transcript. |
topics | boolean | No | Detect topics throughout the transcript. |
intents | boolean | No | Recognize speaker intent throughout the transcript. |
Example request: local file
curl \
--request POST \
--header 'Authorization: Token YOUR_API_KEY' \
--header 'Content-Type: audio/wav' \
--data-binary @youraudio.wav \
--url 'https://api.greenpt.ai/v1/listen?model=green-s&language=en&diarize=true&punctuate=true'Example request: URL / bucket
curl \
--request POST \
--header 'Authorization: Token YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{"url":"https://static.deepgram.com/examples/Bueller-Life-moves-pretty-fast.wav"}' \
--url 'https://api.greenpt.ai/v1/listen?model=green-s&language=en&diarize=true&punctuate=true'Example response
{
"metadata": {
"request_id": "a847f427-4ad5-4d67-9b95-db801e58251c",
"duration": 25.933313,
"channels": 1,
"created": "2024-05-12T18:57:13.426Z"
},
"results": {
"channels": [
{
"alternatives": [
{
"transcript": "Hello, this is a test of the speech to text API.",
"confidence": 0.98,
"words": [
{
"word": "Hello",
"start": 0.5,
"end": 0.8,
"confidence": 0.99,
"speaker": 0
}
]
}
]
}
]
}
}Complete SDK example
Full working example with the Deepgram SDK.
import { createClient } from '@deepgram/sdk';
import fs from 'fs';
const client = createClient('YOUR_API_KEY', {
global: {
fetch: { options: { url: 'https://api.greenpt.ai/v1' } },
},
});
async function transcribeLocalFile() {
const { result, error } = await client.listen.prerecorded.transcribeFile(
fs.readFileSync('path/to/audio.wav'),
{
model: 'green-s',
language: 'en',
punctuate: true,
diarize: true,
},
);
if (error) throw error;
console.log(result);
}
transcribeLocalFile();from deepgram import DeepgramClient
from deepgram.environment import DeepgramClientEnvironment
# Create environment for GreenPT API
greenpt_env = DeepgramClientEnvironment(
base="https://api.greenpt.ai",
production="wss://api.greenpt.ai",
agent="wss://api.greenpt.ai",
)
deepgram = DeepgramClient("YOUR_API_KEY", environment=greenpt_env)
# For URL-based transcription
response = deepgram.listen.v1.media.transcribe_url(
url="https://static.deepgram.com/examples/Bueller-Life-moves-pretty-fast.wav",
model="green-s",
language="en",
punctuate=True,
diarize=True,
smart_format=True,
)
print(response)Available models
Choose the model that fits your language and use case.
green-s: GreenS
Reliable speech-to-text for single-language audio. Great for recordings, podcasts, and archived content.
Supported languages: English, German, Spanish, French, Italian, Dutch, Portuguese, Romanian.
green-s-pro: GreenS Pro
Advanced model with automatic language detection. Ideal for international content and mixed-language recordings.
Supported languages: English, German, Dutch, Swedish, Turkish.
Multilingual: use
multifor automatic language detection across languages in the same file.
Multilingual mode
Transcribe conversations where speakers switch between languages.
With green-s-pro, set language=multi to transcribe audio where multiple
languages are spoken. The model automatically detects and transcribes each
language as speakers switch.
Languages supported in multilingual mode: English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, Dutch.
Price difference
Multilingual processing costs more than single-language. For pre-recorded: €0.60/hour vs €0.52/hour for monolingual.
Output differences
With language=multi, the response adds a languages array and a language
field per word:
"alternatives": [{
"transcript": "No recuerdo mi bank password.",
"languages": ["es", "en"],
"words": [
{ "word": "no", "language": "es" },
{ "word": "recuerdo", "language": "es" },
{ "word": "bank", "language": "en" }
]
}]Available features
Add-on capabilities for pre-recorded transcription.
| Feature | Description | green-s | green-s-pro |
|---|---|---|---|
| Speaker diarization | Identify different speakers in the audio. | Yes | Yes |
| Entity detection | Detect names, dates, and other entities in the transcript. | Yes | Yes |
| Language detection | Automatically detect spoken language. | Yes | Yes |
| Profanity filter | Filter or mask profanity in the transcript. | Yes | Yes |
| Speech intent & topics | Detect topics and speaker intent. | Yes | Yes |
| Summarization | Generate a summary of the transcript (English recommended). | Yes | Yes |
| Smart formatting | Improved punctuation and readability (English only). | Yes | - |
Pricing
Pre-recorded transcription rates per hour of audio.
| Model | Rate |
|---|---|
green-s: all supported languages | €0.52 / hour |
green-s-pro: monolingual | €0.52 / hour |
green-s-pro: multilingual | €0.60 / hour |
All prices in EUR, excl. taxes. green-s and green-s-pro monolingual share
the same pre-recorded rate. Choose green-s-pro for multilingual mode or
additional language options.