GreenPT Docs

Models

Full GreenPT model catalog, GreenPT-branded models, direct provider chat models, embeddings, speech-to-text, and reranking.

Full GreenPT model catalog, including GreenPT-branded models, direct provider chat models, embeddings, speech-to-text, and reranking.

Catalog Notes

All models are served through the GreenPT API proxy with an OpenAI-compatible interface. Prices are in EUR per million tokens unless noted otherwise, and API pricing applies to API usage only.

Get Models

List available models and their capabilities.

GET /v1/models

Endpoint: https://api.greenpt.ai/v1/models

Example request

curl https://api.greenpt.ai/v1/models \
  -H "Authorization: Bearer sk-your_api_key"

Example response

{
  "object": "list",
  "data": [
    { "id": "green-r",       "object": "model", "created": 0, "owned_by": "greenpt"  },
    { "id": "green-r-raw",   "object": "model", "created": 0, "owned_by": "greenpt"  },
    { "id": "green-l",       "object": "model", "created": 0, "owned_by": "greenpt"  },
    { "id": "green-l-raw",   "object": "model", "created": 0, "owned_by": "greenpt"  },
    { "id": "mistral-small-3.2-24b-instruct-2506", "object": "model", "created": 0, "owned_by": "mistral" },
    { "id": "gpt-oss-120b",  "object": "model", "created": 0, "owned_by": "openai"   },
    { "id": "gemma-3-27b-it","object": "model", "created": 0, "owned_by": "google"   },
    { "id": "gemma4",        "object": "model", "created": 0, "owned_by": "greenpt"  }
  ]
}

GreenPT Models

GreenPT-branded chat models with tuned system prompts or raw variants for direct control. gemma4 is the recommended default for chat completions.

gemma4GreenPTRecommendedPreview
Backed by Google Gemma 4 31B

GreenPT-branded next-generation chat model with long-context multimodal reasoning. Recommended default for chat completions.

Input€0.50
Output€1.50
Billingper 1M tokens
Context256k
Max output32k
ChatMultimodal reasoningLong-contextDefault model
green-rGreenPT
Backed by GPT-OSS

GreenPT-branded reasoning model for advanced analysis, writing, and content generation.

Input€0.35
Output€0.95
Billingper 1M tokens
TextImagesDocumentsMultilingualAdvanced reasoningWriting & content generation
green-r-rawGreenPT
Backed by GPT-OSS

Direct access to the same reasoning stack as green-r without the GreenPT system prompt.

Input€0.35
Output€0.95
Billingper 1M tokens
TextImagesDocumentsMultilingualNo system prompt
green-lGreenPT
Backed by Mistral Small 3.2 24B

GreenPT-branded chat model tuned for multilingual writing, image understanding, and Dutch grammar guardrails.

Input€0.25
Output€0.80
Billingper 1M tokens
TextImagesDocumentsMultilingualWriting assistantDutch grammar guardrails
green-l-rawGreenPT
Backed by Mistral Small 3.2 24B

Direct access to the same GreenPT-backed model as green-l without the built-in system prompt.

Input€0.25
Output€0.80
Billingper 1M tokens
TextImagesDocumentsMultilingualNo system prompt

Raw vs Non-Raw

Non-raw models like green-r and green-l include GreenPT's Sustainability system prompt, optimized for our use cases. Raw models like green-r-raw and green-l-raw do not include a system prompt, so you can supply your own via the API.

Green Router Models

Direct access to foundation models without a GreenPT system prompt, served through Green Router.

qwen3.5-397b-a17bQwenRecommendedNew

State-of-the-art Qwen model optimized for code generation, agentic tasks, and logical reasoning.

Input€0.66
Output€3.96
Billingper 1M tokens
Context250k
Max output16k
ChatCode generationAgentic tasksLogical reasoningState-of-the-art (March 2026)
mistral-small-3.2-24b-instruct-2506Mistral

General-purpose model with balanced cost, function calling, and multimodal chat support.

Input€0.15
Output€0.35
Billingper 1M tokens
Context128k
Max output32k
ChatVisionFunction callingMultilingual
gpt-oss-120bOpenAI

Large open model with vision, function calling, and long-context reasoning support.

Input€0.15
Output€0.60
Billingper 1M tokens
Context128k
Max output32k
ChatVisionLong-context reasoningFunction calling
gemma-3-27b-itGoogle

Multimodal reasoning model from Google for chat and image-aware tasks.

Input€0.25
Output€0.50
Billingper 1M tokens
Context40k
Max output8k
ChatVisionMultimodal reasoning
qwen3-235b-a22b-instruct-2507Qwen

High-context multilingual reasoning model with a 250k token window.

Input€0.75
Output€2.25
Billingper 1M tokens
Context250k
Max output16k
ChatLong-contextReasoningMultilingual250k context window
qwen3-coder-30b-a3b-instructQwenCoding

Qwen coding model built for generation, completion, and debugging workflows.

Input€0.20
Output€0.80
Billingper 1M tokens
Context128k
Max output32k
ChatCode generationCompletionDebugging
devstral-2-123b-instruct-2512MistralCoding

Mistral coding model for multi-file reasoning and agentic software engineering tasks.

Input€0.60
Output€2.35
Billingper 1M tokens
Context200k
Max output16k
Code generationMulti-file reasoningAgentic coding tasks200k context
llama-3.3-70b-instructMeta

Meta instruction-following model with strong multilingual support for general chat use cases.

Input€0.90
Output€0.90
Billingper 1M tokens
Context100k
Max output16k
ChatInstruction followingMultilingual
llama-3.1-8b-instructMetaDeprecated

Fast, lightweight, and cost-efficient Meta model retained for compatibility.

Input€0.20
Output€0.20
Billingper 1M tokens
Context128k
Max output16k
ChatFast & lightweightCost-efficient inference
mistral-nemo-instruct-2407MistralDeprecated

Instruction-tuned multilingual Mistral model retained as a deprecated option.

Input€0.20
Output€0.20
Billingper 1M tokens
Context128k
Max output8k
ChatInstruction followingMultilingual
deepseek-r1-distill-llama-70bDeepSeekDeprecated

DeepSeek distilled reasoning model focused on math and code-heavy tasks.

Input€0.90
Output€0.90
Billingper 1M tokens
Context16k
Max output4k
Chain-of-thought reasoningMathCode
voxtral-small-24b-2507Mistral

Audio-aware chat model for transcription-adjacent and speech understanding use cases.

Input€0.15
Output€0.35
Billingper 1M tokens
Context32k
Max output16k
ChatAudio transcriptionSpeech understanding

Deprecated Chat Models

llama-3.1-8b-instruct, mistral-nemo-instruct-2407, and deepseek-r1-distill-llama-70b are deprecated as of January 2026 and remain available until EOL in April 2026.

Embedding Models

Dense vector embeddings for semantic search and retrieval workflows.

green-embeddingGreenPT
Backed by Qwen3-Embedding-4B

Dense multilingual embedding model for semantic search, retrieval, and RAG pipelines. Supports Matryoshka Representation Learning, so output dimensions are configurable from 32 up to 2560.

Input€0.20
OutputNo output charge
Billingper 1M tokens
Context32k
Dimensionsup to 2560
Dense embeddingsMultilingual (100+ languages)Semantic searchRAGMRL (custom dimensions)Instruction-aware

Speech-to-Text Models

Pre-recorded and live transcription models with optional multilingual support.

green-sGreenPT

Speech-to-text model for pre-recorded and live transcription.

Input€0.52/hr
Output€0.65/hr live
Billingaudio pricing
Pre-recorded transcriptionLive transcription
green-s-proGreenPTPro

Advanced speech-to-text model with multilingual transcription options.

Input€0.52/hr recorded
Output€0.78/hr live
Billingaudio pricing
Pre-recorded transcriptionLive transcriptionMultilingual input €0.60/hrMultilingual live €1.04/hr

Reranking Models

Models for document reranking and RAG result scoring.

green-rerankGreenPT
Backed by Qwen3-Reranker-4B

Cross-encoder reranking model that scores and reorders search or RAG results by relevance to a query.

Input€0.12
OutputNo output charge
Billingper 1M tokens
Context32k
Document rerankingRAG result scoringMultilingual (100+ languages)Instruction-aware

On this page