Carbon Calculations

Methodology for calculating the environmental impact of AI inference operations.

Our comprehensive methodology for calculating the environmental impact of AI inference operations.

Alpha version notice: currently we're building up our calculations primarily based on technical specifications of our infrastructure and assumptions regarding our actual usage. We're looking for cases to compare our approach with. Are you measuring, calculating, or estimating the energy consumption and environmental footprint of your infrastructure? Let's have a chat!

Sustainability metrics overview

Our two-step process for measuring environmental impact.

Step 1: estimating energy

Estimate GPU power consumption.
Estimate CPU power consumption.
Factor in Power Usage Effectiveness (PUE).

Step 2: estimate environmental impact

Fetch the actual CO₂ intensity for the datacenter at the time of the request (via Nodera, see CO₂ intensity data).
Convert energy consumption to CO₂ equivalent.

Practical implementation

Measure inference time.
Estimate GPU energy consumption.
Factor in CPU energy consumption to get "cluster energy consumption".

Energy estimation methodology

Based on measured inference runtime and hardware specifications.

Calculation factors

We multiply measured inference runtime by three key factors:

GPU power (kW): based on hardware specifications and load assumptions.
CPU overhead multiplier: includes GPU-supporting CPUs and cluster application CPUs.
Power Usage Effectiveness: datacenter efficiency factor.

GPU power calculations

NVIDIA H100 PCIe specifications and power assumptions.

Hardware specifications

GPU: NVIDIA H100 PCIe.
TDP rating: 350 W.

Power consumption assumptions

TDP represents a maximum threshold rather than actual usage. Our estimates are based on real-world measurements:

State	Power	Description
Idle power	~100 W	Baseline consumption when not processing.
Under load	~300 W	Typical inference workload consumption.

Note: these estimates are based on expert analysis. The H100 can potentially draw more than its TDP rating under certain conditions.

CPU overhead calculations

Estimating total cluster energy consumption beyond GPU usage.

GPU instance CPUs

H100_1_80G instances are packaged with AMD Zen 4 CPUs. Power consumption estimates:

State	Power	Description
Idle	~40 W	Baseline CPU consumption.
Feeding GPU	~80 W	During inference operations.

Open question: vCPU allocation methodology: determining whether to use simple proportion (vCPU / CPU-cores) for shared resources.

Non-inference workload CPUs

POP2_4C_16G instances use AMD EPYC 7543 32-Core Processors (TDP: 225 W):

State	Power	% of TDP
Idle	~33.75 W	15%
Typical load	~67.5 W	30%

Overhead factor calculation

Methodology for scaling from GPU-only to total cluster energy consumption.

Usage assumptions

Typical load: 4 hours / day.
Idle time: 20 hours / day.

Calculation method

Step 1, calculate E_GPU: energy consumption estimation for GPU-only operations (excluding GPU-supporting CPUs).
Step 2, calculate E_Cluster: total energy consumption including GPUs and all CPU overhead.

Overhead factor formula

Overhead Factor = E_Cluster / E_GPU

This proportion allows us to scale GPU energy consumption to total cluster consumption.

Implementation

The overhead factor is applied to measured inference times to estimate the total environmental impact of each AI operation, including all supporting infrastructure.

We partner with Nodera to obtain 1-hour resolution CO₂ intensity data at the datacenter level. Our carbon calculations use the actual carbon intensity of the grid serving each datacenter at the time of the inference, not a regional average or annual estimate. See Nodera's carbon intelligence for more on their methodology.

These values feed directly into the CO₂ metrics shown in GreenPT and the GreenPT API, so every response you receive carries a figure based on what the grid actually looked like at that moment.

What this means in practice

The reported CO₂ cost of the same prompt can differ depending on when you send it. Grid carbon intensity changes throughout the day and between days as the share of renewable generation shifts. On a sunny afternoon, solar output pushes the carbon intensity of the grid down, so a request sent then will show a lower CO₂ figure than the same request sent late at night when there is little solar generation.

This is expected behaviour and reflects the real environmental impact of each inference, not a smoothed-out average.

Why 1-hour datacenter-level data matters

Using coarse averages can significantly over- or under-estimate the true carbon cost of an operation:

Temporal precision: grid carbon intensity fluctuates throughout the day as renewable generation varies. A request processed during peak solar hours may carry a fraction of the carbon cost of one processed at night.
Location precision: carbon intensity can vary considerably between regions and between datacenters in the same area, depending on the local grid mix and any on-site generation.
Accurate accounting: 1-hour granularity aligns with how electricity markets and grid operators report real marginal emissions, giving us numbers we can stand behind.

On this page