GLM 5.2 Fast

Fast version of GLM 5.2 with 120-250 TPS.

ReasoningTool UseImplicit Caching

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'zai/glm-5.2-fast',
  prompt: 'Why is the sky blue?'
})

Latency24 hours

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.