LLM Chat

GPT-OSS 120B

by OpenAI · Released 2026

OpenAI's open-source 120B parameter model. A strong general-purpose model available through Cloudflare Workers AI. Provides solid performance across coding, reasoning, and general tasks at a competitive price point.

LLM Chat

GPT-OSS 120B

Powered by OpenAI · Transformer (open-source)

Context Window

128K

Parameters

120B

Max Output

16K

Category

LLM Chat

Overview

GPT-OSS 120B is OpenAI's first major open-source model, marking a significant shift in the company's strategy. At 120 billion parameters, it is a dense Transformer model that delivers strong general-purpose performance across coding, reasoning, and knowledge tasks — making it a compelling alternative to proprietary models for teams that need self-hosting capability or want to avoid vendor lock-in.

The model is available through Cloudflare Workers AI, enabling edge deployment with low-latency inference globally. Its 128K context window handles substantial documents and codebases, and its performance on standard benchmarks is competitive with many proprietary models at similar price points. The open-source nature means it can be fine-tuned, quantized, and deployed on custom infrastructure.

GPT-OSS 120B serves as a strong baseline for the open-source LLM ecosystem, and several derivative models (like NVIDIA's Nemotron 3 Super) have been built on top of it. For teams that need a capable, self-hostable model with the OpenAI training methodology, GPT-OSS 120B is the go-to choice.

Pricing

MetricPrice
Input /1M tokens₹100.0000
Output /1M tokens₹400.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

  • Open-source 120B parameter model
  • Strong general-purpose performance
  • Available via Cloudflare Workers AI
  • Competitive pricing for its capability tier

Benchmarks

BenchmarkScore
MMLU-Pro79.8%
HumanEval87.3%
MATH-50086.5%
GPQA Diamond65.2%
SWE-bench Verified55.1%

Technical Details

  • OpenAI's first major open-source model — 120B dense Transformer
  • Available through Cloudflare Workers AI for edge deployment
  • Context window: 128K tokens
  • Open-source license allows fine-tuning and custom deployment
  • Base model for derivative works (e.g., NVIDIA Nemotron 3 Super)
  • Supports structured outputs and function calling
  • Can be quantized for deployment on smaller hardware

Strengths

  • Open-source — can be self-hosted, fine-tuned, and customized
  • Strong general-purpose performance from OpenAI's training methodology
  • Available on Cloudflare Workers AI for global edge deployment
  • Competitive pricing at $1.00/$4.00 per 1M tokens

Limitations

  • 120B dense model requires significant compute for self-hosting
  • Lower benchmark scores than proprietary GPT-5.4 variants
  • Less efficient than MoE architectures at similar quality levels
  • No native multimodal support — text only

Use Cases

General-purpose chatCode assistanceContent generationKnowledge Q&A

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "gpt-oss-120b", "messages": [{"role": "user", "content": "Explain the difference between REST and GraphQL"}]}'

Endpoint: POST /v1/chat/completions · Model ID: gpt-oss-120b

Try GPT-OSS 120B now

Get 1000 free API credits on signup. No credit card required.