GPT-OSS 120B
by OpenAI · Released 2026
OpenAI's open-source 120B parameter model. A strong general-purpose model available through Cloudflare Workers AI. Provides solid performance across coding, reasoning, and general tasks at a competitive price point.
GPT-OSS 120B
Powered by OpenAI · Transformer (open-source)
Context Window
128K
Parameters
120B
Max Output
16K
Category
LLM Chat
Overview
GPT-OSS 120B is OpenAI's first major open-source model, marking a significant shift in the company's strategy. At 120 billion parameters, it is a dense Transformer model that delivers strong general-purpose performance across coding, reasoning, and knowledge tasks — making it a compelling alternative to proprietary models for teams that need self-hosting capability or want to avoid vendor lock-in.
The model is available through Cloudflare Workers AI, enabling edge deployment with low-latency inference globally. Its 128K context window handles substantial documents and codebases, and its performance on standard benchmarks is competitive with many proprietary models at similar price points. The open-source nature means it can be fine-tuned, quantized, and deployed on custom infrastructure.
GPT-OSS 120B serves as a strong baseline for the open-source LLM ecosystem, and several derivative models (like NVIDIA's Nemotron 3 Super) have been built on top of it. For teams that need a capable, self-hostable model with the OpenAI training methodology, GPT-OSS 120B is the go-to choice.
Pricing
| Metric | Price |
|---|---|
| Input /1M tokens | ₹100.0000 |
| Output /1M tokens | ₹400.0000 |
1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.
Key Highlights
- Open-source 120B parameter model
- Strong general-purpose performance
- Available via Cloudflare Workers AI
- Competitive pricing for its capability tier
Benchmarks
| Benchmark | Score |
|---|---|
| MMLU-Pro | 79.8% |
| HumanEval | 87.3% |
| MATH-500 | 86.5% |
| GPQA Diamond | 65.2% |
| SWE-bench Verified | 55.1% |
Technical Details
- OpenAI's first major open-source model — 120B dense Transformer
- Available through Cloudflare Workers AI for edge deployment
- Context window: 128K tokens
- Open-source license allows fine-tuning and custom deployment
- Base model for derivative works (e.g., NVIDIA Nemotron 3 Super)
- Supports structured outputs and function calling
- Can be quantized for deployment on smaller hardware
Strengths
- Open-source — can be self-hosted, fine-tuned, and customized
- Strong general-purpose performance from OpenAI's training methodology
- Available on Cloudflare Workers AI for global edge deployment
- Competitive pricing at $1.00/$4.00 per 1M tokens
Limitations
- 120B dense model requires significant compute for self-hosting
- Lower benchmark scores than proprietary GPT-5.4 variants
- Less efficient than MoE architectures at similar quality levels
- No native multimodal support — text only
Use Cases
API Example
curl https://api.callmissed.com/v1/chat/completions \
-H "Authorization: Bearer cm_YOUR_KEY" \
-d '{"model": "gpt-oss-120b", "messages": [{"role": "user", "content": "Explain the difference between REST and GraphQL"}]}'Endpoint: POST /v1/chat/completions · Model ID: gpt-oss-120b
Try GPT-OSS 120B now
Get 1000 free API credits on signup. No credit card required.