What is Mistral Small 3.1?

A 24B parameter open-source model with 128K context, vision understanding, and function calling. Outperforms GPT-4o Mini and Gemma 3 while running at 150 tokens/sec. Free on CallMissed.

How much does Mistral Small 3.1 cost?

Mistral Small 3.1 costs $0.35/1M tokens for input and $0.56/1M tokens for output on CallMissed. 1 credit = ₹1 = $0.01 USD.

How do I use Mistral Small 3.1 via API?

Send a POST request to POST /v1/chat/completions with model "mistral-small-3.1" and your API key. CallMissed uses the OpenAI-compatible format — just change the base URL and model field.

What is the context window of Mistral Small 3.1?

Mistral Small 3.1 supports a 128K token context window with up to 8K output tokens.

सभी मॉडल पर वापस जाएं

LLM चैटfree-tieropen-sourcevision

Mistral Small 3.1

द्वारा Mistral AI · रिलीज़ March 2025

24B पैरामीटर ओपन-सोर्स — 128K कॉन्टेक्स्ट, विज़न, function calling। GPT-4o Mini और Gemma 3 से आगे, 150 टोकन/सेक। CallMissed पर मुफ़्त।

LLM चैट

Mistral Small 3.1

द्वारा संचालित Mistral AI · Dense Transformer (24B)

कॉन्टेक्स्ट विंडो

128K

पैरामीटर

24B (dense)

अधिकतम आउटपुट

श्रेणी

LLM चैट

अवलोकन

Mistral Small 3.1 (2503) builds upon Mistral Small 3 by adding state-of-the-art vision understanding and enhancing long context capabilities up to 128K tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks while remaining efficient enough to run on a single GPU.

The model outperforms comparable models like Gemma 3 and GPT-4o Mini across a range of benchmarks, while delivering inference speeds of 150 tokens per second. It supports function calling, structured outputs, and JSON mode — making it suitable for agentic workflows and tool-use scenarios.

Mistral Small 3.1 is released under the Apache 2.0 license, making it fully open-source and available for commercial use without restrictions. On CallMissed, it runs on the CallMissed gateway, making it available on the free tier with no additional cost beyond credits.

Key improvements over Mistral Small 3 include multimodal vision understanding (the model can process images alongside text), extended context from 32K to 128K tokens, and improved performance on long-document comprehension tasks. The model is optimized for efficient local inference, supporting use cases such as conversational agents, function calling, long-document comprehension, and privacy-sensitive deployments.

प्राइसिंग

मेट्रिक	कीमत
इनपुट /1M tokens	₹35.0000
आउटपुट /1M tokens	₹56.0000

1 क्रेडिट = ₹1 = $0.01 USD। कीमतें प्रोवाइडर से दिखाई गई हैं; CallMissed ~35% मार्कअप के साथ पास-थ्रू करता है।

मुख्य बातें

CallMissed पर मुफ़्त — फ्री टियर
अधिकांश बेंचमार्क पर GPT-4o Mini और Gemma 3 से आगे
लंबे दस्तावेज़ों के लिए 128K कॉन्टेक्स्ट
विज़न — टेक्स्ट के साथ इमेज
Apache 2.0 ओपन-सोर्स लाइसेंस
150 टोकन/सेक इन्फ़रेंस

बेंचमार्क

बेंचमार्क	स्कोर	नोट्स
MMLU	81.0%	सामान्य ज्ञान
HumanEval	84.8%	कोड जनरेशन
MATH	69.3%	गणित
GPQA	40.7%	स्नातकोत्तर-स्तर विज्ञान
IFEval	77.8%	निर्देश पालन
Output Speed	150 t/s	इन्फ़रेंस थ्रूपुट

तकनीकी विवरण

आर्किटेक्चर: Dense Transformer, 24B पैरामीटर
कॉन्टेक्स्ट: 128,000 टोकन (Mistral Small 3 में 32K से)
Vision: मल्टीमोडल — टेक्स्ट और इमेज
function calling और structured outputs
लाइसेंस: Apache 2.0 (पूर्ण ओपन-सोर्स, व्यावसायिक उपयोग)
Hosted on the CallMissed gateway — free tier eligible
एक GPU डिप्लॉय के लिए अनुकूलित
ज्ञान कटऑफ़: शुरुआती 2025

ताकतें

CallMissed पर मुफ़्त — पेड प्लान नहीं
ओपन-सोर्स (Apache 2.0) — सेल्फ़-होस्ट
आकार के मुकाबले मज़बूत — GPT-4o Mini से आगे
विज़न + टेक्स्ट मल्टीमोडल
लंबे दस्तावेज़ों के लिए 128K
150 टोकन/सेक तेज़ इन्फ़रेंस

सीमाएं

फ्रंटियर से छोटा — सबसे कठिन तर्क पर कम सक्षम
विज़न नया — समर्पित विज़न मॉडल से कम परखा
extended thinking / chain-of-thought नहीं

उपयोग के मामले

संवाद एजेंटfunction calling और टूल उपयोगलंबे दस्तावेज़ समझइमेज समझगोपनीयता-संवेदनशील डिप्लॉय

API उदाहरण

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral-small-3.1",
    "messages": [{"role": "user", "content": "Explain the difference between async and sync programming in Python"}],
    "temperature": 0.7
  }'

एंडपॉइंट: POST /v1/chat/completions · मॉडल ID: mistral-small-3.1

Mistral Small 3.1 अभी आज़माएं

साइनअप पर 1000 फ्री API क्रेडिट पाएं। कोई क्रेडिट कार्ड ज़रूरी नहीं।

फ्री शुरू करें डॉक्स पढ़ें