Full access is free during Beta. A paid subscription will be offered after Beta.

Together AI — User Guide

Cheap fast cloud inference.

Visit website
Freemium
Strengths
  • Extremely low price, more than 10 times cheaper than OpenAI
  • Supports mainstream open source models such as Llama 3, Qwen, Mistral, DeepSeek, etc.
  • Fast inference and low latency
  • OpenAI compatible API, extremely low migration cost
  • Free $1 credit, enough for testing
Best for
  • Reduce API call costs for AI applications
  • Using open source models as an alternative to OpenAI
  • Backend inference for highly concurrent AI applications
  • Test and compare different open source models
  • Build cost-sensitive AI products

quick start

Together AI’s API is fully compatible with OpenAI, requiring little modification to existing code.

Scenario

Called with OpenAI compatible API

Prompt example
from openai import OpenAI

# Just modify base_url and api_key
client = OpenAI(
    base_url="https://api.together.xyz/v1",
    api_key="your-together-api-key"
)

response = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
    messages=[
        {"role": "user", "content": "Explain what RAG technology is"}
    ],
    max_tokens=1000
)
print(response.choices[0].message.content)
Output / what to expect
Calling the Llama 3.1 70B model, The price is about 1/10 of OpenAI GPT-4o, The answer quality is close to GPT-4 level.
Tips

To migrate existing OpenAI code to Together AI, just modify the base_url and api_key lines.

Scenario

Choose the right model

Prompt example
Together AI main models and prices (2025):

High quality model:
- Llama-3.1-70B-Instruct: $0.88/million tokens
- Qwen2.5-72B-Instruct: $1.2/million tokens
- DeepSeek-V3: $0.27/million tokens

Fast and lightweight model:
- Llama-3.2-3B-Instruct: $0.06/million tokens
- Llama-3.1-8B-Instruct: $0.18/million tokens

Compare OpenAI:
- GPT-4o: $2.5/million input tokens
- GPT-4o-mini: $0.15/million input tokens
Output / what to expect
DeepSeek-V3 is the most cost-effective option. The quality is close to GPT-4o and the price is only 1/10, Suitable for production applications with large number of calls.
Tips

For cost-sensitive applications, DeepSeek-V3 is one of the most cost-effective options available.

Starter & above

The rest of this guide

Additional scenarios and the full comparison table are included with Starter and above. Sign in with an eligible account to load them.

You're on the Free plan. Upgrade to Starter or higher to unlock the rest of this guide—additional scenarios and the full comparison table.

Loading full guide…