Kimi K2.6 API

Kimi K2.6 API.

$0.70/M input, $3.50/M output, $0.20/M cached input (INT4). OpenAI-compatible, no contracts, no minimums.

Get Started

Point your OpenAI SDK at api.getlilac.com/v1 and request moonshotai/kimi-k2.6.

Model pricing

Pay per token. No commitments.

Shared warm endpoints with cache-read pricing for repeated long-context traffic.

Model

Context

Input

Cache

Output

MiniMax M2.7

Live now

FP8

200K

$0.30/M

$0.055/M

$1.20/M

Kimi K2.6

Live now

INT4

262K

$0.70/M

$0.20/M

$3.50/M

GLM 5.1

Live now

FP8

203K

$0.90/M

$0.27/M

$3.00/M

Gemma 4 (31B)

Live now

BF16

262K

$0.11/M

$0.35/M

OpenAI-compatibleShared warm endpointsNo contractsNo minimums

More models are coming soon and will be added as they go live.

Integration

One base URL change.

Keep the OpenAI SDK and point it at Lilac. Your existing code just works.

inference.py

from openai import OpenAI

client = OpenAI(

base_url="https://api.openai.com/v1",

api_key="sk_...",

)

response = client.chat.completions.create(

model="openai/gpt-5.4",

messages=[{"role": "user", "content": "Hello!"}],

)

# Same code. Same SDK. Fraction of the price.

Standard OpenAI client — just change the base URL.

Cache-read pricing for repeated prompt context.

Long-context model for coding, tools, and agent workflows.

Read the benchmark snapshot

Frequently asked questions

How do I call the API?

Set base_url to https://api.getlilac.com/v1 in the OpenAI SDK, model name moonshotai/kimi-k2.6.

How much does it cost?

$0.70/M input, $3.50/M output, and $0.20/M cached input on the shared endpoint.

Is my data stored or used for training?

No. Lilac is ZDR (Zero Data Retention) compliant — prompt and completion data is never stored or used for model training.

Start running inference in minutes.

No contracts, no commitments. Swap your base URL and pay less for the same output quality.

Get Started

No commitment required.

Cheap inference API Official Kimi K2.6 page Direct endpoints vs. aggregators