Gemma 4 API
Gemma 4 API.
$0.11/M input, $0.35/M output (BF16). OpenAI-compatible, no contracts, no minimums.
Point your OpenAI SDK at api.getlilac.com/v1 and request google/gemma-4-31b-it.
Model pricing
Pay per token. No commitments.
The lowest-priced model on Lilac. Ideal for high-volume workloads that need capable open-weight inference.
25% off all tokens above 1B/month for 3 months. That is $0.30/M input and $1.50/M output above the threshold.
More models are coming soon and will be added as they go live.
Integration
One base URL change.
Keep the OpenAI SDK and point it at Lilac. Your existing code just works.
from openai import OpenAI
client = OpenAI(
base_url="https://api.openai.com/v1",
api_key="sk_...",
)
response = client.chat.completions.create(
model="openai/gpt-5.4",
messages=[{"role": "user", "content": "Hello!"}],
)
# Same code. Same SDK. Fraction of the price.
Standard OpenAI client — just change the base URL.
Cheapest model we host at $0.11/M input tokens.
Shared warm endpoints, no cold starts.
Frequently asked questions
How do I call the API?
Set base_url to https://api.getlilac.com/v1 in the OpenAI SDK, model name google/gemma-4-31b-it.
How much does it cost?
$0.11/M input, $0.35/M output on the shared endpoint.
Is my data stored or used for training?
No. Lilac is ZDR (Zero Data Retention) compliant — prompt and completion data is never stored or used for model training.
Start running inference in minutes.
No contracts, no commitments. Swap your base URL and pay less for the same output quality.
No commitment required.