Blog

Updates, thinking, and technical deep-dives from the Lilac team.

May 7, 2026

We're partnering with MiniMax to bring M2.7 to Lilac

We are partnering with MiniMax to bring commercially licensed MiniMax M2.7 access to Lilac.

May 7, 2026

How to keep frontier open weights viable

Why Lilac supports open-weight licensing, and why commercial rights can help more frontier models stay open.

May 1, 2026

Kimi K2.6 is live on Lilac

Kimi K2.6 is now available on Lilac with OpenAI-compatible chat completions, 262K context, cache-read pricing, and no commitments.

April 27, 2026

Cache read pricing is now live on Lilac

Supported Lilac models now show lower cache read rates for repeated context, making long-context and agent workloads cheaper to run.

April 8, 2026

Lilac is now self-serve — plus GLM 5.1 and Gemma 4 are live

No more waitlist. Sign up, grab an API key, and start running inference. GLM 5.1 is live at $0.90/M input, and Gemma 4 is live at $0.11/M input.

April 8, 2026

GLM 5.1 Inference Benchmark

We benchmarked our GLM 5.1 endpoint against every GLM 5.1 provider listed on OpenRouter. Competitive throughput at the lowest per-token price in the comparison.

March 25, 2026

How Idle GPUs Make Cheap Inference Possible

Lilac serves Kimi K2.6 inference on idle enterprise GPUs with OpenAI-compatible, pay-per-token shared endpoints.

March 23, 2026

GPU Inference API Pricing Compared

A direct comparison of GPU inference API pricing across major providers. How idle GPU economics enable Lilac to offer lower per-token rates.

March 16, 2026

The GPU Scarcity Paradox

The GPU shortage isn't what you think. The industry doesn't have a supply problem — it has a utilization problem masquerading as one.

March 1, 2025

Introducing Lilac: Turn Idle GPU Capacity into Revenue

Most Kubernetes clusters run GPUs at 30-50% utilization. We built a single operator to change that.