Lilac is now self-serve — plus GLM 5.1 is live and Gemma 4 is on the way
No more waitlist. Sign up, grab an API key, and start running inference. GLM 5.1 is live at $0.90/M input, and Gemma 4 is coming soon at $0.13/M input.
Updates, thinking, and technical deep-dives from the Lilac team.
No more waitlist. Sign up, grab an API key, and start running inference. GLM 5.1 is live at $0.90/M input, and Gemma 4 is coming soon at $0.13/M input.
We benchmarked our GLM 5.1 endpoint against every GLM 5.1 provider listed on OpenRouter. Competitive throughput at the lowest per-token price in the comparison.
Lilac serves Kimi K2.5 inference on idle enterprise GPUs and landed at the lowest price in Lilac's current comparable-speed OpenRouter benchmark snapshot.
A direct comparison of GPU inference API pricing across major providers. How idle GPU economics enable Lilac to offer lower per-token rates.
The GPU shortage isn't what you think. The industry doesn't have a supply problem — it has a utilization problem masquerading as one.
Most Kubernetes clusters run GPUs at 30-50% utilization. We built a single operator to change that.