GLM 5.1 API

    GLM 5.1 API.

    $0.90/M input, $3.00/M output. OpenAI-compatible, no contracts, no minimums.

    Get Started

    Point your OpenAI SDK at api.getlilac.com/v1 and request z-ai/glm-5.1.

    Model pricing

    Pay per token. No commitments.

    0.58s TTFT on shared warm endpoints. Competitive with aggregator listings at a lower per-token price.

    Model
    Input
    Output
    Latency
    Kimi K2.5

    Live now

    $0.40/M
    $2.00/M
    0.38s TTFT
    GLM 5.1

    Live now

    $0.90/M
    $3.00/M
    0.58s TTFT
    Gemma 4

    Coming soon

    $0.13/M
    $0.38/M
    OpenAI-compatibleShared warm endpointsNo contractsNo minimums

    25% off all tokens above 1B/month for 3 months. That is $0.30/M input and $1.50/M output above the threshold.

    More models are coming soon and will be added as they go live.

    Integration

    One base URL change.

    Keep the OpenAI SDK and point it at Lilac. Your existing code just works.

    inference.py

    from openai import OpenAI

    client = OpenAI(

    base_url="https://api.openai.com/v1",

    api_key="sk_...",

    )

    response = client.chat.completions.create(

    model="openai/gpt-5.4",

    messages=[{"role": "user", "content": "Hello!"}],

    )

    # Same code. Same SDK. Fraction of the price.

    01

    Standard OpenAI client — just change the base URL.

    02

    Visible pricing. No aggregator markup.

    03

    Shared warm endpoints, no cold starts.

    Frequently asked questions

    How do I call the API?

    Set base_url to https://api.getlilac.com/v1 in the OpenAI SDK, model name z-ai/glm-5.1.

    How much does it cost?

    $0.90/M input, $3.00/M output on the shared endpoint.

    What other models does Lilac support?

    Kimi K2.5 is also live, with Gemma 4 on the way. See the pricing page for the full catalog.

    Start running inference in minutes.

    No contracts, no commitments. Swap your base URL and pay less for the same output quality.

    Get Started

    No commitment required.