Can someone create an LLM standard?

I have an ask for the AI community: can someone create a standard for interacting with LLMs.

There is a whole category of new apps and features ready to be made on top of LLMs. But to get these in the hands of users is hard. Apps will need to have subscriptions, ask the user to add their OpenAI API key or include and ship their own LLMs. It’s time to add a fourth option: create a standardized API that allows interacting with LLMs that cloud and local services can implement to put the user in control.

There are new LLM models and tuned variations popping up all the time, and most of these models end up implementing some part of the Chat Completions endpoint of the OpenAI Rest API. The recommendation is for users to use the OpenAI SDK and override the endpoint (even Google is doing this now!). The OpenAI API is tailored to the needs of OpenAI, which means it’s set up to interact with their cloud-based LLM models that are known in advance. The community is not able to enhance or extend it.

It is time as AI community to create and embrace a standard that can grow with our needs.

A standard API for LLM servers could start by taking the Chat Completions API of OpenAI. A future version could extend it with discoverability APIs to allow an app to see what models are available, what their capabilities are and automatically select one that fits the apps needs and users budget. For the capabilities, let us discover what the optimal context size is for the LLM, the type of inputs and outputs it supports (text/audio/video), if it supports tool calling and how many tokens/second it is estimated to run at.

Maybe you’re thinking now: that sounds like Ollama! Ollama is a server to run LLMs that has its own proprietary API. It is not a standard meant for other projects/models to adopt. From my conversations with Ollama, it also doesn’t seem like serving LLMs to non-localhost apps is the direction they want to go (hence that option being tucked away in environment variables).

There is also the Model Context Protocol (Home Assistant will support both client and server in next month’s release!), but that’s for LLMs to talk to things. Not for apps to talk to LLMs.

As AI community, let’s come together and make this happen! Who is up for the task?

This involves defining the protocol and publishing it. Then get big projects to say they use it (if they are OpenAI compatible, it wouldn’t require any code changes). I even have a name suggestion for this API, CHat AI: Chai.

I don’t want to publish this myself because I am merely a guest in the AI world; and as founder of Home Assistant I already have enough responsibilities as-is. Once established, we would be happy to include it in Home Assistant.

Background: I am the president of the Open Home Foundation and founder of Home Assistant. We are the world's most active open source community and integrate over 2000 different device and service APIs into a single smart home platform that keeps everything local. We have support for LLMs that have official APIs, including tool calling. This includes all major cloud LLM services and Ollama (the only local LLM with a standardized API). LLMs work really well with our voice assistant and can be used for decision making in automations.

My vision for AI is that everyone will have a box in their house to run LLMs (think NAS but for AI). Apps and tools like Home Assistant will be able to find this box via mDNS discovery (like AirPlay, Cast). They can ask users to set it up and via the capabilities API find the right model for the jobs that it wants to do.

– Paulus Schoutsen

Got questions or feedback, hit me up at [email protected]

balloob/lets_make_chai.md

As AI community, let’s come together and make this happen! Who is up for the task?

balloob commented Mar 24, 2025