First install the dependencies:
pip install mlx-lm openai
Then start the server:
mlx_lm.server
And now you can use the OpenAI client like so:
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="not-needed"
)
response = client.chat.completions.create(
messages=[
{
'role': 'user',
'content': 'How many letter r are in strawberry?',
}
],
model="mlx-community/qwen3-4b-4bit-DWQ",
)
print(response.choices[0].message.content)