Skip to content

Instantly share code, notes, and snippets.

@lmcinnes
Last active July 21, 2025 21:38
Show Gist options
  • Save lmcinnes/951de185bd341006a76eece478cc6324 to your computer and use it in GitHub Desktop.
Save lmcinnes/951de185bd341006a76eece478cc6324 to your computer and use it in GitHub Desktop.
Interactive Data Map of Wikipedia
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@rodighiero
Copy link

I am playing with the code, but I have a problem I do not understand here:


ImportError Traceback (most recent call last)
Cell In[48], line 2
1 from toponymy import Toponymy, ToponymyClusterer, KeyphraseBuilder, ClusterLayerText
----> 2 from toponymy.llm_wrappers import AzureAI
3 from toponymy.embedding_wrappers import AzureAIEmbedder

ImportError: cannot import name 'AzureAI' from 'toponymy.llm_wrappers' (/opt/homebrew/Caskroom/miniconda/base/envs/OAPEN/lib/python3.12/site-packages/toponymy/llm_wrappers.py)

@berkidem
Copy link

I think you need to have the AzureAI's package installed in the active environment to be able to import its wrapper.

@rodighiero
Copy link

Do you know which package I have to install in Conda environment? I tried azure-code, but I have the same error.

@lmcinnes
Copy link
Author

azure-ai-inference is the one you'll need if you want to use an azure AI foundry model. You may have to pip install it into your conda environment.

@rodighiero
Copy link

I tried to use it, but it's a very complicated service with some limits with EU cards. Do you think it's feasible to use GPT instead?

@berkidem
Copy link

berkidem commented Jun 28, 2025

There are llm and embedding wrappers for most providers here. However, if you are planning to use OpenAI wrappers, make sure to use the latest version of the package from the repo. I made a PR about the OpenAI wrappers a few days ago and it is merged now but there hasn't been a release since then, so the version you would get from pip install toponymy wouldn't work.

@lmcinnes
Copy link
Author

Note that, as with the Azure AI Foundry you will need to install the relevant package to enable it within toponymy. So if you want to use OpenAI then you'll need to install openai into your environment for toponymy to see it, and so on. Anthropic, Cohere, and OpenAI are all available, as well as local LLMs (assuming you have a GPU) via llamm_cpp, and in the most recent version on github, vLLM.

Note also that, at the time of writing, the async/batch versions of the service wrappers wasn't available, so you may want to consider using those instead as it will be faster. Just prefix with Async to get that to work, so for example AsyncOpenAI etc.

@rodighiero
Copy link

I integrated Toponymy with OpenAI following your suggestion and used AsyncOpenAI, which works well for embeddings and the initial clustering. However, during the topic naming step I’m hitting a BadRequestError when naming clusters with very large keyphrase sets. It seems some prompts exceed the API input limits.

What would you recommend as the best solution? Should I patch make_prompts() to truncate keyphrases per cluster (e.g. top 30–50), or is there an existing parameter or preferred way to limit the prompt size for topic naming?

The OpenAI integration otherwise works fine, with async speeding things up as expected!

@rodighiero
Copy link

It works pretty nicely now, but I couldn't recreate the interface. It seems the package has been updated and some functions do not exist anymore—someone has some thoughts on it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment