Skip to content

Instantly share code, notes, and snippets.

@victorlin
victorlin / 1_longest_tokens.py
Last active October 26, 2024 21:40 — forked from ctlllll/longest_chinese_tokens_gpt4o.py
Longest tokens per language in gpt4o
import iso639
import json
import langdetect
import tiktoken
REQUIRED_LANGUAGES = ["zh-cn"]
# Minimum for required languages
# Maximum for optional languages
TOKENS_PER_LANGUAGE = 20