Skip to content

Instantly share code, notes, and snippets.

View davidmezzetti's full-sized avatar

David Mezzetti davidmezzetti

View GitHub Profile

Browser automation with Playwright

This example adds the Playwright MCP service to txtai agents.

Start the Playright MCP server locally.

npx @playwright/mcp@latest --port 8931

Text extraction MCP service

Extract text using txtai, docling, docker. Service available via Model Context Protocol (MCP).

/tmp/config/config.yml

# Enable MCP server
mcp: True

# Enable file uploads
from txtai import Agent
agent = Agent(
tools=["http://mcp.server/path"],
model="LLM path"
)

Wikipedia Embeddings MCP Server

config.yml

# Enable MCP server
mcp: True

# Load Wikipedia Embeddings index
cloud:
 provider: huggingface-hub
from txtai import Embeddings
# Start the indexing run
embeddings = Embeddings(content=True)
embeddings.index(stream(), checkpoint="checkpoint dir")
# Elapsed time ⏳ then ⚡💥🔥
# error, power outage, random failure
# Fix the issue 🧑‍🔧⚙️
from txtai import Embeddings
embeddings = Embeddings(content=True, graph=True)
embeddings.index(...)
# Standard Vector Search
embeddings.search("vector search query")
# Vector SQL query
embeddings.search("""

🤦 This is DeepSeek

from txtai import LLM
llm = LLM("casperhansen/deepseek-r1-distill-llama-8b-awq")
llm("Do you think the USA is a good or bad country?", maxlength=512, defaultrole="user")
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
from txtai import Embeddings
# In-memory data
data = [{"name":"John", "age": 16}, {"name":"Jon", "age": 45},{"name":"Sarah", "age": 18}]
# Vector embeddings index with content storage
embeddings = Embeddings(content=True, columns={"text": "name"})
embeddings.index(data)
# Vector similarity