# Clone repo
git clone https://github.com/HKUDS/lightrag.git
cd lightrag
# Set up environment
vf new lightrag-env
pip install -r requirements.txt
pip install fastapi uvicorn PyJWT aiofiles python-multipart
# Configure
cp env.example .env
# Edit .env to set OpenAI API key
# Make sure Ollama is running and pull embedding model
ollama pull bge-m3:latest
# Create inputs directory and start server
mkdir -p inputs
lightrag-server
Visit http://localhost:9621 to use LightRAG.
I recently decided to try out LightRAG, a promising graph-based RAG (Retrieval Augmented Generation) system. Since I use Fish shell instead of Bash, I ran into some interesting challenges along the way.
My first step was cloning the repo and setting up a virtual environment with VirtualFish:
git clone https://github.com/HKUDS/lightrag.git
cd lightrag
vf new lightrag-env
VirtualFish worked perfectly, showing the active environment in my prompt as (🐍-lightrag-env)
.
I tried installing LightRAG with:
pip install "lightrag-hku[api]"
But ran into missing dependencies. The solution was to install from requirements.txt and add a few extra packages:
pip install -r requirements.txt
pip install fastapi uvicorn PyJWT aiofiles python-multipart
Setting up the configuration was straightforward:
cp env.example .env
I initially used Ollama for both LLM and embeddings, but entity extraction failed. Switching to OpenAI for the LLM solved this:
LLM_BINDING=openai
LLM_MODEL=gpt-4o
LLM_BINDING_HOST=https://api.openai.com/v1
LLM_BINDING_API_KEY=your-api-key-here
EMBEDDING_BINDING=ollama
EMBEDDING_BINDING_HOST=http://localhost:11434
EMBEDDING_MODEL=bge-m3:latest
EMBEDDING_DIM=1024
When first starting the server, I hit a JWT error:
AttributeError: module 'jwt' has no attribute 'encode'
The fix was to use PyJWT instead of jwt:
pip uninstall jwt
pip install PyJWT
When I wanted a clean slate, I removed the data directories:
rm -rf ~/Workspace/sandbox/LightRAG/rag_storage
rm -f ~/Workspace/sandbox/LightRAG/inputs/*
Finally, starting the server:
lightrag-server
After uploading a document through the web UI, I could see entity extraction working properly.
- When using specialized shells like Fish, standard installation instructions might need tweaking.
- For entity extraction, more powerful LLMs like OpenAI's models currently outperform local models.
- Pay attention to package names - 'jwt' vs 'PyJWT' makes all the difference.
- Having VirtualFish properly set up made the Python environment management seamless.
LightRAG is an interesting system that combines vector search with knowledge graphs for more contextual document retrieval. Once set up properly, it offers a unique approach to RAG that's worth exploring.