Skip to content

Instantly share code, notes, and snippets.

@virattt
Last active October 30, 2025 02:24
Show Gist options
  • Select an option

  • Save virattt/b140fb4bf549b6125d53aa153dc53be6 to your computer and use it in GitHub Desktop.

Select an option

Save virattt/b140fb4bf549b6125d53aa153dc53be6 to your computer and use it in GitHub Desktop.
rag-reranking-gpt-colbert.ipynb
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@truebit

truebit commented Jan 23, 2024

Copy link
Copy Markdown

thanks for sharing but the query_embedding variable missing assignment statement

@jsancs

jsancs commented Jan 23, 2024

Copy link
Copy Markdown

@truebit If I have done it right you need to add:

# Add this lines
query = "Your query in string format..."
query_encoding = tokenizer(query, return_tensors='pt', truncation=True, max_length=512)
query_embedding = model(**query_encoding).last_hidden_state.squeeze(0)

# Get score for each document
for document in splits:
    document_encoding = tokenizer(document, return_tensors='pt', truncation=True, max_length=512)
    document_embedding = model(**document_encoding).last_hidden_state

    # Calculate MaxSim score
    score = maxsim(query_embedding.unsqueeze(0), document_embedding)
    ...

@truebit

truebit commented Jan 23, 2024

Copy link
Copy Markdown

@Psancs05 thx

@virattt

virattt commented Jan 23, 2024

Copy link
Copy Markdown
Author

Great catch - updated 🙏

@jsancs

jsancs commented Jan 23, 2024

Copy link
Copy Markdown

@virattt Do you know the difference between using:
query_embedding = model(**query_encoding).last_hidden_state.squeeze(0)
query_embedding = model(**query_encoding).last_hidden_state.mean(dim=1)

I have tested both and seems that the squeeze(0) returns better quality similar documents (maybe it's just the use-case I tried)

@TripleExclam

Copy link
Copy Markdown

query_embedding = model(**query_encoding).last_hidden_state.squeeze(0) is correct since it returns a vector per token, whilst
query_embedding = model(**query_encoding).last_hidden_state.mean(dim=1) returns a single vector averaged over all tokens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment