Last active
October 7, 2024 05:21
-
-
Save default-anton/83669a22be30d58287d47f24b273dd49 to your computer and use it in GitHub Desktop.
Ruby script using BERT to classify search queries as keywords, questions, or statements.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
uv tool install optimum[exporters] | |
model_name="bert-mini-finetune-question-detection" | |
optimum-cli export onnx --model "shahrukhx01/${model_name}" "${model_name}" --optimize O2 | |
jq '. += {"id2label": {"0": "KEYWORD", "1": "QUESTION_OR_STATEMENT"}}' "${model_name}/config.json" > tmp.json && mv tmp.json "${model_name}/config.json" | |
jq '.model_max_length = 512' "${model_name}/tokenizer_config.json" > tmp.json && mv tmp.json "${model_name}/tokenizer_config.json" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gem "informers" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'informers' | |
classifier = Informers.pipeline( | |
"text-classification", "bert-mini-finetune-question-detection", | |
cache_dir: File.expand_path(__dir__), | |
local_files_only: true, | |
model_file_name: "../model", | |
quantized: false | |
) | |
queries = [ | |
# Keyword queries | |
"ruby on rails installation", | |
"postgresql database setup", | |
"rspec test examples", | |
# Question queries | |
"How to install Ruby on Rails?", | |
"What are the best practices for using Capybara?", | |
"Can you explain how to integrate Elasticsearch with Rails?", | |
# Statement queries | |
"I need help with AWS SDK configuration in Rails.", | |
"The Kafka integration is not working as expected.", | |
"Webpack is throwing errors during asset compilation." | |
] | |
results = classifier.call(queries) | |
queries.zip(results) do |query, result| | |
puts "Query: #{query}" | |
puts "Label: #{result[:label]} (#{(result[:score] * 100).round(2)}% confidence)" | |
puts | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Usage
bin/bundle exec ruby keyword_or_nl_query.rb
Output: