Created
March 12, 2025 07:04
-
-
Save zoltanctoth/1c37bd3845efaf665b5d361620e8e663 to your computer and use it in GitHub Desktop.
mlflow-llm-as-a-judge-example
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
professionalism = mlflow.metrics.genai.make_genai_metric( | |
name="professionalism", | |
definition=( | |
"Professionalism refers to the use of a formal, respectful, and appropriate style of communication that is " | |
"tailored to the context and audience. It often involves avoiding overly casual language, slang, or " | |
"colloquialisms, and instead using clear, concise, and respectful language." | |
), | |
grading_prompt=( | |
"Professionalism: If the answer is written using a professional tone, below are the details for different scores: " | |
"- Score 0: Language is extremely casual, informal, and may include slang or colloquialisms. Not suitable for " | |
"professional contexts." | |
"- Score 1: Language is casual but generally respectful and avoids strong informality or slang. Acceptable in " | |
"some informal professional settings." | |
"- Score 2: Language is overall formal but still have casual words/phrases. Borderline for professional contexts." | |
"- Score 3: Language is balanced and avoids extreme informality or formality. Suitable for most professional contexts. " | |
"- Score 4: Language is noticeably formal, respectful, and avoids casual elements. Appropriate for formal " | |
"business or academic settings. " | |
), | |
examples=[professionalism_example_score_2, professionalism_example_score_4], | |
model="openai:/gpt-4o-mini", | |
parameters={"temperature": 0.0}, | |
aggregations=["mean", "variance"], | |
greater_is_better=True, | |
) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment