Skip to content

Instantly share code, notes, and snippets.

@aaronvg
Last active September 11, 2024 05:50
Show Gist options
  • Select an option

  • Save aaronvg/42dd34ccb77a3de3da7e3f0831948ef6 to your computer and use it in GitHub Desktop.

Select an option

Save aaronvg/42dd34ccb77a3de3da7e3f0831948ef6 to your computer and use it in GitHub Desktop.
BAML Cursor Prompt
Generate a prompt using BAML.
The BAML programming language is used to write LLM prompts with certain inputs and outputs.
The syntax looks like this:
<BAML Example>
```baml
class ExtractedResume {
name string
links string[] @description(#"
Any links to the candidate's online profiles (e.g. LinkedIn, GitHub, Email).
"#)
education Education[]
experience Experience[]
skills string[]
why_hire string[] @description(#"
3 points of why the candidate is a great hire (use fun and exciting language!).
"#)
degree Degree
}
enum Degree {
HIGH_SCHOOL
ASSOCIATES
BACHELORS
}
class Education {
school string
degree string
year int
}
class Experience {
company string
title string?
start_date string
end_date string?
description string[]
company_url string? @description(#"
Best guess of the company's website URL.
"#)
}
// Since we're extracting a resume object from some free-form text, we'll make the output the ExtractedResume, to follow the rules of not using the same input type as the output.
function ExtractResume(raw_text: string) -> ExtractedResume {
client openai/gpt-4o
prompt #"
Parse the following resume and return a structured representation of the data in the schema below.
Resume:
---
{{raw_text}}
---
{{ ctx.output_format }}
"#
}
```
</BAML Example>
<BAML Syntax>
Available primitive types
- string
- int
- float
- bool
- string[], float[], etc..
- map<string, string>, map<string, int>, etc..
- image
You can also use unions like:
int | float
and optionals by adding a "?" like:
string?
There is also a 'class':
class Name {
field_name type
field_name type
...
}
You should try to use enums for any categorization or classification parts:
enum Name {
value1
value2 @description("A helpful description")
...
}
class MyClass {
category Name
}
To describe fields, you can use @description(...):
class Name {
field_name type @description(#"
Description of the field.
"#)
}
You can also alias fields to help an LLM use a better name than the one in the schema like this:
class Name {
field_name type @alias("amazingField")
}
Multiline strings go in hashtag-quote like above. Single-line strings can go in double quotes.
BAML classes do not support recursion, nor anonymous classes. You must declare each at root level like Pydantic.
<Function>
BAML functions are like normal functions, except the input is used to build a prompt. The output is the structured data you want to extract from the input (like a class or primitive type or array of classes etc). The output type must have "Output" in the name (unless it's a primitive type like string, int, etc.)
Function names must be PascalCase.
function FunctionName(myArg1: InputType, myArg2: string) -> OutputType {
client openai/gpt-4o
// everything inside here uses Jinja. You can reference the input with {{input}} and use Jinja filters and functions, as well as if conditions and for loops.
prompt #"
Your prompt here. Use {{input}} to reference the input.
"#
}
<BAML Prompt>
When generating a prompt, try to be specific, without ambiguities. If there are ambiguities, add a comment to clarify like this:
...
prompt #"
{# This is a comment #}
"#
...
You must also include this unique macro in the prompt like this:
...
{{ ctx.output_format }}
...
<Roles>
For the prompt, you can specify message roles like this:
prompt #"
{{ _.role("system")}}
Some general instructions about the prompt
{{ _.role("user")}}
Inject some {{ variables }} here in the prompt if you want.
{{ ctx.output_format }}
"#
You may only use "system" and "user" roles in the prompt.
</Roles>
<Images>
For inlining images into the prompt you can simply add the image in {{ myImage }}
```baml
function ExtractImage(myImage: image) -> MyClass {
client openai/gpt-4o
prompt #"
Describe the image below:
{{ myImage }}
{{ ctx.output_format }}
"#
}
```
If you are extracting data from images, prefer optional fields in the output schema if needed.
</Images>
<ChainOfThought>
When writing the prompt you can write "write out your reasoning in plain english before you write the final result in the format specified".
</ChainOfThought>
You do NOT need to specify the schema in the prompt. It's already injected by {{ ctx.output_format }}.
</BAML Prompt>
<Clients>
For the client field you can specify the provider/model like this.
Aim to use OpenAI unless otherwise specified.
```baml
function MyFunction(myArg1: string) -> MyClass {
client openai/gpt-4o
prompt #"
...
"#
}
```
</Client>
Don't reuse schemas in the inputs in the output of the function.
</Function>
<Tests>
You should always try to include a sample test when generating a BAML function.
Write a short description of tests you'll write using comments first before writing it.
```baml
test MyTestName {
// usually always one function.
functions [ExtractResume]
// input arguments to the function called "ExtractResume".
args {
// multiline strings should use #". No need to escape new lines. Just write them out.
raw_text #"
Jason Doe
Python, Rust
University of California, Berkeley, B.S.
in Computer Science, 2020
Also an expert in Tableau, SQL, and C++
"#
}
}
```
ALL of the required function arguments MUST be present.
If a function takes in an image, you can initialize it like this:
```baml
function DescribeImage(myImage: image) -> string {
...
}
test MyTestName {
functions [ExtractImage]
args {
myImage {
file "./myImage.png"
}
}
}
```
</Tests>
</BAML Syntax>
When you generate any prompts using BAML, don't add any backtics. Any other freeform text must go in comments:
```baml
// This is a comment
```
Before you answer, specify what kind of details or types would be useful in the comments at the top of the file, like "We'll use an enum to classify xyz".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment