Last active
September 11, 2024 05:50
-
-
Save aaronvg/42dd34ccb77a3de3da7e3f0831948ef6 to your computer and use it in GitHub Desktop.
BAML Cursor Prompt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Generate a prompt using BAML. | |
| The BAML programming language is used to write LLM prompts with certain inputs and outputs. | |
| The syntax looks like this: | |
| <BAML Example> | |
| ```baml | |
| class ExtractedResume { | |
| name string | |
| links string[] @description(#" | |
| Any links to the candidate's online profiles (e.g. LinkedIn, GitHub, Email). | |
| "#) | |
| education Education[] | |
| experience Experience[] | |
| skills string[] | |
| why_hire string[] @description(#" | |
| 3 points of why the candidate is a great hire (use fun and exciting language!). | |
| "#) | |
| degree Degree | |
| } | |
| enum Degree { | |
| HIGH_SCHOOL | |
| ASSOCIATES | |
| BACHELORS | |
| } | |
| class Education { | |
| school string | |
| degree string | |
| year int | |
| } | |
| class Experience { | |
| company string | |
| title string? | |
| start_date string | |
| end_date string? | |
| description string[] | |
| company_url string? @description(#" | |
| Best guess of the company's website URL. | |
| "#) | |
| } | |
| // Since we're extracting a resume object from some free-form text, we'll make the output the ExtractedResume, to follow the rules of not using the same input type as the output. | |
| function ExtractResume(raw_text: string) -> ExtractedResume { | |
| client openai/gpt-4o | |
| prompt #" | |
| Parse the following resume and return a structured representation of the data in the schema below. | |
| Resume: | |
| --- | |
| {{raw_text}} | |
| --- | |
| {{ ctx.output_format }} | |
| "# | |
| } | |
| ``` | |
| </BAML Example> | |
| <BAML Syntax> | |
| Available primitive types | |
| - string | |
| - int | |
| - float | |
| - bool | |
| - string[], float[], etc.. | |
| - map<string, string>, map<string, int>, etc.. | |
| - image | |
| You can also use unions like: | |
| int | float | |
| and optionals by adding a "?" like: | |
| string? | |
| There is also a 'class': | |
| class Name { | |
| field_name type | |
| field_name type | |
| ... | |
| } | |
| You should try to use enums for any categorization or classification parts: | |
| enum Name { | |
| value1 | |
| value2 @description("A helpful description") | |
| ... | |
| } | |
| class MyClass { | |
| category Name | |
| } | |
| To describe fields, you can use @description(...): | |
| class Name { | |
| field_name type @description(#" | |
| Description of the field. | |
| "#) | |
| } | |
| You can also alias fields to help an LLM use a better name than the one in the schema like this: | |
| class Name { | |
| field_name type @alias("amazingField") | |
| } | |
| Multiline strings go in hashtag-quote like above. Single-line strings can go in double quotes. | |
| BAML classes do not support recursion, nor anonymous classes. You must declare each at root level like Pydantic. | |
| <Function> | |
| BAML functions are like normal functions, except the input is used to build a prompt. The output is the structured data you want to extract from the input (like a class or primitive type or array of classes etc). The output type must have "Output" in the name (unless it's a primitive type like string, int, etc.) | |
| Function names must be PascalCase. | |
| function FunctionName(myArg1: InputType, myArg2: string) -> OutputType { | |
| client openai/gpt-4o | |
| // everything inside here uses Jinja. You can reference the input with {{input}} and use Jinja filters and functions, as well as if conditions and for loops. | |
| prompt #" | |
| Your prompt here. Use {{input}} to reference the input. | |
| "# | |
| } | |
| <BAML Prompt> | |
| When generating a prompt, try to be specific, without ambiguities. If there are ambiguities, add a comment to clarify like this: | |
| ... | |
| prompt #" | |
| {# This is a comment #} | |
| "# | |
| ... | |
| You must also include this unique macro in the prompt like this: | |
| ... | |
| {{ ctx.output_format }} | |
| ... | |
| <Roles> | |
| For the prompt, you can specify message roles like this: | |
| prompt #" | |
| {{ _.role("system")}} | |
| Some general instructions about the prompt | |
| {{ _.role("user")}} | |
| Inject some {{ variables }} here in the prompt if you want. | |
| {{ ctx.output_format }} | |
| "# | |
| You may only use "system" and "user" roles in the prompt. | |
| </Roles> | |
| <Images> | |
| For inlining images into the prompt you can simply add the image in {{ myImage }} | |
| ```baml | |
| function ExtractImage(myImage: image) -> MyClass { | |
| client openai/gpt-4o | |
| prompt #" | |
| Describe the image below: | |
| {{ myImage }} | |
| {{ ctx.output_format }} | |
| "# | |
| } | |
| ``` | |
| If you are extracting data from images, prefer optional fields in the output schema if needed. | |
| </Images> | |
| <ChainOfThought> | |
| When writing the prompt you can write "write out your reasoning in plain english before you write the final result in the format specified". | |
| </ChainOfThought> | |
| You do NOT need to specify the schema in the prompt. It's already injected by {{ ctx.output_format }}. | |
| </BAML Prompt> | |
| <Clients> | |
| For the client field you can specify the provider/model like this. | |
| Aim to use OpenAI unless otherwise specified. | |
| ```baml | |
| function MyFunction(myArg1: string) -> MyClass { | |
| client openai/gpt-4o | |
| prompt #" | |
| ... | |
| "# | |
| } | |
| ``` | |
| </Client> | |
| Don't reuse schemas in the inputs in the output of the function. | |
| </Function> | |
| <Tests> | |
| You should always try to include a sample test when generating a BAML function. | |
| Write a short description of tests you'll write using comments first before writing it. | |
| ```baml | |
| test MyTestName { | |
| // usually always one function. | |
| functions [ExtractResume] | |
| // input arguments to the function called "ExtractResume". | |
| args { | |
| // multiline strings should use #". No need to escape new lines. Just write them out. | |
| raw_text #" | |
| Jason Doe | |
| Python, Rust | |
| University of California, Berkeley, B.S. | |
| in Computer Science, 2020 | |
| Also an expert in Tableau, SQL, and C++ | |
| "# | |
| } | |
| } | |
| ``` | |
| ALL of the required function arguments MUST be present. | |
| If a function takes in an image, you can initialize it like this: | |
| ```baml | |
| function DescribeImage(myImage: image) -> string { | |
| ... | |
| } | |
| test MyTestName { | |
| functions [ExtractImage] | |
| args { | |
| myImage { | |
| file "./myImage.png" | |
| } | |
| } | |
| } | |
| ``` | |
| </Tests> | |
| </BAML Syntax> | |
| When you generate any prompts using BAML, don't add any backtics. Any other freeform text must go in comments: | |
| ```baml | |
| // This is a comment | |
| ``` | |
| Before you answer, specify what kind of details or types would be useful in the comments at the top of the file, like "We'll use an enum to classify xyz". |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment