aaronvg · September 11, 2024 05:50
diff --git a/baml-prompt.txt b/baml-prompt.txt
 Generate a prompt using BAML.

 The BAML programming language is used to write LLM prompts with certain inputs and outputs.

 The syntax looks like this:

 <BAML Example>
 ```baml
 class ExtractedResume {
  name string
  links string[] @description(#"
    Any links to the candidate's online profiles (e.g. LinkedIn, GitHub, Email).
  "#)
  education Education[]
  experience Experience[]
  skills string[]
  why_hire string[] @description(#"
    3 points of why the candidate is a great hire (use fun and exciting language!).
  "#)
  degree Degree
 }

 enum Degree {
  HIGH_SCHOOL
  ASSOCIATES
  BACHELORS
 }

 class Education {
  school string
  degree string
  year int
 }

 class Experience {
  company string
  title string?
  start_date string
  end_date string?
  description string[]
  company_url string? @description(#"
    Best guess of the company's website URL.
  "#)
 }

 // Since we're extracting a resume object from some free-form text, we'll make the output the ExtractedResume, to follow the rules of not using the same input type as the output.
 function ExtractResume(raw_text: string) -> ExtractedResume {
  client openai/gpt-4o
  prompt #"
    Parse the following resume and return a structured representation of the data in the schema below.

    Resume:
    ---
    {{raw_text}}
    ---

    {{ ctx.output_format }}
  "#
 }
 ```
 </BAML Example>

 <BAML Syntax>
 Available primitive types
 - string
 - int
 - float
 - bool
 - string[], float[], etc..
 - map<string, string>, map<string, int>, etc..
 - image

 You can also use unions like:
 int | float

 and optionals by adding a "?" like:
 string?


 There is also a 'class':
 class Name {
  field_name type
  field_name type
  ...
 }

 You should try to use enums for any categorization or classification parts:
 enum Name {
  value1
  value2 @description("A helpful description")
  ...
 }

 class MyClass {
  category Name
 }


 To describe fields, you can use @description(...):
 class Name {
  field_name type @description(#"
    Description of the field.
  "#)
 }

 You can also alias fields to help an LLM use a better name than the one in the schema like this:
 class Name {
  field_name type @alias("amazingField")
 }

 Multiline strings go in hashtag-quote like above. Single-line strings can go in double quotes.

 BAML classes do not support recursion, nor anonymous classes. You must declare each at root level like Pydantic.

 <Function>
 BAML functions are like normal functions, except the input is used to build a prompt. The output is the structured data you want to extract from the input (like a class or primitive type or array of classes etc). The output type must have "Output" in the name (unless it's a primitive type like string, int, etc.)

 Function names must be PascalCase.

 function FunctionName(myArg1: InputType, myArg2: string) -> OutputType {
  client openai/gpt-4o
  // everything inside here uses Jinja. You can reference the input with {{input}} and use Jinja filters and functions, as well as if conditions and for loops.
  prompt #"
    Your prompt here. Use {{input}} to reference the input.
  "#
 }

 <BAML Prompt>
 When generating a prompt, try to be specific, without ambiguities. If there are ambiguities, add a comment to clarify like this:
 ...
 prompt #"
  {# This is a comment #}
 "#
 ...

 You must also include this unique macro in the prompt like this:
 ...
 {{ ctx.output_format }}
 ...


 <Roles>
 For the prompt, you can specify message roles like this:

 prompt #"
  {{ _.role("system")}}
  Some general instructions about the prompt

  {{ _.role("user")}}
  Inject some {{ variables }} here in the prompt if you want.

  {{ ctx.output_format }}
 "#

 You may only use "system" and "user" roles in the prompt.
 </Roles>

 <Images>
 For inlining images into the prompt you can simply add the image in {{ myImage }}
 ```baml
 function ExtractImage(myImage: image) -> MyClass {
  client openai/gpt-4o
  prompt #"
    Describe the image below:

    {{ myImage }}

    {{ ctx.output_format }}
  "#
 }
 ```

 If you are extracting data from images, prefer optional fields in the output schema if needed.
 </Images>

 <ChainOfThought>
 When writing the prompt you can write "write out your reasoning in plain english before you write the final result in the format specified".
 </ChainOfThought>

 You do NOT need to specify the schema in the prompt. It's already injected by {{ ctx.output_format }}.

 </BAML Prompt>

 <Clients>
 For the client field you can specify the provider/model like this.
 Aim to use OpenAI unless otherwise specified.
 ```baml
 function MyFunction(myArg1: string) -> MyClass {
  client openai/gpt-4o
  prompt #"
    ...
  "#
 }
 ```

 </Client>

 Don't reuse schemas in the inputs in the output of the function.
 </Function>

 <Tests>
 You should always try to include a sample test when generating a BAML function.
 Write a short description of tests you'll write using comments first before writing it.
 ```baml
 test MyTestName {
  // usually always one function.
  functions [ExtractResume]
  // input arguments to the function called "ExtractResume".
  args {
    // multiline strings should use #". No need to escape new lines. Just write them out.
    raw_text #"
    Jason Doe
    Python, Rust
    University of California, Berkeley, B.S.
    in Computer Science, 2020
    Also an expert in Tableau, SQL, and C++
    "#
  }
 }
 ```

 ALL of the required function arguments MUST be present.

 If a function takes in an image, you can initialize it like this:
 ```baml

 function DescribeImage(myImage: image) -> string {
  ...
 }

 test MyTestName {
  functions [ExtractImage]
  args {
    myImage { 
      file "./myImage.png"
    }
  }
 }
 ```

 </Tests>

 </BAML Syntax>

 When you generate any prompts using BAML, don't add any backtics. Any other freeform text must go in comments:
  
 ```baml
 // This is a comment
 ```

 Before you answer, specify what kind of details or types would be useful in the comments at the top of the file, like "We'll use an enum to classify xyz".
	Generate a prompt using BAML.

	The BAML programming language is used to write LLM prompts with certain inputs and outputs.

	The syntax looks like this:

	<BAML Example>
	```baml
	class ExtractedResume {
	name string
	links string[] @description(#"
	Any links to the candidate's online profiles (e.g. LinkedIn, GitHub, Email).
	"#)
	education Education[]
	experience Experience[]
	skills string[]
	why_hire string[] @description(#"
	3 points of why the candidate is a great hire (use fun and exciting language!).
	"#)
	degree Degree
	}

	enum Degree {
	HIGH_SCHOOL
	ASSOCIATES
	BACHELORS
	}

	class Education {
	school string
	degree string
	year int
	}

	class Experience {
	company string
	title string?
	start_date string
	end_date string?
	description string[]
	company_url string? @description(#"
	Best guess of the company's website URL.
	"#)
	}

	// Since we're extracting a resume object from some free-form text, we'll make the output the ExtractedResume, to follow the rules of not using the same input type as the output.
	function ExtractResume(raw_text: string) -> ExtractedResume {
	client openai/gpt-4o
	prompt #"
	Parse the following resume and return a structured representation of the data in the schema below.

	Resume:
	---
	{{raw_text}}
	---

	{{ ctx.output_format }}
	"#
	}
	```
	</BAML Example>

	<BAML Syntax>
	Available primitive types
	- string
	- int
	- float
	- bool
	- string[], float[], etc..
	- map<string, string>, map<string, int>, etc..
	- image

	You can also use unions like:
	int \| float

	and optionals by adding a "?" like:
	string?


	There is also a 'class':
	class Name {
	field_name type
	field_name type
	...
	}

	You should try to use enums for any categorization or classification parts:
	enum Name {
	value1
	value2 @description("A helpful description")
	...
	}

	class MyClass {
	category Name
	}


	To describe fields, you can use @description(...):
	class Name {
	field_name type @description(#"
	Description of the field.
	"#)
	}

	You can also alias fields to help an LLM use a better name than the one in the schema like this:
	class Name {
	field_name type @alias("amazingField")
	}

	Multiline strings go in hashtag-quote like above. Single-line strings can go in double quotes.

	BAML classes do not support recursion, nor anonymous classes. You must declare each at root level like Pydantic.

	<Function>
	BAML functions are like normal functions, except the input is used to build a prompt. The output is the structured data you want to extract from the input (like a class or primitive type or array of classes etc). The output type must have "Output" in the name (unless it's a primitive type like string, int, etc.)

	Function names must be PascalCase.

	function FunctionName(myArg1: InputType, myArg2: string) -> OutputType {
	client openai/gpt-4o
	// everything inside here uses Jinja. You can reference the input with {{input}} and use Jinja filters and functions, as well as if conditions and for loops.
	prompt #"
	Your prompt here. Use {{input}} to reference the input.
	"#
	}

	<BAML Prompt>
	When generating a prompt, try to be specific, without ambiguities. If there are ambiguities, add a comment to clarify like this:
	...
	prompt #"
	{# This is a comment #}
	"#
	...

	You must also include this unique macro in the prompt like this:
	...
	{{ ctx.output_format }}
	...


	<Roles>
	For the prompt, you can specify message roles like this:

	prompt #"
	{{ _.role("system")}}
	Some general instructions about the prompt

	{{ _.role("user")}}
	Inject some {{ variables }} here in the prompt if you want.

	{{ ctx.output_format }}
	"#

	You may only use "system" and "user" roles in the prompt.
	</Roles>

	<Images>
	For inlining images into the prompt you can simply add the image in {{ myImage }}
	```baml
	function ExtractImage(myImage: image) -> MyClass {
	client openai/gpt-4o
	prompt #"
	Describe the image below:

	{{ myImage }}

	{{ ctx.output_format }}
	"#
	}
	```

	If you are extracting data from images, prefer optional fields in the output schema if needed.
	</Images>

	<ChainOfThought>
	When writing the prompt you can write "write out your reasoning in plain english before you write the final result in the format specified".
	</ChainOfThought>

	You do NOT need to specify the schema in the prompt. It's already injected by {{ ctx.output_format }}.

	</BAML Prompt>

	<Clients>
	For the client field you can specify the provider/model like this.
	Aim to use OpenAI unless otherwise specified.
	```baml
	function MyFunction(myArg1: string) -> MyClass {
	client openai/gpt-4o
	prompt #"
	...
	"#
	}
	```

	</Client>

	Don't reuse schemas in the inputs in the output of the function.
	</Function>

	<Tests>
	You should always try to include a sample test when generating a BAML function.
	Write a short description of tests you'll write using comments first before writing it.
	```baml
	test MyTestName {
	// usually always one function.
	functions [ExtractResume]
	// input arguments to the function called "ExtractResume".
	args {
	// multiline strings should use #". No need to escape new lines. Just write them out.
	raw_text #"
	Jason Doe
	Python, Rust
	University of California, Berkeley, B.S.
	in Computer Science, 2020
	Also an expert in Tableau, SQL, and C++
	"#
	}
	}
	```

	ALL of the required function arguments MUST be present.

	If a function takes in an image, you can initialize it like this:
	```baml

	function DescribeImage(myImage: image) -> string {
	...
	}

	test MyTestName {
	functions [ExtractImage]
	args {
	myImage {
	file "./myImage.png"
	}
	}
	}
	```

	</Tests>

	</BAML Syntax>

	When you generate any prompts using BAML, don't add any backtics. Any other freeform text must go in comments:

	```baml
	// This is a comment
	```

	Before you answer, specify what kind of details or types would be useful in the comments at the top of the file, like "We'll use an enum to classify xyz".
No results found