Víctor Gallego vicgalle

Are OpenAI training models in a way that encourages security risks?

Todays's topic is structured outputs, how to produce them, their interprlay with chain-of-thought, and a potential security risk this opens up.

Structured Outputs

When using an LLM programatically as part of a larger system or process, it is useful to have the model produce outputs in a structured format which is easy to parse programatically. Formatting the output as a JSON structure makes a lot of sense in this regard, and the commercial LLM models are trained to produce JSON outputs according to your specification. So for example instead of asking the model to produce a list of 10 items (left) which may be tricky to parse, I could ask it to return the answer as a JSON list of 10 strings (right).

Are multi-LLM-agent systems a thing? Yes they are. But.

Yoav Goldberg, Nov 24, 2024

This piece started with a pair of twitter and bluesky posts:

let's talk about "agents" (in the LLM sense). there's a lot of buzz around "multi-agent" systems where agents collaborate but... i don't really get how it differs from a thinking of a single agent with multiple modes of operation. what are the benefits of modeling as multi-agent?
— (((ل()(ل() 'yoav))))👾 (@yoavgo) November 23, 2024

Introduction

Markov Jr. is an open source C# application that creates procedural content primarily via applying Markov rewrite rules to a 2D or 3D grid. A rewrite rule has an input and output pattern, which essentially specifies what pattern to look for in the existing grid, and what to replace it with.

For example, given a 2D grid, this would replace any white dot with a white cross:

***/*W*/*** :: *W*/WWW/*W*

The left hand side is the rule input, and the right hand side is the output. The / character is used to delimit rows, and space is used to delimit Z-layers (in 3D grids). The input rule above translates to the 2D pattern:

	import argparse
	import random
	import sys

	from transformers import AutoModelForCausalLM, AutoTokenizer, DynamicCache
	import torch

	parser = argparse.ArgumentParser()
	parser.add_argument("question", type=str)
	parser.add_argument(

	<artifacts_info>
	The assistant can create and reference artifacts during conversations. Artifacts are for substantial, self-contained content that users might modify or reuse, displayed in a separate UI window for clarity.

	# Good artifacts are...
	- Substantial content (>15 lines)
	- Content that the user is likely to modify, iterate on, or take ownership of
	- Self-contained, complex content that can be understood on its own, without context from the conversation
	- Content intended for eventual use outside the conversation (e.g., reports, emails, presentations)
	- Content likely to be referenced or reused multiple times