Skip to content

Instantly share code, notes, and snippets.

View bradhilton's full-sized avatar

Brad Hilton bradhilton

  • Ender Research Corp
View GitHub Profile
@bradhilton
bradhilton / easy.md
Last active March 3, 2025 18:23
Temporal Clue Example Puzzles

Easy Difficulty Puzzle

On a dark winter night, wealthy and enigmatic Mr. John Q. Boddy hosted a small, but lavish, dinner party for some of his closest associates. However, the night ended in tragedy when Mr. Boddy was found dead in one of the rooms of Tudor Mansion in the early hours of the morning. The following persons of interest have been identified as suspects:

  • Professor Plum
  • Colonel Mustard

And the following weapons were found on the premises:

  • Poison
@bradhilton
bradhilton / base.md
Last active March 3, 2025 17:52
Responses

User:

On a dark winter night, wealthy and enigmatic Mr. John Q. Boddy hosted a small, but lavish, dinner party for some of his closest associates. However, the night ended in tragedy when Mr. Boddy was found dead in one of the rooms of Tudor Mansion in the early hours of the morning. The following persons of interest have been identified as suspects:

  • Mr. Green
  • Colonel Mustard

And the following weapons were found on the premises:

  • Rope
  • Lead Pipe

Reinforcement Learning Report

On Monday, February 10th, we started with the goal to train a SOTA model with reinforcement learning in two weeks time. Now, a little over two weeks later, we unfortunately have not achieved our goal. However, we may have trained a SOTA open-source model on a niche, but challenging logical task. Here I hope to document some of the novel challenges and solutions I found in the process.

Getting Started

I started by building some modest infrastructure to make benchmarking, collecting distillation data, and generating samples easier, while maintaining flexibility. I also wanted to be able to easily observe and monitor inference progress, so I pulled in and enhanced some utility code I had previously created to consume a chat completion stream with the ability to monitor chunks and build up a standard OpenAI ChatCompletion object. Using the on_chunk callback, we are able to both observe token consumption in real time and stream the results to log files for local inspection. Th

@bradhilton
bradhilton / consume_chat_completion_stream.py
Last active February 11, 2025 00:02
AsyncStream[ChatCompletionChunk] -> ChatCompletion
from openai import AsyncStream
from openai.types.chat.chat_completion import ChatCompletion, Choice, ChoiceLogprobs
from openai.types.chat.chat_completion_chunk import ChatCompletionChunk
from openai.types.chat.chat_completion_message import (
ChatCompletionMessage,
FunctionCall,
)
from openai.types.chat.chat_completion_message_tool_call import (
ChatCompletionMessageToolCall,
Function,

Atreides RL

Atreides RL is a reinforcement fine-tuning service that allows companies and developers to easily train custom LLM models.

@bradhilton
bradhilton / temporal-clue.md
Last active December 23, 2024 19:33
Temporal Clue Puzzle

Murder Mystery Puzzle: The Tragedy at Tudor Mansion

On a dark winter night, wealthy and enigmatic Mr. John Q. Boddy hosted a small, but lavish, dinner party for some of his closest associates. However, the night ended in tragedy when Mr. Boddy was found dead in one of the rooms of Tudor Mansion in the early hours of the morning.

Suspects

The following persons of interest have been identified as suspects:

  • Madame Rose
  • Monsieur Brunette
  • Mrs. White
class CpSolver:
def __init__(self, game: Clue, max_solve_time_per_turn: float) -> None:
self.model = cp_model.CpModel()
self.vars = np.array(
[
[
self.model.new_bool_var(f"Player {player + 1} has '{card}'")
for player in range(game.num_players)
]
@bradhilton
bradhilton / gpt.py
Last active November 30, 2023 22:58
GPT magic functions
import codecs
from IPython import get_ipython # type: ignore
from IPython.core.magic import register_line_cell_magic
from IPython.display import clear_output, display, Markdown, update_display # type: ignore
from openai import OpenAI
from openai.types.chat import ChatCompletionMessageParam
from openai.types.chat.completion_create_params import Function
import os
import re
import requests
// Retrieves and recasts the body of an enum
func body<T, U>(of value: inout T) -> U {
return withUnsafePointer(to: &value) {
$0.withMemoryRebound(to: U.self, capacity: 1) {
$0.pointee
}
}
}
struct Todo {
let title: String
let completed: Bool
}
let storage = PostgreSqlTable<Todo>()
let app = CRUDApp<Todo>(storage: storage)()
let router = RESTAppRouter(app: app)
let server = HTTPSServer(router: router)
server.start()