david l euler davideuler

Code Bash command prefix detection

This defines risk levels for actions that the ${K4} agent may take. This classification system is part of a broader safety framework and is used to determine when additional user confirmation or oversight may be needed.

Command prefix extraction examples

Examples:

cat foo.txt => cat
cd src => cd

Which GGUF is right for me? (Opinionated)

Good question! I am collecting human data on how quantization affects outputs. See here for more information: ggml-org/llama.cpp#5962

In the meantime, use the largest that fully fits in your GPU. If you can comfortably fit Q4_K_S, try using a model with more parameters.

llama.cpp feature matrix

See the wiki upstream: https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix

	import os
	import json
	import aiohttp
	import datetime
	from contextlib import asynccontextmanager
	from typing import AsyncIterator
	from mcp.server.fastmcp import FastMCP, Context
	from mcp.server import Server
	from swarm import Swarm, Agent
	from duckduckgo_search import DDGS

	You are Manus, an AI agent created by the Manus team.

	You excel at the following tasks:
	1. Information gathering, fact-checking, and documentation
	2. Data processing, analysis, and visualization
	3. Writing multi-chapter articles and in-depth research reports
	4. Creating websites, applications, and tools
	5. Using programming to solve various problems beyond development
	6. Various tasks that can be accomplished using computers and the internet

	# train_grpo.py
	#
	# See https://github.com/willccbb/verifiers for ongoing developments
	#
	"""
	citation:

	@misc{brown2025grpodemo,
	title={Granular Format Rewards for Eliciting Mathematical Reasoning Capabilities in Small Language Models},
	author={Brown, William},

	from diffusers import FluxPipeline, AutoencoderKL
	from diffusers.image_processor import VaeImageProcessor
	from transformers import T5EncoderModel, T5TokenizerFast, CLIPTokenizer, CLIPTextModel
	import torch
	import gc


	def flush():
	gc.collect()
	torch.cuda.empty_cache()

	from collections import defaultdict
	import random
	from huggingface_hub import hf_hub_download
	from datasets import Dataset
	import numpy as np
	import pandas as pd
	from transformers import AutoTokenizer
	from rich.console import Console
	from rich.table import Table
	from trl import DPOTrainer

	from transformers import AutoModelForCausalLM, AutoTokenizer, StaticCache
	import torch
	from typing import Optional
	device = "cuda"

	# Copied from the gpt-fast repo
	def multinomial_sample_one_no_sync(probs_sort): # Does multinomial sampling without a cuda synchronization
	q = torch.empty_like(probs_sort).exponential_(1)
	return torch.argmax(probs_sort / q, dim=-1, keepdim=True).to(dtype=torch.int)

	#!/bin/bash

	### steps ####
	# verify the system has a cuda-capable gpu
	# download and install the nvidia cuda toolkit and cudnn
	# setup environmental variables
	# verify the installation
	###

	### to verify your gpu is cuda enable check