- Use your application extensively to build intuition about failure modes
- Define 3-4 dimensions based on observed or anticipated failures
- Create structured tuples covering your priority failure scenarios
- Generate natural language queries from each tuple using a separate LLM call
- Scale to more examples across your most important failure hypotheses (we suggest at least ~100)
- Test and iterate on the most critical failure modes first, and generate more until you reach theoretical saturation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Project Policy | |
This policy provides a single, authoritative, and machine-readable source of truth for AI coding agents and humans, ensuring that all work is governed by clear, unambiguous rules and workflows. It aims to eliminate ambiguity, reduce supervision needs, and facilitate automation while maintaining accountability and compliance with best practices. | |
# 1. Introduction | |
> Rationale: Sets the context, actors, and compliance requirements for the policy, ensuring all participants understand their roles and responsibilities. | |
## 1.1 Actors |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from datasets import load_dataset | |
from sentence_transformers import ( | |
SentenceTransformerTrainer, | |
SentenceTransformerTrainingArguments, | |
) | |
from pylate import losses, models, utils | |
def main(): | |
# As ReasonIR do not re-upload the BRIGHT data, we need to load it from the original source |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
You are Manus, an AI agent created by the Manus team. | |
You excel at the following tasks: | |
1. Information gathering, fact-checking, and documentation | |
2. Data processing, analysis, and visualization | |
3. Writing multi-chapter articles and in-depth research reports | |
4. Creating websites, applications, and tools | |
5. Using programming to solve various problems beyond development | |
6. Various tasks that can be accomplished using computers and the internet |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Claude Code is a Beta product per Anthropic's Commercial Terms of Service. | |
// By using Claude Code, you agree that all code acceptance or rejection decisions you make, | |
// and the associated conversations in context, constitute Feedback under Anthropic's Commercial Terms, | |
// and may be used to improve Anthropic's products, including training models. | |
// You are responsible for reviewing any code suggestions before use. | |
// (c) Anthropic PBC. All rights reserved. Use is subject to Anthropic's Commercial Terms of Service (https://www.anthropic.com/legal/commercial-terms). | |
// Version: 0.2.9 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# the "verifiers" repository is a clean implementation of templated GRPO reinforcement learning training environments | |
# this is a generic set of "install from scratch" commands complete with a deepspeed z3 config that i have been using when i spin up nodes | |
# it will run on the gsm8k example w/ default batch size & generation size (8), and the 8th GPU is used for vllm generations | |
# qwen 14b full finetuning will run on this configuration too without LoRA or CUDA OOM, at least for the gsm8k task's context sizes + generation lengths | |
# hyperparameters are controlled by `verifiers/utils/config_utils.py`; i have been preferring extreme grad clipping (between 0.001 and 0.01) and low beta (under 0.01) | |
# NOTE FEB 27: examples have moved into `verifiers/examples` not `/examples` | |
cd /root | |
mkdir boom |
-
Every atomic object has a timeline (TL) of writes:
- A write is either a store or a read-modify-write (RMW): it read latest write & pushed new one.
- A write is either tagged Relaxed, Release, or SeqCst.
- A read observes some write on the timeline:
- On the same thread, future reads can't go backwards on the timeline.
- A read is either tagged Relaxed, Acquire, or SeqCst.
- RMWs can also be tagged Acquire (or AcqRel). If so, the Acquire refers to the "read" portion of "RMW".
-
Each thread has its own view of the world:
- Shared write timelines but each thread could be reading at different points.
This is just a quick write up - mostly for myself - on how to create a python PyApp package for an air-gapped machine. This means that all dependencies, etc., will be included.
- Download the CPython version that should be used. A list of default versions are in
build.rs
of PyApp- Linux default: 3.12.3, linux, x86_64, gnu, v3
- Windows default: 3.12.3, x84_64, msvc
- Unpack the distribution
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Please analyze the following GitHub issue data, which is provided as a JSON object: | |
{ | |
"title": "🐛 BUG: WebSocket typing doesn't work in apps that also pull in DOM types", | |
"body": "Which Cloudflare product(s) does this pertain to?", | |
} | |
Provide a response with the following structure: | |
<json> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# THIS LINUX SETUP SCRIPT HAS MORPHED INTO A WHOLE PROJECT: HTTPS://OMAKUB.ORG | |
# PLEASE CHECKOUT THAT PROJECT INSTEAD OF THIS OUTDATED SETUP SCRIPT. | |
# | |
# | |
# Libraries and infrastructure | |
sudo apt update -y | |
sudo apt install -y \ | |
docker.io docker-buildx \ | |
build-essential pkg-config autoconf bison rustc cargo clang \ |
NewerOlder