Skip to content

Instantly share code, notes, and snippets.

View yudhiesh's full-sized avatar

Yudhiesh Ravindranath yudhiesh

View GitHub Profile
@willccbb
willccbb / grpo_demo.py
Last active May 21, 2025 14:12
GRPO Llama-1B
# train_grpo.py
#
# See https://github.com/willccbb/verifiers for ongoing developments
#
"""
citation:
@misc{brown2025grpodemo,
title={Granular Format Rewards for Eliciting Mathematical Reasoning Capabilities in Small Language Models},
author={Brown, William},
@bewestphal
bewestphal / _description.md
Last active August 25, 2024 04:03
Ray Serve Unit Testing Examples

Ray Serve Unit Test Examples

1. test_deployment_class.py

Demonstrates how to test a Ray serve deployment class directly without spinning up a Ray cluster.

The key method: func_or_class, allows accessing the underlying class directly where normally, the class would be revised to a Ray serve deployment type due to the @serve.deployment decorator.

2. test_deployment_requests.py

import random
from metaflow import FlowSpec, step, S3, Flow, Parameter, profile, kubernetes, conda, conda_base
# change columns according to your schema (or remove column list to load all)
COLUMNS = ['VendorID', 'tpep_pickup_datetime', 'tpep_dropoff_datetime']
# group parquet files as 1GB batches
def shard_data(src, batch_size=1_000_000_000):
with S3() as s3:
objs = s3.list_recursive([src])
@tuulos
tuulos / s3dir.py
Created March 10, 2023 06:43
Sync full directories to/from S3
import os
from metaflow import S3
def put_dir(local_root, s3root):
root = os.path.abspath(local_root)
objs = []
for p, _, files in os.walk(root):
for f in files:
path = os.path.join(p, f)
key = os.path.relpath(path, start=root)
@miohtama
miohtama / example.py
Created March 26, 2021 10:51
Solidity and Ethereum int256 for Python, SQLAlchemy and SQL Databases, efficiently as 32 bytes blobs
class LiquidityChanged(TransactionEvent):
"""A sampled liquidity at any moment."""
__tablename__ = "liquidity"
delta0 = sa.Column(Int257, nullable=False, index=False)
@kklemon
kklemon / iterable_dataset_dist.py
Last active February 24, 2025 06:16
PyTorch IterableDataset implementation with multiprocessing and distributed training support
import torch
import torch.distributed as dist
import torch.multiprocessing as mp
from torch.utils.data import IterableDataset, DataLoader
class DistributedIterableDataset(IterableDataset):
"""
Example implementation of an IterableDataset that handles both multiprocessing (num_workers > 0)
@den-crane
den-crane / convert not replicated MV to replicated.sql
Last active May 20, 2025 06:08
convert not replicated MV to replicated
CREATE TABLE test12345 (A Int64) ENGINE = MergeTree ORDER BY A;
CREATE MATERIALIZED VIEW test12345mv Engine=MergeTree order by A
AS SELECT A FROM test12345;
insert into test12345 select number from numbers(10);
#### maintanance / load stop
rename table test12345 to test12345_old;
@jefftriplett
jefftriplett / python-django-postgres-ci.yml
Last active March 27, 2024 04:27
This is a good starting point for getting Python, Django, Postgres running as a service, pytest, black, and pip caching rolling with GitHub Actions.
name: CI
on: [push]
jobs:
test:
runs-on: ubuntu-latest
services:
@ddelange
ddelange / airflow_slack_notifications.md
Last active November 16, 2023 16:57
Airflow Slack notifications

Airflow Slack notifications

Installation

Make sure slackclient v1.3.1 is installed (for apache-airflow 1.10).

pip install -U "apache-airflow[slack,...]"
@mayankcpdixit
mayankcpdixit / install-kafka-mac.md
Last active April 19, 2022 02:25
Install Kafka in local (mac)

Install kafka in your local mac machine

run following commands:

brew install kafka
sudo mkdir -p /usr/local/var/run/zookeeper/data
sudo chmod 777 /usr/local/var/run/zookeeper/data
zkServer start

mkdir -p /usr/local/var/lib/kafka-logs