Skip to content

Instantly share code, notes, and snippets.

View CraftsMan-Labs's full-sized avatar
😇
I may be slow to respond.

Rishub C R (Craftsman) CraftsMan-Labs

😇
I may be slow to respond.
View GitHub Profile

Group Relative Policy Optimization (GRPO): A Comprehensive Guide

Group Relative Policy Optimization (GRPO) is an innovative reinforcement learning algorithm aimed at enhancing large language models (LLMs) for reasoning tasks. This guide explains the GRPO process with detailed diagrams and step-by-step explanations.


Main GRPO Workflow

The core GRPO process is depicted as a circular workflow with five key stages:

import os
import base64
import json
import re
from typing import List, Dict, Any, Optional, Union, Type, TypeVar
from pydantic import BaseModel, Field
from pathlib import Path

# Type variable for Pydantic models

Multi-Agent Collaborative Debate System Using LiteLLM

This system leverages multiple language models—each with its own persona—to collaboratively debate and solve complex problems. It uses a sparse communication topology and iterative summarization to ensure efficient, coherent, and robust problem-solving.

Core System Architecture

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#5D8AA8', 'primaryTextColor': '#fff', 'primaryBorderColor': '#7C0000', 'lineColor': '#F8B229', 'secondaryColor': '#006100', 'tertiaryColor': '#202020' }}}%%
flowchart TB
    subgraph "Multi-Agent Collaborative Debate System"

Building a RAG Preprocessor with Enhanced Context Summaries

This comprehensive guide details how to create a Python notebook for preprocessing documents in Retrieval Augmented Generation (RAG) systems with improved context handling. The notebook focuses on enhancing chunk quality by adding summaries of surrounding content and implementing strategic overlap between chunks.

Understanding RAG Preprocessing Challenges

Retrieval Augmented Generation has emerged as a powerful technique for combining the knowledge retrieval capabilities of vector databases with the generative abilities of large language models. However, standard chunking approaches often fail to maintain semantic continuity between document segments, leading to context loss and diminished response quality[6]. Creating intelligent chunks with appropriate context is essential for effective RAG systems.

When documents are chunked without considering their broader context, the retrieval process can surface relevant text fragments that lack the

Building a Self-Questioning Reasoning Framework with DSPy, LiteLLM, and Pydantic

This comprehensive guide presents a detailed implementation of a self-questioning reasoning framework based on the arXiv paper 2502.05078v1, which emphasizes stream-of-consciousness thinking patterns. The implementation leverages DSPy for structured prompting and optimization, LiteLLM for language model integration, and Pydantic for robust data modeling. The resulting Jupyter notebook provides a flexible system that emulates human-like exploratory reasoning with complete traceability of thought processes.

Understanding the Building Blocks

DSPy: A New Paradigm for LLM Programming

DSPy represents a significant advancement in how developers interact with large language models. Unlike traditional prompting frameworks, DSPy introduces a declarative programming model that separates the specification of what a language model should do from how it accomplishes that task[2]. This separation allows developers to define high-le

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

FinSphere-Inspired Conversational Stock Analysis Notebook

Introduction
Current financial LLMs often lack depth in stock analysis and struggle to incorporate up-to-date data. The FinSphere research proposes a solution by combining real-time data access, quantitative tools, and an LLM to produce professional-grade stock analysis ([2501.12399] FinSphere: A Conversational Stock Analysis Agent Equipped with Quantitative Tools based on Real-Time Database). This notebook demonstrates a simplified FinSphere-inspired pipeline for conversational stock analysis, integrating several components:

  • Pydantic for defining and validating structured request/response data models.
  • Pinecone as a vector database to store and retrieve relevant stock data (e.g. technical indicators, fundamentals, recent news).
  • LiteLLM for interacting with a language model to generate insights from the
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch_geometric.nn import GCNConv, global_mean_pool
from torch_geometric.data import Data
from rdkit import Chem
import numpy as np
# 1. Molecular graph construction
def molecule_to_graph(smiles):
@CraftsMan-Labs
CraftsMan-Labs / 1-quantum-agent-manager.md
Created February 18, 2025 11:16 — forked from ruvnet/1-quantum-agent-manager.md
Quantum Agent Manager is a quantum-inspired task scheduling system

Quantum Agent Manager

Introduction

What if you could instantly see all the best solutions to a complex reasoning problem all at once? That’s the problem I’m trying to solve with Quantum Task Manager. Traditional AI approaches like reinforcement learning struggle with interconnected decision-making because they evaluate actions sequentially, step by step. But quantum computing can consider all possibilities simultaneously, making it an ideal tool for agent-based task allocation.

Using Azure Quantum, this system leverages pure mathematical optimization and quantum principles to find the best way to distribute tasks among autonomous agents. Most people don’t fully understand how quantum computing works, but in simple terms, it can represent and evaluate every possible task assignment at the same time, using superposition and interference to amplify the best solutions and discard bad ones. This makes it fundamentally different from other scheduling or learning-based approaches.

What makes