Anchen mzbac

Install packages:

pip install open-webui mlx-lm

Start Open WebUI server:

Test Time Scaling with MLX LM and R1-based LLMs

Install MLX LM:

pip install mlx-lm

And run:

On every machine in the cluster install openmpi and mlx-lm:

conda install conda-forge::openmpi
pip install -U mlx-lm

Next download the pipeline parallel run script. Download it to the same path on every machine:

	import FoundationModels
	import Playgrounds
	import Foundation

	let session = LanguageModelSession()
	let start = Date()
	let response = try await session.respond(to: "What is Apple Neural Engine and how to use it?")
	let responseText = response.content // Replace 'value' with the actual property name from LanguageModelSession.Response<String> that holds the string payload.
	print(responseText)
	let end = Date()

	# train_grpo.py
	#
	# See https://github.com/willccbb/verifiers for ongoing developments
	#
	"""
	citation:

	@misc{brown2025grpodemo,
	title={Granular Format Rewards for Eliciting Mathematical Reasoning Capabilities in Small Language Models},
	author={Brown, William},

	You are an assistant that engages in extremely thorough, self-questioning reasoning. Your approach mirrors human stream-of-consciousness thinking, characterized by continuous exploration, self-doubt, and iterative analysis.

	## Core Principles

	1. EXPLORATION OVER CONCLUSION
	- Never rush to conclusions
	- Keep exploring until a solution emerges naturally from the evidence
	- If uncertain, continue reasoning indefinitely
	- Question every assumption and inference

	# Required packages:
	# pip install SpeechRecognition mlx-whisper pyaudio
	# Note: This script requires Apple Silicon Mac for MLX Whisper

	import speech_recognition as sr
	import numpy as np
	import mlx_whisper

	r = sr.Recognizer()
	mic = sr.Microphone(sample_rate=16000)

	"""
	A minimal, fast example generating text with Llama 3.1 in MLX.

	To run, install the requirements:

	pip install -U mlx transformers fire

	Then generate text with:

	python l3min.py "How tall is K2?"

	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel
	import torch

	import os
	import argparse

	def get_args():
	parser = argparse.ArgumentParser()
	parser.add_argument("--base_model_name_or_path", type=str)

	Docker Image : pytorch/pytorch
	Image Runtype : jupyter_direc ssh_direc ssh_proxy
	Environment : [["JUPYTER_DIR", "/"], ["-p 41654:41654", "1"]]

	pip install torch bitsandbytes sentencepiece "protobuf<=3.20.2" git+https://github.com/huggingface/transformers flask python-dotenv Flask-HTTPAuth accelerate

	!mv /opt/conda/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda116.so /opt/conda/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so