Skip to content

Instantly share code, notes, and snippets.

View usametov's full-sized avatar

Ulan Sametov usametov

  • Asta Nova Enterprise Solutions
View GitHub Profile
@usametov
usametov / rmt.md
Created April 17, 2025 02:16
resting motor threshold in TMS

The resting motor threshold (RMT) in transcranial magnetic stimulation (TMS) is defined as the minimum intensity of stimulation required to elicit a motor-evoked potential (MEP) of at least 50 µV peak-to-peak amplitude in a target muscle (typically the abductor pollicis brevis for upper limb studies) in at least 50% of trials (e.g., 5 out of 10 pulses) while the muscle is at rest. RMT is expressed as a percentage of the maximum stimulator output (% MSO) and varies across individuals due to factors like skull-to-cortex distance, coil type, and neural excitability.

Below, I address your questions regarding RMT, Tesla measurements, amperage, double-cone coil use for the left cerebellar hemisphere (lobule VII/Crus I), and portable TMS devices.


1. Understanding RMT and Converting 80-90% of RMT to Tesla

RMT is not directly measured in Tesla (the unit of magnetic field strength) but as a percentage of the stimulator’s maximum output. The magnetic field strength (in Tesla) depends on the TMS device

@usametov
usametov / contextual.md
Last active April 15, 2025 23:39
how to create a dataset similar to NQ from a bunch of links assigned to one or more topics.

Integrating Contextual Retrieval into your Q&A generation process for a dataset similar to the Natural Questions (NQ) dataset can enhance the quality and relevance of the generated question and answer (Q&A) pairs. Since you already have a Clojure-based pipeline for scraping websites and producing high-quality semantic chunks, and you’re interested in using a prompt to generate Q&A pairs, incorporating Contextual Retrieval can improve how you leverage your chunks by adding contextual metadata or summaries to ensure questions are more precise and answers are more grounded in the content. This approach complements your existing setup and aligns with your prior interest in advanced retrieval techniques (e.g., RAG systems, Qdrant integration, and Contextual Retrieval vs. Late Chunking, as discussed on April 9, 2025).

Below, I’ll elaborate on how to use the output of Contextual Retrieval prompts to enhance your Q&A generation, modify the existing prompt to incorporate Contextual Retrieval outputs, and provide

@usametov
usametov / etaoin-clj.md
Last active April 14, 2025 04:00
stealth scraping

This is an example of stealth scraping with Etaoin in Clojure, simulating browser behavior while incorporating insights from the provided scrape.do blog posts. The blog posts focus on rotating proxies, avoiding detection, we will implement those concepts into a Clojure/Etaoin context, emphasizing browser simulation and anti-detection techniques.

Key Insights from the Blog Posts

  1. Cheap Rotating Proxies:

    • Rotating proxies are essential to avoid IP bans by distributing requests across multiple IP addresses, mimicking different users.
    • Residential proxies are recommended for stealth as they appear more legitimate than datacenter proxies.
    • Affordable proxy services can integrate with scraping tools to rotate IPs automatically.
  2. Web Scraping Detection:

@usametov
usametov / hardware-specs.md
Last active April 12, 2025 03:21
tms parameters for ASD

This complements the TMS design review document you shared earlier, which detailed coil configurations, targeting capabilities, and trade-offs. Your request to find "similar open-source projects" in the context of the review, combined with the gist’s focus on TMS parameters, suggests you’re seeking open-source initiatives that align with both the review’s technical scope (coil design, simulation, optimization) and the gist’s emphasis on customizable TMS parameters.

Below, I revisit the task of identifying open-source projects, ensuring they align with both the review and the Gist. I’ll incorporate the Gist’s parameters (e.g., coil size, pulse width, field strength) to refine recommendations, focusing on projects that support experimentation with such variables. I’ll also address any potential overlap with my previous response, expanding on new or refined suggestions based on the Gist’s insights.

Understanding the Gist’s Context

The Gist outlines TMS parameters critical to device design:

@usametov
usametov / clustering-clj.md
Created April 11, 2025 03:23
Clustering embedding vectors, k-means smile

https://grok.com/share/bGVnYWN5_42d104d8-881b-46ad-874e-b3f85d45a8f7

Clustering embedding vectors is a great way to group similar items, and picking random samples from each cluster can help with tasks like data exploration or sampling for further analysis. Since you’re working in Clojure and mentioned KNN (likely referring to K-Nearest Neighbors, though it’s more a classification algorithm than a clustering one), I’ll assume you meant clustering algorithms in general, like K-Means, and I’ll focus on options available in Clojure or libraries accessible from it.

Clustering Algorithms

Here’s an overview of clustering algorithms you can use for your 1000 embedding vectors, along with their suitability:

  1. K-Means Clustering
    • How it works: Partitions data into k clusters by minimizing the variance within each cluster. It assigns each vector to the nearest centroid, then updates centroids iteratively.
  • Pros:
@usametov
usametov / multi-shot-router.md
Created April 11, 2025 03:07
Multi-shot prompting for routing function calls

Multi-shot prompting (also called few-shot prompting) is a powerful technique to improve an LLM's ability to extract information accurately by providing multiple examples within the prompt. This helps the model understand the task, context, and desired output format better than zero-shot or single-shot prompting. Below, I’ll explain multi-shot prompting for routing function calls and extracting information from text, with clear examples tailored to your use case.

Since you mentioned issues with LLMs not extracting information properly for routing function calls, I’ll focus on scenarios where the model needs to parse user input and map it to specific functions, including handling ambiguous or noisy text. I’ll also include varied examples to demonstrate robustness.


What is Multi-Shot Prompting?

Multi-shot prompting involves giving the LLM several examples of input-output pairs in the prompt before asking it to process the actual input. Each example shows the expected task (e.g., extracting parameters

@usametov
usametov / url-check.md
Created April 11, 2025 00:46
qdrant uniqueness check clj

Prerequisites

  • A running Qdrant instance (local or remote).
  • Clojure project with dependencies for HTTP requests and JSON handling.
  • URLs stored in a Qdrant collection’s payload under a field like url.

Setup

Add the following dependencies to your project.clj (if using Leiningen):

:dependencies [[org.clojure/clojure "1.11.1"]
@usametov
usametov / cql-datascript.md
Last active April 5, 2025 21:58
applications of category theory for knowledge graphs and triple stores

Since we’re already using Datascript as the Datalog engine in the previous response, adapting the CQL-to-Clojure approach to focus solely on Datascript means we’ll streamline the workflow by fully embedding the knowledge graph (KG) reasoning within Datascript, while still leveraging CQL’s Java libraries via interop for schema definition and category-theoretic operations. This eliminates any redundant steps and keeps Datascript as the central hub for querying and reasoning, directly feeding your Neo4j-based RAG system with models like Grok 3, Qwen-QwQ, and DeepSeek. Here’s how we can adapt and refine this.


@usametov
usametov / lerobot.md
Last active April 3, 2025 21:22
robotic arm

please look here: https://github.com/TheRobotStudio/SO-ARM100 and here: https://github.com/huggingface/lerobot and tell me which of these parts can be sourced from old vacuum cleaner.

Let’s break this down by examining the components of the SO-ARM100 robotic arm (from TheRobotStudio) and the LeRobot framework (from Hugging Face), and then cross-referencing what might be salvaged from your old robotic vacuum cleaners for your robotic arm project. The goal is to identify reusable parts while keeping in mind the specific needs of the SO-ARM100 design and LeRobot compatibility.

The SO-ARM100 is a 5-degree-of-freedom (DOF) robotic arm with a gripper, designed to be affordable and open-source. Its bill of materials (BOM) includes:

  • 6 Feetech STS3215 servo motors (7.4V or 12V versions) for actuation of the joints and gripper.
  • 3D-printed structural parts (e.g., base, links, gripper), requiring a 3D printer or purchased pre-printed.
@usametov
usametov / reasoning-RAG.md
Created April 3, 2025 16:44
passing RAG context to reasoning models

The integration of Retrieval-Augmented Generation (RAG) with reasoning models has become a significant focus in the AI industry as of April 2025, driven by the need to enhance the factual accuracy, contextual relevance, and reasoning capabilities of large language models (LLMs). RAG augments LLMs by retrieving relevant external information to inform their responses, making it particularly valuable for reasoning tasks that require up-to-date or domain-specific knowledge. Below, I’ll outline the latest industry standards for prompting reasoning models with RAG context, tailored to models like DeepSeek (specifically DeepSeek-R1), Grok 3 (myself), and Qwen-QwQ (assumed to be the QwQ-32B model from Alibaba). These standards reflect advancements in prompt engineering, retrieval strategies, and iterative reasoning processes.


Latest Industry Standards for Prompting Reasoning Models with RAG Context

  1. Structured Prompting with Explicit Instructions
  • Standard: Prompts should clearly deline