Andrielson Ferreira da Silva Andrielson

Implementing Vector Embedding Processing with Python UDFs in Snowflake

This document outlines the design and implementation of a Snowflake-native system for processing and generating vector embeddings using Python UDFs. It replaced a distributed EC2-based solution, reduced operational overhead, and enabled scalable, SQL-driven vector workflows within our analytics platform.

Problem statement

Our team needed to perform operations on embedding vectors (e.g., similarity scoring, normalization, distance calculations) using billions of records stored in our Snowflake database.

However, the existing workflow—relying on distributed Python scripts running on EC2 and ingesting CSVs from S3—was operationally brittle, inefficient, and difficult to maintain.

Cody AI Prompts

Generate Requirements for an Idea: I have an idea for a new software project. If I haven't already given my idea to you, ask me to provide it before proceeding.

Using my project idea, generate a set of core requirements. Keep the requirements simple and focused on the aspects that are most likely to impact the architecture.

Present the list of requirements in an organized manner. Categorize as appropriate. Give each core feature a unique number I can use later to refer back to that feature.

Give me this in markdown format and offer to let me insert your generated content into a new file.

Sequential Prompts from Original RAG with aider and Claude 3.5 Sonnet Tutorial

LangChain RAG App Prompts

Initial RAG App Setup

Prompt: Use the LangChain RAG tutorial documentation, which I provided to you previously, to generate a simple RAG app in Python that uses the LangChain v0.2 framework. The app will allow the user to provide the URL for a web page and ask questions related to the contents of the page. The user interface will be the command line for now. The app should use OpenAI models and APIs. Generate all the required files, functions, methods, imports, etc. but don't implement any functions yet. Instead, just insert comments. Also, generate the Python requirements.txt file. Only implement the features I've requested.

	import {
	Inject,
	Injectable,
	InjectionToken,
	ValueProvider,
	} from "@angular/core";
	import { BehaviorSubject, fromEvent, Observable, Subject } from "rxjs";
	import { map, takeUntil } from "rxjs/operators";

	const MENU_STATE_ID = "UNIQUE_MENU_STATE_ID";

	<?php

	define('ALFABETO', 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/');

	function tetrasexagesimal(int $numero): string {
	$retorno = "";
	while ($numero >= 0) {
	if ($numero === 0) {
	$retorno = "A{$retorno}";
	break;