Skip to content

Instantly share code, notes, and snippets.

View jasbir-singh's full-sized avatar

jasbir-singh

View GitHub Profile
@jasbir-singh
jasbir-singh / README.openai-structured-output-demo.md
Created October 15, 2024 12:29 — forked from dannguyen/README.openai-structured-output-demo.md
A basic test of OpenAI's Structured Output feature against financial disclosure reports and a newspaper's police blotter. Code examples use the Python SDK and pydantic for the schema definition.

Extracting financial disclosure reports and police blotter narratives using OpenAI's Structured Output

tl;dr this demo shows how to call OpenAI's gpt-4o-mini model, provide it with URL of a screenshot of a document, and extract data that follows a schema you define. The results are pretty solid even with little effort in defining the data — and no effort doing data prep. OpenAI's API could be a cost-efficient tool for large scale data gathering projects involving public documents.

OpenAI announced Structured Outputs for its API, a feature that allows users to specify the fields and schema of extracted data, and guarantees that the JSON output will follow that specification.

For example, given a Congressional financial disclosure report, with assets defined in a table like this:

@jasbir-singh
jasbir-singh / README.openai-structured-output-demo.md
Created October 15, 2024 12:29 — forked from dannguyen/README.openai-structured-output-demo.md
A basic test of OpenAI's Structured Output feature against financial disclosure reports and a newspaper's police blotter. Code examples use the Python SDK and pydantic for the schema definition.

Extracting financial disclosure reports and police blotter narratives using OpenAI's Structured Output

tl;dr this demo shows how to call OpenAI's gpt-4o-mini model, provide it with URL of a screenshot of a document, and extract data that follows a schema you define. The results are pretty solid even with little effort in defining the data — and no effort doing data prep. OpenAI's API could be a cost-efficient tool for large scale data gathering projects involving public documents.

OpenAI announced Structured Outputs for its API, a feature that allows users to specify the fields and schema of extracted data, and guarantees that the JSON output will follow that specification.

For example, given a Congressional financial disclosure report, with assets defined in a table like this:

@jasbir-singh
jasbir-singh / clocktable-by-tag.el
Created December 18, 2023 20:59 — forked from nonducor/clocktable-by-tag.el
Emacs org-mode dynamic block similar to clocktable, but grouped by tag. See: https://stackoverflow.com/questions/70568361/org-mode-review-clocked-time-by-multiple-tags
(require 'org-clock)
(defun clocktable-by-tag/shift-cell (n)
(let ((str ""))
(dotimes (i n)
(setq str (concat str "| ")))
str))
(defun clocktable-by-tag/insert-tag (files params)
(let ((tag (plist-get params :tags))

Rails naming conventions

General Ruby conventions

Class names are CamelCase.

Methods and variables are snake_case.

Methods with a ? suffix will return a boolean.

#!/bin/bash
# This script will migrate schema and data from a SQLite3 database to PostgreSQL.
# Schema translation based on http://stackoverflow.com/a/4581921/1303625.
# Some column types are not handled (e.g blobs).
#
# See also:
# - http://stackoverflow.com/questions/4581727/convert-sqlite-sql-dump-file-to-postgresql
# - https://gist.github.com/bittner/7368128
@jasbir-singh
jasbir-singh / service-objects.md
Created April 19, 2023 15:02 — forked from blaix/service-objects.md
Martin Fowler on Service Objects via the Ruby Rogues Parley mailing list

On Tue, Mar 12, 2013 at 1:26 PM, Martin Fowler [email protected] wrote:

The term pops up in some different places, so it's hard to know what it means without some context. In PoEAA I use the pattern Service Layer to represent a domain-oriented layer of behaviors that provide an API for the domain layer. This may or may not sit on top of a Domain Model. In DDD Eric Evans uses the term Service Object to refer to objects that represent processes (as opposed to Entities and Values). DDD Service Objects are often useful to factor out behavior that would otherwise bloat Entities, it's also a useful step to patterns like Strategy and Command.

It sounds like the DDD sense is the sense I'm encountering most often. I really need to read that book.

The conceptual problem I run into in a lot of codebases is that rather than representing a process, the "service objects" represent "a thing that does the process". Which sounds like a nitpicky difference, but it seems to have a real impact on how people us