Skip to content

Instantly share code, notes, and snippets.

View amotl's full-sized avatar

Andreas Motl amotl

  • $PYTHONPATH
View GitHub Profile
@amotl
amotl / cratedb_climate_data_import.py
Last active April 10, 2026 23:10
Import a CSV file with special features into CrateDB
"""
### About
Task: Import a CSV file with special features into CrateDB.
Source: https://guided-path.s3.us-east-1.amazonaws.com/demo_climate_data_export.csv
The program applies two transformations before the data is ready for importing into
CrateDB. For the import procedure, it uses a performance path with pandas/SQLAlchemy.
- Convert coordinates in JSON list format to WKT POINT format.
@amotl
amotl / cratedb_percentile_with_args.py
Last active March 31, 2026 13:30
Recipe to reproduce `UnsupportedFunctionException[Invalid arguments in: percentile(doc.t03.timestamp_ms, $1) with (bigint, undefined)`
import sqlalchemy as sa
def workload():
"""
Install
uv pip install sqlalchemy-cratedb psycopg2-binary
Invoke
python cratedb_percentile_with_args.py
@amotl
amotl / cratedb_groupby_function.py
Created March 31, 2026 07:33
Recipe to reproduce `SQLParseException['floor((timestamp_ms / $1))' must appear in the GROUP BY clause or be used in an aggregation function.`
import sqlalchemy as sa
def workload():
"""
Install
uv pip install sqlalchemy-cratedb psycopg2-binary
Invoke
python cratedb_groupby_function.py
@amotl
amotl / cratedb_functionname_nested.py
Created March 31, 2026 07:20
Recipe to reproduce `UnsupportedFeatureException[Function FunctionName{schema='null', name='count'}(text) is not a scalar function.]`
import sqlalchemy as sa
def workload():
"""
Install
uv pip install sqlalchemy-cratedb psycopg2-binary
Invoke
python cratedb_functionname_nested.py
@amotl
amotl / demo.env
Created July 17, 2025 10:48
Evaluate `source-bash` of xonsh
TESTDRIVE=foobar
@amotl
amotl / cratedb-orjson.py
Created July 9, 2025 23:49
Probe JSON serialization with `crate-python`
"""
## About
`crate-python` uses `orjson` for JSON serialization.
## Errors
- TypeError: Type is not JSON serializable: numpy.ndarray
- TypeError: Type is not JSON serializable: recarray
@amotl
amotl / cratedb-cloud-mongodb-cdc.md
Created May 16, 2025 18:02
How do I optimally synchronize data between MongoDB and CrateDB?

To optimally synchronize data between MongoDB and CrateDB, you should use a Change Data Capture (CDC) integration, which is available as a managed feature in CrateDB Cloud. This allows you to keep your MongoDB data continuously and efficiently synchronized with a table in CrateDB. Here’s a concise guide on how to do this:


1. Use CrateDB Cloud’s MongoDB CDC Integration

CrateDB Cloud (preview feature, see docs) can continuously import and sync data from MongoDB (e.g., MongoDB Atlas) using Change Streams.

Key Features:

@amotl
amotl / compose.yml
Created May 6, 2025 18:21
Miniature rig for evaluating cratedb-cockpit on a non-root URL
services:
nginx:
image: nginx:1.27
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
ports:
- "8080:80"
restart: unless-stopped
@amotl
amotl / minigeocode.py
Created May 4, 2025 19:00
Miniature geocoder using a dedicated instance of Nominatim.
#!/usr/bin/env python3
# /// script
# requires-python = ">=3.9"
# dependencies = [
# "click",
# "geopandas",
# "geopy",
# ]
# ///
"""
@amotl
amotl / uv_run_stuck_mcp.py
Last active April 3, 2025 19:02
Problem with `uv run` not running to completion.
#!/usr/bin/env python3
"""
Prerequisite:
docker run --rm --name=cratedb \
--publish=4200:4200 --publish=5432:5432 \
--env=CRATE_HEAP_SIZE=2g crate/crate:nightly \
-Cdiscovery.type=single-node
Variants: