Skip to content

Instantly share code, notes, and snippets.

GCSV

GCSV en résumé

GCSV est un moyen de relever des données généalogiques ou historiographiques courantes, sous une forme simple pouvant être lue à la fois par des humains et par des logiciels. GCSV est un format de données en ligne, similaire à l'écriture quotidienne, mais de manière structurée, afin d'être réutilisé ou analysé par des logiciels sans ambiguïté.

GCSV est basé sur le format de fichier CSV (Commas Separated Values), permettant de représenter des données tabulaires par des valeurs séparées par des virgules. Le format CSV est un format de fichier extrêmement simple et répandu. Il est reconnu par de très nombreux logiciels comme des tableurs. De ce fait, les données écrites manuellement au format GCSV sont immédiatement utilisables par de très nombreux outils.

GCSV est si simple que beaucoup, avec un minimum d'expérience en généalogie ou historiographie, devraient pouvoir comprendre cet exemple :

@CharlesNepote
CharlesNepote / nginx_logs2parquet.py
Created December 23, 2024 22:34
Convert nginx logs to parquet file
import argparse
import re
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq
from datetime import datetime
#import dateutil
#from dateutil import parser
@CharlesNepote
CharlesNepote / markdown-flavors.md
Created May 2, 2024 13:18 — forked from vimtaai/markdown-flavors.md
Comparison of features in various Markdown flavors

Comparison of syntax extensions in Markdown flavors

I created a crude comparison of the syntax of the various common Markdown extensions to have a better view on what are the most common extensions and what is the most widely accepted syntax for them. The list of Markdown flavors that I looked at was based on the list found on CommonMark's GitHub Wiki.

Flavor Superscript Subscript Deletion*
Strikethrough
Insertion* Highlight* Footnote Task list Table Abbr Deflist Smart typo TOC Math Math Block Mermaid
GFM
@CharlesNepote
CharlesNepote / csv2datasette
Last active December 5, 2024 09:06
csv2datasette
#!/usr/bin/env bash
# sudo ln -s "$(pwd)/csv2datasette" /usr/bin/csv2datasette
# csv2datasette is meant to explore CSV data. It is not meant to create a sustainable DB.
# csv2datasette is a bash script which open CSV files directly in Datasette. It offers
# a number of options for reading and exploring CSV files, such as --stats, inspired by WTFCsv.
#
# `--stats` option includes, for each column: the column name, the number of unique values,
# the number of filled rows, the number of missing values, the mininmum value, the maximum value,
# the average, the sum, the shortest string, the longest string, the number of numeric values,
(function() {
'use strict';
console.log("DS: start");
// Part I. -- Add "CSV without limit" link
// document.getElementsByClassName("export-links")?
if (document.getElementsByClassName('show-hide-sql')[0] && document.getElementById('sql-editor') !== null) {
displayLinkCSVWithoutLimit();
}
@CharlesNepote
CharlesNepote / count.js
Last active December 21, 2022 11:35
Datasette: button to count rows
// Add "count" button aside the "Custom SQL query returning XX rows" title
// Clicking on the button counts and displays total number of rows
// Probably does not play well with complex queries (eg. sub-queries)
// Test if we're located in a query's page
const hide = document.getElementsByClassName('show-hide-sql')[0] || false;
if (hide) {
var sql_request = document.getElementById('sql-editor').value;
// Remove request's comments
# get total requests by status code
awk '{print $9}' /var/log/nginx/access.log | sort | uniq -c | sort -rn
# get top requesters by IP
awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head | awk -v OFS='\t' '{"host " $2 | getline ip; print $0, ip}'
# get top requesters by user agent
awk -F'"' '{print $6}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head
# get top requests by URL
CREATE TABLE IF NOT EXISTS "all" (
[code] TEXT,
[url] TEXT,
[creator] TEXT,
[created_t] INTEGER,
[created_datetime] TEXT,
[last_modified_t] INTEGER,
[last_modified_datetime] TEXT,
[product_name] TEXT,
[abbreviated_product_name] TEXT,
@CharlesNepote
CharlesNepote / export.txt
Last active April 14, 2022 17:54
Open Food Facts import
code
creator
created_t
last_modified_t
product_name
abbreviated_product_name
generic_name
quantity
packaging
packaging_text
@CharlesNepote
CharlesNepote / standard.sh
Last active February 16, 2022 17:57 — forked from hfossli/standard.sh
Standard bash script format
#!/bin/bash
CLEAR='\033[0m'
RED='\033[0;31m'
function usage() {
if [ -n "$1" ]; then
echo -e "${RED}👉 $1${CLEAR}\n";
fi
echo "Usage: $0 [-n number-of-people] [-s section-id] [-c cache-file]"