Skip to content

Instantly share code, notes, and snippets.

View nobucshirai's full-sized avatar

Nobu C. Shirai nobucshirai

View GitHub Profile
@nobucshirai
nobucshirai / author_year_renamer.py
Last active February 23, 2025 09:17
Academic Paper Renamer: Extracts text, scans for DOI/arXiv IDs, retrieves metadata from Crossref or arXiv, and renames the file systematically.
#!/usr/bin/env python3
"""
Rename a PDF based on the DOI/arXiv ID extracted from its content and author information retrieved via the Crossref or arXiv APIs.
"""
import argparse
import os
import sys
import re
import requests
@nobucshirai
nobucshirai / img2pdf.py
Last active February 18, 2025 02:22
Image to PDF Converter: Easily combine multiple images into a single PDF file. Just provide image paths and an optional output name. Perfect for quick document assembly tasks.
#!/usr/bin/env python3
"""
Merge image files into a single PDF, optionally annotating images with their filenames using ImageMagick.
The grid layout is controlled by the number of rows and columns.
Use the --with-text flag to enable filename annotation (default is without text).
You can adjust the annotation font size by using the --font-scale option.
"""
import argparse
import os
@nobucshirai
nobucshirai / timestamp_renamer.py
Created February 13, 2025 02:46
Timestamp-based Renaming: A script that renames files by adding a timestamp derived from their last modification time.
#!/usr/bin/env python3
import os
import re
import argparse
from datetime import datetime
def timestamp_renamer(files, custom_prefix=None):
for file_path in files:
# Extract the file extension and convert it to lowercase
@nobucshirai
nobucshirai / リハーサルモードの注意点.md
Created February 12, 2025 20:59
PowerPointの設定で注意したいポイントをChatGPT 4oに列挙してもらった結果
title create_time update_time conversation_id
リハーサルモードの注意点
2025-02-12 21:50:52 -0800
2025-02-12 21:54:58 -0800
67ad09ac-7f98-800f-abeb-1bc1993db4b1

リハーサルモードの注意点

Creation Time: 2025-02-13 05:50:52

@nobucshirai
nobucshirai / annotated_pages_extractor.py
Created February 12, 2025 03:38
PDF Annotation Extractor – This script processes PDF files, detecting and extracting only the pages that contain annotations. Useful for reviewing highlighted or commented content.
#!/usr/bin/env python3
"""
Extract annotated pages from PDF files.
This script reads one or more PDF files, checks each page for annotations,
and writes a new PDF containing only those pages that contain annotations.
If an output filename is not provided, the script uses the input file's basename
but adds "_extracted" before the ".pdf" extension.
Before overwriting an existing file, the user is prompted for confirmation.
@nobucshirai
nobucshirai / clipboard_util.py
Created February 11, 2025 02:56
Clipboard Utility for Text Files – A Python script that reads one or more text files, formats them with headers and footers, and optionally copies the output to the clipboard (macOS only).
#!/usr/bin/env python3
"""
This script reads one or more specified text files and outputs their content with a header and footer.
If the header/footer symbol "~~~" is found in a file, alternative symbols are used.
It also provides an option to save the combined output to the clipboard (macOS only).
"""
import argparse
import subprocess
@nobucshirai
nobucshirai / xlsx_to_csv.py
Last active February 18, 2025 07:21
A Python script to convert .xlsx files to .csv with UTF-8 encoding, supporting optional output file specification and overwrite protection.
#!/usr/bin/env python3
"""
Convert an Excel (.xlsx) file to a CSV (UTF-8) file.
Usage:
python3 xlsx_to_csv.py [--method {pandas,soffice}] input.xlsx [output.csv]
Options:
-h, --help Show this help message and exit.
@nobucshirai
nobucshirai / ocr.sh
Last active February 1, 2025 01:22 — forked from rok-git/ocr.sh
A shell script to perform OCR on images/PDFs using macOS built-in OCR engine
#!/bin/bash
SCRIPTNAME=$(basename "$0")
function realpath () {
f=$@;
if [ -d "$f" ]; then
base="";
dir="$f";
else
base="/$(basename "$f")";
@nobucshirai
nobucshirai / 生成AI活用部会名称案_2024_1029.md
Created January 7, 2025 06:57
生成AI活用部会名称案の検討経緯の記録。採用されたのは「生成AI活用検証イニシアティブ (Generative AI Utilization Testing Initiative; GAUTI)」
title create_time update_time conversation_id
生成AI活用部会名称案
2024-10-29 11:32:41 -0700
2024-10-29 11:42:09 -0700
6720abb9-02ac-800f-bcd6-f8d06477dc84

生成AI活用部会名称案

Creation Time: 2024-10-29 18:32:41

@nobucshirai
nobucshirai / toggle_dir.py
Last active June 3, 2024 00:28
`toggle_dir.py` is a simple Python script that toggles between two specific directory structures: `src/` and `tests/`. It checks the current working directory and, if it is within a `src/` directory, it switches to the corresponding `tests/` directory and vice versa. The typical command: cd `toggle_dir.py`
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import os
import sys
def switch_directory():
current_path = os.getcwd()
base_dir = os.path.abspath(current_path)
if "src" in base_dir.split(os.sep):