Skip to content

Instantly share code, notes, and snippets.

@arynyklas
Last active February 20, 2025 17:40
Show Gist options
  • Save arynyklas/9af835057888bbce76bd0c53b2127b3b to your computer and use it in GitHub Desktop.
Save arynyklas/9af835057888bbce76bd0c53b2127b3b to your computer and use it in GitHub Desktop.
Github Actions Workflow to track updates of files in data folder

Check files updates workflow

This workflow automatically checks for file updates by running a Python script and commits any changes to a separate branch.

Triggers

  • Scheduled: Every 30 minutes
  • Manual: via workflow_dispatch
  • Push: on the main branch

Steps

  1. Checkout Repository:
    Uses actions/checkout@v4 with a PAT for full history.

  2. Set Up Python:
    Uses actions/setup-python@v4 to install Python 3.10.

  3. Branch Management:
    Switches to the data branch (or creates it) and copies files from main branch.

  4. Environment Setup:
    Creates a virtual environment and installs dependencies from requirements.txt.

  5. Run Script:
    Executes check_files_updates.py, captures its output, and cleans up.

  6. Commit Changes:
    If updates are detected, configures Git, cleans up non-essential files, commits, and pushes changes to the data branch.

Customization

  • Ensure PAT_TOKEN is set in your repository secrets.
  • Update requirements.txt as needed.
  • Modify branch names if desired.
name: Check files updates
on:
schedule:
- cron: "*/30 * * * *" # Runs every 30 minutes
workflow_dispatch:
push:
branches:
- main
jobs:
check-updates:
runs-on: self-hosted
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
token: ${{ secrets.PAT_TOKEN }}
fetch-depth: 0
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Switch to data branch
run: |
git checkout data || git checkout -b data
- name: Copy files from main branch
run: |
if [ ! -f .checkout_exclude ]; then
echo ".github" > .checkout_exclude
fi
# debug
echo "Files to checkout exclude:"
cat .checkout_exclude
GIT_CHECKOUT_EXCLUDES=$(awk '{printf " :(exclude)%s", $0}' .checkout_exclude)
git checkout main -- . $GIT_CHECKOUT_EXCLUDES
- name: Create virtual environment
run: python -m venv .venv
- name: Install dependencies
run: |
source .venv/bin/activate
pip install -r requirements.txt
- name: Run script
run: |
source .venv/bin/activate
python -m src
rm -rf __pycache__
- name: Commit changes if any
run: |
git config --global user.email "[email protected]"
git config --global user.name "GitHub Action"
if [ ! -f .keep ]; then
echo ".git" > .keep
echo ".github" >> .keep
echo "data" >> .keep
echo ".keep" >> .keep
fi
# debug
echo "Files to keep:"
cat .keep
echo "All files:"
ls -la
EXCLUDES=$(awk '{printf "! -name \"%s\" ", $0}' .keep)
echo "Excludes: $EXCLUDES"
eval "find . -maxdepth 1 -mindepth 1 $EXCLUDES -exec rm -rf {} +"
# debug
echo "All files after removal:"
ls -la
git add -A
if git diff-index --quiet HEAD -- data && git diff --cached --quiet HEAD -- data; then
echo "No changes in data/ folder; exiting."
exit 0
else
echo "Changes detected in data/ folder; continuing."
fi
git remote set-url origin https://${{ secrets.PAT_TOKEN }}@github.com/${{ github.repository }}.git
git commit -m "Update tracked files"
git push origin data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment