Skip to content

Instantly share code, notes, and snippets.

View crazyguitar's full-sized avatar
🎯
Focusing

CHANG-NING TSAI crazyguitar

🎯
Focusing
View GitHub Profile
@crazyguitar
crazyguitar / m256.cc
Created October 5, 2021 20:18 — forked from kaityo256/m256.cc
//----------------------------------------------------------------------
#include <stdio.h>
#include <emmintrin.h>
#include <immintrin.h>
//----------------------------------------------------------------------
void
printm256(__m256d r){
double *a = (double*)(&r);
printf("%f %f %f %f\n",a[0],a[1],a[2],a[3]);
}
@ih2502mk
ih2502mk / list.md
Last active April 27, 2025 14:15
Quantopian Lectures Saved
@mcarilli
mcarilli / nsight.sh
Last active April 23, 2025 01:26
Favorite nsight systems profiling commands for Pytorch scripts
# This isn't supposed to run as a bash script, i named it with ".sh" for syntax highlighting.
# https://developer.nvidia.com/nsight-systems
# https://docs.nvidia.com/nsight-systems/profiling/index.html
# My preferred nsys (command line executable used to create profiles) commands
#
# In your script, write
# torch.cuda.nvtx.range_push("region name")
# ...
@crazyguitar
crazyguitar / mapread.c
Created January 17, 2020 16:43 — forked from marcetcheverry/mapread.c
mmap and read/write string to file
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <unistd.h>
int main(int argc, const char *argv[])
{
@mcarilli
mcarilli / commands.md
Last active June 11, 2024 20:13
Single- and multiprocess profiling workflow with nvprof and NVVP (Nsight Systems coming soon...)

Ordinary launch commands (no profiling):

Single-process:

python main_amp.py -a resnet50 --b 224 --deterministic --workers 4 --opt-level O1 ./bare_metal_train_val/

Multi-process:

python -m torch.distributed.launch  --nproc_per_node=2 main_amp.py -a resnet50 --b 224 --deterministic --workers 4 --opt-level O1 ./bare_metal_train_val/
@dideler
dideler / bot.rb
Last active April 24, 2025 04:41
Sending a notification message to Telegram using its HTTP API via cURL
# Use this script to test that your Telegram bot works.
#
# Install the dependency
#
# $ gem install telegram_bot
#
# Run the bot
#
# $ ruby bot.rb
#
@mbinna
mbinna / effective_modern_cmake.md
Last active April 25, 2025 22:01
Effective Modern CMake

Effective Modern CMake

Getting Started

For a brief user-level introduction to CMake, watch C++ Weekly, Episode 78, Intro to CMake by Jason Turner. LLVM’s CMake Primer provides a good high-level introduction to the CMake syntax. Go read it now.

After that, watch Mathieu Ropert’s CppCon 2017 talk Using Modern CMake Patterns to Enforce a Good Modular Design (slides). It provides a thorough explanation of what modern CMake is and why it is so much better than “old school” CMake. The modular design ideas in this talk are based on the book [Large-Scale C++ Software Design](https://www.amazon.de/Large-Scale-Soft

@mdonkers
mdonkers / server.py
Last active April 4, 2025 13:11
Simple Python 3 HTTP server for logging all GET and POST requests
#!/usr/bin/env python3
"""
License: MIT License
Copyright (c) 2023 Miel Donkers
Very simple HTTP server in python for logging requests
Usage::
./server.py [<port>]
"""
from http.server import BaseHTTPRequestHandler, HTTPServer
@kbarbary
kbarbary / simd-vmv.c
Created October 8, 2016 18:29
Vector-matrix-vector multiplication with SIMD (AVX) intrinsics
// Doing the operation:
//
// | a a a a | | y |
// x * A * y = [ x x x x ] | a a a a | | y |
// | a a a a | | y |
// | a a a a | | y |
//
// with SIMD intrinics (specifically AVX).
//
// adapted from https://gist.github.com/rygorous/4172889
@slavafomin
slavafomin / nodejs-custom-es6-errors.md
Last active November 14, 2024 11:23
Custom ES6 errors in Node.js

Here's how you could create custom error classes in Node.js using latest ES6 / ES2015 syntax.

I've tried to make it as lean and unobtrusive as possible.

Defining our own base class for errors

errors/AppError.js