Skip to content

Instantly share code, notes, and snippets.

View cakiki's full-sized avatar
🐈‍⬛
meow

Christopher Akiki cakiki

🐈‍⬛
meow
View GitHub Profile
@fxkamd
fxkamd / bert-tiny-amd.md
Created October 1, 2024 19:06
Solutions to problems with BERT training with tinygrad on AMD GPUs

Thank you to tiny corp for pointing out some problems running BERT training with Tinygrad on AMD GPUs in this Tweet. We had a few engineers at AMD take a look at the problem and they were quickly able to reproduce it.

What they found was an issue related to CWSR (compute wave save restore), which is a mechanism that allows our driver and firmware to preempt and reschedule long-running compute waves on our GPUs. The GFXv11 GPU line requires a workaround to set COMPUTE_PGM_RSRC1.PRIV=1 when dispatching a compute kernel. Normally this is handled by the AQL DISPATCH packet. However, since the Tinygrad implementation leverages a custom runtime, it requires this workaround in its PM4-based dispatch. This patch is specific to GFXv11 GPUs. Other GPUs do not require it and should not use this workaround. The following KFDTest patch can be used as a reference: https://github.com/ROCm/ROCT-Thunk-Interface/commit/507637ed5b82197eecbf483cdc1234939766549a

While inv

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@severo
severo / set_gated.py
Created August 12, 2022 16:00
A function to set the gated parameter on a HF repository
from huggingface_hub.hf_api import ( # type: ignore
REPO_TYPES,
REPO_TYPES_URL_PREFIXES,
HfApi,
_raise_for_status,
)
def update_repo_settings(
hf_api: HfApi,
repo_id: str,
@stefan-it
stefan-it / tpu_vm_cheatsheet.md
Last active December 5, 2024 23:33
TPU VM Cheatsheet

TPU VM Cheetsheat

This TPU VM cheatsheet uses and was tested with the following library versions:

Library Version
JAX 0.3.25
FLAX 0.6.4
Datasets 2.10.1
Transformers 4.27.1
@lmcinnes
lmcinnes / doc_embeddings_with_vectorizers.ipynb
Last active November 9, 2023 04:31
Document Embeddings with the Vectorizers Library
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Garfounkel
Garfounkel / gpu_tfidf_demo.ipynb
Last active April 24, 2021 21:59
notebooks/gpu_tfidf_demo.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@cbuntain
cbuntain / agreement.py
Created March 19, 2020 18:38
Example of using NLTK's agreement package to calculate agreement scores for an annotation task
#
# Author: Cody Buntain
# Date: 19 March 2020
#
# Description:
# This code is an example of uysing the agreement package
#. in NLTK to calculate a number of agreement metrics on
#. a set of annotations. Currently, this code will work
#. with two annotators and multiple labels.
#. You can use Fleiss's Kappa or Krippendorf's Alpha if you
@nstarke
nstarke / resize-ghidra-gui.md
Last active April 19, 2025 04:57
Resize Ghidra GUI for High DPI screens

Resize Ghidra for High DPI screens

If you run Ghidra on a high DPI screen, you will probably find the GUI to be scaled down so small to be almost of no use.

There is a setting that you can adjust to scale the Ghidra GUI:

in $GHIDRA_ROOT/support is a file named launch.properties. In this launch.properties file is the following configuration key:

VMARGS_LINUX=-Dsun.java2d.uiScale=1

What the BookCorpus?

So in the midst of all these Sesame Streets characters and robots transforming automobile era of "contextualize" language models, there is this "Toronto Book Corpus" that points to this kinda recently influential paper:

Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. "Aligning books and movies: Towards story-like visual explanations by watching movies and reading books." In Proceedings of the IEEE international conference on computer vision, pp. 19-27.

Why do I even care, there's no translations there?

Some might know my personal pet peeve on collecting translation datasets but this BookCorpus has no translations, so why do I even care about it?

@koreyou
koreyou / bm25.py
Created November 1, 2019 05:26
Implementation of OKapi BM25 with sklearn's TfidfVectorizer
""" Implementation of OKapi BM25 with sklearn's TfidfVectorizer
Distributed as CC-0 (https://creativecommons.org/publicdomain/zero/1.0/)
"""
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from scipy import sparse
class BM25(object):