Skip to content

Instantly share code, notes, and snippets.

View dsdanielpark's full-sized avatar
๐Ÿ„โ€โ™‚๏ธ
Believe in your potential. May the Force be with us.

MinWoo(Daniel) Park dsdanielpark

๐Ÿ„โ€โ™‚๏ธ
Believe in your potential. May the Force be with us.
View GitHub Profile
@alielfilali01
alielfilali01 / Eval-Arabic-LLMs-using-lighteval.ipynb
Last active May 19, 2024 22:15
Copy of Test-lighteval.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@gamingflexer
gamingflexer / main.py
Created July 19, 2023 09:02
Anthropic's tokenizer for Claude
from transformers import PreTrainedTokenizerFast
fast_tokenizer = PreTrainedTokenizerFast(tokenizer_file="/home/ubuntu/LLM/module/claude-v1-tokenization.json")
text = "Hello, this is a test input."
tokens = fast_tokenizer.tokenize(text)
tokens
@rain-1
rain-1 / LLM.md
Last active April 8, 2025 13:49
LLM Introduction: Learn Language Models

Purpose

Bootstrap knowledge of LLMs ASAP. With a bias/focus to GPT.

Avoid being a link dump. Try to provide only valuable well tuned information.

Prelude

Neural network links before starting with transformers.

@JoaoLages
JoaoLages / RLHF.md
Last active April 3, 2025 13:14
Reinforcement Learning from Human Feedback (RLHF) - a simplified explanation

Maybe you've heard about this technique but you haven't completely understood it, especially the PPO part. This explanation might help.

We will focus on text-to-text language models ๐Ÿ“, such as GPT-3, BLOOM, and T5. Models like BERT, which are encoder-only, are not addressed.

Reinforcement Learning from Human Feedback (RLHF) has been successfully applied in ChatGPT, hence its major increase in popularity. ๐Ÿ“ˆ

RLHF is especially useful in two scenarios ๐ŸŒŸ:

  • You canโ€™t create a good loss function
    • Example: how do you calculate a metric to measure if the modelโ€™s output was funny?
  • You want to train with production data, but you canโ€™t easily label your production data
@haven-jeon
haven-jeon / naver_review_classifications_gluon_bert.ipynb
Last active February 25, 2023 08:36
BERT with Naver Sentiment Movie Corpus
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@russss
russss / deskew.py
Created September 10, 2018 12:05
Automatic scanned image rotation/deskew with OpenCV
import cv2
import numpy as np
def deskew(im, max_skew=10):
height, width = im.shape
# Create a grayscale image and denoise it
im_gs = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
im_gs = cv2.fastNlMeansDenoising(im_gs, h=3)
@meetkabeershah
meetkabeershah / default nginx configuration file
Last active January 11, 2025 12:19
The default nginx configuration file inside /etc/nginx/sites-available/default
# Author: Zameer Ansari
# You should look at the following URL's in order to grasp a solid understanding
# of Nginx configuration files in order to fully unleash the power of Nginx.
# http://wiki.nginx.org/Pitfalls
# http://wiki.nginx.org/QuickStart
# http://wiki.nginx.org/Configuration
#
# Generally, you will want to move this file somewhere, and start with a clean
# file but keep this around for reference. Or just disable in sites-enabled.
#
@f0k
f0k / cuda_check.py
Last active December 5, 2024 13:35
Simple python script to obtain CUDA device information
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Outputs some information on CUDA-enabled devices on your computer,
including current memory usage.
It's a port of https://gist.github.com/f0k/0d6431e3faa60bffc788f8b4daa029b1
from C to Python with ctypes, so it can run without compiling anything. Note
that this is a direct translation with no attempt to make the code Pythonic.
@ihoneymon
ihoneymon / how-to-write-by-markdown.md
Last active April 19, 2025 04:37
๋งˆํฌ๋‹ค์šด(Markdown) ์‚ฌ์šฉ๋ฒ•

[๊ณตํ†ต] ๋งˆํฌ๋‹ค์šด markdown ์ž‘์„ฑ๋ฒ•

์˜์–ด์ง€๋งŒ, ์กฐ๊ธˆ ๋” ์ƒ์„ธํ•˜๊ฒŒ ๋งˆํฌ๋‹ค์šด ์‚ฌ์šฉ๋ฒ•์„ ์•ˆ๋‚ดํ•˜๊ณ  ์žˆ๋Š”
"Markdown Guide (https://www.markdownguide.org/)" ๋ฅผ ๋ณด์‹œ๋Š” ๊ฒƒ์„ ์ถ”์ฒœํ•ฉ๋‹ˆ๋‹ค. ^^

์•„, ๊ทธ๋ฆฌ๊ณ  ๋งˆํฌ๋‹ค์šด๋งŒ์œผ๋กœ ํ‘œํ˜„์ด ๋ถ€์กฑํ•˜๋‹ค๊ณ  ๋Š๋ผ์‹ ๋‹ค๋ฉด, HTML ํƒœ๊ทธ๋ฅผ ํ™œ์šฉํ•˜์‹œ๋Š” ๊ฒƒ๋„ ์ข‹์Šต๋‹ˆ๋‹ค.

1. ๋งˆํฌ๋‹ค์šด์— ๊ด€ํ•˜์—ฌ