Tian Cao tonycao

This worked on 14/May/23. The instructions will probably require updating in the future.

llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet. It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)

Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.

It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.

Clone llama.cpp from git, I am on commit 08737ef720f0510c7ec2aa84d7f70c691073c35d.

Deep Learning for Face Recognition (May 2016)

Popular architectures

FaceNet (Google)
- They use a triplet loss with the goal of keeping the L2 intra-class distances low and inter-class distances high
DeepID (Hong Kong University)
- They use verification and identification signals to train the network. Afer each convolutional layer there is an identity layer connected to the supervisory signals in order to train each layer closely (on top of normal backprop)
DeepFace (Facebook)
- Convs followed by locally connected, followed by fully connected

Key-Value Memory Networks for Directly Reading Documents

Introduction

Knowledge Bases (KBs) are effective tools for Question Answering (QA) but are often too restrictive (due to fixed schema) and too sparse (due to limitations of Information Extraction (IE) systems).
The paper proposes Key-Value Memory Networks, a neural network architecture based on Memory Networks that can leverage both KBs and raw data for QA.
The paper also introduces MOVIEQA, a new QA dataset that can be answered by a perfect KB, by Wikipedia pages and by an imperfect KB obtained using IE techniques thereby allowing a comparison between systems using any of the three sources.
Link to the paper.

Related Work

Deep Learning for Face Recognition (May 2016)

Popular architectures

FaceNet (Google)
- They use a triplet loss with the goal of keeping the L2 intra-class distances low and inter-class distances high
DeepID (Hong Kong University)
- They use verification and identification signals to train the network. Afer each convolutional layer there is an identity layer connected to the supervisory signals in order to train each layer closely (on top of normal backprop)
DeepFace (Facebook)
- Convs followed by locally connected, followed by fully connected

System Design Cheatsheet

Picking the right architecture = Picking the right battles + Managing trade-offs

Basic Steps

Clarify and agree on the scope of the system

User cases (description of sequences of events that, taken together, lead to a system doing something useful)
- Who is going to use it?
- How are they going to use it?

KC60 Keyboard end-user tips, tricks, programming notes, etc.

Leimi's note: removed lots of stuff from the original gist of scottjl, (thanks to him by the way!), as in the end some wasn't useful for me. If you are intestered, go check the gist revisions.

The KC60 is kinda like a premade GH60 that was first sold on Massdrop during summer 2015.
It runs on TMK firmware, or something based on it at least (not sure this is the real source for the keyboard but it seems it is), which means it's heavily programmable.
There is a GUI tool (the source of this tool seems to be here) and a command-line tool to ease up the process of programming the board.
**Go check this great article on Key

	# coding=utf-8
	# Copyright 2023 The HuggingFace Inc. team. All rights reserved.
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software

	set itermRunning to (application "iTerm" is running)
	set scriptPath to quoted form of POSIX path of ((path to me as text) & "::" & "start.sh")
	set user_shell to do shell script "dscl /Search -read /Users/$USER UserShell \| awk '{print $2}'"

	tell application "iTerm"
	activate
	if not (exists window 1) or (itermRunning = false) then
	reopen
	end if

	# from https://gist.github.com/MattDMo/6cb1dfbe8a124e1ca5af

	import os
	import json
	import socket
	import threading

	activate_this = os.environ.get("SUBLIMEREPL_ACTIVATE_THIS", None)

	# turn off pager