Skip to content

Instantly share code, notes, and snippets.

View klintan's full-sized avatar

Andreas Klintberg klintan

View GitHub Profile
@tamuhey
tamuhey / tokenizations_post.md
Last active July 27, 2024 14:46
How to calculate the alignment between BERT and spaCy tokens effectively and robustly

How to calculate the alignment between BERT and spaCy tokens effectively and robustly

image

site: https://tamuhey.github.io/tokenizations/

Natural Language Processing (NLP) has made great progress in recent years because of neural networks, which allows us to solve various tasks with end-to-end architecture. However, many NLP systems still require language-specific pre- and post-processing, especially in tokenizations. In this article, I describe an algorithm that simplifies calculating correspondence between tokens (e.g. BERT vs. spaCy), one such process. And I introduce Python and Rust libraries that implement this algorithm. Here are the library and the demo site links:

@lucasw
lucasw / create_cloud_xyzrgb.py
Created April 25, 2018 00:08
Create PointCloud2 with python with rgb
#!/usr/bin/env python
# PointCloud2 color cube
# https://answers.ros.org/question/289576/understanding-the-bytes-in-a-pcl2-message/
import rospy
import struct
from sensor_msgs import point_cloud2
from sensor_msgs.msg import PointCloud2, PointField
from std_msgs.msg import Header
import nltk
from nltk.tokenize.treebank import TreebankWordTokenizer
class TreebankSpanTokenizer(TreebankWordTokenizer):
def __init__(self):
self._word_tokenizer = TreebankWordTokenizer()
def span_tokenize(self, text):
@dirko
dirko / keras_bidirectional_tagger.py
Created August 11, 2016 05:32
Keras bidirectional LSTM NER tagger
# Keras==1.0.6
from keras.models import Sequential
import numpy as np
from keras.layers.recurrent import LSTM
from keras.layers.core import TimeDistributedDense, Activation
from keras.preprocessing.sequence import pad_sequences
from keras.layers.embeddings import Embedding
from sklearn.cross_validation import train_test_split
from keras.layers import Merge
from keras.backend import tf
@martinapugliese
martinapugliese / boto_dynamodb_methods.py
Last active June 30, 2024 10:00
Some wrapper methods to deal with DynamoDB databases in Python, using boto3.
# Copyright (C) 2016 Martina Pugliese
from boto3 import resource
from boto3.dynamodb.conditions import Key
# The boto3 dynamoDB resource
dynamodb_resource = resource('dynamodb')
def get_table_metadata(table_name):
@DaniSancas
DaniSancas / neo4j_cypher_cheatsheet.md
Created June 14, 2016 23:52
Neo4j's Cypher queries cheatsheet

Neo4j Tutorial

Fundamentals

Store any kind of data using the following graph concepts:

  • Node: Graph data records
  • Relationship: Connect nodes (has direction and a type)
  • Property: Stores data in key-value pair in nodes and relationships
  • Label: Groups nodes and relationships (optional)
@baraldilorenzo
baraldilorenzo / readme.md
Last active January 14, 2025 11:07
VGG-16 pre-trained model for Keras

##VGG16 model for Keras

This is the Keras model of the 16-layer network used by the VGG team in the ILSVRC-2014 competition.

It has been obtained by directly converting the Caffe model provived by the authors.

Details about the network architecture can be found in the following arXiv paper:

Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan, A. Zisserman

@alfard
alfard / Tree_Cart_clean.py
Last active December 5, 2017 16:11
Cart algorithm
import numpy as np
import random
class Node:
def __init__(self,t,L,R,D,S,V,M,X):
self.t=t
self.L=L
self.R=R
self.D=D
@0asa
0asa / sklearn-pyspark.py
Created January 27, 2015 11:12
Run a Scikit-Learn algorithm on top of Spark with PySpark
from pyspark import SparkConf, SparkContext
from sklearn.datasets import make_classification
from sklearn.ensemble import ExtraTreesClassifier
import pandas as pd
import numpy as np
conf = (SparkConf()
.setMaster("local[*]")
.setAppName("My app")
.set("spark.executor.memory", "1g"))

Experimental Generation of Interpersonal Closeness

Instructions to Subjects Included With Task Slips Packet

This is a study of interpersonal closeness, and your task, which we think will be quite enjoyable, is simply to get close to your partner. We believe that the best way for you to get close to your partner is for you to share with them and for them to share with you. Of course, when we advise you about getting close to your partner, we are giving advice regarding your behavior in this demonstration only, we are not advising you about your behavior outside of this demonstration.

In order to help you get close we've arranged for the two of you to engage in a kind of sharing game. You're sharing time will be for about one hour, after which time we ask you to fill out a questionnaire concerning your experience of getting close to your partner.

You have been given three sets of slips. Each slip has a question or a task written on it. As soon as you both finish reading these instructions, you should