Skip to content

Instantly share code, notes, and snippets.

@Chrislu30604
Created December 24, 2018 10:07
Show Gist options
  • Save Chrislu30604/35674ce1f2a71a9afbe34ac109fd0337 to your computer and use it in GitHub Desktop.
Save Chrislu30604/35674ce1f2a71a9afbe34ac109fd0337 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Defaultdict & Counter"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Defaultdict"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### What' s defaultdict ?\n",
"- defaultdict is a subclass of the built-in dict class\n",
"- Like **dict** but we don't care about the key whether it exists or not. (dict will throw KeyError if the key doesn't exist)\n",
"- Generate default Key\n",
"\n",
"### Defaultdict feature\n",
"1. default_factory : When initializing defaultdict, the first parameter pass the factory.\n",
"2. __missing__(key) : If the key doesn't exist, it call default_factory to create a default key for correspond value."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Common way to deal with dict"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'a': 3, 'b': 1, 'c': 2, 'd': 1, 'e': 1}\n"
]
}
],
"source": [
"alphabeticList = ['a', 'a', 'a', 'b', 'c', 'c', 'd', 'e']\n",
"countList = {}\n",
"for element in alphabeticList:\n",
" if element not in countList:\n",
" countList[element] = 1\n",
" else:\n",
" countList[element] += 1\n",
" \n",
"print(countList)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'apple': [2, 6], 'banana': [4], 'cake': [4]}\n"
]
}
],
"source": [
"keyValuePair = [('apple', 2), ('banana', 4), ('cake', 4), ('apple', 6)]\n",
"\n",
"countValue = {}\n",
"for key, value in keyValuePair:\n",
" if key not in countValue:\n",
" countValue[key] = [value]\n",
" else:\n",
" countValue[key].append(value)\n",
"print(countValue)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Use Defaultdict to generate default value"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[('i', 4), ('m', 1), ('p', 2), ('s', 4)]"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from collections import defaultdict\n",
"s = 'mississippi'\n",
"d = defaultdict(int)\n",
"for k in s:\n",
" d[k] += 1\n",
"\n",
"sorted(d.items())"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[]\n",
"[1, 2, 3]\n"
]
}
],
"source": [
"better_dict = defaultdict(list)\n",
"check_default = better_dict['a']\n",
"print(check_default)\n",
"\n",
"better_dict['b'].append(1) \n",
"better_dict['b'].append(2)\n",
"better_dict['b'].append(3)\n",
"print(better_dict['b'])"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"defaultdict(<class 'list'>, {'even': [2, 8], 'odd': [1, 3, 7], 'float': [2.4]})\n"
]
}
],
"source": [
"from collections import defaultdict\n",
"\n",
"multi_dict = defaultdict(list) \n",
"key_values = [('even',2),('odd',1),('even',8),('odd',3),('float',2.4),('odd',7)]\n",
"\n",
"for key,value in key_values:\n",
" multi_dict[key].append(value)\n",
"\n",
"print(multi_dict) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### A special way"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'John ran to <missing>'"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def constant_factory(value):\n",
" return lambda: value\n",
"d = defaultdict(constant_factory('<missing>'))\n",
"d.update(name='John', action='ran')\n",
"'%(name)s %(action)s to %(object)s' % d"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[('blue', {2, 4}), ('red', {1, 3})]"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)]\n",
"d = defaultdict(set)\n",
"for k, v in s:\n",
" d[k].add(v)\n",
"\n",
"sorted(d.items())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Counter"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### What 's counter ?\n",
"- A Counter is a dict subclass for counting hashable objects.\n",
"- It is a collection where elements are stored as dictionary keys and their counts are stored as dictionary values"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'red': 2, 'blue': 3, 'green': 1})"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from collections import Counter\n",
"\n",
"cnt = Counter()\n",
"for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:\n",
" cnt[word] += 1\n",
"cnt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialize form"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"c = Counter() # a new, empty counter\n",
"c = Counter('gallahad') # a new counter from an iterable\n",
"c = Counter({'red': 4, 'blue': 2}) # a new counter from a mapping\n",
"c = Counter(cats=4, dogs=8) # a new counter from keyword args"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If the object doesn't exist, it return zero"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"c = Counter(['eggs', 'ham'])\n",
"c['bacon']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Common methods"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### elements()"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['a', 'a', 'a', 'a', 'b', 'b']"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"c = Counter(a=4, b=2, c=0, d=-2)\n",
"sorted(c.elements())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### most_common([n])"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[('a', 5), ('b', 2), ('r', 2)]"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Counter('abracadabra').most_common(3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### subtract([iterable-or-mapping])"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'a': 3, 'b': 0, 'c': -3, 'd': -6})"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"c = Counter(a=4, b=2, c=0, d=-2)\n",
"d = Counter(a=1, b=2, c=3, d=4)\n",
"c.subtract(d)\n",
"c"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Other methods"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'a': 3, 'b': 2})"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"c = Counter(a=3, b=1)\n",
"d = Counter(a=1, b=2)\n",
"print(c + d) # add two counters together: c[x] + d[x]\n",
"Counter({'a': 4, 'b': 3})\n",
"print(c - d # subtract (keeping only positive counts)\n",
"Counter({'a': 2})\n",
"print(c & d # intersection: min(c[x], d[x]) # doctest: +SKIP\n",
"Counter({'a': 1, 'b': 1})\n",
"print(c | d # union: max(c[x], d[x])\n",
"Counter({'a': 3, 'b': 2})"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment