Created
December 24, 2018 10:07
-
-
Save Chrislu30604/35674ce1f2a71a9afbe34ac109fd0337 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Defaultdict & Counter" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Defaultdict" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### What' s defaultdict ?\n", | |
"- defaultdict is a subclass of the built-in dict class\n", | |
"- Like **dict** but we don't care about the key whether it exists or not. (dict will throw KeyError if the key doesn't exist)\n", | |
"- Generate default Key\n", | |
"\n", | |
"### Defaultdict feature\n", | |
"1. default_factory : When initializing defaultdict, the first parameter pass the factory.\n", | |
"2. __missing__(key) : If the key doesn't exist, it call default_factory to create a default key for correspond value." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Common way to deal with dict" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"{'a': 3, 'b': 1, 'c': 2, 'd': 1, 'e': 1}\n" | |
] | |
} | |
], | |
"source": [ | |
"alphabeticList = ['a', 'a', 'a', 'b', 'c', 'c', 'd', 'e']\n", | |
"countList = {}\n", | |
"for element in alphabeticList:\n", | |
" if element not in countList:\n", | |
" countList[element] = 1\n", | |
" else:\n", | |
" countList[element] += 1\n", | |
" \n", | |
"print(countList)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"{'apple': [2, 6], 'banana': [4], 'cake': [4]}\n" | |
] | |
} | |
], | |
"source": [ | |
"keyValuePair = [('apple', 2), ('banana', 4), ('cake', 4), ('apple', 6)]\n", | |
"\n", | |
"countValue = {}\n", | |
"for key, value in keyValuePair:\n", | |
" if key not in countValue:\n", | |
" countValue[key] = [value]\n", | |
" else:\n", | |
" countValue[key].append(value)\n", | |
"print(countValue)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Use Defaultdict to generate default value" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 13, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[('i', 4), ('m', 1), ('p', 2), ('s', 4)]" | |
] | |
}, | |
"execution_count": 13, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"from collections import defaultdict\n", | |
"s = 'mississippi'\n", | |
"d = defaultdict(int)\n", | |
"for k in s:\n", | |
" d[k] += 1\n", | |
"\n", | |
"sorted(d.items())" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 11, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[]\n", | |
"[1, 2, 3]\n" | |
] | |
} | |
], | |
"source": [ | |
"better_dict = defaultdict(list)\n", | |
"check_default = better_dict['a']\n", | |
"print(check_default)\n", | |
"\n", | |
"better_dict['b'].append(1) \n", | |
"better_dict['b'].append(2)\n", | |
"better_dict['b'].append(3)\n", | |
"print(better_dict['b'])" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 12, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"defaultdict(<class 'list'>, {'even': [2, 8], 'odd': [1, 3, 7], 'float': [2.4]})\n" | |
] | |
} | |
], | |
"source": [ | |
"from collections import defaultdict\n", | |
"\n", | |
"multi_dict = defaultdict(list) \n", | |
"key_values = [('even',2),('odd',1),('even',8),('odd',3),('float',2.4),('odd',7)]\n", | |
"\n", | |
"for key,value in key_values:\n", | |
" multi_dict[key].append(value)\n", | |
"\n", | |
"print(multi_dict) " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### A special way" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 17, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'John ran to <missing>'" | |
] | |
}, | |
"execution_count": 17, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"def constant_factory(value):\n", | |
" return lambda: value\n", | |
"d = defaultdict(constant_factory('<missing>'))\n", | |
"d.update(name='John', action='ran')\n", | |
"'%(name)s %(action)s to %(object)s' % d" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 19, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[('blue', {2, 4}), ('red', {1, 3})]" | |
] | |
}, | |
"execution_count": 19, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)]\n", | |
"d = defaultdict(set)\n", | |
"for k, v in s:\n", | |
" d[k].add(v)\n", | |
"\n", | |
"sorted(d.items())" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Counter" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### What 's counter ?\n", | |
"- A Counter is a dict subclass for counting hashable objects.\n", | |
"- It is a collection where elements are stored as dictionary keys and their counts are stored as dictionary values" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 20, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"Counter({'red': 2, 'blue': 3, 'green': 1})" | |
] | |
}, | |
"execution_count": 20, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"from collections import Counter\n", | |
"\n", | |
"cnt = Counter()\n", | |
"for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:\n", | |
" cnt[word] += 1\n", | |
"cnt" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Initialize form" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 21, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"c = Counter() # a new, empty counter\n", | |
"c = Counter('gallahad') # a new counter from an iterable\n", | |
"c = Counter({'red': 4, 'blue': 2}) # a new counter from a mapping\n", | |
"c = Counter(cats=4, dogs=8) # a new counter from keyword args" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"If the object doesn't exist, it return zero" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 22, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"0" | |
] | |
}, | |
"execution_count": 22, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"c = Counter(['eggs', 'ham'])\n", | |
"c['bacon']" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Common methods" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### elements()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 23, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"['a', 'a', 'a', 'a', 'b', 'b']" | |
] | |
}, | |
"execution_count": 23, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"c = Counter(a=4, b=2, c=0, d=-2)\n", | |
"sorted(c.elements())" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### most_common([n])" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 24, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[('a', 5), ('b', 2), ('r', 2)]" | |
] | |
}, | |
"execution_count": 24, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"Counter('abracadabra').most_common(3)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### subtract([iterable-or-mapping])" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 25, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"Counter({'a': 3, 'b': 0, 'c': -3, 'd': -6})" | |
] | |
}, | |
"execution_count": 25, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"c = Counter(a=4, b=2, c=0, d=-2)\n", | |
"d = Counter(a=1, b=2, c=3, d=4)\n", | |
"c.subtract(d)\n", | |
"c" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Other methods" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 26, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"Counter({'a': 3, 'b': 2})" | |
] | |
}, | |
"execution_count": 26, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"c = Counter(a=3, b=1)\n", | |
"d = Counter(a=1, b=2)\n", | |
"print(c + d) # add two counters together: c[x] + d[x]\n", | |
"Counter({'a': 4, 'b': 3})\n", | |
"print(c - d # subtract (keeping only positive counts)\n", | |
"Counter({'a': 2})\n", | |
"print(c & d # intersection: min(c[x], d[x]) # doctest: +SKIP\n", | |
"Counter({'a': 1, 'b': 1})\n", | |
"print(c | d # union: max(c[x], d[x])\n", | |
"Counter({'a': 3, 'b': 2})" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.6.5" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment