Skip to content

Instantly share code, notes, and snippets.

@muellerzr
Created June 2, 2021 19:45
Show Gist options
  • Save muellerzr/3302ee373a303da54efaf492faab31a2 to your computer and use it in GitHub Desktop.
Save muellerzr/3302ee373a303da54efaf492faab31a2 to your computer and use it in GitHub Desktop.
Source Code Tracking
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Source Code Tracking",
"provenance": [],
"collapsed_sections": [],
"include_colab_link": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/muellerzr/3302ee373a303da54efaf492faab31a2/source-code-tracking.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_HtUNfd0OEWi"
},
"source": [
"# Source Code Tracking with fastai and Other Libraries\n",
"\n",
"## By Zachary Mueller"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jGP269WlOIAY"
},
"source": [
"I've written a series of three quick functions to help track and consolidate source code when looking at either the fastai library, or other libraries."
]
},
{
"cell_type": "code",
"metadata": {
"id": "lIYdn1woOS1n"
},
"source": [
"!pip install fastai -U >> /dev/null"
],
"execution_count": 1,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "I3w8Wdryz4cE"
},
"source": [
"from fastai.vision.all import *\n",
"import inspect"
],
"execution_count": 20,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "UypLYyUrOPYJ"
},
"source": [
"First let's patch on some random function onto `TensorBase`:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "IkpmtXVB0EOJ"
},
"source": [
"@patch\n",
"def myfunc(o:TensorBase): print(\"Hello!\")"
],
"execution_count": 21,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "LZgDe2D_OUAv"
},
"source": [
"Next let's dive into why that's needed. My patched function currently does not show up in `TensorBase??`, and there's no way to see the source code directly. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "38l6M5ynQPyy"
},
"source": [
"## `trace_class`\n",
"\n",
"\n",
"`trace_class` can take in a list of libraries and trace back the original class and all its related source code and return it to you, along with its filenames and function names"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Qv3ddR8P3z2p"
},
"source": [
"libname = 'fastai'"
],
"execution_count": 22,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "8laaUPlc4_Py"
},
"source": [
"def trace_class(obj, libnames):\n",
" \"Gets list of relevant functions in object stemming from `libname`\"\n",
" funcs, code, fnames = [], [], []\n",
" for nm in dir(obj):\n",
" func = str(nested_attr(obj, f'{nm}.__module__'))\n",
" for libname in libnames:\n",
" if libname in func or '__main__' in func:\n",
" funcs.append(nm)\n",
" code.append(inspect.getsource(getattr(obj, nm)))\n",
" fnames.append(func)\n",
" return funcs, code, fnames"
],
"execution_count": 23,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "yLpVxUyKOkgh"
},
"source": [
"funcs, code, fnames = trace_class(TensorBase, ['fastai'])"
],
"execution_count": 25,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "ukUOCvv3OsvR"
},
"source": [
"Note: Passing in a list of library names helps us filter out unneeded library code/files (such as items originating from `torch` when we just care about `fastai`)"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "rt-RhpZPOqzr",
"outputId": "c84a66e2-9fa5-4519-ee41-5e65433c2bbf"
},
"source": [
"funcs"
],
"execution_count": 27,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['__array_eq__',\n",
" '__new__',\n",
" '__reduce_ex__',\n",
" '__repr__',\n",
" '__torch_function__',\n",
" '_before_cast',\n",
" 'as_subclass',\n",
" 'interp_1d',\n",
" 'myfunc',\n",
" 'new',\n",
" 'new_ones',\n",
" 'new_tensor',\n",
" 'pca',\n",
" 'register_func',\n",
" 'requires_grad_',\n",
" 'set_meta']"
]
},
"metadata": {
"tags": []
},
"execution_count": 27
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3YHilx5PO9R8"
},
"source": [
"Next we can print out all the source code:"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "z5NYBaNgO8QG",
"outputId": "a5939e66-b2f4-48cb-8656-ee8f84ab5a18"
},
"source": [
"for c in code: print(c)"
],
"execution_count": 29,
"outputs": [
{
"output_type": "stream",
"text": [
"@patch\n",
"def __array_eq__(self:Tensor,b):\n",
" return torch.equal(self,b) if self.dim() else self==b\n",
"\n",
" def __new__(cls, x, **kwargs):\n",
" res = cast(tensor(x), cls)\n",
" for k,v in kwargs.items(): setattr(res, k, v)\n",
" return res\n",
"\n",
" def __reduce_ex__(self,proto):\n",
" torch.utils.hooks.warn_if_has_hooks(self)\n",
" args = (type(self), self.storage(), self.storage_offset(), tuple(self.size()), self.stride())\n",
" if self.is_quantized: args = args + (self.q_scale(), self.q_zero_point())\n",
" f = _fa_rebuild_qtensor if self.is_quantized else _fa_rebuild_tensor\n",
" return (f, args + (self.requires_grad, OrderedDict()))\n",
"\n",
" def __repr__(self): return re.sub('tensor', self.__class__.__name__, super().__repr__())\n",
"\n",
" def __torch_function__(self, func, types, args=(), kwargs=None):\n",
" if self.debug and func.__name__ not in ('__str__','__repr__'): print(func, types, args, kwargs)\n",
" convert=False\n",
" if _torch_handled(args, self._opt, func): convert,types = type(self),(torch.Tensor,)\n",
" res = super().__torch_function__(func, types, args=args, kwargs=kwargs)\n",
" if convert: res = convert(res)\n",
" if isinstance(res, TensorBase): res.set_meta(self, as_copy=True)\n",
" return res\n",
"\n",
" @classmethod\n",
" def _before_cast(cls, x): return tensor(x)\n",
"\n",
"@patch\n",
"def as_subclass(self:Tensor, typ):\n",
" \"Cast to `typ` and include `__dict__` and meta\"\n",
" return retain_meta(self, torch.as_subclass(self, typ))\n",
"\n",
"@patch\n",
"def interp_1d(x:Tensor, xp, fp):\n",
" \"Same as `np.interp`\"\n",
" slopes = (fp[1:]-fp[:-1])/(xp[1:]-xp[:-1])\n",
" incx = fp[:-1] - (slopes*xp[:-1])\n",
" locs = (x[:,None]>=xp[None,:]).long().sum(1)-1\n",
" locs = locs.clamp(0,len(slopes)-1)\n",
" return slopes[locs]*x + incx[locs]\n",
"\n",
"@patch\n",
"def myfunc(o:TensorBase): print(\"Hello!\")\n",
"\n",
" def new(self, x=None):\n",
" cls = type(self)\n",
" res = self.as_subclass(Tensor).new() if x is None else self.as_subclass(Tensor).new(x)\n",
" return res.as_subclass(cls)\n",
"\n",
" def new_ones(self, data, dtype=None, device=None, requires_grad=False):\n",
" cls = type(self)\n",
" return self.as_subclass(Tensor).new_ones(data, dtype=dtype, device=device, requires_grad=requires_grad).as_subclass(cls)\n",
"\n",
" def new_tensor(self, size, dtype=None, device=None, requires_grad=False):\n",
" cls = type(self)\n",
" return self.as_subclass(Tensor).new_tensor(size, dtype=dtype, device=device, requires_grad=requires_grad).as_subclass(cls)\n",
"\n",
"@patch\n",
"def pca(x:Tensor, k=2):\n",
" \"Compute PCA of `x` with `k` dimensions.\"\n",
" x = x-torch.mean(x,0)\n",
" U,S,V = torch.svd(x.t())\n",
" return torch.mm(x,U[:,:k])\n",
"\n",
" @classmethod\n",
" def register_func(cls, func, *oks): cls._opt[func].append(oks)\n",
"\n",
" def requires_grad_(self, requires_grad=True):\n",
" # Workaround https://github.com/pytorch/pytorch/issues/50219\n",
" self.requires_grad = requires_grad\n",
" return self\n",
"\n",
"@patch\n",
"def set_meta(self:Tensor, x, as_copy=False):\n",
" \"Set all metadata in `__dict__`\"\n",
" if not hasattr(x,'__dict__'): return\n",
" # XXX: change to `deepcopy` once PyTorch 1.7.1 is out, and check nb 23 segmentation fit works\n",
" self.__dict__ = copy(x.__dict__) if as_copy else x.__dict__\n",
"\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "k5MB8czAOsYP"
},
"source": [
"And we can see that any and all `@patch`'d functions show up!\n",
"\n",
"Along with this, we have a list of filenames these functions stem from:"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "vuyztFMxPGcE",
"outputId": "928f1931-3ac0-4281-ed9e-dce221de918c"
},
"source": [
"fnames"
],
"execution_count": 30,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['fastai.torch_core',\n",
" 'fastai.torch_core',\n",
" 'fastai.torch_core',\n",
" 'fastai.torch_core',\n",
" 'fastai.torch_core',\n",
" 'fastai.torch_core',\n",
" 'fastai.torch_core',\n",
" 'fastai.torch_core',\n",
" '__main__',\n",
" 'fastai.torch_core',\n",
" 'fastai.torch_core',\n",
" 'fastai.torch_core',\n",
" 'fastai.torch_core',\n",
" 'fastai.torch_core',\n",
" 'fastai.torch_core',\n",
" 'fastai.torch_core']"
]
},
"metadata": {
"tags": []
},
"execution_count": 30
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zVluL2-DPHs2"
},
"source": [
"Including when we `@patch`'d ourselves.\n",
"\n",
"This also works for regular classes too not stemming from fastai:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "kr3iqAKmPLL3"
},
"source": [
"funcs, code, fnames = trace_class(pd.DataFrame, ['pandas'])"
],
"execution_count": 31,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "y5srytsjPSW2",
"outputId": "1d36515e-a101-46f5-854f-f68646d486d0"
},
"source": [
"funcs[:5]"
],
"execution_count": 33,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['__abs__', '__add__', '__and__', '__array__', '__array_wrap__']"
]
},
"metadata": {
"tags": []
},
"execution_count": 33
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Q6lCP6jxPUAT",
"outputId": "4d0bc58e-2edf-4888-f34f-1b139a55336b"
},
"source": [
"for c in code[-5:]: print(c)"
],
"execution_count": 36,
"outputs": [
{
"output_type": "stream",
"text": [
" def update(\n",
" self, other, join=\"left\", overwrite=True, filter_func=None, errors=\"ignore\"\n",
" ) -> None:\n",
" \"\"\"\n",
" Modify in place using non-NA values from another DataFrame.\n",
"\n",
" Aligns on indices. There is no return value.\n",
"\n",
" Parameters\n",
" ----------\n",
" other : DataFrame, or object coercible into a DataFrame\n",
" Should have at least one matching index/column label\n",
" with the original DataFrame. If a Series is passed,\n",
" its name attribute must be set, and that will be\n",
" used as the column name to align with the original DataFrame.\n",
" join : {'left'}, default 'left'\n",
" Only left join is implemented, keeping the index and columns of the\n",
" original object.\n",
" overwrite : bool, default True\n",
" How to handle non-NA values for overlapping keys:\n",
"\n",
" * True: overwrite original DataFrame's values\n",
" with values from `other`.\n",
" * False: only update values that are NA in\n",
" the original DataFrame.\n",
"\n",
" filter_func : callable(1d-array) -> bool 1d-array, optional\n",
" Can choose to replace values other than NA. Return True for values\n",
" that should be updated.\n",
" errors : {'raise', 'ignore'}, default 'ignore'\n",
" If 'raise', will raise a ValueError if the DataFrame and `other`\n",
" both contain non-NA data in the same place.\n",
"\n",
" .. versionchanged:: 0.24.0\n",
" Changed from `raise_conflict=False|True`\n",
" to `errors='ignore'|'raise'`.\n",
"\n",
" Returns\n",
" -------\n",
" None : method directly changes calling object\n",
"\n",
" Raises\n",
" ------\n",
" ValueError\n",
" * When `errors='raise'` and there's overlapping non-NA data.\n",
" * When `errors` is not either `'ignore'` or `'raise'`\n",
" NotImplementedError\n",
" * If `join != 'left'`\n",
"\n",
" See Also\n",
" --------\n",
" dict.update : Similar method for dictionaries.\n",
" DataFrame.merge : For column(s)-on-columns(s) operations.\n",
"\n",
" Examples\n",
" --------\n",
" >>> df = pd.DataFrame({'A': [1, 2, 3],\n",
" ... 'B': [400, 500, 600]})\n",
" >>> new_df = pd.DataFrame({'B': [4, 5, 6],\n",
" ... 'C': [7, 8, 9]})\n",
" >>> df.update(new_df)\n",
" >>> df\n",
" A B\n",
" 0 1 4\n",
" 1 2 5\n",
" 2 3 6\n",
"\n",
" The DataFrame's length does not increase as a result of the update,\n",
" only values at matching index/column labels are updated.\n",
"\n",
" >>> df = pd.DataFrame({'A': ['a', 'b', 'c'],\n",
" ... 'B': ['x', 'y', 'z']})\n",
" >>> new_df = pd.DataFrame({'B': ['d', 'e', 'f', 'g', 'h', 'i']})\n",
" >>> df.update(new_df)\n",
" >>> df\n",
" A B\n",
" 0 a d\n",
" 1 b e\n",
" 2 c f\n",
"\n",
" For Series, it's name attribute must be set.\n",
"\n",
" >>> df = pd.DataFrame({'A': ['a', 'b', 'c'],\n",
" ... 'B': ['x', 'y', 'z']})\n",
" >>> new_column = pd.Series(['d', 'e'], name='B', index=[0, 2])\n",
" >>> df.update(new_column)\n",
" >>> df\n",
" A B\n",
" 0 a d\n",
" 1 b y\n",
" 2 c e\n",
" >>> df = pd.DataFrame({'A': ['a', 'b', 'c'],\n",
" ... 'B': ['x', 'y', 'z']})\n",
" >>> new_df = pd.DataFrame({'B': ['d', 'e']}, index=[1, 2])\n",
" >>> df.update(new_df)\n",
" >>> df\n",
" A B\n",
" 0 a x\n",
" 1 b d\n",
" 2 c e\n",
"\n",
" If `other` contains NaNs the corresponding values are not updated\n",
" in the original dataframe.\n",
"\n",
" >>> df = pd.DataFrame({'A': [1, 2, 3],\n",
" ... 'B': [400, 500, 600]})\n",
" >>> new_df = pd.DataFrame({'B': [4, np.nan, 6]})\n",
" >>> df.update(new_df)\n",
" >>> df\n",
" A B\n",
" 0 1 4.0\n",
" 1 2 500.0\n",
" 2 3 6.0\n",
" \"\"\"\n",
" import pandas.core.computation.expressions as expressions\n",
"\n",
" # TODO: Support other joins\n",
" if join != \"left\": # pragma: no cover\n",
" raise NotImplementedError(\"Only left join is supported\")\n",
" if errors not in [\"ignore\", \"raise\"]:\n",
" raise ValueError(\"The parameter errors must be either 'ignore' or 'raise'\")\n",
"\n",
" if not isinstance(other, DataFrame):\n",
" other = DataFrame(other)\n",
"\n",
" other = other.reindex_like(self)\n",
"\n",
" for col in self.columns:\n",
" this = self[col]._values\n",
" that = other[col]._values\n",
" if filter_func is not None:\n",
" with np.errstate(all=\"ignore\"):\n",
" mask = ~filter_func(this) | isna(that)\n",
" else:\n",
" if errors == \"raise\":\n",
" mask_this = notna(that)\n",
" mask_that = notna(this)\n",
" if any(mask_this & mask_that):\n",
" raise ValueError(\"Data overlaps.\")\n",
"\n",
" if overwrite:\n",
" mask = isna(that)\n",
" else:\n",
" mask = notna(this)\n",
"\n",
" # don't overwrite columns unnecessarily\n",
" if mask.all():\n",
" continue\n",
"\n",
" self[col] = expressions.where(mask, this, that)\n",
"\n",
" def value_counts(\n",
" self,\n",
" subset: Optional[Sequence[Label]] = None,\n",
" normalize: bool = False,\n",
" sort: bool = True,\n",
" ascending: bool = False,\n",
" ):\n",
" \"\"\"\n",
" Return a Series containing counts of unique rows in the DataFrame.\n",
"\n",
" .. versionadded:: 1.1.0\n",
"\n",
" Parameters\n",
" ----------\n",
" subset : list-like, optional\n",
" Columns to use when counting unique combinations.\n",
" normalize : bool, default False\n",
" Return proportions rather than frequencies.\n",
" sort : bool, default True\n",
" Sort by frequencies.\n",
" ascending : bool, default False\n",
" Sort in ascending order.\n",
"\n",
" Returns\n",
" -------\n",
" Series\n",
"\n",
" See Also\n",
" --------\n",
" Series.value_counts: Equivalent method on Series.\n",
"\n",
" Notes\n",
" -----\n",
" The returned Series will have a MultiIndex with one level per input\n",
" column. By default, rows that contain any NA values are omitted from\n",
" the result. By default, the resulting Series will be in descending\n",
" order so that the first element is the most frequently-occurring row.\n",
"\n",
" Examples\n",
" --------\n",
" >>> df = pd.DataFrame({'num_legs': [2, 4, 4, 6],\n",
" ... 'num_wings': [2, 0, 0, 0]},\n",
" ... index=['falcon', 'dog', 'cat', 'ant'])\n",
" >>> df\n",
" num_legs num_wings\n",
" falcon 2 2\n",
" dog 4 0\n",
" cat 4 0\n",
" ant 6 0\n",
"\n",
" >>> df.value_counts()\n",
" num_legs num_wings\n",
" 4 0 2\n",
" 6 0 1\n",
" 2 2 1\n",
" dtype: int64\n",
"\n",
" >>> df.value_counts(sort=False)\n",
" num_legs num_wings\n",
" 2 2 1\n",
" 4 0 2\n",
" 6 0 1\n",
" dtype: int64\n",
"\n",
" >>> df.value_counts(ascending=True)\n",
" num_legs num_wings\n",
" 2 2 1\n",
" 6 0 1\n",
" 4 0 2\n",
" dtype: int64\n",
"\n",
" >>> df.value_counts(normalize=True)\n",
" num_legs num_wings\n",
" 4 0 0.50\n",
" 6 0 0.25\n",
" 2 2 0.25\n",
" dtype: float64\n",
" \"\"\"\n",
" if subset is None:\n",
" subset = self.columns.tolist()\n",
"\n",
" counts = self.groupby(subset).grouper.size()\n",
"\n",
" if sort:\n",
" counts = counts.sort_values(ascending=ascending)\n",
" if normalize:\n",
" counts /= counts.sum()\n",
"\n",
" # Force MultiIndex for single column\n",
" if len(subset) == 1:\n",
" counts.index = MultiIndex.from_arrays(\n",
" [counts.index], names=[counts.index.name]\n",
" )\n",
"\n",
" return counts\n",
"\n",
" @Substitution(desc=desc, name1=name1, name2=name2, axis_descr=axis_descr)\n",
" @Appender(_num_ddof_doc)\n",
" def stat_func(\n",
" self, axis=None, skipna=None, level=None, ddof=1, numeric_only=None, **kwargs\n",
" ):\n",
" nv.validate_stat_ddof_func(tuple(), kwargs, fname=name)\n",
" if skipna is None:\n",
" skipna = True\n",
" if axis is None:\n",
" axis = self._stat_axis_number\n",
" if level is not None:\n",
" return self._agg_by_level(\n",
" name, axis=axis, level=level, skipna=skipna, ddof=ddof\n",
" )\n",
" return self._reduce(\n",
" func, name, axis=axis, numeric_only=numeric_only, skipna=skipna, ddof=ddof\n",
" )\n",
"\n",
" @doc(\n",
" klass=_shared_doc_kwargs[\"klass\"],\n",
" cond=\"True\",\n",
" cond_rev=\"False\",\n",
" name=\"where\",\n",
" name_other=\"mask\",\n",
" )\n",
" def where(\n",
" self,\n",
" cond,\n",
" other=np.nan,\n",
" inplace=False,\n",
" axis=None,\n",
" level=None,\n",
" errors=\"raise\",\n",
" try_cast=False,\n",
" ):\n",
" \"\"\"\n",
" Replace values where the condition is {cond_rev}.\n",
"\n",
" Parameters\n",
" ----------\n",
" cond : bool {klass}, array-like, or callable\n",
" Where `cond` is {cond}, keep the original value. Where\n",
" {cond_rev}, replace with corresponding value from `other`.\n",
" If `cond` is callable, it is computed on the {klass} and\n",
" should return boolean {klass} or array. The callable must\n",
" not change input {klass} (though pandas doesn't check it).\n",
" other : scalar, {klass}, or callable\n",
" Entries where `cond` is {cond_rev} are replaced with\n",
" corresponding value from `other`.\n",
" If other is callable, it is computed on the {klass} and\n",
" should return scalar or {klass}. The callable must not\n",
" change input {klass} (though pandas doesn't check it).\n",
" inplace : bool, default False\n",
" Whether to perform the operation in place on the data.\n",
" axis : int, default None\n",
" Alignment axis if needed.\n",
" level : int, default None\n",
" Alignment level if needed.\n",
" errors : str, {{'raise', 'ignore'}}, default 'raise'\n",
" Note that currently this parameter won't affect\n",
" the results and will always coerce to a suitable dtype.\n",
"\n",
" - 'raise' : allow exceptions to be raised.\n",
" - 'ignore' : suppress exceptions. On error return original object.\n",
"\n",
" try_cast : bool, default False\n",
" Try to cast the result back to the input type (if possible).\n",
"\n",
" Returns\n",
" -------\n",
" Same type as caller\n",
"\n",
" See Also\n",
" --------\n",
" :func:`DataFrame.{name_other}` : Return an object of same shape as\n",
" self.\n",
"\n",
" Notes\n",
" -----\n",
" The {name} method is an application of the if-then idiom. For each\n",
" element in the calling DataFrame, if ``cond`` is ``{cond}`` the\n",
" element is used; otherwise the corresponding element from the DataFrame\n",
" ``other`` is used.\n",
"\n",
" The signature for :func:`DataFrame.where` differs from\n",
" :func:`numpy.where`. Roughly ``df1.where(m, df2)`` is equivalent to\n",
" ``np.where(m, df1, df2)``.\n",
"\n",
" For further details and examples see the ``{name}`` documentation in\n",
" :ref:`indexing <indexing.where_mask>`.\n",
"\n",
" Examples\n",
" --------\n",
" >>> s = pd.Series(range(5))\n",
" >>> s.where(s > 0)\n",
" 0 NaN\n",
" 1 1.0\n",
" 2 2.0\n",
" 3 3.0\n",
" 4 4.0\n",
" dtype: float64\n",
"\n",
" >>> s.mask(s > 0)\n",
" 0 0.0\n",
" 1 NaN\n",
" 2 NaN\n",
" 3 NaN\n",
" 4 NaN\n",
" dtype: float64\n",
"\n",
" >>> s.where(s > 1, 10)\n",
" 0 10\n",
" 1 10\n",
" 2 2\n",
" 3 3\n",
" 4 4\n",
" dtype: int64\n",
"\n",
" >>> df = pd.DataFrame(np.arange(10).reshape(-1, 2), columns=['A', 'B'])\n",
" >>> df\n",
" A B\n",
" 0 0 1\n",
" 1 2 3\n",
" 2 4 5\n",
" 3 6 7\n",
" 4 8 9\n",
" >>> m = df % 3 == 0\n",
" >>> df.where(m, -df)\n",
" A B\n",
" 0 0 -1\n",
" 1 -2 3\n",
" 2 -4 -5\n",
" 3 6 -7\n",
" 4 -8 9\n",
" >>> df.where(m, -df) == np.where(m, df, -df)\n",
" A B\n",
" 0 True True\n",
" 1 True True\n",
" 2 True True\n",
" 3 True True\n",
" 4 True True\n",
" >>> df.where(m, -df) == df.mask(~m, -df)\n",
" A B\n",
" 0 True True\n",
" 1 True True\n",
" 2 True True\n",
" 3 True True\n",
" 4 True True\n",
" \"\"\"\n",
" other = com.apply_if_callable(other, self)\n",
" return self._where(\n",
" cond, other, inplace, axis, level, errors=errors, try_cast=try_cast\n",
" )\n",
"\n",
" def xs(self, key, axis=0, level=None, drop_level: bool_t = True):\n",
" \"\"\"\n",
" Return cross-section from the Series/DataFrame.\n",
"\n",
" This method takes a `key` argument to select data at a particular\n",
" level of a MultiIndex.\n",
"\n",
" Parameters\n",
" ----------\n",
" key : label or tuple of label\n",
" Label contained in the index, or partially in a MultiIndex.\n",
" axis : {0 or 'index', 1 or 'columns'}, default 0\n",
" Axis to retrieve cross-section on.\n",
" level : object, defaults to first n levels (n=1 or len(key))\n",
" In case of a key partially contained in a MultiIndex, indicate\n",
" which levels are used. Levels can be referred by label or position.\n",
" drop_level : bool, default True\n",
" If False, returns object with same levels as self.\n",
"\n",
" Returns\n",
" -------\n",
" Series or DataFrame\n",
" Cross-section from the original Series or DataFrame\n",
" corresponding to the selected index levels.\n",
"\n",
" See Also\n",
" --------\n",
" DataFrame.loc : Access a group of rows and columns\n",
" by label(s) or a boolean array.\n",
" DataFrame.iloc : Purely integer-location based indexing\n",
" for selection by position.\n",
"\n",
" Notes\n",
" -----\n",
" `xs` can not be used to set values.\n",
"\n",
" MultiIndex Slicers is a generic way to get/set values on\n",
" any level or levels.\n",
" It is a superset of `xs` functionality, see\n",
" :ref:`MultiIndex Slicers <advanced.mi_slicers>`.\n",
"\n",
" Examples\n",
" --------\n",
" >>> d = {'num_legs': [4, 4, 2, 2],\n",
" ... 'num_wings': [0, 0, 2, 2],\n",
" ... 'class': ['mammal', 'mammal', 'mammal', 'bird'],\n",
" ... 'animal': ['cat', 'dog', 'bat', 'penguin'],\n",
" ... 'locomotion': ['walks', 'walks', 'flies', 'walks']}\n",
" >>> df = pd.DataFrame(data=d)\n",
" >>> df = df.set_index(['class', 'animal', 'locomotion'])\n",
" >>> df\n",
" num_legs num_wings\n",
" class animal locomotion\n",
" mammal cat walks 4 0\n",
" dog walks 4 0\n",
" bat flies 2 2\n",
" bird penguin walks 2 2\n",
"\n",
" Get values at specified index\n",
"\n",
" >>> df.xs('mammal')\n",
" num_legs num_wings\n",
" animal locomotion\n",
" cat walks 4 0\n",
" dog walks 4 0\n",
" bat flies 2 2\n",
"\n",
" Get values at several indexes\n",
"\n",
" >>> df.xs(('mammal', 'dog'))\n",
" num_legs num_wings\n",
" locomotion\n",
" walks 4 0\n",
"\n",
" Get values at specified index and level\n",
"\n",
" >>> df.xs('cat', level=1)\n",
" num_legs num_wings\n",
" class locomotion\n",
" mammal walks 4 0\n",
"\n",
" Get values at several indexes and levels\n",
"\n",
" >>> df.xs(('bird', 'walks'),\n",
" ... level=[0, 'locomotion'])\n",
" num_legs num_wings\n",
" animal\n",
" penguin 2 2\n",
"\n",
" Get values at specified column and axis\n",
"\n",
" >>> df.xs('num_wings', axis=1)\n",
" class animal locomotion\n",
" mammal cat walks 0\n",
" dog walks 0\n",
" bat flies 2\n",
" bird penguin walks 2\n",
" Name: num_wings, dtype: int64\n",
" \"\"\"\n",
" axis = self._get_axis_number(axis)\n",
" labels = self._get_axis(axis)\n",
" if level is not None:\n",
" if not isinstance(labels, MultiIndex):\n",
" raise TypeError(\"Index must be a MultiIndex\")\n",
" loc, new_ax = labels.get_loc_level(key, level=level, drop_level=drop_level)\n",
"\n",
" # create the tuple of the indexer\n",
" _indexer = [slice(None)] * self.ndim\n",
" _indexer[axis] = loc\n",
" indexer = tuple(_indexer)\n",
"\n",
" result = self.iloc[indexer]\n",
" setattr(result, result._get_axis_name(axis), new_ax)\n",
" return result\n",
"\n",
" if axis == 1:\n",
" return self[key]\n",
"\n",
" self._consolidate_inplace()\n",
"\n",
" index = self.index\n",
" if isinstance(index, MultiIndex):\n",
" loc, new_index = self.index.get_loc_level(key, drop_level=drop_level)\n",
" else:\n",
" loc = self.index.get_loc(key)\n",
"\n",
" if isinstance(loc, np.ndarray):\n",
" if loc.dtype == np.bool_:\n",
" (inds,) = loc.nonzero()\n",
" return self._take_with_is_copy(inds, axis=axis)\n",
" else:\n",
" return self._take_with_is_copy(loc, axis=axis)\n",
"\n",
" if not is_scalar(loc):\n",
" new_index = self.index[loc]\n",
"\n",
" if is_scalar(loc):\n",
" # In this case loc should be an integer\n",
" if self.ndim == 1:\n",
" # if we encounter an array-like and we only have 1 dim\n",
" # that means that their are list/ndarrays inside the Series!\n",
" # so just return them (GH 6394)\n",
" return self._values[loc]\n",
"\n",
" new_values = self._mgr.fast_xs(loc)\n",
"\n",
" result = self._constructor_sliced(\n",
" new_values,\n",
" index=self.columns,\n",
" name=self.index[loc],\n",
" dtype=new_values.dtype,\n",
" )\n",
"\n",
" else:\n",
" result = self.iloc[loc]\n",
" result.index = new_index\n",
"\n",
" # this could be a view\n",
" # but only in a single-dtyped view sliceable case\n",
" result._set_is_copy(self, copy=not result._is_view)\n",
" return result\n",
"\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "684Kor76PwiO"
},
"source": [
"## `trace_func`\n",
"\n",
"Similarly we have `trace_func`, which can trace functions:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "YAZQ2OaYa_tK"
},
"source": [
"def trace_func(func) -> (str, str, str):\n",
" \"\"\"\n",
" Traces some function and returns its name, the source code, and its source code file\n",
"\n",
" Example usage:\n",
"\n",
" ```\n",
" from fastai.vision.all import *\n",
"\n",
" name, source, fname = trace_func(PILImage.create)\n",
" ```\n",
" \"\"\" \n",
" return func.__func__.__name__, inspect.getsource(func), inspect.getsourcefile(func)"
],
"execution_count": 37,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "duH4jIQCQGER"
},
"source": [
"name, source, fname = trace_func(PILImage.create)"
],
"execution_count": 38,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
},
"id": "VBoOBqnVQIW3",
"outputId": "4d41a95a-ebf5-4127-c481-6099686e1895"
},
"source": [
"name"
],
"execution_count": 40,
"outputs": [
{
"output_type": "execute_result",
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'create'"
]
},
"metadata": {
"tags": []
},
"execution_count": 40
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "fdYY2AYbQKVX",
"outputId": "4be12269-c82f-47ba-9788-4c37ac4a4201"
},
"source": [
"print(source)"
],
"execution_count": 41,
"outputs": [
{
"output_type": "stream",
"text": [
" @classmethod\n",
" def create(cls, fn:(Path,str,Tensor,ndarray,bytes), **kwargs)->None:\n",
" \"Open an `Image` from path `fn`\"\n",
" if isinstance(fn,TensorImage): fn = fn.permute(1,2,0).type(torch.uint8)\n",
" if isinstance(fn, TensorMask): fn = fn.type(torch.uint8)\n",
" if isinstance(fn,Tensor): fn = fn.numpy()\n",
" if isinstance(fn,ndarray): return cls(Image.fromarray(fn))\n",
" if isinstance(fn,bytes): fn = io.BytesIO(fn)\n",
" return cls(load_image(fn, **merge(cls._open_args, kwargs)))\n",
"\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
},
"id": "gdVCO6o5QLX2",
"outputId": "dbd231ba-7c7d-469e-c408-c804808d510e"
},
"source": [
"fname"
],
"execution_count": 42,
"outputs": [
{
"output_type": "execute_result",
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'/usr/local/lib/python3.7/dist-packages/fastai/vision/core.py'"
]
},
"metadata": {
"tags": []
},
"execution_count": 42
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VnG_uq1RQSgo"
},
"source": [
"## `trace_dispatch`\n",
"\n",
"fastai's `typedispatch` decorator makes it extremely confusing to look at the source code and track behavior, as it's never declared in the `__all__` and will not show up when doing a `??`. \n",
"\n",
"`trace_dispatch` fixes this by pulling *all* behavior of a type dispatch function. Unlike `trace_class` `trace_dispatch` doesn't need to know the library to look at;"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Jr5X6Wcac9sT"
},
"source": [
"def trace_dispatch(func):\n",
" code, fnames = [], []\n",
" type_dict = nested_attr(func, 'funcs.d')\n",
" for key, val in type_dict.items():\n",
" for inner_key, inner_val in type_dict[key].d.items():\n",
" code.append(inspect.getsource(inner_val))\n",
" fnames.append(inspect.getsourcefile(inner_val))\n",
" return code, fnames"
],
"execution_count": 43,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "t8AtNwiNQkIZ"
},
"source": [
"code, fnames = trace_dispatch(show_results)"
],
"execution_count": 45,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "0y_S3ktKQlNx",
"outputId": "8e69b3b5-299f-4865-fb8d-cd9eb2bf19a9"
},
"source": [
"for c in code: print(c)"
],
"execution_count": 46,
"outputs": [
{
"output_type": "stream",
"text": [
"@typedispatch\n",
"def show_results(x:TensorImage, y:TensorCategory, samples, outs, ctxs=None, max_n=10, nrows=None, ncols=None, figsize=None, **kwargs):\n",
" if ctxs is None: ctxs = get_grid(min(len(samples), max_n), nrows=nrows, ncols=ncols, add_vert=1, figsize=figsize)\n",
" for i in range(2):\n",
" ctxs = [b.show(ctx=c, **kwargs) for b,c,_ in zip(samples.itemgot(i),ctxs,range(max_n))]\n",
" ctxs = [r.show(ctx=c, color='green' if b==r else 'red', **kwargs)\n",
" for b,r,c,_ in zip(samples.itemgot(1),outs.itemgot(0),ctxs,range(max_n))]\n",
" return ctxs\n",
"\n",
"@typedispatch\n",
"def show_results(x:TensorImage, y:(TensorMask, TensorPoint, TensorBBox), samples, outs, ctxs=None, max_n=6,\n",
" nrows=None, ncols=1, figsize=None, **kwargs):\n",
" if ctxs is None: ctxs = get_grid(min(len(samples), max_n), nrows=nrows, ncols=ncols, add_vert=1, figsize=figsize, double=True,\n",
" title='Target/Prediction')\n",
" for i in range(2):\n",
" ctxs[::2] = [b.show(ctx=c, **kwargs) for b,c,_ in zip(samples.itemgot(i),ctxs[::2],range(2*max_n))]\n",
" for o in [samples,outs]:\n",
" ctxs[1::2] = [b.show(ctx=c, **kwargs) for b,c,_ in zip(o.itemgot(0),ctxs[1::2],range(2*max_n))]\n",
" return ctxs\n",
"\n",
"@typedispatch\n",
"def show_results(x:TensorImage, y:(TensorMask, TensorPoint, TensorBBox), samples, outs, ctxs=None, max_n=6,\n",
" nrows=None, ncols=1, figsize=None, **kwargs):\n",
" if ctxs is None: ctxs = get_grid(min(len(samples), max_n), nrows=nrows, ncols=ncols, add_vert=1, figsize=figsize, double=True,\n",
" title='Target/Prediction')\n",
" for i in range(2):\n",
" ctxs[::2] = [b.show(ctx=c, **kwargs) for b,c,_ in zip(samples.itemgot(i),ctxs[::2],range(2*max_n))]\n",
" for o in [samples,outs]:\n",
" ctxs[1::2] = [b.show(ctx=c, **kwargs) for b,c,_ in zip(o.itemgot(0),ctxs[1::2],range(2*max_n))]\n",
" return ctxs\n",
"\n",
"@typedispatch\n",
"def show_results(x:TensorImage, y:(TensorMask, TensorPoint, TensorBBox), samples, outs, ctxs=None, max_n=6,\n",
" nrows=None, ncols=1, figsize=None, **kwargs):\n",
" if ctxs is None: ctxs = get_grid(min(len(samples), max_n), nrows=nrows, ncols=ncols, add_vert=1, figsize=figsize, double=True,\n",
" title='Target/Prediction')\n",
" for i in range(2):\n",
" ctxs[::2] = [b.show(ctx=c, **kwargs) for b,c,_ in zip(samples.itemgot(i),ctxs[::2],range(2*max_n))]\n",
" for o in [samples,outs]:\n",
" ctxs[1::2] = [b.show(ctx=c, **kwargs) for b,c,_ in zip(o.itemgot(0),ctxs[1::2],range(2*max_n))]\n",
" return ctxs\n",
"\n",
"@typedispatch\n",
"def show_results(x:TensorImage, y:TensorImage, samples, outs, ctxs=None, max_n=10, figsize=None, **kwargs):\n",
" if ctxs is None: ctxs = get_grid(3*min(len(samples), max_n), ncols=3, figsize=figsize, title='Input/Target/Prediction')\n",
" for i in range(2):\n",
" ctxs[i::3] = [b.show(ctx=c, **kwargs) for b,c,_ in zip(samples.itemgot(i),ctxs[i::3],range(max_n))]\n",
" ctxs[2::3] = [b.show(ctx=c, **kwargs) for b,c,_ in zip(outs.itemgot(0),ctxs[2::3],range(max_n))]\n",
" return ctxs\n",
"\n",
"@typedispatch\n",
"def show_results(x:TensorImage, y, samples, outs, ctxs=None, max_n=10, nrows=None, ncols=None, figsize=None, **kwargs):\n",
" if ctxs is None: ctxs = get_grid(min(len(samples), max_n), nrows=nrows, ncols=ncols, add_vert=1, figsize=figsize)\n",
" ctxs = show_results[object](x, y, samples, outs, ctxs=ctxs, max_n=max_n, **kwargs)\n",
" return ctxs\n",
"\n",
"@typedispatch\n",
"def show_results(x, y, samples, outs, ctxs=None, max_n=9, **kwargs):\n",
" if ctxs is None: ctxs = Inf.nones\n",
" for i in range(len(samples[0])):\n",
" ctxs = [b.show(ctx=c, **kwargs) for b,c,_ in zip(samples.itemgot(i),ctxs,range(max_n))]\n",
" for i in range(len(outs[0])):\n",
" ctxs = [b.show(ctx=c, **kwargs) for b,c,_ in zip(outs.itemgot(i),ctxs,range(max_n))]\n",
" return ctxs\n",
"\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "z-fXQEM7QoCl",
"outputId": "4e526331-7f27-43db-a396-9dd14b04dd3d"
},
"source": [
"fnames"
],
"execution_count": 47,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['/usr/local/lib/python3.7/dist-packages/fastai/vision/learner.py',\n",
" '/usr/local/lib/python3.7/dist-packages/fastai/vision/learner.py',\n",
" '/usr/local/lib/python3.7/dist-packages/fastai/vision/learner.py',\n",
" '/usr/local/lib/python3.7/dist-packages/fastai/vision/learner.py',\n",
" '/usr/local/lib/python3.7/dist-packages/fastai/vision/learner.py',\n",
" '/usr/local/lib/python3.7/dist-packages/fastai/vision/learner.py',\n",
" '/usr/local/lib/python3.7/dist-packages/fastai/data/core.py']"
]
},
"metadata": {
"tags": []
},
"execution_count": 47
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "e6AotzQpQoT-"
},
"source": [
""
],
"execution_count": null,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment