Skip to content

Instantly share code, notes, and snippets.

@devops-school
Created December 1, 2024 11:43
Show Gist options
  • Save devops-school/d555c2f3d7985dea605ffabaa41816e1 to your computer and use it in GitHub Desktop.
Save devops-school/d555c2f3d7985dea605ffabaa41816e1 to your computer and use it in GitHub Desktop.
PyTorch Lab - 5 - Pytorch AutogradIntro
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": 60,
"metadata": {},
"outputs": [],
"source": [
"import torch"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### We dont need to specify requires_grad = False, since by default it flags it as False"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[1., 2., 3.],\n",
" [4., 5., 6.]])"
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor1 = torch.Tensor([[1, 2, 3], \n",
" [4, 5, 6]])\n",
"tensor1"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[ 7., 8., 9.],\n",
" [10., 11., 12.]])"
]
},
"execution_count": 62,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor2 = torch.Tensor([[7, 8, 9], \n",
" [10, 11, 12]])\n",
"\n",
"tensor2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The requires_grad property defines whether to track operations on this tensor\n",
"By default, it is set to False"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor1.requires_grad"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor2.requires_grad"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The requires\\_grad\\_() function sets requires_grad to True"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[1., 2., 3.],\n",
" [4., 5., 6.]], requires_grad=True)"
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor1.requires_grad_()"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 66,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor1.requires_grad"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 67,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor2.requires_grad"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The .grad property stores all the gradients for the tensor\n",
"However, there are no gradients yet"
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n"
]
}
],
"source": [
"print(tensor1.grad)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The .grad_fn property contains the gradient function\n",
"This has not been set either"
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n"
]
}
],
"source": [
"print(tensor1.grad_fn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Create a new output tensor from our original tensor"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {},
"outputs": [],
"source": [
"output_tensor = tensor1 * tensor2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The requires_grad property has been derived from the original tensor"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 71,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"output_tensor.requires_grad"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### There are still no gradients"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n"
]
}
],
"source": [
"print(output_tensor.grad)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### But there is a gradient function\n",
"This is from the multiplication operation performed on the original tensor "
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<MulBackward0 object at 0x113472ac8>\n"
]
}
],
"source": [
"print(output_tensor.grad_fn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The original tensor still does not have a gradient function"
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n"
]
}
],
"source": [
"print(tensor1.grad_fn)"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n"
]
}
],
"source": [
"print(tensor2.grad_fn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Changing the operation for the output changes the gradient function\n",
"The gradient function only contains the last operation. Here, even though there is a multiplication as well as a mean, only the mean calculation is recorded as the gradient function"
]
},
{
"cell_type": "code",
"execution_count": 76,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<MeanBackward1 object at 0x113472c50>\n"
]
}
],
"source": [
"output_tensor = (tensor1 * tensor2).mean()\n",
"print(output_tensor.grad_fn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### In spite of setting a gradient function for the output, the gradients for the input tensor is still empty"
]
},
{
"cell_type": "code",
"execution_count": 77,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n"
]
}
],
"source": [
"print(tensor1.grad)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### To calculate the gradients, we need to explicitly perform a backward propagation"
]
},
{
"cell_type": "code",
"execution_count": 78,
"metadata": {},
"outputs": [],
"source": [
"output_tensor.backward()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The gradients are now available for the input tensor\n",
"\n",
"Future calls to backward will accumulate gradients into this vector"
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[1.1667, 1.3333, 1.5000],\n",
" [1.6667, 1.8333, 2.0000]])\n"
]
}
],
"source": [
"print(tensor1.grad)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The gradient vector is the same shape as the original vector"
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(torch.Size([2, 3]), torch.Size([2, 3]))"
]
},
"execution_count": 82,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor1.grad.shape, tensor1.shape"
]
},
{
"cell_type": "code",
"execution_count": 83,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n"
]
}
],
"source": [
"print(tensor2.grad)"
]
},
{
"cell_type": "code",
"execution_count": 84,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n"
]
}
],
"source": [
"print(output_tensor.grad)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The requires_grad property propagates to other tensors\n",
"Here the new_tensor is created from the original tensor and gets the original's value of requires_grad"
]
},
{
"cell_type": "code",
"execution_count": 85,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True\n"
]
}
],
"source": [
"new_tensor = tensor1 * 3\n",
"print(new_tensor.requires_grad)"
]
},
{
"cell_type": "code",
"execution_count": 86,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[ 3., 6., 9.],\n",
" [12., 15., 18.]], grad_fn=<MulBackward0>)"
]
},
"execution_count": 86,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"new_tensor"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Turning off gradient calculations for tensors\n",
"You can also stops autograd from tracking history on newly created tensors with requires_grad=True by wrapping the code block in <br />\n",
"<b>with torch.no_grad():</b>"
]
},
{
"cell_type": "code",
"execution_count": 87,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"new_tensor = tensor([[ 3., 6., 9.],\n",
" [12., 15., 18.]])\n",
"requires_grad for tensor = True\n",
"requires_grad for tensor = False\n",
"requires_grad for new_tensor = False\n"
]
}
],
"source": [
"with torch.no_grad():\n",
" \n",
" new_tensor = tensor1 * 3\n",
" \n",
" print('new_tensor = ', new_tensor)\n",
" \n",
" print('requires_grad for tensor = ', tensor1.requires_grad)\n",
" \n",
" print('requires_grad for tensor = ', tensor2.requires_grad)\n",
" \n",
" print('requires_grad for new_tensor = ', new_tensor.requires_grad)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Can turn off gradient calculations performed within a function"
]
},
{
"cell_type": "code",
"execution_count": 88,
"metadata": {},
"outputs": [],
"source": [
"def calculate(t):\n",
" return t * 2"
]
},
{
"cell_type": "code",
"execution_count": 89,
"metadata": {},
"outputs": [],
"source": [
"@torch.no_grad()\n",
"def calculate_with_no_grad(t):\n",
" return t * 2"
]
},
{
"cell_type": "code",
"execution_count": 90,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[ 2., 4., 6.],\n",
" [ 8., 10., 12.]], grad_fn=<MulBackward0>)"
]
},
"execution_count": 90,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result_tensor = calculate(tensor1)\n",
"\n",
"result_tensor"
]
},
{
"cell_type": "code",
"execution_count": 91,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 91,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result_tensor.requires_grad"
]
},
{
"cell_type": "code",
"execution_count": 92,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[ 2., 4., 6.],\n",
" [ 8., 10., 12.]])"
]
},
"execution_count": 92,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result_tensor_no_grad = calculate_with_no_grad(tensor1)\n",
"\n",
"result_tensor_no_grad"
]
},
{
"cell_type": "code",
"execution_count": 93,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 93,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result_tensor_no_grad.requires_grad"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Can explicitly enabled gradients within a no_grad() context\n",
"\n",
"There is an equivalent @torch.enable_grad() as well"
]
},
{
"cell_type": "code",
"execution_count": 96,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"new_tensor_no_grad = tensor([[ 3., 6., 9.],\n",
" [12., 15., 18.]])\n",
"new_tensor_grad = tensor([[ 3., 6., 9.],\n",
" [12., 15., 18.]], grad_fn=<MulBackward0>)\n"
]
}
],
"source": [
"with torch.no_grad():\n",
" \n",
" new_tensor_no_grad = tensor1 * 3\n",
" \n",
" print('new_tensor_no_grad = ', new_tensor_no_grad)\n",
" \n",
" with torch.enable_grad():\n",
" \n",
" new_tensor_grad = tensor1 * 3\n",
" \n",
" print('new_tensor_grad = ', new_tensor_grad)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Result tensors get requires_grad properties from input tensors"
]
},
{
"cell_type": "code",
"execution_count": 98,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[1., 2.],\n",
" [3., 4.]], requires_grad=True)"
]
},
"execution_count": 98,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor_one = torch.tensor([[1.0, 2.0], \n",
" [3.0, 4.0]], requires_grad=True) \n",
"tensor_one"
]
},
{
"cell_type": "code",
"execution_count": 99,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[5., 6.],\n",
" [7., 8.]])"
]
},
"execution_count": 99,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor_two = torch.Tensor([[5, 6], \n",
" [7, 8]])\n",
"tensor_two"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### enable the gradients for two tensors"
]
},
{
"cell_type": "code",
"execution_count": 100,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 100,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor_one.requires_grad"
]
},
{
"cell_type": "code",
"execution_count": 101,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[5., 6.],\n",
" [7., 8.]], requires_grad=True)"
]
},
"execution_count": 101,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor_two.requires_grad_()"
]
},
{
"cell_type": "code",
"execution_count": 102,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor(9., grad_fn=<MeanBackward1>)"
]
},
"execution_count": 102,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"final_tensor = (tensor_one + tensor_two).mean()\n",
"final_tensor"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### final tensor has gradients enabled as it derives from the tensors its made up of"
]
},
{
"cell_type": "code",
"execution_count": 103,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 103,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"final_tensor.requires_grad"
]
},
{
"cell_type": "code",
"execution_count": 104,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n"
]
}
],
"source": [
"print(tensor_one.grad)"
]
},
{
"cell_type": "code",
"execution_count": 105,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n"
]
}
],
"source": [
"print(tensor_two.grad)"
]
},
{
"cell_type": "code",
"execution_count": 106,
"metadata": {},
"outputs": [],
"source": [
"final_tensor.backward()"
]
},
{
"cell_type": "code",
"execution_count": 107,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[0.2500, 0.2500],\n",
" [0.2500, 0.2500]])\n"
]
}
],
"source": [
"print(tensor_one.grad)"
]
},
{
"cell_type": "code",
"execution_count": 108,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[0.2500, 0.2500],\n",
" [0.2500, 0.2500]])\n"
]
}
],
"source": [
"print(tensor_two.grad)"
]
},
{
"cell_type": "code",
"execution_count": 109,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n"
]
}
],
"source": [
"print(final_tensor.grad)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Detach tensors from the computation graph"
]
},
{
"cell_type": "code",
"execution_count": 111,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[1., 2.],\n",
" [3., 4.]])"
]
},
"execution_count": 111,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"detached_tensor = tensor_one.detach()\n",
"\n",
"detached_tensor"
]
},
{
"cell_type": "code",
"execution_count": 112,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[1., 2.],\n",
" [3., 4.]], requires_grad=True)"
]
},
"execution_count": 112,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor_one"
]
},
{
"cell_type": "code",
"execution_count": 113,
"metadata": {},
"outputs": [],
"source": [
"mean_tensor = (tensor_one + detached_tensor).mean()\n",
"\n",
"mean_tensor.backward()"
]
},
{
"cell_type": "code",
"execution_count": 114,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[0.5000, 0.5000],\n",
" [0.5000, 0.5000]])"
]
},
"execution_count": 114,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor_one.grad"
]
},
{
"cell_type": "code",
"execution_count": 116,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n"
]
}
],
"source": [
"print(detached_tensor.grad)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment