Skip to content

Instantly share code, notes, and snippets.

@Pymmdrza
Last active April 11, 2025 16:32
Show Gist options
  • Save Pymmdrza/2fce3d4fccd1230c1d1b22e1bf91a9ae to your computer and use it in GitHub Desktop.
Save Pymmdrza/2fce3d4fccd1230c1d1b22e1bf91a9ae to your computer and use it in GitHub Desktop.
Python Professional Write a Clean Code

Clean Code in Python

Table of Contents

  1. Introduction
  2. Variables
  3. Functions
  4. Classes
  5. Don't repeat yourself (DRY)

Introduction

Software engineering principles, from the book Clean Code by Robert C. Martin, adapted for Python. This is not a style guide. It's a guide to producing readable, reusable, and refactorable software in Python. Not every principle herein has to be strictly followed, and even fewer will be universally agreed upon. These are guidelines and not gospel, but they are codified over many years of collective experience by the authors of Clean Code.

Variables

Use meaningful and pronounceable variable names

Bad:

import datetime


ymdstr = datetime.date.today().strftime("%y-%m-%d")

Moreover, there is no need to use the type str for the name.

Good:

import datetime


current_date: str = datetime.date.today().strftime("%y-%m-%d")

Use the same vocabulary for the same type of variable

Bad: We use multiple names for the same entity here:

def get_user_info(): pass
def get_client_data(): pass
def get_customer_record(): pass

Good: If the entity is the same, you should be consistent in referencing it in your functions:

def get_user_info(): pass
def get_user_data(): pass
def get_user_record(): pass

Even better

Python is (also) an Object-Oriented Programming language. If it makes sense, bundle functions together with the specific implementation of the entity in your code, as instance attributes, property methods, or methods:

from typing import Union, Dict


class Record:
    pass


class User:
    info : str

    @property
    def data(self) -> Dict[str, str]:
        return {}

    def get_record(self) -> Union[Record, None]:
        return Record()

Use searchable names

We will read more code than we will ever write. It's important that the code we do write is readable and searchable. By not naming variables that end up being meaningful for understanding our program, we hurt our readers. Make your names searchable.

Bad:

import time


# What is the number 86400 for again?
time.sleep(86400)

Good:

import time


# Declare them in the global namespace for the module.
SECONDS_IN_A_DAY = 60 * 60 * 24
time.sleep(SECONDS_IN_A_DAY)

Use explanatory variables

Bad:

import re


address = "One Infinite Loop, Cupertino 95014"
city_zip_code_regex = r"^[^,\\]+[,\\\s]+(.+?)\s*(\d{5})?$"

matches = re.match(city_zip_code_regex, address)
if matches:
    print(f"{matches[1]}: {matches[2]}")

Not bad:

Better, but still heavily dependent on the regex.

import re


address = "One Infinite Loop, Cupertino 95014"
city_zip_code_regex = r"^[^,\\]+[,\\\s]+(.+?)\s*(\d{5})?$"
matches = re.match(city_zip_code_regex, address)

if matches:
    city, zip_code = matches.groups()
    print(f"{city}: {zip_code}")

Good:

Reduce the dependency on the regex by naming the subgroups.

import re


address = "One Infinite Loop, Cupertino 95014"
city_zip_code_regex = r"^[^,\\]+[,\\\s]+(?P<city>.+?)\s*(?P<zip_code>\d{5})?$"

matches = re.match(city_zip_code_regex, address)
if matches:
    print(f"{matches['city']}, {matches['zip_code']}")

Avoid Mental Mapping

Don't force your code readers to translate variable names. Explicit is better than implicit.

Bad:

seq = ("Austin", "New York", "San Francisco")

for item in seq:
    #do_stuff()
    #do_some_other_stuff()

    # Wait, what's `item` again?
    print(item)

Good:

locations = ("Austin", "New York", "San Francisco")

for location in locations:
    #do_stuff()
    #do_some_other_stuff()
    # ...
    print(location)

Don't add unneeded context

If the class/object name tells you something, don't repeat that in your variable name.

Bad:

class Car:
    car_make: str
    car_model: str
    car_color: str

Good:

class Car:
    make: str
    model: str
    color: str

Use default arguments instead of short-circuiting or conditionals.

Bad

Why write:

import hashlib


def create_micro_brewery(name):
    name = "Hipster Brew Co." if name is None else name
    slug = hashlib.sha1(name.encode()).hexdigest()
    # etc.

...when you can use a default argument? This also makes it crystal clear to the reader that you expect a string for your argument.

Good:

import hashlib


def create_micro_brewery(name: str = "Hipster Brew Co."):
    slug = hashlib.sha1(name.encode()).hexdigest()
    # etc.

Functions

Function arguments (2 or fewer ideally)

Limiting the number of function parameters is incredibly important because it makes testing your function easier. Having more than three leads to a combinatorial explosion where you have to test tons of different cases with each separate argument.

Zero arguments is the ideal case. One or two arguments is okay, and three should be avoided. Anything more should be consolidated. Usually, if you have more than two arguments then your function is trying to do too much. In cases where it's not, most of the time a higher-level object will suffice as an argument.

Bad:

def create_menu(title, body, button_text, cancellable):
    pass

Java-esque:

class Menu:
    def __init__(self, config: dict):
        self.title = config["title"]
        self.body = config["body"]
        # ...

menu = Menu(
    {
        "title": "My Menu",
        "body": "Something about my menu",
        "button_text": "OK",
        "cancellable": False
    }
)

This is fine too

class MenuConfig:
    """A configuration for the Menu.

    Attributes:
        title: The title of the Menu.
        body: The body of the Menu.
        button_text: The text for the button label.
        cancellable: Can it be cancelled?
    """
    title: str
    body: str
    button_text: str
    cancellable: bool = False


def create_menu(config: MenuConfig) -> None:
    title = config.title
    body = config.body
    # ...


config = MenuConfig()
config.title = "My delicious menu"
config.body = "A description of the various items on the menu"
config.button_text = "Order now!"
# The instance attribute overrides the default class attribute.
config.cancellable = True

create_menu(config)

Fancy

from typing import NamedTuple


class MenuConfig(NamedTuple):
    """A configuration for the Menu.

    Attributes:
        title: The title of the Menu.
        body: The body of the Menu.
        button_text: The text for the button label.
        cancellable: Can it be cancelled?
    """
    title: str
    body: str
    button_text: str
    cancellable: bool = False


def create_menu(config: MenuConfig):
    title, body, button_text, cancellable = config
    # ...


create_menu(
    MenuConfig(
        title="My delicious menu",
        body="A description of the various items on the menu",
        button_text="Order now!"
    )
)

Fancier

from dataclasses import astuple, dataclass


@dataclass
class MenuConfig:
    """A configuration for the Menu.

    Attributes:
        title: The title of the Menu.
        body: The body of the Menu.
        button_text: The text for the button label.
        cancellable: Can it be cancelled?
    """
    title: str
    body: str
    button_text: str
    cancellable: bool = False

def create_menu(config: MenuConfig):
    title, body, button_text, cancellable = astuple(config)
    # ...


create_menu(
    MenuConfig(
        title="My delicious menu",
        body="A description of the various items on the menu",
        button_text="Order now!"
    )
)

Even fancier, Python 3.8+ only

from typing import TypedDict


class MenuConfig(TypedDict):
    """A configuration for the Menu.

    Attributes:
        title: The title of the Menu.
        body: The body of the Menu.
        button_text: The text for the button label.
        cancellable: Can it be cancelled?
    """
    title: str
    body: str
    button_text: str
    cancellable: bool


def create_menu(config: MenuConfig):
    title = config["title"]
    # ...


create_menu(
    # You need to supply all the parameters
    MenuConfig(
        title="My delicious menu",
        body="A description of the various items on the menu",
        button_text="Order now!",
        cancellable=True
    )
)

Functions should do one thing

This is by far the most important rule in software engineering. When functions do more than one thing, they are harder to compose, test, and reason about. When you can isolate a function to just one action, they can be refactored easily and your code will read much cleaner. If you take away anything from this guide other than this, you'll be ahead of many developers.

Bad:

from typing import List


class Client:
    active: bool


def email(client: Client) -> None:
    pass


def email_clients(clients: List[Client]) -> None:
    """Filter active clients and send them an email.
    """
    for client in clients:
        if client.active:
            email(client)

Good:

from typing import List


class Client:
    active: bool


def email(client: Client) -> None:
    pass


def get_active_clients(clients: List[Client]) -> List[Client]:
    """Filter active clients.
    """
    return [client for client in clients if client.active]


def email_clients(clients: List[Client]) -> None:
    """Send an email to a given list of clients.
    """
    for client in get_active_clients(clients):
        email(client)

Do you spot an opportunity to use generators now?

Even better

from typing import Generator, Iterator


class Client:
    active: bool


def email(client: Client):
    pass


def active_clients(clients: Iterator[Client]) -> Generator[Client, None, None]:
    """Only active clients"""
    return (client for client in clients if client.active)


def email_client(clients: Iterator[Client]) -> None:
    """Send an email to a given list of clients.
    """
    for client in active_clients(clients):
        email(client)

Function names should say what they do

Bad:

class Email:
    def handle(self) -> None:
        pass

message = Email()
# What is this supposed to do again?
message.handle()

Good:

class Email:
    def send(self) -> None:
        """Send this message"""

message = Email()
message.send()

Functions should only be one level of abstraction

When you have more than one level of abstraction your function is usually doing too much. Splitting up functions leads to reusability and easier testing.

Bad:

# type: ignore

def parse_better_js_alternative(code: str) -> None:
    regexes = [
        # ...
    ]

    statements = code.split('\n')
    tokens = []
    for regex in regexes:
        for statement in statements:
            pass

    ast = []
    for token in tokens:
        pass

    for node in ast:
        pass

Good:

from typing import Tuple, List, Dict


REGEXES: Tuple = (
   # ...
)


def parse_better_js_alternative(code: str) -> None:
    tokens: List = tokenize(code)
    syntax_tree: List = parse(tokens)

    for node in syntax_tree:
        pass


def tokenize(code: str) -> List:
    statements = code.split()
    tokens: List[Dict] = []
    for regex in REGEXES:
        for statement in statements:
            pass

    return tokens


def parse(tokens: List) -> List:
    syntax_tree: List[Dict] = []
    for token in tokens:
        pass

    return syntax_tree

Don't use flags as function parameters

Flags tell your user that this function does more than one thing. Functions should do one thing. Split out your functions if they are following different code paths based on a boolean.

Bad:

from tempfile import gettempdir
from pathlib import Path


def create_file(name: str, temp: bool) -> None:
    if temp:
        (Path(gettempdir()) / name).touch()
    else:
        Path(name).touch()

Good:

from tempfile import gettempdir
from pathlib import Path


def create_file(name: str) -> None:
    Path(name).touch()


def create_temp_file(name: str) -> None:
    (Path(gettempdir()) / name).touch()

Avoid Side Effects

A function produces a side effect if it does anything other than take a value in and return another value or values. A side effect could be writing to a file, modifying some global variable, or accidentally wiring all your money to a stranger.

Now, you do need to have side effects in a program on occasion. Like the previous example, you might need to write to a file. What you want to do is to centralize where you are doing this. Don't have several functions and classes that write to a particular file. Have one service that does it. One and only one.

The main point is to avoid common pitfalls like sharing state between objects without any structure, using mutable data types that can be written to by anything, or not centralizing where your side effects occur. If you can do this, you will be happier than the vast majority of other programmers.

Bad:

# type: ignore

# This is a module-level name.
# It's good practice to define these as immutable values, such as a string.
# However...
fullname = "Ryan McDermott"

def split_into_first_and_last_name() -> None:
    # The use of the global keyword here is changing the meaning of the
    # the following line. This function is now mutating the module-level
    # state and introducing a side-effect!
    global fullname
    fullname = fullname.split()

split_into_first_and_last_name()

# MyPy will spot the problem, complaining about 'Incompatible types in
# assignment: (expression has type "List[str]", variable has type "str")'
print(fullname)  # ["Ryan", "McDermott"]

# OK. It worked the first time, but what will happen if we call the
# function again?

Good:

from typing import List, AnyStr


def split_into_first_and_last_name(name: AnyStr) -> List[AnyStr]:
    return name.split()

fullname = "Ryan McDermott"
name, surname = split_into_first_and_last_name(fullname)

print(name, surname)  # => Ryan McDermott

Also Good :

from dataclasses import dataclass


@dataclass
class Person:
    name: str

    @property
    def name_as_first_and_last(self) -> list:
        return self.name.split()


# The reason why we create instances of classes is to manage state!
person = Person("Ryan McDermott")
print(person.name)  # => "Ryan McDermott"
print(person.name_as_first_and_last)  # => ["Ryan", "McDermott"]

Classes

Single Responsibility Principle (SRP)

As stated by Robert C. Martin:

A class should have one, and only one, reason to change.

These "reasons to change" are inherently the responsibilities handled by a class or function.

In the following example, we create an HTML element representing a comment with a version number.

Bad :

from importlib import metadata


class VersionCommentElement:
   """An element that renders an HTML comment with the program's version number
   """

   def get_version(self) -> str:
       """Get the package version"""
       return metadata.version("pip")

   def render(self) -> None:
       print(f'<!-- Version: {self.get_version()} -->')


VersionCommentElement().render()

This class has two responsibilities:

  • Retrieving the version of a Python package.
  • Rendering it into an HTML comment element.

Any changes to one of these responsibilities risk affecting the other. We can refactor the class to segregate the responsibilities.

Good :

from importlib import metadata


def get_version(pkg_name:str) -> str:
    """Retrieve the version of a given package"""
    return metadata.version(pkg_name)


class VersionCommentElement:
   """An element that renders an HTML comment with the program's version number
   """

   def __init__(self, version: str):
       self.version = version

   def render(self) -> None:
       print(f'<!-- Version: {self.version} -->')


VersionCommentElement(get_version("pip")).render()

The result is that the class now only handles the rendering. It receives the version during instantiation, and the text is retrieved by a separate get_version() function. Changing the class won't affect the other, and vice-versa, as long as the contract between them doesn't change (i.e., the function returns a string, and the class's __init__ method accepts a string).

Also, the get_version() function is now reusable elsewhere.

Open/Closed Principle (OCP)

“Entities should be open for extension, but closed for modification.” Uncle Bob.

Objects should be open for extension but closed for modification. It should be possible to enhance the functionality provided by an object (e.g., a class) without altering its internal contract. An object achieves this capability when it is written in a way that allows for easy extension.

In the next example, we attempt to implement a simple web framework that handles HTTP requests and returns a response. The View class has a .get() method that will be called when the HTTP server receives a GET request.

The View class is intentionally simple and returns text/plain responses. We also want HTML responses based on template files, so we subclass it with TemplateView.

Bad:

from dataclasses import dataclass


@dataclass
class Response:
    """An HTTP response"""

    status: int
    content_type: str
    body: str


class View:
    """A simple view that returns plain text responses"""

    def get(self, request) -> Response:
        """Handle a GET request and return a message in the response"""
        return Response(
            status=200,
            content_type='text/plain',
            body="Welcome to my web site"
        )


class TemplateView(View):
    """A view that returns HTML responses based on a template file."""

    def get(self, request) -> Response:
        """Handle a GET request and return an HTML document in the response"""
        with open("index.html") as fd:
            return Response(
                status=200,
                content_type='text/html',
                body=fd.read()
            )

The TemplateView class modifies the internal behavior of its parent to add more advanced functionality. By doing so, TemplateView relies on its parent's .get() implementation not changing, which must now be frozen in time. For example, we cannot introduce some additional checks across all classes derived from View because this behavior has been overridden in at least one subclass, and we would need to update it too.

Let's refactor our classes to fix this issue and allow the View class to be cleanly extended (not modified).

Good :

from dataclasses import dataclass


@dataclass
class Response:
    """An HTTP response"""

    status: int
    content_type: str
    body: str


class View:
    """A simple view that returns plain text responses"""

    content_type = "text/plain"

    def render_body(self) -> str:
        """Render the message body of the response"""
        return "Welcome to my web site"

    def get(self, request) -> Response:
        """Handle a GET request and return a message in the response"""
        return Response(
            status=200,
            content_type=self.content_type,
            body=self.render_body()
        )


class TemplateView(View):
    """A view that returns HTML responses based on a template file."""

    content_type = "text/html"
    template_file = "index.html"

    def render_body(self) -> str:
        """Render the message body as HTML"""
        with open(self.template_file) as fd:
            return fd.read()

Note that we had to override render_body() to change the source of the body, but this method has a single, well-defined responsibility that explicitly invites subclasses to override it. It's designed to allow extension by subclasses. Another good way to harness the power of inheritance and object composition is to use Mixins.

Mixins are simple, root classes that are used exclusively with other related classes. They are "mixed" into the target class using multiple inheritance to alter the target's behavior.

A few rules:

  • Mixins must inherit from object.
  • Mixins always come before the target class. For example:

class Foo(MixinA, MixinB, TargetClass): ...

Also good :

from dataclasses import dataclass, field
from typing import Protocol


@dataclass
class Response:
    """An HTTP response"""

    status: int
    content_type: str
    body: str
    headers: dict = field(default_factory=dict)


class View:
    """A simple view that returns plain text responses"""

    content_type = "text/plain"

    def render_body(self) -> str:
        """Render the message body of the response"""
        return "Welcome to my web site"

    def get(self, request) -> Response:
        """Handle a GET request and return a message in the response"""
        return Response(
            status=200,
            content_type=self.content_type,
            body=self.render_body()
        )


class TemplateRenderMixin:
    """A mixin class for views that render HTML documents using a template file

    Not to be used by itself!
    """
    template_file: str = ""

    def render_body(self) -> str:
        """Render the message body as HTML"""
        if not self.template_file:
            raise ValueError("The path to a template file must be given.")

        with open(self.template_file) as fd:
            return fd.read()


class ContentLengthMixin:
    """A mixin class for views that injects a Content-Length header in the
    response

    Not to be used by itself!
    """

    def get(self, request) -> Response:
        """Introspect and amend the response to inject the new header"""
        response = super().get(request) # type: ignore
        response.headers['Content-Length'] = len(response.body)
        return response


class TemplateView(TemplateRenderMixin, ContentLengthMixin, View):
    """A view that returns HTML responses based on a template file."""

    content_type = "text/html"
    template_file = "index.html"

As you can see, mixins facilitate object composition by packaging similar functionality into a reusable class with a single responsibility and allowing them to be segregated. The popular Django framework makes extensive use of mixins to build its class-based views.

Liskov Substitution Principle (LSP)

“ Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it. “ Uncle Bob.

This principle is named after Barbara Liskov, who collaborated with computer scientist Jeannette Wing on the paper "A behavioral notion of subtyping (1994)". A central tenet of the paper is that "a subtype must preserve the behavior of its supertype's methods and also keep all of its supertype's invariant and history properties."

This means that a function accepting a supertype should also accept its subtypes without any modification.

Can you spot the problem in the following code?

Bad :

from dataclasses import dataclass


@dataclass
class Response:
    """An HTTP response"""

    status: int
    content_type: str
    body: str


class View:
    """A simple view that returns plain text responses"""

    content_type = "text/plain"

    def render_body(self) -> str:
        """Render the message body of the response"""
        return "Welcome to my web site"

    def get(self, request) -> Response:
        """Handle a GET request and return a message in the response"""
        return Response(
            status=200,
            content_type=self.content_type,
            body=self.render_body()
        )


class TemplateView(View):
    """A view that returns HTML responses based on a template file."""

    content_type = "text/html"

    def get(self, request, template_file: str) -> Response: # type: ignore
        """Render the message body as HTML"""
        with open(template_file) as fd:
            return Response(
                status=200,
                content_type=self.content_type,
                body=fd.read()
            )


def render(view: View, request) -> Response:
    """Render a View"""
    return view.get(request)

The render() function is expected to work well with View and its subclass TemplateView. However, the subclass breaks compatibility by changing the signature of the .get() method. The function will raise a TypeError when used with TemplateView.

If we want the render() function to work with any subtype of View, we must ensure that we do not break its public protocol. But how do we know what constitutes the protocol of a class? Type hinters like mypy will raise warnings like this if they spot such errors:

error: Signature of "get" incompatible with supertype "View"
<string>:36: note:      Superclass:
<string>:36: note:          def get(self, request: Any) -> Response
<string>:36: note:      Subclass:
<string>:36: note:          def get(self, request: Any, template_file: str) -> Response

Interface Segregation Principle (ISP)

“Keep interfaces small so that users don’t end up depending on things they don’t need.” Uncle Bob.

Several popular object-oriented programming languages like Go and Java have the concept of interfaces. Interfaces define the public methods and attributes of an object without implementing them. They are useful when we don't want to couple a function's signature to a specific object. Python does not have interfaces. It has Abstract Base Classes instead, which are slightly different but can serve the same purpose.

Good

from abc import ABCMeta, abstractmethod


# Define the Abstract Class for a generic Greeter object
class Greeter(metaclass=ABCMeta):
    """An object that can perform a greeting action."""

    @staticmethod
    @abstractmethod
    def greet(name: str) -> None:
        """Display a greeting for the user with the given name"""


class FriendlyActor(Greeter):
    """An actor that greets the user with a friendly salutation"""

    @staticmethod
    def greet(name: str) -> None:
        """Greet a person by name"""
        print(f"Hello {name}!")


def welcome_user(user_name: str, actor: Greeter):
    """Welcome a user with a given name using the provided actor"""
    actor.greet(user_name)


welcome_user("Barbara", FriendlyActor())

Now consider this scenario: we have a number of PDF documents that we authored, and we want to serve them to visitors of our website. We use a Python web framework and might be tempted to design a class that handles these documents, so we design a comprehensive abstract base class for our document.

Bad

import abc


class Persistable(metaclass=abc.ABCMeta):
    """Serialize a file to data and back"""

    @property
    @abc.abstractmethod
    def data(self) -> bytes:
        """The raw data of the file"""

    @classmethod
    @abc.abstractmethod
    def load(cls, name: str):
        """Load the file from disk"""

    @abc.abstractmethod
    def save(self) -> None:
        """Save the file to disk"""


# We just want to serve the documents, so our concrete PDF document
# implementation just needs to implement the `.load()` method and have
# a public attribute named `data`.

class PDFDocument(Persistable):
    """A PDF document"""

    @property
    def data(self) -> bytes:
        """The raw bytes of the PDF document"""
        ... # Code goes here - omitted for brevity

    @classmethod
    def load(cls, name: str):
        """Load the file from the local filesystem"""
        ... # Code goes here - omitted for brevity


def view(request):
    """A web view that handles a GET request for a document"""
    requested_name = request.qs['name'] # We want to validate this!
    return PDFDocument.load(requested_name).data

But we can't! An error is raised if we don't implement the .save() method:

Can't instantiate abstract class PDFDocument with abstract method save.

This is annoying. We don't really need to implement .save() here. We could write a method that does nothing or raises NotImplementedError, but that's just extra code we need to maintain.

Meanwhile, if we remove .save() from the abstract class, we'll need to reimplement it later when we want users to be able to submit their own documents, putting us back at square one.

The problem is that we wrote an interface that includes functionality we don't currently need or use.

The solution is to break the interface into smaller pieces that handle separate concerns.

Good

import abc


class DataCarrier(metaclass=abc.ABCMeta):
    """Carries a data payload"""
    @property
    def data(self):
        ...

class Loadable(DataCarrier):
    """Can load data from storage by name"""
    @classmethod
    @abc.abstractmethod
    def load(cls, name: str):
        ...

class Saveable(DataCarrier):
    """Can save data to storage"""
    @abc.abstractmethod
    def save(self) -> None:
        ...


class PDFDocument(Loadable):
    """A PDF document"""

    @property
    def data(self) -> bytes:
        """The raw bytes of the PDF document"""
        ... # Code goes here - omitted for brevity

    @classmethod
    def load(cls, name: str):
        """Load the file from the local filesystem"""
        ... # Code goes here - omitted for brevity


def view(request):
    """A web view that handles a GET request for a document"""
    requested_name = request.qs['name'] # We want to validate this!
    return PDFDocument.load(requested_name).data

Dependency Inversion Principle (DIP)

“Depend on abstractions, not on concretions.”, Uncle Bob.

Imagine we wanted to create a web view that returns an HTTP response containing rows from a CSV file we generated. We want to use the CSV writer provided by Python's standard library.

Bad

import csv
from io import StringIO


class StreamingHttpResponse:
    """A streaming HTTP response"""
    ... # implementation code goes here


def some_view(request):
   rows = (
       ['First row', 'Foo', 'Bar', 'Baz'],
       ['Second row', 'A', 'B', 'C', '"Testing"', "Here's a quote"]
   )
   # Define a generator to stream data directly to the client
   def stream():
       buffer_ = StringIO()
       writer = csv.writer(buffer_, delimiter=';', quotechar='"')
       for row in rows:
           writer.writerow(row)
           buffer_.seek(0)
           data = buffer_.read()
           buffer_.seek(0)
           buffer_.truncate()
           yield data

   # Create the streaming response  object with the appropriate CSV header.
   response = StreamingHttpResponse(stream(), content_type='text/csv')
   response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'

   return response

Our first implementation works around the CSV writer's interface by manipulating a StringIO object (which is file-like) and performing several low-level operations to extract rows from the writer.

A better approach is to leverage the fact that the writer requires an object with a .write() method to do its work. Why not pass it a dummy object that immediately returns the newly generated row, so the StreamingHttpResponse class can immediately stream it back to the client?

Good

import csv


class Echo:
   """An object that implements just the write method of the file-like
   interface.
   """
   def write(self, value):
       """Write the value by returning it, instead of storing in a buffer."""
       return value

def some_streaming_csv_view(request):
   """A view that streams a large CSV file."""
   rows = (
       ['First row', 'Foo', 'Bar', 'Baz'],
       ['Second row', 'A', 'B', 'C', '"Testing"', "Here's a quote"]
   )
   writer = csv.writer(Echo(), delimiter=';', quotechar='"')
   return StreamingHttpResponse(
       (writer.writerow(row) for row in rows),
       content_type="text/csv",
       headers={'Content-Disposition': 'attachment; filename="somefilename.csv"'},
   )

Much better, and it works like magic! The reason why this implementation is better than the previous one should be obvious: less code and more efficiency to achieve the same result. We decided to leverage the fact that the writer class depends on the .write() abstraction of the object it receives, without caring about the low-level details of how the method works. This example is adapted from a submission to the Django documentation by this author.

Don't repeat yourself (DRY)

Try to understand the DRY principle.

Do your best to avoid writing the same code over and over again. Repetition is bad because it means if you need to make a change, you have to make it in more than one place.

Imagine you run a restaurant and you keep track of your inventory: all your tomatoes, onions, garlic, spices, etc. If you have multiple lists that you keep this track in, then you have to update all of those lists whenever you serve a dish with tomatoes in it. If you only have one list, there's only one place to update!

Often you repeat yourself because you have two or more things that are slightly different, but share a lot in common, but their differences force you to have two or more separate functions that do much of the same things. Removing code duplication means creating an abstraction that can handle this set of different things with just one function/module/class.

Getting the abstraction right is critical, that's why you should follow the SOLID principles discussed in the Classes section. Bad abstractions can be worse than code duplication, so be careful! With that said, if you can make a good abstraction, do it! Don't repeat yourself, otherwise, you'll find yourself updating multiple places anytime you want to change one thing.

Bad:

from typing import List, Dict
from dataclasses import dataclass

@dataclass
class Developer:
    def __init__(self, experience: float, github_link: str) -> None:
        self._experience = experience
        self._github_link = github_link

    @property
    def experience(self) -> float:
        return self._experience

    @property
    def github_link(self) -> str:
        return self._github_link

@dataclass
class Manager:
    def __init__(self, experience: float, github_link: str) -> None:
        self._experience = experience
        self._github_link = github_link

    @property
    def experience(self) -> float:
        return self._experience

    @property
    def github_link(self) -> str:
        return self._github_link


def get_developer_list(developers: List[Developer]) -> List[Dict]:
    developers_list = []
    for developer in developers:
        developers_list.append({
        'experience' : developer.experience,
        'github_link' : developer.github_link
            })
    return developers_list

def get_manager_list(managers: List[Manager]) -> List[Dict]:
    managers_list = []
    for manager in managers:
        managers_list.append({
        'experience' : manager.experience,
        'github_link' : manager.github_link
            })
    return managers_list

## create list objects of developers
company_developers = [
    Developer(experience=2.5, github_link='https://github.com/1'),
    Developer(experience=1.5, github_link='https://github.com/2')
]
company_developers_list = get_developer_list(developers=company_developers)

## create list objects of managers
company_managers = [
    Manager(experience=4.5, github_link='https://github.com/3'),
    Manager(experience=5.7, github_link='https://github.com/4')
]
company_managers_list = get_manager_list(managers=company_managers)

Good:

from typing import List, Dict
from dataclasses import dataclass

@dataclass
class Employee:
    def __init__(self, experience: float, github_link: str) -> None:
        self._experience = experience
        self._github_link = github_link

    @property
    def experience(self) -> float:
        return self._experience

    @property
    def github_link(self) -> str:
        return self._github_link



def get_employee_list(employees: List[Employee]) -> List[Dict]:
    employees_list = []
    for employee in employees:
        employees_list.append({
        'experience' : employee.experience,
        'github_link' : employee.github_link
            })
    return employees_list

## create list objects of developers
company_developers = [
    Employee(experience=2.5, github_link='https://github.com/1'),
    Employee(experience=1.5, github_link='https://github.com/2')
]
company_developers_list = get_employee_list(employees=company_developers)

## create list objects of managers
company_managers = [
    Employee(experience=4.5, github_link='https://github.com/3'),
    Employee(experience=5.7, github_link='https://github.com/4')
]
company_managers_list = get_employee_list(employees=company_managers)

Use meaningful and pronounceable variable names

Good:

import datetime


current_date: str = datetime.date.today().strftime("%y-%m-%d")

Use the same vocabulary for the same type of variable

Good:

def get_user_info(): pass
def get_user_data(): pass
def get_user_record(): pass

Even better

from typing import Union, Dict


class Record:
    pass


class User:
    info : str

    @property
    def data(self) -> Dict[str, str]:
        return {}

    def get_record(self) -> Union[Record, None]:
        return Record()

Use searchable names

Good:

import time


# Declare them in the global namespace for the module.
SECONDS_IN_A_DAY = 60 * 60 * 24
time.sleep(SECONDS_IN_A_DAY)

Use explanatory variables

Good:

import re


address = "One Infinite Loop, Cupertino 95014"
city_zip_code_regex = r"^[^,\\]+[,\\\s]+(?P<city>.+?)\s*(?P<zip_code>\d{5})?$"

matches = re.match(city_zip_code_regex, address)
if matches:
    print(f"{matches['city']}, {matches['zip_code']}")

Avoid Mental Mapping

Good:

locations = ("Austin", "New York", "San Francisco")

for location in locations:
    #do_stuff()
    #do_some_other_stuff()
    # ...
    print(location)

Don't add unneeded context

Good:

class Car:
    make: str
    model: str
    color: str

Use default arguments instead of short-circuiting or conditionals.

Good:

import hashlib


def create_micro_brewery(name: str = "Hipster Brew Co."):
    slug = hashlib.sha1(name.encode()).hexdigest()
    # etc.

Function arguments (2 or fewer ideally)

This is fine too

class MenuConfig:
    """A configuration for the Menu.

    Attributes:
        title: The title of the Menu.
        body: The body of the Menu.
        button_text: The text for the button label.
        cancellable: Can it be cancelled?
    """
    title: str
    body: str
    button_text: str
    cancellable: bool = False


def create_menu(config: MenuConfig) -> None:
    title = config.title
    body = config.body
    # ...


config = MenuConfig()
config.title = "My delicious menu"
config.body = "A description of the various items on the menu"
config.button_text = "Order now!"
# The instance attribute overrides the default class attribute.
config.cancellable = True

create_menu(config)

Fancy

from typing import NamedTuple


class MenuConfig(NamedTuple):
    """A configuration for the Menu.

    Attributes:
        title: The title of the Menu.
        body: The body of the Menu.
        button_text: The text for the button label.
        cancellable: Can it be cancelled?
    """
    title: str
    body: str
    button_text: str
    cancellable: bool = False


def create_menu(config: MenuConfig):
    title, body, button_text, cancellable = config
    # ...


create_menu(
    MenuConfig(
        title="My delicious menu",
        body="A description of the various items on the menu",
        button_text="Order now!"
    )
)

Fancier

from dataclasses import astuple, dataclass


@dataclass
class MenuConfig:
    """A configuration for the Menu.

    Attributes:
        title: The title of the Menu.
        body: The body of the Menu.
        button_text: The text for the button label.
        cancellable: Can it be cancelled?
    """
    title: str
    body: str
    button_text: str
    cancellable: bool = False

def create_menu(config: MenuConfig):
    title, body, button_text, cancellable = astuple(config)
    # ...


create_menu(
    MenuConfig(
        title="My delicious menu",
        body="A description of the various items on the menu",
        button_text="Order now!"
    )
)

Even fancier, Python 3.8+ only

from typing import TypedDict


class MenuConfig(TypedDict):
    """A configuration for the Menu.

    Attributes:
        title: The title of the Menu.
        body: The body of the Menu.
        button_text: The text for the button label.
        cancellable: Can it be cancelled?
    """
    title: str
    body: str
    button_text: str
    cancellable: bool


def create_menu(config: MenuConfig):
    title = config["title"]
    # ...


create_menu(
    # You need to supply all the parameters
    MenuConfig(
        title="My delicious menu",
        body="A description of the various items on the menu",
        button_text="Order now!",
        cancellable=True
    )
)

Functions should do one thing

Good:

from typing import List


class Client:
    active: bool


def email(client: Client) -> None:
    pass


def get_active_clients(clients: List[Client]) -> List[Client]:
    """Filter active clients.
    """
    return [client for client in clients if client.active]


def email_clients(clients: List[Client]) -> None:
    """Send an email to a given list of clients.
    """
    for client in get_active_clients(clients):
        email(client)

Even better

from typing import Generator, Iterator


class Client:
    active: bool


def email(client: Client):
    pass


def active_clients(clients: Iterator[Client]) -> Generator[Client, None, None]:
    """Only active clients"""
    return (client for client in clients if client.active)


def email_client(clients: Iterator[Client]) -> None:
    """Send an email to a given list of clients.
    """
    for client in active_clients(clients):
        email(client)

Function names should say what they do

Good:

class Email:
    def send(self) -> None:
        """Send this message"""

message = Email()
message.send()

Functions should only be one level of abstraction

Good:

from typing import Tuple, List, Dict


REGEXES: Tuple = (
   # ...
)


def parse_better_js_alternative(code: str) -> None:
    tokens: List = tokenize(code)
    syntax_tree: List = parse(tokens)

    for node in syntax_tree:
        pass


def tokenize(code: str) -> List:
    statements = code.split()
    tokens: List[Dict] = []
    for regex in REGEXES:
        for statement in statements:
            pass

    return tokens


def parse(tokens: List) -> List:
    syntax_tree: List[Dict] = []
    for token in tokens:
        pass

    return syntax_tree

Don't use flags as function parameters

Good:

from tempfile import gettempdir
from pathlib import Path


def create_file(name: str) -> None:
    Path(name).touch()


def create_temp_file(name: str) -> None:
    (Path(gettempdir()) / name).touch()

Avoid Side Effects

Good:

from typing import List, AnyStr


def split_into_first_and_last_name(name: AnyStr) -> List[AnyStr]:
    return name.split()

fullname = "Ryan McDermott"
name, surname = split_into_first_and_last_name(fullname)

print(name, surname)  # => Ryan McDermott

Also Good :

from dataclasses import dataclass


@dataclass
class Person:
    name: str

    @property
    def name_as_first_and_last(self) -> list:
        return self.name.split()


# The reason why we create instances of classes is to manage state!
person = Person("Ryan McDermott")
print(person.name)  # => "Ryan McDermott"
print(person.name_as_first_and_last)  # => ["Ryan", "McDermott"]

Single Responsibility Principle (SRP)

Good :

from importlib import metadata


def get_version(pkg_name:str) -> str:
    """Retrieve the version of a given package"""
    return metadata.version(pkg_name)


class VersionCommentElement:
   """An element that renders an HTML comment with the program's version number
   """

   def __init__(self, version: str):
       self.version = version

   def render(self) -> None:
       print(f'<!-- Version: {self.version} -->')


VersionCommentElement(get_version("pip")).render()

Open/Closed Principle (OCP)

Good :

from dataclasses import dataclass


@dataclass
class Response:
    """An HTTP response"""

    status: int
    content_type: str
    body: str


class View:
    """A simple view that returns plain text responses"""

    content_type = "text/plain"

    def render_body(self) -> str:
        """Render the message body of the response"""
        return "Welcome to my web site"

    def get(self, request) -> Response:
        """Handle a GET request and return a message in the response"""
        return Response(
            status=200,
            content_type=self.content_type,
            body=self.render_body()
        )


class TemplateView(View):
    """A view that returns HTML responses based on a template file."""

    content_type = "text/html"
    template_file = "index.html"

    def render_body(self) -> str:
        """Render the message body as HTML"""
        with open(self.template_file) as fd:
            return fd.read()

Also good :

from dataclasses import dataclass, field
from typing import Protocol


@dataclass
class Response:
    """An HTTP response"""

    status: int
    content_type: str
    body: str
    headers: dict = field(default_factory=dict)


class View:
    """A simple view that returns plain text responses"""

    content_type = "text/plain"

    def render_body(self) -> str:
        """Render the message body of the response"""
        return "Welcome to my web site"

    def get(self, request) -> Response:
        """Handle a GET request and return a message in the response"""
        return Response(
            status=200,
            content_type=self.content_type,
            body=self.render_body()
        )


class TemplateRenderMixin:
    """A mixin class for views that render HTML documents using a template file

    Not to be used by itself!
    """
    template_file: str = ""

    def render_body(self) -> str:
        """Render the message body as HTML"""
        if not self.template_file:
            raise ValueError("The path to a template file must be given.")

        with open(self.template_file) as fd:
            return fd.read()


class ContentLengthMixin:
    """A mixin class for views that injects a Content-Length header in the
    response

    Not to be used by itself!
    """

    def get(self, request) -> Response:
        """Introspect and amend the response to inject the new header"""
        response = super().get(request) # type: ignore
        response.headers['Content-Length'] = len(response.body)
        return response


class TemplateView(TemplateRenderMixin, ContentLengthMixin, View):
    """A view that returns HTML responses based on a template file."""

    content_type = "text/html"
    template_file = "index.html"

Interface Segregation Principle (ISP)

Good

from abc import ABCMeta, abstractmethod


# Define the Abstract Class for a generic Greeter object
class Greeter(metaclass=ABCMeta):
    """An object that can perform a greeting action."""

    @staticmethod
    @abstractmethod
    def greet(name: str) -> None:
        """Display a greeting for the user with the given name"""


class FriendlyActor(Greeter):
    """An actor that greets the user with a friendly salutation"""

    @staticmethod
    def greet(name: str) -> None:
        """Greet a person by name"""
        print(f"Hello {name}!")


def welcome_user(user_name: str, actor: Greeter):
    """Welcome a user with a given name using the provided actor"""
    actor.greet(user_name)


welcome_user("Barbara", FriendlyActor())

Good

import abc


class DataCarrier(metaclass=abc.ABCMeta):
    """Carries a data payload"""
    @property
    def data(self):
        ...

class Loadable(DataCarrier):
    """Can load data from storage by name"""
    @classmethod
    @abc.abstractmethod
    def load(cls, name: str):
        ...

class Saveable(DataCarrier):
    """Can save data to storage"""
    @abc.abstractmethod
    def save(self) -> None:
        ...


class PDFDocument(Loadable):
    """A PDF document"""

    @property
    def data(self) -> bytes:
        """The raw bytes of the PDF document"""
        ... # Code goes here - omitted for brevity

    @classmethod
    def load(cls, name: str):
        """Load the file from the local filesystem"""
        ... # Code goes here - omitted for brevity


def view(request):
    """A web view that handles a GET request for a document"""
    requested_name = request.qs['name'] # We want to validate this!
    return PDFDocument.load(requested_name).data

Dependency Inversion Principle (DIP)

Good

import csv


class Echo:
   """An object that implements just the write method of the file-like
   interface.
   """
   def write(self, value):
       """Write the value by returning it, instead of storing in a buffer."""
       return value

def some_streaming_csv_view(request):
   """A view that streams a large CSV file."""
   rows = (
       ['First row', 'Foo', 'Bar', 'Baz'],
       ['Second row', 'A', 'B', 'C', '"Testing"', "Here's a quote"]
   )
   writer = csv.writer(Echo(), delimiter=';', quotechar='"')
   return StreamingHttpResponse(
       (writer.writerow(row) for row in rows),
       content_type="text/csv",
       headers={'Content-Disposition': 'attachment; filename="somefilename.csv"'},
   )

Don't repeat yourself (DRY)

Good:

from typing import List, Dict
from dataclasses import dataclass

@dataclass
class Employee:
    def __init__(self, experience: float, github_link: str) -> None:
        self._experience = experience
        self._github_link = github_link

    @property
    def experience(self) -> float:
        return self._experience

    @property
    def github_link(self) -> str:
        return self._github_link



def get_employee_list(employees: List[Employee]) -> List[Dict]:
    employees_list = []
    for employee in employees:
        employees_list.append({
        'experience' : employee.experience,
        'github_link' : employee.github_link
            })
    return employees_list

## create list objects of developers
company_developers = [
    Employee(experience=2.5, github_link='https://github.com/1'),
    Employee(experience=1.5, github_link='https://github.com/2')
]
company_developers_list = get_employee_list(employees=company_developers)

## create list objects of managers
company_managers = [
    Employee(experience=4.5, github_link='https://github.com/3'),
    Employee(experience=5.7, github_link='https://github.com/4')
]
company_managers_list = get_employee_list(employees=company_managers)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment