Skip to content

Instantly share code, notes, and snippets.

@creyesp
Created November 19, 2024 12:54
Show Gist options
  • Save creyesp/7cf839f6c66fc1d526d4e2aed638f20b to your computer and use it in GitHub Desktop.
Save creyesp/7cf839f6c66fc1d526d4e2aed638f20b to your computer and use it in GitHub Desktop.
def unify_victoria_secret(df):
    """
    We want that all brands that are related to Victoria's Secret
    have `victoria's secret` as their brand instead of what they
    currently have.
    """
    df = df.copy()
    new_string = "victoria's secret"
    df.loc[df["brand_name"].isin(["Victorias-Secret", "Victoria's Secret", "Victoria's Secret Pink"]), "brand_name"] = new_string

    return df
def clean_price(df):
    """
    In this function we will transform the
    `price` column into a column of floats.
    In case a product has more than one price,
    return the lowest one.
    """

    df = df.copy()
    parse_price = df["price"].str.extractall(r"(\d+(?:\.\d+)?)").astype(float)
    df["price"] = parse_price.groupby(level=0).min()
    return df
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment