Skip to content

Instantly share code, notes, and snippets.

@rodigu
Last active February 17, 2025 12:15
Show Gist options
  • Save rodigu/fbaab3c6d8108e210c56819c492ec0f5 to your computer and use it in GitHub Desktop.
Save rodigu/fbaab3c6d8108e210c56819c492ec0f5 to your computer and use it in GitHub Desktop.
composite id column in dataframe

say we have a dataframe with many columns.

we want a new id column, which will be a concatenation of two or more columns in the dataframe.

this is useful in a case when we have a table without a "natural" id. for example a sales table with a client_id column, and a purchase_datetime.

this function will concat our columns:

def concatenated_column(df: pd.DataFrame, id_keys: list[str], separator: str) -> pd.Series:
  return df[id_keys[0]].str.cat(df[id_keys[1:]].astype(str), sep=separator)

it can be used as such:

df['concatenated_id'] = concatenated_column(df=df, id_keys=['client_id', 'purchase_datetime'], separator='+')

reference

How to concatenate multiple column values into a single column in Pandas dataframe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment