Skip to content

Instantly share code, notes, and snippets.

@liorkesos
Last active September 8, 2024 09:43
Show Gist options
  • Save liorkesos/52035c7308dafb220bee4244107a9a08 to your computer and use it in GitHub Desktop.
Save liorkesos/52035c7308dafb220bee4244107a9a08 to your computer and use it in GitHub Desktop.
GraphBuilder Contribution segmentation
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"authorship_tag": "ABX9TyMQSJo0o0FAJKAugLWUBaTR",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"77102d5b05a7485981c20ddc2eb00606": {
"model_module": "@jupyter-widgets/controls",
"model_name": "IntSliderModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "IntSliderModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "IntSliderView",
"continuous_update": false,
"description": "Slider:",
"description_tooltip": null,
"disabled": false,
"layout": "IPY_MODEL_fcbc7a1d98bf45f2bf52350bc771d2b2",
"max": 100,
"min": 0,
"orientation": "horizontal",
"readout": true,
"readout_format": "d",
"step": 1,
"style": "IPY_MODEL_7663355f7b6c485294554156ca81759d",
"value": 16
}
},
"fcbc7a1d98bf45f2bf52350bc771d2b2": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"7663355f7b6c485294554156ca81759d": {
"model_module": "@jupyter-widgets/controls",
"model_name": "SliderStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "SliderStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": "",
"handle_color": null
}
}
}
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/liorkesos/52035c7308dafb220bee4244107a9a08/untitled0.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "0M3YoPr738h-"
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"source": [
"# GraphBuilder Contribution Segmentation Algorithm\n",
"\n",
"## Introduction\n",
"\n",
"This notebook presents the mathematical formulation of the GraphBuilder Contribution Segmentation algorithm. This algorithm processes definition files (like Terraform, Pulumi, or Backstage) to create a weighted graph describing and visualizing the application and its subsystems.\n",
"\n",
"## Input and Output\n",
"\n",
"### Input\n",
"The input is a singular definition files:\n",
"\n",
"D\n",
"\n",
"### Output\n",
"The output is a weighted graph:\n",
"\n",
"$G = (V, E, W)$\n",
"\n",
"Where:\n",
"- $V$ is the set of vertices (subsystems or components)\n",
"- $E \\subseteq V \\times V$ is the set of edges (dependencies)\n",
"- $W: E \\rightarrow \\mathbb{R}^+$ is a weight function assigning a positive real number to each edge\n",
"\n",
"## Algorithm Steps\n",
"\n",
"### 1. Parsing and Tokenization\n",
"\n",
"For each $d_i \\in D$:\n",
"- Parse $d_i$ into a set of tokens $T_i = \\{t_1, t_2, ..., t_m\\}$\n",
"- Define a function $f: T_i \\rightarrow C$ that maps tokens to components\n",
"\n",
"### 2. Component Extraction\n",
"\n",
"$C = \\bigcup_{i=1}^n f(T_i)$\n",
"\n",
"This represents the set of all unique components.\n",
"\n",
"### 3. Dependency Identification\n",
"\n",
"Define a binary relation $R \\subseteq C \\times C$ where $(c_i, c_j) \\in R$ if $c_i$ depends on $c_j$\n",
"\n",
"### 4. Graph Construction\n",
"\n",
"- Set $V = C$\n",
"- For each $(c_i, c_j) \\in R$, add an edge $e_{ij} = (c_i, c_j)$ to $E$\n",
"\n",
"### 5. Weight Assignment\n",
"\n",
"Define a weight function $W: E \\rightarrow \\mathbb{R}^+$ as:\n",
"\n",
"$W(e_{ij}) = \\alpha \\cdot f_{dep}(c_i, c_j) + \\beta \\cdot f_{com}(c_i, c_j) + \\gamma \\cdot f_{imp}(c_i, c_j)$\n",
"\n",
"Where:\n",
"- $f_{dep}(c_i, c_j)$ quantifies the strength of dependency\n",
"- $f_{com}(c_i, c_j)$ measures the communication frequency\n",
"- $f_{imp}(c_i, c_j)$ assesses the impact of the dependency\n",
"- $\\alpha, \\beta, \\gamma$ are hyperparameters such that $\\alpha + \\beta + \\gamma = 1$\n",
"\n",
"### 6. Subsystem Clustering\n",
"\n",
"Apply a graph clustering algorithm $\\mathcal{A}$ to $G$:\n",
"\n",
"$S = \\mathcal{A}(G)$, where $S = \\{S_1, S_2, ..., S_k\\}$ is a set of subsystems\n",
"\n",
"### 7. Contribution Segmentation\n",
"\n",
"For each subsystem $S_i \\in S$, compute its contribution score:\n",
"\n",
"$CS(S_i) = \\frac{\\sum_{v \\in S_i} \\sum_{u \\in V} W((v,u))}{\\sum_{e \\in E} W(e)}$\n",
"\n",
"## Complexity Analysis\n",
"\n",
"- Time Complexity: $O(|D| \\cdot L + |V|^2 + |E| \\log |E|)$, where $L$ is the average length of a definition file\n",
"- Space Complexity: $O(|V| + |E|)$\n",
"\n",
"## Notes\n",
"\n",
"- The choice of clustering algorithm $\\mathcal{A}$ (e.g., Louvain, Spectral Clustering) affects the final subsystem structure\n",
"- The weight function $W$ can be adjusted based on domain-specific knowledge or empirical studies\n",
"\n",
"## Next Steps\n",
"\n",
"[You can add implementation details, code snippets, or further explanations here]"
],
"metadata": {
"id": "XifFjqEU4KSP"
}
},
{
"cell_type": "code",
"source": [
"# prompt: a slider using jupyter widgets\n",
"\n",
"import ipywidgets as widgets\n",
"\n",
"slider = widgets.IntSlider(\n",
" value=50,\n",
" min=0,\n",
" max=100,\n",
" step=1,\n",
" description='Slider:',\n",
" disabled=False,\n",
" continuous_update=False,\n",
" orientation='horizontal',\n",
" readout=True,\n",
" readout_format='d'\n",
")\n",
"\n",
"display(slider)\n"
],
"metadata": {
"id": "EAm1BWJgi07d",
"outputId": "ec4c0b71-ac09-4c66-dbe4-b431799bb949",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 49,
"referenced_widgets": [
"77102d5b05a7485981c20ddc2eb00606",
"fcbc7a1d98bf45f2bf52350bc771d2b2",
"7663355f7b6c485294554156ca81759d"
]
}
},
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"IntSlider(value=50, continuous_update=False, description='Slider:')"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "77102d5b05a7485981c20ddc2eb00606"
}
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": [],
"metadata": {
"id": "ZWd7H6GP9PpQ"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [],
"metadata": {
"id": "cBiueTo89QQ3"
}
},
{
"cell_type": "code",
"source": [],
"metadata": {
"id": "amL1suJf9QpA"
},
"execution_count": null,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment