Created
September 14, 2022 21:15
-
-
Save luisquintanilla/fa3e8d1fb61c7f8669be179efe4e0bae to your computer and use it in GitHub Desktop.
Infer data schema and train using ML.NET AutoML
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Install NuGet Packages" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div><div><strong>Restore sources</strong><ul><li><span>https://pkgs.dev.azure.com/dnceng/public/_packaging/MachineLearning/nuget/v3/index.json</span></li></ul></div><div></div><div><strong>Installed Packages</strong><ul><li><span>Microsoft.Data.Analysis, 0.20.0-preview.22424.1</span></li><li><span>Microsoft.ML.AutoML, 0.20.0-preview.22424.1</span></li><li><span>Plotly.NET.CSharp, 0.0.1</span></li><li><span>Plotly.NET.Interactive, 3.0.2</span></li></ul></div></div>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
}, | |
{ | |
"data": { | |
"text/markdown": [ | |
"Loading extensions from `Plotly.NET.Interactive.dll`" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
}, | |
{ | |
"data": { | |
"text/markdown": [ | |
"Loading extensions from `Microsoft.Data.Analysis.Interactive.dll`" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"#i \"nuget:https://pkgs.dev.azure.com/dnceng/public/_packaging/MachineLearning/nuget/v3/index.json\"\n", | |
"#r \"nuget: Plotly.NET.Interactive, 3.0.2\"\n", | |
"#r \"nuget: Plotly.NET.CSharp, 0.0.1\"\n", | |
"#r \"nuget: Microsoft.ML.AutoML, 0.20.0-preview.22424.1\"\n", | |
"#r \"nuget: Microsoft.Data.Analysis, 0.20.0-preview.22424.1\"" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Import NuGet packages" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [], | |
"source": [ | |
"using System;\n", | |
"using System.IO;\n", | |
"using Microsoft.Data.Analysis;\n", | |
"using Microsoft.ML;\n", | |
"using Microsoft.ML.AutoML;\n", | |
"using Microsoft.ML.Data;" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Define training data path" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [], | |
"source": [ | |
"var trainDataPath = @\"C:\\\\Datasets\\\\taxi-fare-train.csv\";" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Initialize MLContext" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [], | |
"source": [ | |
"var ctx = new MLContext();" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Infer training data schema" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [], | |
"source": [ | |
"var columnInferenceResults = ctx.Auto().InferColumns(trainDataPath, \"fare_amount\", groupColumns: false);" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Inspect inference results\n", | |
"\n", | |
"### Label Column (Column to predict)\n", | |
"\n", | |
"fare_amount\n", | |
"\n", | |
"### Features\n", | |
"\n", | |
"- Numeric Columns\n", | |
" - rate_code\n", | |
" - passenger_count\n", | |
" - trip_time_in_secs\n", | |
" - trip_distance\n", | |
"- Categorical Columns\n", | |
" - vendor_id\n", | |
" - payment_type" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<table><thead><tr><th>LabelColumnName</th><th>UserIdColumnName</th><th>GroupIdColumnName</th><th>ItemIdColumnName</th><th>ExampleWeightColumnName</th><th>SamplingKeyColumnName</th><th>CategoricalColumnNames</th><th>NumericColumnNames</th><th>TextColumnNames</th><th>IgnoredColumnNames</th><th>ImagePathColumnNames</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">fare_amount</div></td><td><div class=\"dni-plaintext\"><null></div></td><td><div class=\"dni-plaintext\"><null></div></td><td><div class=\"dni-plaintext\"><null></div></td><td><div class=\"dni-plaintext\"><null></div></td><td><div class=\"dni-plaintext\"><null></div></td><td><div class=\"dni-plaintext\">[ vendor_id, payment_type ]</div></td><td><div class=\"dni-plaintext\">[ rate_code, passenger_count, trip_time_in_secs, trip_distance ]</div></td><td><div class=\"dni-plaintext\">[ ]</div></td><td><div class=\"dni-plaintext\">[ ]</div></td><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"columnInferenceResults.ColumnInformation" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
} | |
}, | |
"source": [ | |
"## Load data into IDataView" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [], | |
"source": [ | |
"var textLoader = ctx.Data.CreateTextLoader(columnInferenceResults.TextLoaderOptions);\n", | |
"var idv = textLoader.Load(trainDataPath);" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Inspect IDataView Schema" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<table><thead><tr><th><i>index</i></th><th>Name</th><th>Index</th><th>IsHidden</th><th>Type</th><th>Annotations</th></tr></thead><tbody><tr><td>0</td><td>vendor_id</td><td><div class=\"dni-plaintext\">0</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.ReadOnlyMemory<System.Char></div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr><tr><td>1</td><td>rate_code</td><td><div class=\"dni-plaintext\">1</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.Single</div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr><tr><td>2</td><td>passenger_count</td><td><div class=\"dni-plaintext\">2</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.Single</div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr><tr><td>3</td><td>trip_time_in_secs</td><td><div class=\"dni-plaintext\">3</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.Single</div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr><tr><td>4</td><td>trip_distance</td><td><div class=\"dni-plaintext\">4</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.Single</div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr><tr><td>5</td><td>payment_type</td><td><div class=\"dni-plaintext\">5</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.ReadOnlyMemory<System.Char></div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr><tr><td>6</td><td>fare_amount</td><td><div class=\"dni-plaintext\">6</div></td><td><div class=\"dni-plaintext\">False</div></td><td><table><thead><tr><th>RawType</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">System.Single</div></td></tr></tbody></table></td><td><table><thead><tr><th>Schema</th></tr></thead><tbody><tr><td><div class=\"dni-plaintext\">[ ]</div></td></tr></tbody></table></td></tr></tbody></table>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"idv.Schema" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Split data into train / validation set\n", | |
"\n", | |
"80% Train\n", | |
"20% Validation" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [], | |
"source": [ | |
"var trainTestSplit = ctx.Data.TrainTestSplit(idv,testFraction:0.2);\n", | |
"var trainSet = trainTestSplit.TrainSet;\n", | |
"var validation = trainTestSplit.TestSet;" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Define training pipeline\n", | |
"\n", | |
"- `Featurizer`: Applies transformations to data to prepare it for training based on the schema information provided by the column inference results. The resulting output is a feature vector called *Features*.\n", | |
"- `Regression`: Estimator that will automatically explore various regression algorithms and settings to find the best model for the given dataset. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [], | |
"source": [ | |
"var pipeline = \n", | |
" ctx.Auto().Featurizer(trainSet,columnInferenceResults.ColumnInformation,outputColumnName:\"Features\")\n", | |
" .Append(ctx.Auto().Regression(labelColumnName:columnInferenceResults.ColumnInformation.LabelColumnName, useLgbm:false));" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Configure Experiment" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [], | |
"source": [ | |
"var experiment = ctx.Auto().CreateExperiment();\n", | |
"\n", | |
"experiment\n", | |
"\t.SetPipeline(pipeline)\n", | |
"\t.SetTrainingTimeInSeconds(60)\n", | |
"\t.SetRegressionMetric(RegressionMetric.RootMeanSquaredError, labelColumn: columnInferenceResults.ColumnInformation.LabelColumnName)\n", | |
"\t.SetDataset(trainSet, validation);" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Configure logging" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [], | |
"source": [ | |
"// Configure logging\n", | |
"ctx.Log += (object? sender, LoggingEventArgs e) =>\n", | |
"{\n", | |
" if (e.Source.Contains(\"AutoMLExperiment\")) Console.WriteLine(e.RawMessage);\n", | |
"};" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Run experiment" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Channel started\r\n", | |
"Channel started\r\n", | |
"Update Running Trial - Id: 0 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression\r\n", | |
"Update Running Trial - Id: 0 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression\r\n", | |
"Update Completed Trial - Id: 0 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression - Duration: 2744\r\n", | |
"Update Completed Trial - Id: 0 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression - Duration: 2744\r\n", | |
"Update Best Trial - Id: 0 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression\r\n", | |
"Update Best Trial - Id: 0 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression\r\n", | |
"Update Running Trial - Id: 1 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression\r\n", | |
"Update Running Trial - Id: 1 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression\r\n", | |
"Update Completed Trial - Id: 1 - Metric: 5.402454859092199 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression - Duration: 2892\r\n", | |
"Update Completed Trial - Id: 1 - Metric: 5.402454859092199 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression - Duration: 2892\r\n", | |
"Update Running Trial - Id: 2 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression\r\n", | |
"Update Running Trial - Id: 2 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression\r\n", | |
"Update Completed Trial - Id: 2 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression - Duration: 3014\r\n", | |
"Update Completed Trial - Id: 2 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression - Duration: 3014\r\n", | |
"Update Running Trial - Id: 3 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression\r\n", | |
"Update Running Trial - Id: 3 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression\r\n", | |
"Update Completed Trial - Id: 3 - Metric: 10.530742315550812 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression - Duration: 5229\r\n", | |
"Update Completed Trial - Id: 3 - Metric: 10.530742315550812 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression - Duration: 5229\r\n", | |
"Update Running Trial - Id: 4 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression\r\n", | |
"Update Running Trial - Id: 4 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression\r\n", | |
"Update Completed Trial - Id: 4 - Metric: 5.3937182735638975 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression - Duration: 5183\r\n", | |
"Update Completed Trial - Id: 4 - Metric: 5.3937182735638975 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression - Duration: 5183\r\n", | |
"Update Running Trial - Id: 5 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression\r\n", | |
"Update Running Trial - Id: 5 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression\r\n", | |
"Update Completed Trial - Id: 5 - Metric: 10.530742315550812 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression - Duration: 2010\r\n", | |
"Update Completed Trial - Id: 5 - Metric: 10.530742315550812 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression - Duration: 2010\r\n", | |
"Update Running Trial - Id: 6 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression\r\n", | |
"Update Running Trial - Id: 6 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression\r\n", | |
"Update Completed Trial - Id: 6 - Metric: 14.211149762211765 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>LbfgsPoissonRegressionRegression - Duration: 5686\r\n", | |
"Update Completed Trial - Id: 6 - Metric: 14.211149762211765 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>LbfgsPoissonRegressionRegression - Duration: 5686\r\n", | |
"Update Running Trial - Id: 7 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>LbfgsPoissonRegressionRegression\r\n", | |
"Update Running Trial - Id: 7 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>LbfgsPoissonRegressionRegression\r\n", | |
"Update Completed Trial - Id: 7 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression - Duration: 3093\r\n", | |
"Update Completed Trial - Id: 7 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression - Duration: 3093\r\n", | |
"Update Running Trial - Id: 8 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression\r\n", | |
"Update Running Trial - Id: 8 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastForestRegression\r\n", | |
"Update Completed Trial - Id: 8 - Metric: 14.226784641417762 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>LbfgsPoissonRegressionRegression - Duration: 7934\r\n", | |
"Update Completed Trial - Id: 8 - Metric: 14.226784641417762 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>LbfgsPoissonRegressionRegression - Duration: 7934\r\n", | |
"Update Running Trial - Id: 9 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>LbfgsPoissonRegressionRegression\r\n", | |
"Update Running Trial - Id: 9 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>LbfgsPoissonRegressionRegression\r\n", | |
"Update Completed Trial - Id: 9 - Metric: 5.040059189971107 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression - Duration: 4136\r\n", | |
"Update Completed Trial - Id: 9 - Metric: 5.040059189971107 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression - Duration: 4136\r\n", | |
"Update Running Trial - Id: 10 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression\r\n", | |
"Update Running Trial - Id: 10 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>SdcaRegression\r\n", | |
"Update Completed Trial - Id: 10 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression - Duration: 3233\r\n", | |
"Update Completed Trial - Id: 10 - Metric: 2.775531046547196 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression - Duration: 3233\r\n", | |
"Update Running Trial - Id: 11 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression\r\n", | |
"Update Running Trial - Id: 11 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastForestRegression\r\n", | |
"Update Completed Trial - Id: 11 - Metric: 13.89212957681258 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression - Duration: 3765\r\n", | |
"Update Completed Trial - Id: 11 - Metric: 13.89212957681258 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression - Duration: 3765\r\n", | |
"Update Running Trial - Id: 12 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression\r\n", | |
"Update Running Trial - Id: 12 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>FastTreeRegression\r\n", | |
"Update Completed Trial - Id: 12 - Metric: 4.442418125775676 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression - Duration: 6165\r\n", | |
"Update Completed Trial - Id: 12 - Metric: 4.442418125775676 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression - Duration: 6165\r\n", | |
"Update Running Trial - Id: 13 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression\r\n", | |
"Update Running Trial - Id: 13 - Pipeline: ReplaceMissingValues=>OneHotEncoding=>Concatenate=>SdcaRegression\r\n", | |
"Update Completed Trial - Id: 13 - Metric: 14.50397627304784 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression - Duration: 3285\r\n", | |
"Update Completed Trial - Id: 13 - Metric: 14.50397627304784 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression - Duration: 3285\r\n", | |
"Update Running Trial - Id: 14 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression\r\n", | |
"Update Running Trial - Id: 14 - Pipeline: ReplaceMissingValues=>OneHotHashEncoding=>Concatenate=>FastTreeRegression\r\n" | |
] | |
} | |
], | |
"source": [ | |
"var result = await experiment.RunAsync();" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Display evaluation metric for the best model" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"R-Squared: 2.775531046547196\r\n" | |
] | |
} | |
], | |
"source": [ | |
"Console.WriteLine($\"R-Squared: {result.Metric}\");" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Save the best model" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"dotnet_interactive": { | |
"language": "csharp" | |
}, | |
"vscode": { | |
"languageId": "dotnet-interactive.csharp" | |
} | |
}, | |
"outputs": [], | |
"source": [ | |
"ctx.Model.Save(result.Model,idv.Schema,\"taxi-fare.mlnet\");" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": ".NET (C#)", | |
"language": "C#", | |
"name": ".net-csharp" | |
}, | |
"language_info": { | |
"name": "C#" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment