https://www.youtube.com/watch?v=JMUxmLyrhSk
- google reccomendation
- analyze legal documents J.P Morgan
- IBM Watson medical technology
- self driving
- tweet emotional recognition
- ...
-
Artifical Narrow intelligence (only specific problems)
-
Artifical General Intelligence (perform any intellectual task)
-
Artificial Super Intelligence (a time when AI will surpass human) doesn't exist yet
- Python
- R
- Java
- Lisp
- Prolog
- C++
Machine learning is used in AI
- increase in data generation
- improve decision making
- uncover patterns & trends in data
- solve complex problem
Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying on patterns and inference instead.
Algorithm: a set of rules and statistical techniques used to learn patterns from data
Model: A model is trained by using a ML Algorithm
Predictor Variable:it is a feature(s) of the data that can be used to predict output
Response Variable: it is the feature or the output variable that need to be predicted by using the predictor variable(s)
Training Data: The machine learning model is built using the training data.
Testing Data: The machine learning model is evaluated using the testing data
Problem model
- Define Object
- Data Gathering
- Preparing Data
- Data exploration
- Model
- Model evaluation
- Predictions
-
Data Gathering weather conditions, humidity level, temperature, pressure, etc ...
-
Preparing Data
- Transform data
- Data cleaning
-
Exploratory Data Analysis Involves undestand the pattern and trends (es temperature low possible rain)
-
Building a machine model split data in train, test always use training data linear regression, decision tree
-
Model evaluation and optimization
-
Predictions the final output is predicted after performing tuning and improving the accurancy
-
Supervised learning we teach or train the machine using data which is well labelled
-
Unsupervised learning unlabeled data and allowing the algorithm to act on that information without guidance
-
Reinforcement learning an agent is put in an environment and he learns to behave in this environment by performing certain actions and observing the rewards
- regression
- classification
- clustering
build ML module to predict the price Linear regression algorithm supervised learning regression problem
KNN algorithm classification problem supervised learning
k-means algorithm unsupervised learnign classiffication problem
- Linear regression
- Logistic regression
- Decision Tree
- Random Forest
- Naive Bayes Classifier
- K nearest neighbour
- Support vector Machines
Linear Regression is a method to predict dependent variable(Y) based on values of indipendent variables (X). It can be used for the cases where we want to predict some continuos quantity
Logistic regression is a method used to predict a dependent variable, given a set of indipendent variables, such that the dependent variable is categorical
per predire quantità non continue ma per quantità categoriche
classification algorithm
yes/no 0 or 1
A decisison tree is a Supervised ML algorithm which look like an inverted tree, wherein each node representes a predictore variable, the link between the nodes represents a decision and aeach leaf node represents an outcome.
root node starting point
internal nodes decision point
leaf/terminal the outcome
branches connection between node
ID3 the most useful algporithm to make a decision tree
choose the best attribute as the root
How to find the best attribute ? that one that splits the data in two differetn classes
information gain and entropy
entropy measures the impurity or uncertainty present in the data
information gain IG indicates how much "information" a partticular feature/values give us about the final node
is necessary calculate the entroy and IG of each term ti choose the right one
classification algorithm Random forest builds multiple decision trees and glues them together to get a more accurate and stable precision
- more accurancy
- avoid overfitting
- bagging
we use a bootstrap dataset
classification algorithm supervised algorithm follow probabilistic h approach
Naive bayes is based on the Bayes theorem that is used to solve classification problems by following a probabilistch approach
it is based on the idea that the predictor variables ina machine learning are indipendent of each other
K nearest neighbour is a supervised learning algorithm that classifies a new data point into the target class, depending on the features of it's neghbouring
no way about correlation of the variables
lazy algorithm , learn from training data
k stand for nearest neighbors
measures Euclidean distance and manatthan distance
regression and classification algorithm separate data using hyperplanes can also be used to classify non linear data draw a decision boundary to classify data
the process by which objects are classified into a predefined number of groups so that they are as much dissimilar as possible from one group to another group, but as much similar as possible.
start
|
▼
Number of ----> centroid ---> distance object ---> grouping based
clusters K ▲ to centroid on min. distance
| |
| |
| false |
-------- centrioid has converged ? <---------
| true
|
▼
end
-
decide number of clusters
-
then we provide centroid of all clusters
-
then the algorithm calculate the euclidian distance from each centroid
-
nex centroid are recalculated
-
reassigned points
-
and then again
-
these steps are repeated until we have a repetiiton in centroids
the elbow method
an agent is put inside unknow environment and with some function it understand where it is and what it can do.
- agent
- environment
- action
- state
the mathematical approach for mapping solution in reinforcement learning is called markov decision process
parameters :
- set of actions a
- set of states s
- reward r
- policy
- value
AI is a technique which enables machines to mimic human action
ML Subset of AI techniques which use statistical methods to enable machines to improve with experience
DL subset of ML which make the computation of multi-layer neural network feasible
is not capable enough to handle the high dimensional data
is difficult the feature extraction
Deep learning models are capable to focus on the right features by themselves, requiring little guidance from the programmer. these models also partially solvwe the dimensional problem.
Deep learnign is a form of machine learning that uses a model of computing that's very much inspired by the structure of the brain.
Neuron:
Dendrite. Receives signals from other neurons Cell Body: Sums all the inputs Axon: it is used to transmit signal to other cells
Deep learning is a collection of statistical machine learnign techniques used to learn feature hierarchies based on the concept of artificial neural networks.
An artificial neuron or a perceptron is a linear model used for binary classificatiopn. It model a neuron whihc has a set of inputs, each of which is given a specific weight. the neuron computes some fuction on these weighted inputs and gives the output.
is linear or binary classifier
somma pesata
activation function, is a treshold the neuron is activated
a multi layer perceptron has the same strucutre of a single layer perceptron but with one or more hidden layers and is thus considered a deep neural network
feed forward connected network is a full layer connected with other
Backpropagation algorithm is a supervised learning method for multilayer perceptron
classsify lead on the basis of priority
to reduce error
propagate backwards values and update the weights to reduce error
calculate the error
calculate the rate of change of error wrt change in the weights
based on change in weight update the values
A trained feed forward network can be exposed to any random collection of photgraphs, and the first photograph it is exposed to will not necessarily alter how to it classifies the second one.
Recurrent networks are a type of officila neural network designed to recognize patterns in a sequence of data, such as text, genomes, handwriting, the spoken word, or numerical time series data emanating from sensor stock markets and government agencies.
a particular neuron is connected only with few neuron in a region