Skip to content

Instantly share code, notes, and snippets.

@maitysubhasis
Created November 23, 2016 21:24
Show Gist options
  • Save maitysubhasis/b26a3e32552fd7865d050c8adc0eb4b3 to your computer and use it in GitHub Desktop.
Save maitysubhasis/b26a3e32552fd7865d050c8adc0eb4b3 to your computer and use it in GitHub Desktop.
\documentclass[12pt]{article}
\usepackage{fullpage} % Package to use full page
\usepackage{parskip} % Package to tweak paragraph skipping
\usepackage{tikz} % Package for drawing
\usepackage{amsmath}
\usepackage{hyperref}
\usepackage{hyperref}
\usepackage[section]{placeins}
% \usepackage[scaled=2]{beramono}
\usepackage{charter}
\usepackage{color}
\usepackage{listings}
\usepackage{transparent}
\begin{document}
\title{PATTERN RECOGNITION \break Assignment 3 \break \\ IITM}
\author{Dhaval Limdiwala - CS16M051 \\
\texttt Subhasis Maity - CS16M055}
\date{06/11/2016}
\maketitle
\begin{figure}[!htb]
\begin{center}
\minipage{0.9\textwidth}
\transparent{0.2}\includegraphics[width=\linewidth]{12345.jpg}
\endminipage\hfill
\end{center}
\end{figure}
\break
\begin{document}
\break\break
% ====================
% Subhasis
% ====================
\tableofcontents
\begin{center}
\section{GMM (Gausian Mixture Model)}
\end{center}
\subsection{Introduction}
\begin{flushleft}
Gaussian mixture model is a simple linear superposition of Gaussian components, aimed at providing a richer class of density models
than the single Gaussian. Gaussian mixture distribution can be written as a linear superposition of Gaussians in the form
\begin{center}
\begin{equation}
p(x) = \sum_{k=1}^{K} \pi_{k}(x|\mu_{k}, \Sigma_{k})
\end{equation}
\end{center}
\end{flushleft}
\subsection{Goal}
\begin{flushleft}
Identify image using GMM (Gausian Mixture Model).
\end{flushleft}
\subsection{Data}
\begin{flushleft}
Given files were containg extracted features from different images in terms of 36x23 dimention matrix.
\end{flushleft}
\subsection{Observation and Preprocessing of Data}
\begin{flushleft}
We created one vector for each image data of dimention 1x828 (23x36).
\end{flushleft}
\subsection{Experiment}
\begin{flushleft}
In previous assignment, we were given 3 classes of images for our classifier But now we are given 8 classes for classification. We make 5 gaussian models for each class. For that we use K-means algorithm to make 5 clusters in each image-type data.and then find log likelihood for each class.
After finding log liklihood, E-M algorithm is applied to maximize the log liklihood. After that, we have parameters pie, sigma and covariance of each cluster for each class. We have to apply these parameters in dicreminate function to classify input file.
We have taken \textbf{\textit{70\%}} of total given data for training and remaining \textbf{\textit{30\%}} data for testing accuracy of classifier.
\begin{figure}[!htb]
\begin{center}
\minipage{0.9\textwidth}
\includegraphics[width=\linewidth]{5.jpg}
\caption{ROC cureve}\label{fig:fig_a}
\endminipage\hfill
\end{center}
\end{figure}
\break
\begin{figure}[!htb]
\begin{center}
\minipage{0.9\textwidth}
\includegraphics[width=\linewidth]{4.JPG}
\caption{Confusion Matrix}\label{fig:fig_a}
\endminipage\hfill
\end{center}
\end{figure}
\end{flushleft}
\subsection{Observation}
\begin{flushleft}
1. The efficiency increases with increase of number of mixtures.
\end{flushleft}
\break
% DHMM
\begin{center}
\section{DHMM (Discrete Hidden Markov Model)}
\end{center}
\subsection{Introduction}
\begin{flushleft}
DHMM is one kind of supervised learning model. In this model the transition from one symbol to
another is clearly known.
\end{flushleft}
\subsection{Goal}
\begin{flushleft}
Identify digit sequence using DHMM (Discrete Hidden Markov Model).
\end{flushleft}
\subsection{Data}
\begin{flushleft}
Given files were containg extracted features for diffenent digits individually.
\end{flushleft}
\subsection{Observation and Preprocessing of Data}
\begin{flushleft}
Each file contained mutiple feature vector for one digit. Each row contained 39 components.
\end{flushleft}
\subsection{Experiment}
\begin{flushleft}
We used MIT hmm toolkit for this experiment.
First we applied k-means clustering to get discontinuous clusters.
Then we assigned multiple cluster to different digits (classes).
After that we applied dhmm\_em() train data with multiple iteration.
Next we calculated accuracy of test data classification calculating log probability using dhmm\_logprob().
For 15 states and 50 iteration for dhmm\_em() the accuracy was around \textbf{\textit{82\%}}. \break
\end{flushleft}
\subsection{Observation}
\begin{flushleft}
1. The accuracy increased with the increase of number of states and iteration count.
\end{flushleft}
% DTW
\begin{center}
\section{DTW (Dynamic Time Warping)}
\end{center}
\subsection{Introduction}
\begin{flushleft}
DTW(Dynamic Time Warping) is used in time series analysis. It can be used to measure similarity between varying length sequence.
DTW can be used to calculate similarity between time series sequence.
\end{flushleft}
\subsection{Goal}
\begin{flushleft}
Identify city names from speech using DTW (Dynamic Time Warping).
\end{flushleft}
\subsection{Data}
\begin{flushleft}
Given .wav file and .mfcc files corresponding to differnent cities of south India.
The .mfcc files contains the features extracted from .wav files.
\end{flushleft}
\subsection{Observation and Preprocessing of Data}
\begin{flushleft}
It was observed that for many cities there was not enoung data i.e. enough .mfcc file was not there.
So we reduced the data set by ignoring .mfcc files corresponding to cities having less than 20 .mfcc files.
Then the reduced data was divided into training set by taking \textbf{\textit{25\%}} of .mfcc files for each city.
The remaining \textbf{\textit{75\%}} data (.mfcc files) was used for training.
\textbf{Before} data reduction \break
Total number of given .mfcc file: 2737 \break
\textbf{After} data reduction \break
Total number of test .mfcc file: 216 \break
Total number of train .mfcc file: 539 \break
\textbf{Note}: When .wav files are not given we can generate .mfcc from .wav files using matlab.
\end{flushleft}
\subsection{Experiment}
\begin{flushleft}
\textbf{Approach 1}: \break
We have tried two approaches to classify the test cities.
Our approach was to calculate dtw distance between one test (.mfcc) file and all train (.mfcc) file.
From the train (.mfcc) file, giving minimum dtw distance is taken as classified city. All the coding is done
in c++.
With this approach we got around \textbf{\textit{20\%}} efficiency.
The approach is shown pictorially below.
\begin{figure}[!htb]
\begin{center}
\minipage{0.8\textwidth}
\includegraphics[width=\linewidth]{1.png}
\caption{Approach 1}\label{fig:fig_a}
\endminipage\hfill
\end{center}
\end{figure}
\break
\textbf{Approach 2}: \break
The first approach was not giving resonable efficiency. So we tried averaging the dtw distance for all train (.mfcc) files having same city name. This time we tried python.
With this approach we got around \textbf{\textit{35\%}} efficiency.
The approach is shown pictorially below
\begin{figure}[!htb]
\begin{center}
\minipage{0.8\textwidth}
\includegraphics[width=\linewidth]{2.png}
\caption{Approach 2}\label{fig:fig_a}
\endminipage\hfill
\end{center}
\end{figure}
\end{flushleft}
\subsection{Inferances}
\begin{flushleft}
1. We need more data set to get reasonable classification.
2. The dtw algorithm is somewhat similar to LCS(Least Common Subsequence) with little modification.
3. Classification using dtw is extremely computation intensive.
\end{flushleft}
\subsection{Links}
\begin{flushleft}
All the codes are available at the following
\color{red}
\href{https://github.com/holasm/pr_assignment3/tree/master/dtw}{link}.
\end
4045b2ac7aadbc011301ccf731a1fb5094e8000b
\end{flushleft}
\break
% HMM
\begin{center}
\section{HMM - HANDWRITTEN data}
\end{center}
\subsection{Introduction}
\begin{flushleft}
HMM is a statistical model where we use supervised learning to create the model.
The model contains hidden states. We attach state transition probability to each state.
The states can emit symbols with certain probabilities. \break
The state transition probabilities and
symbol emission probabilities are calculated using Baum Welch algorithm.
The classification of a particular sequence is done using Vitterb algorithm on the constructed HMM.
\end{flushleft}
\subsection{Goal}
\begin{flushleft}
Identify character symbols using HMM(Hidden Markov Model) using HTK tool kit.
\end{flushleft}
\subsection{Data}
\begin{flushleft}
Given .ldf files corresponding to diffenent character symbols.
The .ldf files were created by extracting feature from pictures of character symbols.
Each files contain x, y coordinate for diffenent points on the symbol picture.
All the train and test data was not already separated.
\end{flushleft}
\subsection{Observation and Preprocessing of Data}
\begin{flushleft}
All given .ldf files were containing data for multple picture of same character symbols.
We have created multiple .mfcc files from the x, y coordinated for one feature corresponding
to one character symbol picture.
Then we separated \textbf{\textit{25\%}} of .mfcc files for each character symbol and taken as test data.
The remaining \textbf{\textit{75\%}} .mfcc files were concisered for training.
We were about to use HTK toolkit to do the experiment.
The following stepas were taken\break
\textbf{1}. To use the HTK tool kit we need to convert the .mfcc files to .htk format. To do this we have used matlab script file.\break
\textbf{2}. To use the matlab script we need to truncate the first line of the provided .mfcc files.\break
\textbf{3}. To use the matlab script we also need a filelist file with all .mfcc file paths.\break
Note: When .wav files are not given we can generate .mfcc from .wav files using matlab.
\end{flushleft}
\subsection{Experiment}
\begin{flushleft}
We have HTK toolkit for the entire experiment.
The steps taken as follows. \break
1. Create grammer file \break
\begin{lstlisting}[language=bash]
$ HParse ... (i/p: grammer, o/p: wordnet)
\end{lstlisting}
2. Create {wlist} file \break
3. Create {lexicon} file \break
\begin{lstlisting}[language=bash]
$ HDMan ... (i/p: wlist, lexicon, model0, o/p: dict, dlog)
\end{lstlisting}
4. Create {symbol.mlf} file. The all.mlf file contains specific symbol label. \break
5. Create {symbol.scp} file. This is the training scp file. The symbol.scp file contains specific symbol .mfcc paths. \break
6. Create {proto} file \break
\begin{lstlisting}[language=bash]
$ HInit ... (i/p: all.scp, proto, o/p: newproto)
\end{lstlisting}
\begin{lstlisting}[language=bash]
$ HRest ... (i/p: macros, hmmdefs, o/p: newproto)
\end{lstlisting}
Repeat \$ HRest for 4 or more times using previously generated proto for each character symbol.
\begin{lstlisting}[language=bash]
$ HERest ... (i/p: macros, hmmdefs, o/p: newproto)
\end{lstlisting}
(repeat 4 or more times using previously generated proto)
\break
\begin{lstlisting}[language=bash]
$ HHEd ... (i/p: macros, hmmdefs, o/p: newHmmdefs.mlf)
\end{lstlisting}
\begin{lstlisting}[language=bash]
$ HResults ... (i/p: allTest.mlf, o/p: result.mlf, confusiton matrix)
\end{lstlisting}
In the above experiment we have create created separate hmmdef for different character symbols.
Then after applying HRest ... we have copied all generated proto (lastly) file in hmmdefs file.
We have created the test.scp by kepping all test (.htk) file paths (for all character symbols).
We have also created the test.mlf by kepping all labels for test .mfcc files (for all character symbols).
In the above experiment we have tried using \break
5 states and 1 mixture for each digit with efficiency \textbf{\textit{50\%}}. \break
8 states and 1 mixture for each digit with efficiency \textbf{\textit{60\%}}. \break
12 states and 5 mixture for each digit with efficiency \textbf{\textit{85\%}}. \break
\break
\end{flushleft}
\subsection{Inferances}
\begin{flushleft}
1. It was observed that with the increase of state number and mixture for each digit the effenciy increases.
\end{flushleft}
\subsection{Links}
\begin{flushleft}
All the codes are available at the following
\color{red}
\href{https://github.com/holasm/pr_assignment3/tree/master/hmm/handwritten}{link}.
\end
4045b2ac7aadbc011301ccf731a1fb5094e8000b
\end{flushleft}
\break
\begin{center}
\section{HMM - TIDIGIT data}
\end{center}
\subsection{Introduction}
\begin{flushleft}
HMM is a statistical model where we use supervised learning to create the model.
The model contains hidden states. We attach state transition probability to each state.
The states can emit symbols with certain probabilities. \break
The state transition probabilities and
symbol emission probabilities are calculated using Baum Welch algorithm.
The classification of a particular sequence is done using Vitterb algorithm on the constructed HMM.
\end{flushleft}
\subsection{Goal}
\begin{flushleft}
Identify digit sequence from speech using HMM(Hidden Markov Model) using HTK tool kit.
\end{flushleft}
\subsection{Data}
\begin{flushleft}
Given .mfcc files corresponding to diffenent digit sequence uttered by men and womnen.
The .mfcc files contains the features extracted from .wav files.
All the train and test data was already separated.
\end{flushleft}
\subsection{Observation and Preprocessing of Data}
\begin{flushleft}
All given .mfcc files were containing around 400 feature vector with 38 component each.
The fist line of .mfcc file was indicating the no of feature vector and number of components.
We were about to use HTK toolkit to do the experiment.
The following stepas were taken
1. To use the HTK tool kit we need to convert the .mfcc files to .htk format. To do this we have used matlab script file.
2. To use the matlab script we need to truncate the first line of the provided .mfcc files.
3. To use the matlab script we also need a filelist file with all .mfcc file paths.
\end{flushleft}
\subsection{Experiment}
\begin{flushleft}
We have HTK toolkit for the entire experiment.
The steps taken as follows. \break
1. Create grammer file \break
\begin{lstlisting}[language=bash]
$ HParse ... (i/p: grammer, o/p: wordnet)
\end{lstlisting}
2. Create {wlist} file \break
3. Create {lexicon} file \break
\begin{lstlisting}[language=bash]
$ HDMan ... (i/p: wlist, lexicon, model0, o/p: dict, dlog)
\end{lstlisting}
4. Create {all.mlf} file. The all.mlf file contains all label for man and woman mfcc files \break
5. Create {all.scp} file. This is the training scp file. The all.scp file contains all label for man and woman mfcc files \break
6. Create {proto} file √ \break
\begin{lstlisting}[language=bash]
$ HCompV ... (i/p: all.scp, proto, o/p: newproto)
\end{lstlisting}
7. Create {hmmdefs} file containg all HMMs for diffenent digits. hmmdefs was created by renaming the ~h option to digit name
inside the newproto created using \$ HCompV and make multiple copies for each digit in one hmmdefs.
8. We created {macros} file using {vFloors} file
\begin{lstlisting}[language=bash]
$ HERest ... (i/p: macros, hmmdefs, o/p: newproto)
\end{lstlisting}
(repeat 4 or more times using previously generated proto)
\break
\begin{lstlisting}[language=bash]
$ HHEd ... (i/p: macros, hmmdefs, o/p: newHmmdefs.mlf)
\end{lstlisting}
\begin{lstlisting}[language=bash]
$ HResults ... (i/p: allTest.mlf, o/p: result.mlf, confusiton matrix)
\end{lstlisting}
In the above experiment we have tried using \break
5 states and 1 mixture for each digit with efficiency \textbf{\textit{26\%}}. \break
8 states and 1 mixture for each digit with efficiency \textbf{\textit{50\%}}. \break
12 states and 1 mixture for each digit with efficiency \textbf{\textit{70\%}}. \break
20 states and 1 mixture for each digit with efficiency \textbf{\textit{71\%}}. \break
12 states and 5 mixture for each digit with efficiency \textbf{\textit{90\%}}. \break
\end{flushleft}
\subsection{Inferances}
\begin{flushleft}
1. It was observed that with the increase of state number and mixture for each digit the effenciy increases.
\end{flushleft}
\subsection{Links}
\begin{flushleft}
All the codes are available at the following
\color{red}
\href{https://github.com/holasm/pr_assignment3/tree/master/hmm/tidigits/codes}{link}.
\end
3c36055eb64e50119ca39c844605bf59b824cb43
\end{flushleft}
\break
% Parzen window
\begin{center}
\section{Parzen Window - Hypersphere as A Region}
\end{center}
\subsection{Introduction}
\begin{flushleft}
Parzen Windiow is a non-parametric way of classifying data. In this method we take a volume in
space around the test feature vector. For Hypersphere as a region we take hypersphere volume
and calculate number of points from different classes inside the sphere.
The class in which maximum points are lying is the resultant class of input feature vector.
\end{flushleft}
\subsection{Goal}
\begin{flushleft}
Identify image using Parzen Window (Taking Hypersphere as A Region).
\end{flushleft}
\subsection{Data}
\begin{flushleft}
Given files were containg extracted features from different images in terms of 36x23 dimention matrix.
\end{flushleft}
\subsection{Observation and Preprocessing of Data}
\begin{flushleft}
We created one vector for each image data of dimention 1x828 (23x36).
\end{flushleft}
\subsection{Experiment}
\begin{flushleft}
We have tried different volume of hyperspheres yielding varying effeciencies.
For h = 25 the efficiency was \textbf{\textit{42\%}}\break
For h = 20 the efficiency was \textbf{\textit{48\%}}\break
For h = 15 the efficiency was \textbf{\textit{55\%}}\break
For h = 10 the efficiency was \textbf{\textit{50\%}}\break
\end{flushleft}
\subsection{Observation}
\begin{flushleft}
1. The efficiency depends on the radius of the hypersphere.
There is a optimum value of radius of the hypersphere for which the efficiency will be maximum.
\end{flushleft}
\break
\begin{center}
\section{Parzen Window - using Gausian Kernel}
\end{center}
\subsection{Introduction}
\begin{flushleft}
Parzen Windiow is a non-parametric way of classifying data. In Gausian Kernel we calculate probability
of all training points and the test point is classified as in the class of the point with highest probaility.
\end{flushleft}
\subsection{Goal}
\begin{flushleft}
Identify image using Parzen Window (using Gausian Kernel).
\end{flushleft}
\subsection{Data}
\begin{flushleft}
Given files were containg extracted features from different images in terms of 36x23 dimention matrix.
\end{flushleft}
\subsection{Observation and Preprocessing of Data}
\begin{flushleft}
We created one vector for each image data of dimention 1x828 (23x36).
\end{flushleft}
\subsection{Experiment}
\begin{flushleft}
The efficiency was around \textbf{\textit{60\%}}
\begin{figure}[!htb]
\begin{center}
\minipage{0.8\textwidth}
\includegraphics[width=\linewidth]{kernel.JPG}
\caption{Confusion Matrix}\label{fig:fig_a}
\endminipage\hfill
\end{center}
\end{figure}
\end{flushleft}
\subsection{Observation}
\begin{flushleft}
1. The efficiency depends on the radius of the hypersphere.
There is a optimum value of radius of the hypersphere for which the efficiency will be maximum.
\end{flushleft}
\break
% KNN
\begin{center}
\section{KNN (K Nearest Neighbours)}
\end{center}
\subsection{Introduction}
\begin{flushleft}
K-nearest neighbours method is the simplest non parametric method of all. In this method we classify
the point depending on k nearest point.
\end{flushleft}
\subsection{Goal}
\begin{flushleft}
Identify image using KNN (K Nearest Neighbours).
\end{flushleft}
\subsection{Data}
\begin{flushleft}
Given files were containg extracted features from different images in terms of 36x23 dimention matrix.
\end{flushleft}
\subsection{Observation and Preprocessing of Data}
\begin{flushleft}
We created one vector for each image data of dimention 1x828 (23x36).
\end{flushleft}
\subsection{Experiment}
\begin{flushleft}
First of all, for one input feature vector, distances are calculated from all training feature vectors.
Then some value of h is taken and h points with minimum distances are taken. we know the classes of this points so class in which the maximum number of points are lying will be the class of the input feature. \break
For k = 5 the efficiency was around \textbf{\textit{60\%}}
\begin{figure}[!htb]
\begin{center}
\minipage{0.9\textwidth}
\includegraphics[width=\linewidth]{3.png}
\caption{ROC cureve}\label{fig:fig_a}
\endminipage\hfill
\end{center}
\end{figure}
\end{flushleft}
\subsection{Observation}
\begin{flushleft}
1. The efficiency depends on the value of k. With too much high value or too much
kow value of k the efficiency decreases.
\end{flushleft}
\break
% LDA
\begin{center}
\section{FDA (Fisher Discriminant Analysis)}
\end{center}
\subsection{Introduction}
\begin{flushleft}
Fishers linear discreminent analysis technique is used to reduce the dimension of the n-dimensional point to the k-dimension where k <= n. FLDA reduces the dimensions in the direction of maximum of between the class scatter and minimum of the withiin class scatter.
\end{flushleft}
\subsection{Goal}
\begin{flushleft}
Identify image using FDA (Fisher Discriminant Analysis).
\end{flushleft}
\subsection{Data}
\begin{flushleft}
Given files were containg extracted features from different images in terms of 36x23 dimention matrix.
\end{flushleft}
\subsection{Experiment}
\begin{flushleft}
At first we reduced the dimensions of every input feature vector from 23 to 7 (no. of classes - 1) and then we applied K-NN classifier to classify.
\begin{figure}[!htb]
\begin{center}
\minipage{0.9\textwidth}
\includegraphics[width=\linewidth]{6.jpg}
\caption{ROC cureve}\label{fig:fig_a}
\endminipage\hfill
\end{center}
\end{figure}
\break
\begin{figure}[!htb]
\begin{center}
\minipage{0.9\textwidth}
\includegraphics[width=\linewidth]{7.JPG}
\caption{Confusion matrix}\label{fig:fig_a}
\endminipage\hfill
\end{center}
\end{figure}
\end{flushleft}
\subsection{Observation}
\begin{flushleft}
1. The efficiency depends on the value of k.
\end{flushleft}
\break
% SVM
\begin{center}
\section{SVM (Support Vector Machine)}
\end{center}
\subsection{Introduction}
\begin{flushleft}
SVM (Support Vector Machine) is mainly used in linear classification. It is a supervised learning model.
In SVM the training data is marked as belonging to one of two categories separeated by line. This way we
classify new test data by observing which side th test data falls. \break
We can extend this idea for classification of among multiple class.
\end{flushleft}
\subsection{Goal}
\begin{flushleft}
Identify image using SVM (Support Vector Machine). We had to use libsvm library.
\end{flushleft}
\subsection{Data}
\begin{flushleft}
Given files were containg extracted features from different images in terms of 36x23 dimention matrix.
\end{flushleft}
\subsection{Observation and Preprocessing of Data}
\begin{flushleft}
We have taken 25\% of the given data from each class and used it as test data.
The remaining 75\% was used as training data.
\end{flushleft}
\subsection{Experiment}
\begin{flushleft}
We used two library function provided by libsvm toolkit.\break
1. svmtrain()\break
2. svmpredict()\break
For svmtrain() function the first argument was a matrix of feature vectors created
by vertically concatenating all training image feature matrix.
The second argument was a vector indicating the class label for row of the first argument matrix.
The output of the svmtrain() is the train\_model. \break
For svmpredict() function the first argument was a matrix of feature vectors created
by vertically concatenating all test image feature matrix.
The second argument was a vector indicating the class label for row of the first argument matrix.
The second argument was the train\_model we got as output from svmtrain().
The accuracy was around \textbf{\textit{55\%}}
\end{flushleft}
\break
\end{document}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment