Statistical learning (SL) is the third mainstream in machine learning research. The main goal of statistical learning theory is to provide a framework for studying problem of inference. That is of gaining knowledge, making predictions, making decisions or constructing models from a set of data. Statistical Learning provides an accessible overview of the field of statistical learning, an essential tool-set for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. Basic Overview of Statistical Learning Statistical learning refers to a set of tools for modeling and understanding complex datasets. It is a recently developed area in statistics and blends with parallel developments in computer science and, in particular, machine learning. The field encompasses many methods such as the lasso and sparse regression, classification and regression trees, and boosting and support vector machines. It refers to a vast set of tools for understanding data. These tools can be classified as supervised or unsupervised. Broadly speaking, supervised SL involves building a statistical model for predicting, or estimating, an output based on one or more inputs. Problems of this nature occur in fields as…

What is Confusion Matrix? In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as an error matrix. A confusion matrix represents information about actual and classified cases produced by a classification system. Performance of such a system is commonly evaluated by demonstrating the correct and incorrect patterns classification. A confusion matrix is a table that is often used to describe the performance of a classification model (or “classifier”) on a set of test data for which the true values are known. It allows the visualization of the performance of an algorithm. A confusion matrix (Kohavi and Provost, 1998) contains information about actual and predicted classifications done by a classification system. Performance of such systems is commonly evaluated using the data in the matrix. The following table shows the confusion matrix for a two class classifier. It allows easy identification of confusion between classes e.g. one class is commonly mislabeled as the other. Most performance measures are computed from the confusion matrix. The entries in the confusion matrix have the following meaning in the context of our study: TN is the number of correct predictions that an instance is negative, FP is the number of incorrect predictions…

Classification is a technique of supervised learning in data mining. that technique is applied when the data patterns or samples are having some predefined pattern labels or class labels. the supervised learning algorithms first prepare the data models based on the existing patterns. these existing patterns are known as training samples. additionally the preparation of data models are known as the training of algorithms. after the training of algorithms the data model is used to recognize the similar newly appeared samples or patterns. that is a very essential and popular technique in data mining because for obtaining the precise outcomes these techniques are used. figure 1 classification There are two forms of data analysis that can be used for extract models describing important classes or predict future data trends. These two forms are as follows Classification Prediction These data analysis help us to provide a better understanding of large data. Classification predicts categorical and a prediction model predicts continuous valued functions. For example, we can build a classification model to categorize bank loan applications as either safe or risky, or a prediction model to predict the expenditures in dollars of potential customers on computer equipment given their income and occupation….