Sentiment Analysis and Text Mining
/ September 28, 2017

Sentiment analysis over Twitter offer organizations a fast and effective way to monitor the publics’ feelings towards their brand, business, directors, etc. A wide range of features and methods for training sentiment classifiers for Twitter datasets have been researched in recent years with varying results. Sentiment analysis overview The emergence of social media has given web users a venue for expressing and sharing their thoughts and opinions on all kinds of topics and events. Twitter, with nearly 600 million users1 and over 250 million messages per day, has quickly become a gold mine for organizations to monitor their reputation and brands by extracting and analyzing the sentiment of the Tweets posted by the public about them, their markets, and competitors. Sentiment analysis has been first introduced by Liu, B. It is also known as opinion mining and subjectivity analysis is the process to determine the attitude or polarity of opinions or reviews written by humans to rate products or services. Sentiment analysis can be applied on any textual form of opinions such as blogs, reviews and Microblogs. Microblogs are those small text messages such as tweets, a short message that cannot exceed 149 characters. These microblogs are easier than other…

Image Inpainting Technique in Image Processing
/ September 27, 2017

Image inpainting was historically done manually by painters for removing defect from paintings and photographs. Fill the region of missing information from a signal using surrounding information and re-form signal is the basic work of inpainting algorithms. Image inpainting is an art of missing value or a data in an image. The purpose of image inpainting is to reconstruct missing regions which is visible for human eyes. Image inpainting is the process of reconstructing lost part of images based on the background information. The modification of images in a way that is non-detectable for an observer who does not know the original image is a practice as old as artistic creation itself. The need to retouch the image in an unobtrusive way extended naturally from paintings to photography and film. The purposes remain the same to revert deterioration (e.g., cracksin photographs or scratches and dust spots in film), or to add or remove elements (e.g., removal of stamped date and red-eye from photographs, the infamous “airbrushing” of political enemies). Inpainting is the art of restoring lost parts of an image and reconstructing them based on the background information. This has to be done in an undetectable way. The term inpainting…

Classification and Regression Tree (CART) Algorithm in Data mining
/ September 25, 2017

CART Algorithm overview A CART tree is a binary decision tree that is constructed by splitting a node into two child nodes repeatedly, beginning with the root node that contains the whole learning sample. Classification and regression trees are machine-learning methods for constructing prediction models from data. The models are obtained by recursively partitioning the data space and fitting a simple prediction model within each partition. Decision Trees are commonly used in data mining with the objective of creating a model that predicts the value of a target (or dependent variable) based on the values of several input (or independent variables).  In today’s post, we discuss the CART decision tree methodology. Classification Trees: where the target variable is categorical and the tree is used to identify the “class” within which a target variable would likely fall into. Regression Trees: where the target variable is continuous and tree is used to predict its value. The CART decision tree is a binary recursive partitioning procedure capable of processing continuous and nominal attributes as targets and predictors. Data are handled in their raw form; no binning is required or recommended. Beginning in the root node, the data are split into two children, and…

Genetic Algorithm
/ September 21, 2017

Genetic algorithms are a part of evolutionary computing, which is a rapidly growing area of artificial intelligence. Nature has always been a great source of inspiration to all mankind. Genetic Algorithms (GAs) are search based algorithms based on the concepts of natural selection and genetics. GAs is a subset of a much larger branch of computation known as Evolutionary Computation Genetic algorithms are inspired by Darwin’s theory about evolution. Simply said, solution to a problem solved by genetic algorithms is evolved. Genetic Algorithms are a family of computational models inspired by evolution These algorithms encode a potential solution to a specific problem on a simple chromosomelike data structure and apply recombination operators to these structures so as to preserve critical information Genetic algorithms are often viewed as function optimizers although the range of problems to which genetic algorithms have been applied is quite broad. Definition of Genetic algorithm Genetic Algorithms are heuristic search approaches that are applicable to a wide range of optimization problems. This flexibility makes them attractive for many optimization problems in practice. Evolution is the basis of Genetic Algorithms. The current variety and success of species is a good reason for believing in the power of evolution. Species are…

Decision Tree C 4.5(J48)
/ September 15, 2017

Data mining is the useful tool to discovering the knowledge from large data. Classification methods aim to identify the classes that belong to objects from some descriptive traits. They find utility in a wide range of human activities and particularly in automated decision making. Decision trees are a very effective method of supervised learning. It aims is the partition of a dataset into groups as homogeneous as possible in terms of the variable to be predicted. It takes as input a set of classified data, and outputs a tree that resembles to an orientation diagram where each end node (leaf) is a decision (a class) and each non- final node (internal) represents a test. Each leaf represents the decision of belonging to a class of data verifying all tests path from the root to the leaf. The tree is simpler, and technically it seems easy to use. In fact, it is more interesting to get a tree that is adapted to the probabilities of variables to be tested. Mostly balanced tree will be a good result. If a sub-tree can only lead to a unique solution, then all sub-tree can be reduced to the simple conclusion, this simplifies the process…

NS2 Simulator: An Introduction
/ September 14, 2017

Simulation Overview Simulation is widely-used in system modelling for applications ranging from engineering research, business analysis, manufacturing planning, and biological science experimentation. Networking study, implementation, testing and evaluation is not feasible without Network simulation. It is a technique where a code incorporates the behaviour of a network by calculating the interaction between the different network entities (hosts/packets, etc.) using mathematical modelling. Simulators are used for the development of new networking architectures, protocols or to modify the existing protocols in efficient environment. Network simulator provides benefits of time as well as cost saving while implementing and testing any wired or wireless network. Due to growth of communication networks and ever increasing networking speed, the role of efficient Network simulators in research field is important. A network simulator is a piece of software or hardware that predicts the behaviour of a network, without an actual network being present. A simulation is, more or less, a combination of art and science. That is, while the expertise in computer programming and the applied mathematical tools account for the science part, the very skill in analysis and conceptual model formulation usually represents the art portion. A simulation can be thought of as a flow process…

What is Bayesian Classifier and how Bayesian Classifier works ?
/ September 12, 2017

Bayesian classifier: introduction A Bayesian classifier is based on idea that can predict values of features for members of learned classes. In data set patterns are grouped in classes because they have common features. Such classes are often called natural kinds. The idea behind a Bayesian classifier is that, if someone knows the classes, it can predict values of  similar other patterns. If it does not know the class, Bayes’ rule can be used to predict the class given according to attributes. In a Bayesian classifier, the learning model is a probabilistic model of attributes and that predict the classification labels of a new similar Pattern. A latent variable is a probabilistic variable that is not observed. A Bayesian classifier is a probabilistic model where the classification is a latent variable that is probabilistically related to the observed variables. Classification then becomes inference in the probabilistic model. Naive Bayes is a family of probabilistic algorithms that take advantage of probability theory and Bayes’ Theorem to predict the category of a sample (like a piece of news or a customer review). They are probabilistic, which means that they calculate the probability of each category for a given sample, and then output the category with the highest one. The…

Image Segmentation
/ September 11, 2017

Overview The division of an image into meaningful structures, image segmentation, is often an essential step in image analysis, object representation, visualization, and many other image processing tasks. Segmentation partitions an image into distinct regions containing each pixel with similar attributes. To be meaningful and useful for image analysis and interpretation, the regions should strongly relate to depicted objects or features of interest. Meaningful segmentation is the first step from low-level image processing transforming a greyscale or colour image into one or more other images to high-level image description in terms of features, objects, and scenes. The success of image analysis depends on reliability of segmentation, but an accurate partitioning of an image is generally a very challenging problem. Image segmentation is the division of an image into regions or categories, which correspond to different objects or parts of objects. Every pixel in an image is allocated to one of a number of these categories. A good segmentation is typically one in which: Pixels in the same category have similar greyscale of multivariate values and form a connected region, Neighboring pixels which are in different categories have dissimilar values The goal of image segmentation is to cluster pixels into salient…

An Overview of Particle swarm optimization (PSO)
/ September 8, 2017

Particle swarm optimization (PSO): introduction Particle swarm optimization (PSO) simulates the behaviors of bird flocking. Suppose the following scenario: a group of birds are randomly searching food in an area. There is only one piece of food in the area being searched. All the birds do not know where the food is. But they know how far the food is in each iteration. So what’s the best strategy to find the food? The effective one is to follow the bird which is nearest to the food. Particle swarm optimization (PSO) is a population based stochastic optimization technique developed by Dr. Eberhart and Dr. Kennedy in 1995. Definition Theory of particle swarm optimization (PSO) has been growing rapidly. PSO has been used by many applications of several problems. The algorithm of PSO emulates from behavior of animals societies that don’t have any leader in their group or swarm, such as bird flocking and fish schooling. Typically, a flock of animals that have no leaders will find food by random, follow one of the members of the group that has the closest position with a food source (potential solution). The flocks achieve their best condition simultaneously through communication among members who already have…

GridSim Overview
/ September 1, 2017

The GridSim is a Java based discrete event simulation package, supports modeling and simulation of a wide range of heterogeneous resources, such as single or multiprocessors, shared and distributed memory machines such as PCs, workstations, Shared memory Multiprocessors (SMP), and clusters with different capabilities and configurations. It can be used for modeling and simulation of application scheduling on various classes of parallel and distributed computing systems such as clusters, Grids, and P2P networks. The toolkit provides concurrent entities for creation of application tasks, mapping of tasks to resources and their management. In Grid Computing applications where resources are spread over dispersed locations and run by different organizations with differing policies, it is important that the user is given a guarantee that their tasks be completed, and completed within specific guidelines that the user may wish to request. To achieve this, agreements need to be made between the resource providers and the users themselves, protecting and outlining the requirements, policies and rights of both parties. Simulation has been used extensively as a way to evaluate and compare scheduling strategies as simulation experiments are configurable, repeatable, and generally fast. GridSim is a famous Java-based grid simulator with a clear focus on Grid…

Insert math as
$${}$$