What is a Confusion Matrix in Machine Learning?
/ March 30, 2018

What is Confusion Matrix? In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as an error matrix. A confusion matrix represents information about actual and classified cases produced by a classification system. Performance of such a system is commonly evaluated by demonstrating the correct and incorrect patterns classification. A confusion matrix is a table that is often used to describe the performance of a classification model (or “classifier”) on a set of test data for which the true values are known. It allows the visualization of the performance of an algorithm. A confusion matrix (Kohavi and Provost, 1998) contains information about actual and predicted classifications done by a classification system. Performance of such systems is commonly evaluated using the data in the matrix. The following table shows the confusion matrix for a two class classifier. It allows easy identification of confusion between classes e.g. one class is commonly mislabeled as the other. Most performance measures are computed from the confusion matrix. The entries in the confusion matrix have the following meaning in the context of our study: TN is the number of correct predictions that an instance is negative, FP is the number of incorrect predictions…

How to Implement ID3 Algorithm using Python
/ March 29, 2018

Introduction ID3 decision tree algorithm is the first of a series of algorithms created by Ross Quinlan to generate decision trees. Decision Tree is one of the most powerful and popular algorithm. Decision-tree algorithm falls under the category of supervised learning algorithms. It works for both continuous as well as categorical output variables. ID3 is a classification algorithm which for a given set of attributes and class labels, generates the model/decision tree that categorizes a given input to a specific class label ​$$C_k [C_1 C_2 C_3,…C_k].$$​. The algorithm follows a greedy approach by selecting a best attribute that yields maximum information gain ​$$(IG)$$​or minimum entropy ​$$(H).$$​. The algorithm then splits the data-set ​$$(S)$$​recursively upon other unused attributes until it reaches the stop criteria (no further attributes to split). The non-terminal nodes in the decision tree represents the selected attribute upon which the split occurs and the terminal nodes represent the class labels. ID3 Characteristics ID3 does not guarantee an optimal solution; it can get stuck in local optimums It uses a greedy approach by selecting the best attribute to split the dataset on each iteration (one improvement that can be made on the algorithm can be to use backtracking during…

How to Implement ID3 Decision Tree Algorithm using JAVA
/ March 28, 2018

The development of Information technology has generated large amount of databases and huge data in various areas. The research in databases and information technology has given rise to an approach to store and manipulate this precious data for further decision making. Decision tree is powerful and popular tool for classification and prediction. Decision trees represent rules. A decision tree is predictive model that, as its name implies, can be viewed as a tree. Specifically each branch of the tree is a classification question and the leaves of the tree are partitions of the dataset with their classification. Decision tree is a classifier in the form of a tree structure, where each node is either: A leaf node- indicates the value of the target attribute(class) of examples, or A decision node- specifies some test to be carried out on a single attribute- value, with one branch and sub-tree for each possible outcome of the test. ID3 algorithm is primarily used for decision making. ID3 (Iterative Dichotomiser 3) algorithm invented by Ross Quinlan is used to generate a decision tree from a dataset. There are different implementations given for Decision Trees. Major ones are ID3: Iternative Dichotomizer was the very first implementation…

Social Media Analytics and How Social Media Analytics Works
/ March 19, 2018

In the decade since social networking was born, we have seen the power of platforms that unite humanity. Across our professional and personal lives, social platforms have truly changed the world. Social media has been the tool to ignite revolutions and elections, deliver real-time news, connect people and interests, and of course, drive commerce. Social media plays a significant role in today‘s networked society. It has affected the online interaction between users, whom shares a lot of personal details and information online. Dynamic nature of social media data is a significant challenge for continuously and speedily evolving social media sites. Social media is growing rapidly and it offers something for everyone. Overview of Social Media Analytics Social Media Analytics is an on-demand offering that integrates, archives, analyzes and reports on the effects of online conversations occurring across professional, consumer-generated and social network media sites. As a result of the intelligence gleaned from this process, organizations can understand the effects online conversations are having on specific aspects of their business operations. Social media analytics (SMA) refers to the approach of collecting data from social media sites and blogs and evaluating that data to make business decisions. This process goes beyond the…

What is Blockchain Technology And How It Is Useful For Us
/ March 18, 2018

Blockchain is being termed as the fifth disruptive innovation in computing. Blockchain technology or the distributed, secure ledger technology has gained much attention in recent years. This article presents blockchain technology literature and its applications. A very significant plus of the blockchain technology is that it solves two of the most dreaded problems of currency based transactions, which have so long necessitated the requirement of a third party to validate the transactions. Blockchain Overview Blockchain technology is a sophisticated, interesting, and emerging technology. It provides a reliable way of confirming the party submitting a record to the blockchain, the time and date of its submission, and the contents of the record at the time of submission, eliminating the need for third-party intermediaries in certain situations. However, it is important to consider that blockchain technology does not verify or address the reliability or the accuracy of the contents, and additionally blockchain technology provides no storage for records, but instead the hashes thereof A blockchain is an electronic ledger of digital records, events, or transactions that are cryptographically hashed, authenticated, and maintained through a “distributed” or “shared” network of participants using a group consensus protocol. Much like a checkbook is a ledger…

What is Visual Analytics
/ February 9, 2018

We are living in a world which faces a rapidly increasing amount of data to be dealt with on a daily basis. In the last decade, the steady improvement of data storage devices and means to create and collect data along the way influenced our way of dealing with information: Most of the time, data is stored without filtering and refinement for later use. Virtually every branch of industry or business, and any political or personal activity nowadays generate vast amounts of data. Making matters worse, the possibilities to collect and store data increase at a faster rate than our ability to use it for making decisions. However, in most applications, raw data has no value in itself; instead we want to extract the information contained in it. Overview Generally, large scale organizations have large amount of data and information to process. They need some strong procedures and techniques to collect, analyze, process and visualize the data in order to get required results as well as to take the right decision in order to get their long term goals and objectives. Several software and tools relating to big data analytics, visual analytics are being used by companies in order to…

What is Data Steaming in Data Mining
/ February 5, 2018

In today’s information society, computer users are used to gathering and sharing data anytime and anywhere. This concerns applications such as social networks, banking, telecommunication, health care, research, and entertainment, among others. As a result, a huge amount of data related to all human activity is gathered for storage and processing purposes. These data sets may contain interesting and useful knowledge represented by hidden patterns, but due to the volume of the gathered data it is impossible to manually extract that knowledge. Data streaming requires some combination of bandwidth sufficiency and, for real-time human perception of the data, the ability to make sure that enough data is being continuously received without any noticeable time lag. What is it? Streaming Data is data that is generated continuously by thousands of data sources, which typically send in the data records simultaneously, and in small sizes (order of Kilobytes). Streaming data includes a wide variety of data such as log files generated by customers using your mobile or web applications, e-commerce purchases, in-game player activity, information from social networks, financial trading floors, or Geo-spatial services, and telemetry from connected devices or instrumentation in data centers. This data needs to be processed sequentially and incrementally…

Multi Agent System in Artificial Intelligence
/ January 28, 2018

Multi-agent systems are made up of multiple interacting intelligent agents—computational entities to some degree autonomous and able to cooperate, compete, communicate, act flexibly, and exercise control over their behavior within the frame of their objectives. They are the enabling technology for a wide range of advanced applications relying on distributed and parallel processing of data, information, and knowledge relevant in domains ranging from industrial manufacturing to e-commerce to health care. What is Multi-agent system? In artificial intelligence research, agent-based systems technology has been hailed as a new paradigm for conceptualizing, designing, and implementing software systems. Agents are sophisticated computer programs that act autonomously on behalf of their users, across open and distributed environments, to solve a growing number of complex problems. Increasingly, however, applications require multiple agents that can work together. A multi-agent system (MULTI-AGENT SYSTEM) is a loosely coupled network of software agents that interact to solve problems that are beyond the individual capacities or knowledge of each problem solver. Multi-agent system can be define by the following definition: “A multi-agent system is a loosely coupled network of problem-solving entities (agents) that work together to find answers to problems that are beyond the individual capabilities or knowledge of each…

Introduction of Context Aware Computing
/ January 19, 2018

Context-aware computing promises a smooth interaction between humans and technology but few studies have been conducted with regards to how autonomously an application should perform. Context-aware computing is a style of computing in which situational and environmental information about people, places and things is used to anticipate immediate needs and proactively offer enriched, situation-aware and usable content, functions and experiences. The notion of context is much more widely appreciated today. The term “context-aware computing” is commonly understood by those working in ubiquitous/pervasive computing, where it is felt that context is a key in their efforts to disperse and enmesh computation into our lives. Overview of Context-aware computing Context is a powerful, and longstanding, concept in human-computer interaction. Interaction with computation is by explicit acts of communication (e.g., pointing to a menu item), and the context is implicit (e.g., default settings). Context can be used to interpret explicit acts, making communication much more efficient. Thus, by carefully embedding computing into the context of our lived activities, it can serve us with minimal effort on our part. Communication can be not only effortless, but also naturally fit in with our ongoing activities. A great deal of effort has gone into the field of…

$${}$$