Introduction of Text Summarization

With the dramatic growth of the Internet, people are overwhelmed by the tremendous amount of online information and documents. This expanding availability of documents has demanded exhaustive research in the area of automatic text summarization. Every day, people rely on a wide variety of sources to stay informed — from news stories to social media posts to search results. Being able to develop Machine Learning models that can automatically deliver accurate summaries of longer text can be useful for digesting such large amounts of information in a compressed form. What is Text Summarization? Text summarization is the problem of creating a short, accurate, and fluent summary of a longer text document. Summarization can also serve as an interesting reading comprehension test for machines. To summarize well, machine learning models need to be able to comprehend documents and distill the important information, tasks which are highly challenging for computers, especially as the length of a document increases. The World Wide Web has brought us a vast amount of on-line information. Due to this fact, every time someone searches something on the Internet, the response obtained is lots of different Web pages with much information, which is impossible for a person to…

What is Sequential Pattern Mining

The rapid growth of the amount of stored digital data and the recent developments in data mining techniques, have lead to an increased interest in methods for the exploration of data, creating a set of new data mining problems and solutions. Frequent Structure Mining is one of these problems. Its target is the discovery of hidden structured patterns in large databases. Sequences are the simplest form of structured patterns. In this article Sequential Pattern Mining is discussed. Introduction of Sequential Pattern Mining Sequential pattern is a set of itemsets structured in sequence database which occurs sequentially with a specific order. A sequence database is a set of ordered elements or events, stored with or without a concrete notion of time. Each itemset contains a set of items which include the same transaction-time value. While association rules indicate intra-transaction relationships, sequential patterns represent the correlation between transactions. Sequential pattern mining discovers which items a single customer, having those items come from various transactions, brings in a particular order. The resulting pattern found after mining is the sequence of item sets that normally found frequent in specific order. Sequential pattern mining is used in various areas for different purposes. It can be used…

Introduction of Natural Language Processing

Natural language processing (NLP) is the relationship between computers and human language. More specifically, natural language processing is the computer understanding, analysis, manipulation, and/or generation of natural language. Will a computer program ever be able to convert a piece of English text into a programmer friendly data structure that describes the meaning of the natural language text? Unfortunately, no consensus has emerged about the form or the existence of such a data structure. Until such fundamental Artificial Intelligence problems are resolved, computer scientists must settle for the reduced objective of extracting simpler representations that describe limited aspects of the textual information. Overview Natural Language processing Natural language processing (NLP) can be defined as the automatic (or semi-automatic) processing of human language. The term ‘NLP’ is sometimes used rather more narrowly than that, often excluding information retrieval and sometimes even excluding machine translation. NLP is sometimes contrasted with ‘computational linguistics’, with NLP being thought of as more applied. Nowadays, alternative terms are often preferred, like ‘Language Technology’ or ‘Language Engineering’. Language is often used in contrast with speech (e.g., Speech and Language Technology). But I’m going to simply refer to NLP and use the term broadly. NLP is essentially multidisciplinary: it is…

Outlier Detection in Data Mining

Outlier detection is a primary step in many data-mining applications. In many data analysis tasks a large number of variables are being recorded or sampled. One of the first steps towards obtaining a coherent analysis is the detection of outlaying observations. Although outliers are often considered as an error or noise, they may carry important information. Detected outliers are candidates for aberrant data that may otherwise adversely lead to model misspecification, biased parameter estimation and incorrect results. It is therefore important to identify them prior to modeling and analysis. Outlier Detection Overview Outlier Detection is an algorithmic feature that allows you to detect when some members of a group are behaving strangely compared to the others. Outlier detection is an important research problem in data mining that aims to find objects that are considerably dissimilar, exceptional and inconsistent with respect to the majority of the data in an input database. Outliers are extreme values that deviate from other observations on data; they may indicate variability in a measurement, experimental errors or a novelty. An outlier is an observation (or measurement) that is different with respect to the other values contained in a given dataset. Outliers can be due to several…

Iris Recognition System

The pressures on today’s system administrators to have secure systems are ever increasing. One area where security can be improved is in authentication. Iris recognition, a biometric, provides one of the most secure methods of authentication and identification thanks to the unique characteristics of the iris. The iris recognition is now becoming a common authentication method in handheld consumer electronics devices, such as cellphones and tablets. The iris being a biometric parameter is a way better than password protection because of its uniqueness for each individual. General Overview of Iris Recognition System In today’s information technology world, security for systems is becoming more and more important. The number of systems that have been compromised is ever increasing and authentication plays a major role as a first line of defence against intruders. The three main types of authentication are something you know (such as a password), something you have (such as a card or token), and something you are (biometric). Passwords are notorious for being weak and easily crackable due to human nature and our tendency to make passwords easy to remember or writing them down somewhere easily accessible. Cards and tokens can be presented by anyone and although the token…

Signature Recognition and Applications

Signature is a special case of handwriting which includes special characters and flourishes. Many signatures can be unreadable. They are a kind of artistic handwriting objects. However, a signature can be handled as an image, and hence, it can be recognized using computer vision and artificial neural network techniques. Handwritten signatures are widely utilized as a form of personal recognition. However, they have the unfortunate shortcoming of being easily abused by those who would fake the identification or intent of an individual which might be very harmful. Therefore, the need for an automatic signature recognition system is crucial. Signature Recognition Overview The basic goal of the handwritten signatures is to provide an accurate method in order to verify a person’s identity based on the way in which he/she signs his/her name. Hence for this reason, the handwritten signatures are widely accepted, socially and legally throughout the world. There are basically two types of systems – online and offline. The hand-written signature verification uses the features conveyed by every signatory such that the features considered have a unique understanding and the way of signing presents the behavioral biostatistics. Some researchers considered common issues with the extraction of identification data from different…

Importance of Dimensionality Reduction in Data mining

The recent explosion of data set size, in number of records as well as of attributes, has triggered the development of a number of big data platforms as well as parallel data analytics algorithms. At the same time though, it has pushed for the usage of data dimensionality reduction procedures. Dealing with a lot of dimensions can be painful for machine learning algorithms. High dimensionality will increase the computational complexity, increase the risk of over fitting (as your algorithm has more degrees of freedom) and the sparsity of the data will grow. Hence, dimensionality reduction will project the data in a space with fewer dimensions to limit these phenomena. What is Dimensionality Reduction? The problem of unwanted increase in dimension is closely related to fixation of measuring / recording data at a far granular level then it was done in past. This is no way suggesting that this is a recent problem. It has started gaining more importance lately due to surge in data. In machine learning classification problems, there are often too many factors on the basis of which the final classification is done. These factors are basically variables called features. The higher the number of features, the harder…

Rule based System

Knowledge is practical or theoretical understanding of a subject or domain. Thus who possess knowledge are called experts. The human mental process is internal, it is too complex to be represented as an algorithm. However, most experts are capable of expressing their knowledge in the form of rules for problem solving. Rules are the popular paradigm for representing knowledge. A rule based expert system is one whose knowledge base contains the domain knowledge coded in the form of rules. Overview of Rule Based Systems Instead of representing knowledge in a relatively declarative, static way (as a bunch of things that are true), rule based system represent knowledge in terms of a bunch of rules that tell you what you should do or what you could conclude in different situations. A rule-based system consists of a bunch of IF-THEN rules, a bunch of facts, and some interpreter controlling the application of the rules, given the facts Rule-based systems (also known as production systems or expert systems) are the simplest form of artificial intelligence. A rule based system uses rules as the knowledge representation for knowledge coded into the system. The definitions of rule-based system depend almost entirely on expert systems, which…

What is Pattern Recognition

One of the most important capabilities of mankind is learning by experience, by our endeavors, by our faults. By the time we attain an age of five most of us are able to recognize digits, characters; whether it is big or small, uppercase or lowercase, rotated, tilted. We will be able to recognize, even if the character is on a mutilated paper, partially occluded or even on the clustered background. Looking at the history of the human search for knowledge, it is clear that humans are fascinated with recognizing patterns in nature, understand it, and attempt to relate patterns into a set of rules. Informally, a pattern is defined by the common denominator among the multiple instances of an entity. Therefore, Pattern recognition has its origins in engineering, whereas machine learning grew out of computer science. Pattern recognition is concerned with the design and development of systems that recognize patterns in data. The purpose of a pattern recognition program is to analyze a scene in the real world and to arrive at a description of the scene which is useful for the accomplishment of some task. Introduction of Pattern Recognition Pattern Recognition is a mature but exciting and fast developing field,…

What is Fingerprint Recognition

Fingerprint Recognition is one of the most well-known and publicized biometrics. Because of their uniqueness and consistency over time, fingerprints have been used for identification for over a century, more recently becoming automated (i.e. a biometric) due to advancements in computing capabilities. Fingerprint identification is popular because of the inherent ease in acquisition, the numerous sources (ten fingers) available for collection, and their established use and collections by law enforcement and immigration. Introduction of Fingerprint Recognition Fingerprint recognition is one of most popular and accuracy Biometric technologies. Fingerprint recognition (identification) is one of the oldest methods of identification with biometric traits. Large no. of archeological artifacts and historical items shows the signs of fingerprints of human on stones. The ancient people were aware about the individuality of fingerprint, but they were not aware of scientific methods of finding individuality. Fingerprints have remarkable permanency and uniqueness throughout the time. Fingerprints offer more secure and reliable personal identification than passwords, id-cards or key can provide. Examples such as computers and mobile phones equipped with fingerprint sensing devices for fingerprint based password protection are being implemented to replace ordinary password protection methods. Finger-scan technology is the most widely deployed biometric technology, with a…

Insert math as
Block
Inline
Additional settings
Formula color
Text color
#333333
Type math using LaTeX
Preview
\({}\)
Nothing to preview
Insert