Ariori Algorithm: Example and Algorithm Description
/ November 2, 2017

With the quick growth in e-commerce applications, there is an accumulation vast quantity of data in months not in years. Data Mining, also known as Knowledge Discovery in Databases (KDD), to find anomalies, correlations, patterns, and trends to predict outcomes. Apriori algorithm is a classical algorithm in data mining. It is used for mining frequent itemsets and relevant association rules. It is devised to operate on a database containing a lot of transactions, for instance, items brought by customers in a store. It is very important for effective Market Basket Analysis and it helps the customers in purchasing their items with more ease which increases the sales of the markets. It has also been used in the field of healthcare for the detection of adverse drug reactions. It produces association rules that indicate what all combinations of medications and patient. Figure 1 Apriori algorithm example application Ariori Algorithm :  Overview One of the first algorithms to evolve for frequent itemset and Association rule mining was Apriori. Two major steps of the Apriori algorithm are the join and prune steps. The join step is used to construct new candidate sets. A candidate itemset is basically an item set that could be either Frequent or…

FP Growth(FP-tree) Algorithm with Example
/ November 1, 2017

FP-growth algorithm : Introduction The FP-growth algorithm is one of the fastest approaches to frequent itemset mining. The FP-Growth Algorithm adopts “Divide and Conquer” strategy to mine the patterns. The algorithm compresses the database, frequent items are transformed into a frequent-pattern tree. the algorithm preserves the association information available in Dataset. The database is compressed into a set of rules that represent entire databases. Each rule shows conditional decisions based on itemset. First, The database is scanned to find a list of frequent items, in descending order. Then the FP-tree is constructed as.Create the root node of the tree and scan the database, second time. The items in each transaction are evaluated, with respect to the frequent items list. Using, the mutual frequency of items, these calculated items help to create a branch node. When considering the branch to be added in FP-Tree, the count of each node with a common prefix is preferred. .the mining of frequent patterns and trees is also known as association rule mining. therefore how two items come together in a transaction, is the key concept of an association rule mining. this algorithm helps to measure the frequency of association of two items in a…

What is Distributed Database
/ October 31, 2017

A distributed database is a database in which portions of the database are stored in multiple physical locations and processing is distributed among multiple database nodes. Distributed databases can be homogenous or heterogeneous. In a homogenous distributed database system, all the physical locations have the same underlying hardware and run the same operating systems and database applications. In a heterogeneous distributed database, the hardware, operating systems or database applications may be different at each of the locations. Distributed Database: Overview A distributed database is a database distributed between several sites. The reasons for the data distribution may include the inherent distributed nature of the data or performance reasons. In a distributed database the data at each site is not necessarily an independent entity, but can be rather related to the data stored on the other sites.  A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network. A distributed database management system (DDBMS) is the software that manages the DDB, and provides an access mechanism that makes this distribution transparent to the user. Distributed database system (DDBS) is the integration of DDB and DDBMS. This integration is achieved through the merging the database and…

Association Rule Mining
/ October 30, 2017

Data Mining is the discovery of hidden information found in databases and can be viewed as a step in the knowledge discovery process. Data mining functions include clustering, classification, prediction, and link analysis (associations). One of the most important data mining applications is that of mining association rules. An association rule has two parts, an antecedent (if) and a consequent (then). An antecedent is an item found in the data. A consequent is an item that is found in combination with the antecedent. Association Rule Mining: Overview Association rules are created by analyzing data for frequent if/then patterns and using the criteria support and confidence to identify the most important relationships. Support is an indication of how frequently the items appear in the database. Confidence indicates the number of times the if/then statements have been found to be true. Association rule mining has been an active research area in data mining, for which many algorithms have been developed. In data mining, association rule learning is a popular and well-accepted method for discovering interesting relations between variables in large databases. Association rules are employed today in many areas including web usage mining, intrusion detection and bioinformatics. In general, the association rule…

What is Mobile Computing
/ October 26, 2017

Mobile Computing is a technology that allows transmission of data, voice and video via a computer or any other wireless enabled device without having to be connected to a fixed physical link. Mobile computing (or ubiquitous computing as it is sometimes called) is the use of computers in a non-static environment. This use may range from using notebook-type computers away from one’s office or home to the use of handheld, palmtop-type PDA-like devices to perform both simple and complex computing tasks. Mobile Computing: General Mobile device has become essential part of human life. Apart from call and receive functions, user can access many function in his/her mobile. A user wants everything on his/her mobile device for the ease of work. Some people use tablets instead of laptop or desktop. Despite increasing usage of mobile computing, exploiting its full potential is difficult due to its inherent problems such as resource scarcity, frequent disconnections, and mobility. Mobile cloud computing can address these problems by executing mobile applications on resource providers external to the mobile device. Mobile phones are set to become the universal interface to online services and cloud computing applications. However, using them for this purpose today is limited to two…

What is Phishing in Web Security
/ October 25, 2017

Phishing is one of the luring techniques used by phishing artists with the intention of exploiting the personal details of unsuspected users. Phishing is a form of identity theft that occurs when a malicious Web site impersonates a legitimate one in order to acquire sensitive information such as passwords, account details, or credit card numbers. Though there are several anti-phishing software and techniques for detecting potential phishing attempts in emails and detecting phishing contents on websites, phishers come up with new and hybrid techniques to circumvent the available software and techniques. This section provide the detail study about the online phishing and their deployment techniques. Phishing: General Description Now a day’s attacks have become major issues in networks. Attacks will intrude into the network infrastructure and collect the information needed to cause vulnerability to the networks. Security is needed to prevent the data from various attacks. Attacks may either active attack or passive attack. One type of passive attack is phishing. Phishing is a continual threat and is larger in social media such as facebook twitter. Phishing emails contain link to the infected website. Phishing email direct the user to the infected website where they are asked to enter the…

What is Intrusion Detection System (IDS)
/ October 24, 2017

Internet is a global public network. With the growth of the Internet and its potential, there has been subsequent change in business model of organizations across the world. More and more people are getting connected to the Internet every day to take advantage of the new business model popularly known as e-Business. Internetwork connectivity has therefore become very critical aspect of today’s e-business. “Intrusion is an unauthorized access to the system with the intent of doing theft of information or harms the system. The act of detecting intrusions, monitoring the incidents occurring in the computer system, the suspicious or unusual activities, taking place in the system, which can be the possible attack, is known as Intrusion Detection System (IDS)” If the computer is left unattended, any person can attempt to access and misuse the system. The problem is, however, far greater if the computer is connected to a network, particularly the Internet. Any user from around the world can reach the computer remotely (to some capacity) and may attempt to access private/confidential information or to launch some form of attack to bring the system to a halt or cease to function effectively. Overview The Intrusion Detection System (IDS) in a…

Wireless Network Routing Protocol
/ October 23, 2017

Due to the severe energy constraints of large number of densely deployed sensor nodes, it requires a suite of network protocols to implement various network control and management functions such as synchronization, node localization, and network security. A routing protocol is a set of rules used by routers to determine the most appropriate paths into which they should forward packets towards their intended destinations. Routing Protocol Overview A routing protocol is considered adaptive if certain system parameters can be controlled in order to adapt to current network conditions and available energy levels. Routing in wireless sensor networks differs from conventional routing in fixed networks in various ways. There is no infrastructure, wireless links are unreliable, sensor nodes may fail, and routing protocols have to meet strict energy saving requirements. Part of the job of the routing protocol is to specify how routers report changes and share information with the other routers in the network in order to update their routing tables, thereby allowing networks to dynamically adjust to changing conditions (e.g., changes in network topology and traffic patterns). Routing is the act of moving information from a source to a destination in an internetwork. During this process, at least one…

Self Organizing Map (SOM)
/ October 23, 2017

What is Self Organizing Map (SOM) A neural network is called a mapping network if it is able to compute some functional relationship between its input and its output. For example if the input to a network is the value of an angle and the output is the cosine of that angle, the network perform the mapping θ =cos (θ). For such a simple function, we do not need a Neural Network. However we might want to perform a complicated mapping where does not know how to describe the functional relationship in advance, but we do know of examples of the correct mapping. In this situation, Neural Network is applicable to discover its own algorithms which is extremely useful. Self Organizing Map: Overview The SOM algorithm is based on unsupervised, competitive learning. It provides a topology preserving mapping from the high dimensional space to map units. Map units, or neurons, usually form a two-dimensional lattice and thus the mapping is a mapping from high dimensional space onto a plane. The property of topology preserving means that the mapping preserves the relative distance between the points. Points that are near each other in the input space are mapped to nearby map…

Privacy Preserving Data Mining
/ October 22, 2017

Privacy is a matter of individual perception, an infallible and universal solution to this dichotomy is infeasible. The common term of privacy in the general, limits the information that is leaked by the distributed computation to be the information that can be learned from the designated output of the computation. The current state-of-the-art paradigm for privacy-preserving data mining is differential privacy, which allows un-trusted parties to access private data through aggregate queries. Privacy Preserving Data Mining : Overview The technology that converts clear text into a non-human readable form is called data anonymization. In recent years data anonymization technique for privacy-preserving data publishing of micro-data has received a lot of attention. Micro-data contains information about an individual entity, such as a person, a household or an organization. In each record a number of attributes can be categorized as i) Identifiers that can uniquely identify an individual, such as Name or Social Security Number ii) some attributes may be Sensitive Attributes (SAs) such as disease and salary and iii) some attributes are Quasi-Identifiers (QI) such as zip code, age, and sex which may be from publicly available database, whose values, when taken together, can potentially identify an individual. Data anonymization enables the transfer…