ID3 Decision Tree in Data Mining

ID3 Decision Tree Overview Engineered by Ross Quinlan the ID3 is a straightforward decision tree learning algorithm. The main concept of this algorithm is construction of the decision tree through implementing a top-down, greedy search by the provided sets for testing every attribute at each node of decision. With the aim of selecting the attribute which is most useful to classify a provided set of data, a metric is introduced named as Information Gain [1]. To acquire the finest way for classification of learning set, one requires to act for minimizing the fired question (i.e. to minimize depth of the tree). Hence, some functions are needed that is capable of determine which questions will offer the generally unbiased splitting. One such function is information gain metric. Entropy In order to define information gain exactly, we require discussing entropy first. Let’s assume, without loss of simplification, that the resultant decision tree classifies instances into two categories, we’ll call them ​\( [ P_{positive} ] and [ N_{negative} ] \)​ Given a set S, containing these positive and negative targets, the entropy of S related to this Boolean classification is: ​\( [ P_{positive} ] \)​: proportion of positive examples in S ​\( [ N_{negative} ]…

Insert math as
Additional settings
Formula color
Text color
Type math using LaTeX
Nothing to preview