TY - BOOK AU - Rokach,Lior AU - Maimon,Oded TI - Data mining with decision trees: theory and applications SN - 9789814590075 (hb) AV - QA76.9.D343 R654 2014 U1 - 006.312 22 PY - 2015/// CY - [Hackensack] New Jersey PB - World Scientific KW - Data mining KW - Decision trees KW - Machine learning KW - Decision support systems N1 - computer bookfair2015; Includes bibliographical references and index; About the Authors; Preface for the Second Edition; Preface for the First Edition; Contents; 1. Introduction to Decision Trees; 1.1 Data Science; 1.2 Data Mining; 1.3 The Four-Layer Model; 1.4 Knowledge Discovery in Databases (KDD); 1.5 Taxonomy of Data Mining Methods; 1.6 Supervised Methods; 1.6.1 Overview; 1.7 Classification Trees; 1.8 Characteristics of Classification Trees; 1.8.1 Tree Size; 1.8.2 The Hierarchical Nature of Decision Trees; 1.9 Relation to Rule Induction; 2. Training Decision Trees; 2.1 What is Learning?; 2.2 Preparing the Training Set; 2.3 Training the Decision Tree 3. A Generic Algorithm for Top-Down Induction of Decision Trees3.1 Training Set; 3.2 Definition of the Classification Problem; 3.3 Induction Algorithms; 3.4 Probability Estimation in Decision Trees; 3.4.1 Laplace Correction; 3.4.2 No Match; 3.5 Algorithmic Framework for Decision Trees; 3.6 Stopping Criteria; 4. Evaluation of Classification Trees; 4.1 Overview; 4.2 Generalization Error; 4.2.1 Theoretical Estimation of Generalization Error; 4.2.2 Empirical Estimation of Generalization Error; 4.2.3 Alternatives to the Accuracy Measure; 4.2.4 The F-Measure; 4.2.5 Confusion Matrix 4.2.6 Classifier Evaluation under Limited Resources4.2.6.1 ROC Curves; 4.2.6.2 Hit-Rate Curve; 4.2.6.3 Qrecall (Quota Recall); 4.2.6.4 Lift Curve; 4.2.6.5 Pearson Correlation Coefficient; 4.2.6.6 Area Under Curve (AUC); 4.2.6.7 Average Hit-Rate; 4.2.6.8 Average Qrecall; 4.2.6.9 Potential Extract Measure (PEM); 4.2.7 Which Decision Tree Classifier is Better?; 4.2.7.1 McNemar's Test; 4.2.7.2 A Test for the Difference of Two Proportions; 4.2.7.3 The Resampled Paired t Test; 4.2.7.4 The k-fold Cross-validated Paired t Test; 4.3 Computational Complexity; 4.4 Comprehensibility 4.5 Scalability to Large Datasets4.6 Robustness; 4.7 Stability; 4.8 Interestingness Measures; 4.9 Overfitting and Underfitting; 4.10 "No Free Lunch" Theorem; 5. Splitting Criteria; 5.1 Univariate Splitting Criteria; 5.1.1 Overview; 5.1.2 Impurity-based Criteria; 5.1.3 Information Gain; 5.1.4 Gini Index; 5.1.5 Likelihood Ratio Chi-squared Statistics; 5.1.6 DKM Criterion; 5.1.7 Normalized Impurity-based Criteria; 5.1.8 Gain Ratio; 5.1.9 Distance Measure; 5.1.10 Binary Criteria; 5.1.11 Twoing Criterion; 5.1.12 Orthogonal Criterion; 5.1.13 Kolmogorov-Smirnov Criterion 5.1.14 AUC Splitting Criteria5.1.15 Other Univariate Splitting Criteria; 5.1.16 Comparison of Univariate Splitting Criteria; 5.2 Handling Missing Values; 6. Pruning Trees; 6.1 Stopping Criteria; 6.2 Heuristic Pruning; 6.2.1 Overview; 6.2.2 Cost Complexity Pruning; 6.2.3 Reduced Error Pruning; 6.2.4 Minimum Error Pruning (MEP); 6.2.5 Pessimistic Pruning; 6.2.6 Error-BasedPruning (EBP); 6.2.7 Minimum Description Length (MDL) Pruning; 6.2.8 Other Pruning Methods; 6.2.9 Comparison of Pruning Methods; 6.3 Optimal Pruning; 7. Popular Decision Trees Induction Algorithms; 7.1 Overview; 7.2 ID3 N2 - Decision trees have become one of the most powerful and popular approaches in knowledge discovery and data mining; it is the science of exploring large and complex bodies of data in order to discover useful patterns. Decision tree learning continues to evolve over time. Existing methods are constantly being improved and new methods introduced. This 2nd Edition is dedicated entirely to the field of decision trees in data mining; to cover all aspects of this important technique, as well as improved or new methods and techniques developed after the publication of our first edition. In this new ed UR - http://repository.fue.edu.eg/xmlui/handle/123456789/3632 ER -