Ramdan Hours:
Sun - Thu
9.30 AM - 2.30 PM
Iftar in --:--:--
🌙 Maghrib: --:--
Image from Google Jackets

Data mining for business intelligence : concepts, techniques, and applications in Microsoft Office Excel with XLMiner / Galit Shmueli, Nitin R. Patel, Peter C. Bruce

By: Contributor(s): Material type: TextTextPublisher: Hoboken, N.J. : Wiley-Interscience, [2007]Manufacturer: Hoboken, N.J. : Wiley-Interscience, 2018Description: xviii, 279 pages : illustrations ; 26 cmContent type:
  • text
Media type:
  • unmediated
Carrier type:
  • volume
Subject(s): DDC classification:
  • 005.54 22 S.G.D
LOC classification:
  • HF5548.2 .S44843 2007
Contents:
PART I PRELIMINARIES CHAPTER 1 Introduction 3 1.1 What Is Business Analytics? 3 1.2 What Is Data Mining? 5 1.3 Data Mining and Related Terms 5 1.4 Big Data 6 1.5 Data Science 7 1.6 Why Are There So Many Different Methods? 8 1.7 Terminology and Notation 9 1.8 Road Maps to This Book 11 Order of Topics 11 CHAPTER 2 Overview of the Data Mining Process 15 2.1 Introduction 15 2.2 Core Ideas in Data Mining 16 2.3 The Steps in Data Mining 19 2.4 Preliminary Steps 21 2.5 Predictive Power and Overfitting 33 2.6 Building a Predictive Model 38 2.7 Using R for Data Mining on a Local Machine 43 2.8 Automating Data Mining Solutions 43 PART II DATA EXPLORATION AND DIMENSION REDUCTION CHAPTER 3 Data Visualization 55 3.1 Uses of Data Visualization 55 3.2 Data Examples 57 3.3 Basic Charts: Bar Charts, Line Graphs, and Scatter Plots 59 3.4 Multidimensional Visualization 67 3.5 Specialized Visualizations 80 3.6 Summary: Major Visualizations and Operations, by Data Mining Goal 86 CHAPTER 4 Dimension Reduction 91 4.1 Introduction 91 4.2 Curse of Dimensionality 92 4.3 Practical Considerations 92 4.4 Data Summaries 94 4.5 Correlation Analysis 97 4.6 Reducing the Number of Categories in Categorical Variables 99 4.7 Converting a Categorical Variable to a Numerical Variable 99 4.8 Principal Components Analysis 101 4.9 Dimension Reduction Using Regression Models 111 4.10 Dimension Reduction Using Classification and Regression Trees 111 PART III PERFORMANCE EVALUATION CHAPTER 5 Evaluating Predictive Performance 117 5.1 Introduction 117 5.2 Evaluating Predictive Performance 118 5.3 Judging Classifier Performance 122 5.4 Judging Ranking Performance 136 5.5 Oversampling 140 PART IV PREDICTION AND CLASSIFICATION METHODS CHAPTER 6 Multiple Linear Regression 153 6.1 Introduction 153 6.2 Explanatory vs. Predictive Modeling 154 6.3 Estimating the Regression Equation and Prediction 156 6.4 Variable Selection in Linear Regression 161
CHAPTER 7 k-Nearest Neighbors (kNN) 173 7.1 The k-NN Classifier (Categorical Outcome) 173 7.2 k-NN for a Numerical Outcome 180 7.3 Advantages and Shortcomings of k-NN Algorithms 182 CHAPTER 8 The Naive Bayes Classifier 187 8.1 Introduction 187 8.2 Applying the Full (Exact) Bayesian Classifier 189 8.3 Advantages and Shortcomings of the Naive Bayes Classifier 199 CHAPTER 9 Classification and Regression Trees 205 9.1 Introduction 205 9.2 Classification Trees 207 9.3 Evaluating the Performance of a Classification Tree 215 9.4 Avoiding Overfitting 216 9.5 Classification Rules from Trees 226 9.6 Classification Trees for More Than Two Classes 227 9.7 Regression Trees 227 9.8 Improving Prediction: Random Forests and Boosted Trees 229 9.9 Advantages and Weaknesses of a Tree 232 CHAPTER 10 Logistic Regression 237 10.1 Introduction 237 10.2 The Logistic Regression Model 239 10.3 Example: Acceptance of Personal Loan 240 10.4 Evaluating Classification Performance 247 10.5 Example of Complete Analysis: Predicting Delayed Flights 250 10.6 Appendix: Logistic Regression for Profiling 259 Appendix A: Why Linear Regression Is Problematic for a Categorical Outcome 259 Appendix B: Evaluating Explanatory Power 261 Appendix C: Logistic Regression for More Than Two Classes 264 CHAPTER 11 Neural Nets 271 11.1 Introduction 271 11.2 Concept and Structure of a Neural Network 272 11.3 Fitting a Network to Data 273 11.4 Required User Input 285 11.5 Exploring the Relationship between Predictors and Outcome 287 11.6 Advantages and Weaknesses of Neural Networks 288 CHAPTER 12 Discriminant Analysis 293 12.1 Introduction 293 12.2 Distance of a Record from a Class 296 12.3 Fisher’s Linear Classification Functions 297 12.4 Classification Performance of Discriminant Analysis 300 12.5 Prior Probabilities 302 12.6 Unequal Misclassification Costs 302 12.7 Classifying More Than Two Classes 303 12.8 Advantages and Weaknesses 306 CHAPTER 13 Combining Methods: Ensembles and Uplift Modeling 311 13.1 Ensembles 311 13.2 Uplift (Persuasion) Modeling 317 13.3 Summary 324 PART V MINING RELATIONSHIPS AMONG RECORDS CHAPTER 14 Association Rules and Collaborative Filtering 329 14.1 Association Rules 329 14.2 Collaborative Filtering 342 14.3 Summary 351
CHAPTER 15 Cluster Analysis 357 15.1 Introduction 357 15.2 Measuring Distance between Two Records 361 15.3 Measuring Distance between Two Clusters 366 15.4 Hierarchical (Agglomerative) Clustering 368 15.5 Non-Hierarchical Clustering: The k-Means Algorithm 376 PART VI FORECASTING TIME SERIES CHAPTER 16 Handling Time Series 387 16.1 Introduction 387 16.2 Descriptive vs. Predictive Modeling 389 16.3 Popular Forecasting Methods in Business 389 16.4 Time Series Components 390 16.5 Data-Partitioning and Performance Evaluation 395 CHAPTER 17 Regression-Based Forecasting 401 17.1 A Model with Trend 401 17.2 A Model with Seasonality 407 17.3 A Model with Trend and Seasonality 411 17.4 Autocorrelation and ARIMA Models 412 CHAPTER 18 Smoothing Methods 433 18.1 Introduction 433 18.2 Moving Average 434 18.3 Simple Exponential Smoothing 439 18.4 Advanced Exponential Smoothing 442 PART VII DATA ANALYTICS CHAPTER 19 Social Network Analytics 455 19.1 Introduction 455 19.2 Directed vs. Undirected Networks 457 19.3 Visualizing and Analyzing Networks 458 19.4 Social Data Metrics and Taxonomy 462 19.5 Using Network Metrics in Prediction and Classification 467 19.6 Collecting Social Network Data with R 471 19.7 Advantages and Disadvantages 474 CHAPTER 20 Text Mining 479 20.1 Introduction 479 20.2 The Tabular Representation of Text: Term-Document Matrix and “Bag-of-Words” 480 20.3 Bag-of-Words vs. Meaning Extraction at Document Level 481 20.4 Preprocessing the Text 482 20.5 Implementing Data Mining Methods 489 20.6 Example: Online Discussions on Autos and Electronics 490 20.7 Summary 494 PART VIII CASES CHAPTER 21 Cases 499 21.1 Charles Book Club 499 21.2 German Credit 505 21.3 Tayko Software Cataloger 510 21.4 Political Persuasion 513 21.5 Taxi Cancellations 517 21.6 Segmenting Consumers of Bath Soap 518 21.7 Direct-Mail Fundraising 521 21.8 Catalog Cross-Selling 524 21.9 Predicting Bankruptcy 525 21.10 Time Series Case: Forecasting Public Transportation Demand 528 Index 535
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)

Title page verso: First Printing 2007

Includes bibliographical references (pages 271-272) and index

PART I PRELIMINARIES

CHAPTER 1 Introduction 3

1.1 What Is Business Analytics? 3

1.2 What Is Data Mining? 5

1.3 Data Mining and Related Terms 5

1.4 Big Data 6

1.5 Data Science 7

1.6 Why Are There So Many Different Methods? 8

1.7 Terminology and Notation 9

1.8 Road Maps to This Book 11

Order of Topics 11
CHAPTER 2 Overview of the Data Mining Process 15

2.1 Introduction 15

2.2 Core Ideas in Data Mining 16

2.3 The Steps in Data Mining 19

2.4 Preliminary Steps 21

2.5 Predictive Power and Overfitting 33

2.6 Building a Predictive Model 38

2.7 Using R for Data Mining on a Local Machine 43

2.8 Automating Data Mining Solutions 43

PART II DATA EXPLORATION AND DIMENSION REDUCTION

CHAPTER 3 Data Visualization 55

3.1 Uses of Data Visualization 55

3.2 Data Examples 57

3.3 Basic Charts: Bar Charts, Line Graphs, and Scatter Plots 59

3.4 Multidimensional Visualization 67

3.5 Specialized Visualizations 80

3.6 Summary: Major Visualizations and Operations, by Data Mining Goal 86

CHAPTER 4 Dimension Reduction 91

4.1 Introduction 91

4.2 Curse of Dimensionality 92

4.3 Practical Considerations 92

4.4 Data Summaries 94

4.5 Correlation Analysis 97

4.6 Reducing the Number of Categories in Categorical Variables 99

4.7 Converting a Categorical Variable to a Numerical Variable 99

4.8 Principal Components Analysis 101

4.9 Dimension Reduction Using Regression Models 111

4.10 Dimension Reduction Using Classification and Regression Trees 111

PART III PERFORMANCE EVALUATION

CHAPTER 5 Evaluating Predictive Performance 117

5.1 Introduction 117

5.2 Evaluating Predictive Performance 118

5.3 Judging Classifier Performance 122

5.4 Judging Ranking Performance 136

5.5 Oversampling 140

PART IV PREDICTION AND CLASSIFICATION METHODS

CHAPTER 6 Multiple Linear Regression 153

6.1 Introduction 153

6.2 Explanatory vs. Predictive Modeling 154

6.3 Estimating the Regression Equation and Prediction 156

6.4 Variable Selection in Linear Regression 161

CHAPTER 7 k-Nearest Neighbors (kNN) 173

7.1 The k-NN Classifier (Categorical Outcome) 173

7.2 k-NN for a Numerical Outcome 180

7.3 Advantages and Shortcomings of k-NN Algorithms 182

CHAPTER 8 The Naive Bayes Classifier 187

8.1 Introduction 187

8.2 Applying the Full (Exact) Bayesian Classifier 189

8.3 Advantages and Shortcomings of the Naive Bayes Classifier 199

CHAPTER 9 Classification and Regression Trees 205

9.1 Introduction 205

9.2 Classification Trees 207

9.3 Evaluating the Performance of a Classification Tree 215

9.4 Avoiding Overfitting 216

9.5 Classification Rules from Trees 226

9.6 Classification Trees for More Than Two Classes 227

9.7 Regression Trees 227

9.8 Improving Prediction: Random Forests and Boosted Trees 229

9.9 Advantages and Weaknesses of a Tree 232

CHAPTER 10 Logistic Regression 237

10.1 Introduction 237

10.2 The Logistic Regression Model 239

10.3 Example: Acceptance of Personal Loan 240

10.4 Evaluating Classification Performance 247

10.5 Example of Complete Analysis: Predicting Delayed Flights 250

10.6 Appendix: Logistic Regression for Profiling 259

Appendix A: Why Linear Regression Is Problematic for a Categorical Outcome 259

Appendix B: Evaluating Explanatory Power 261

Appendix C: Logistic Regression for More Than Two Classes 264

CHAPTER 11 Neural Nets 271

11.1 Introduction 271

11.2 Concept and Structure of a Neural Network 272

11.3 Fitting a Network to Data 273

11.4 Required User Input 285

11.5 Exploring the Relationship between Predictors and Outcome 287

11.6 Advantages and Weaknesses of Neural Networks 288

CHAPTER 12 Discriminant Analysis 293

12.1 Introduction 293

12.2 Distance of a Record from a Class 296

12.3 Fisher’s Linear Classification Functions 297

12.4 Classification Performance of Discriminant Analysis 300

12.5 Prior Probabilities 302

12.6 Unequal Misclassification Costs 302

12.7 Classifying More Than Two Classes 303

12.8 Advantages and Weaknesses 306

CHAPTER 13 Combining Methods: Ensembles and Uplift Modeling 311

13.1 Ensembles 311

13.2 Uplift (Persuasion) Modeling 317

13.3 Summary 324

PART V MINING RELATIONSHIPS AMONG RECORDS

CHAPTER 14 Association Rules and Collaborative Filtering 329

14.1 Association Rules 329

14.2 Collaborative Filtering 342

14.3 Summary 351

CHAPTER 15 Cluster Analysis 357

15.1 Introduction 357

15.2 Measuring Distance between Two Records 361

15.3 Measuring Distance between Two Clusters 366

15.4 Hierarchical (Agglomerative) Clustering 368

15.5 Non-Hierarchical Clustering: The k-Means Algorithm 376

PART VI FORECASTING TIME SERIES

CHAPTER 16 Handling Time Series 387

16.1 Introduction 387

16.2 Descriptive vs. Predictive Modeling 389

16.3 Popular Forecasting Methods in Business 389

16.4 Time Series Components 390

16.5 Data-Partitioning and Performance Evaluation 395

CHAPTER 17 Regression-Based Forecasting 401

17.1 A Model with Trend 401

17.2 A Model with Seasonality 407

17.3 A Model with Trend and Seasonality 411

17.4 Autocorrelation and ARIMA Models 412

CHAPTER 18 Smoothing Methods 433

18.1 Introduction 433

18.2 Moving Average 434

18.3 Simple Exponential Smoothing 439

18.4 Advanced Exponential Smoothing 442

PART VII DATA ANALYTICS

CHAPTER 19 Social Network Analytics 455

19.1 Introduction 455

19.2 Directed vs. Undirected Networks 457

19.3 Visualizing and Analyzing Networks 458

19.4 Social Data Metrics and Taxonomy 462

19.5 Using Network Metrics in Prediction and Classification 467

19.6 Collecting Social Network Data with R 471

19.7 Advantages and Disadvantages 474

CHAPTER 20 Text Mining 479

20.1 Introduction 479

20.2 The Tabular Representation of Text: Term-Document Matrix and “Bag-of-Words” 480

20.3 Bag-of-Words vs. Meaning Extraction at Document Level 481

20.4 Preprocessing the Text 482

20.5 Implementing Data Mining Methods 489

20.6 Example: Online Discussions on Autos and Electronics 490

20.7 Summary 494

PART VIII CASES

CHAPTER 21 Cases 499

21.1 Charles Book Club 499

21.2 German Credit 505

21.3 Tayko Software Cataloger 510

21.4 Political Persuasion 513

21.5 Taxi Cancellations 517

21.6 Segmenting Consumers of Bath Soap 518

21.7 Direct-Mail Fundraising 521

21.8 Catalog Cross-Selling 524

21.9 Predicting Bankruptcy 525

21.10 Time Series Case: Forecasting Public Transportation Demand 528

Index 535

There are no comments on this title.

to post a comment.