DATA WAREHOUSING AND MINING (MCA2104)


 

UNIT I


Introduction to Data mining, types of Data, Data Quality, Data Processing, Measures of Similarity and  Dissimilarity, Exploring Data: Data Set, Summary Statistics, Visualization, Data Warehouse, OLAP and  multi dimensional data analysis


UNIT II


Classification: Basic Concepts, Decision Trees and model evaluation: General approach for solving a classification problem, Decision Tree induction, Model over fitting: due to presence of noise, due to lack of representation samples, Evaluating the performance of classifier. Nearest Neighborhood classifier, Bayesian Classifier, Support vector Machines: Linear SVM, Separable and Non Separable case. 


UNIT III


Association Analysis: Problem Definition, Frequent Item-set generation, rule generation, compact representation of frequent item sets, FP-Growth Algorithms. Handling Categorical, Continuous attributes, Concept hierarchy, Sequential, Sub graph patterns

 

UNIT IV


Clustering: Overview, K-means, Agglomerative Hierarchical clustering, DBSCAN, Cluster evaluation: overview, Unsupervised Cluster Evaluation using cohesion and separation, using proximity matrix, Scalable Clustering algorithm 

 

UNIT V


Web data mining: Introduction, Web terminology and characteristics, Web content mining, Web usage mining, web structure mining, Search Engines: Characteristics, Functionality, Architecture, Ranking of Web Pages, Enterprise search


Download Textbooks

netaji gandi Sunday, February 9, 2025

Algorithms and Programs

  Algorithms and Programs Both the algorithms and programs are used to solve problems, but they are not the same things in terms of their fu...