Introduction to Data Mining


Data Mining

Data mining is the computational process that transforms raw data into useful information and high level understanding. It builds upon a host of techniques from statistical analysis, information processing, and machine learning. This course provides an opportunity to learn both practical skills and fundamental concepts in order to carry out pattern discovery from structured and unstructured datasets, and to process data for data clustering, information extraction, information indexing, outlier detection, and knowledge discovery. Lab sessions are intertwined with lectures to provide practical experiences with data mining on real world problems. 

This course contains three parts: introduction to data mining and statistics, data mining techniques, and data mining applications.

  • Introduction to data mining
  • Introduction to statistics
  • Data mining pipelines
  • Data transformations
  • Association pattern mining
  • Identifying groups within the data 
  • Identifying items from predefined groups
  • Detecting outliers
  • Rules, tress, and linear models
  • Nonlinear models
  • Probabilistic methods
  • Application: text mining
  • Application: web mining
  • Application: image and speech

All topics are facilitated with lab materials. Programming language:

  • Python

Introduction to Python programming is also provided in the course lab sessions.



Comments are closed.