Wednesday, February 07, 2007

Data Mining 001

Definition:
Data mining is about analyzing data and finding hidden patterns using automatic or semi-automatic means.

Three Parts to Data Mining:
1. Create the model - similar to "create table"
discrete = distinct categories
continuous = numeric columns

2. Train the model -
similar to "insert into table"
a. processing the model ==> similar to processing a cube
b. training model ==> truth table

3. Predict the model -
similar to "select from table"
a. closing the analysis loop

Reference
Data Mining with SQL Server 2005 by ZhaoHui Tang and Jamie MacLennan

No comments: