Friday, March 13, 2009

Brief Information about Data Mining

Data is a collection of set of values. Obtaining utile data from these collections of set of values with the help of different data mining technique is called as Data mining. Data mining is defined as ‘the process of automatically discovering useful information in large data repositories’ (Tan et al 2006, p.2). According to Han and Kamber (2006, p.5) ‘data mining refers to extracting or mining knowledge from large amounts of data’.

The vast majority of work in data mining emphasizes the representation of data in different forms. The immensely used areas are data prediction, pattern recognition, data management, advanced data analysis. There are many data mining techniques to solve the problems, some of them are, K-mean algorithm, decision tree algorithm and neural networks.


In the real world, there are immense data which are widely available and generated. It is very important to keep track, store those data (information) and impendent need for making such data into utile information and cognition. The information and cognition obtained can be used for various applications such as analyzing stock market, supermarkets, fraud detection and scientific discoveries. The types of data used in data mining can be of any data such as numeric data, symbolic data, mix of numeric and symbolic, specifically data in data warehouses, relational databases, transactional databases, sequence databases, stream data, internet data and many more.

Basically data mining task can be classified into 2 categories, (Han and Kamber 2006)

1. Descriptive: Descriptive mining task deals with general attributes of the data in the database.

2. Predictive: Predictive mining task deals with making prediction of the data based on the current data in the database.

No comments: