Hi All,
Data Mining is seldom talked about. We all are trying to focus on new features in sql server 2008 from the perspective programming, manageability, availability, etc. I would like to talk a bit about the Data Mining offering from SQL Server. What exactly is Data Mining? Why do u need it? How does it benefit an organization? I plan to write a series of articles on data mining, but let me start with the “WHY” part of it… I hope I am successful in writing a series :)
Data Mining is the process of identifying hidden relationships and patterns in your data. I know that sounds very bookish. Let me explain more. Sometimes, organizations have difficult questions that need to be answered. These difficult questions need intelligent answers. For example, your company sells products to millions of customers, and your company wants to know why does a customer purchase your product? What is the decision making criteria before a customer decides to buy your product? Which factors influence his decision the most? Now with millions of customers in your database, how would you answer such a question? This asks for an intelligent answer. Do human beings have the capability to comprehend or analyze such huge amount of data? Will you have the capacity to browse transaction by transaction (row by row) to fathom the attribute values in your customers/orders tables? For sure this is not practical. Data mining comes to rescue and the company needs data mining because either the data is too complex or the data is huge or both. And casual human observation cannot do the job. The company needs to know the answers to the difficult questions because they want take some action based on the discovering, based on the hidden relationships. This is basically the actionable information that the company wants. There are many scenarios where data mining can be used like:-
• Forecasting sales
• Targeting mailings toward specific customers
• Determining which products are likely to be sold together
• Finding sequences in the order that customers add products to a shopping cart
There are certainly more examples where US agencies use Data Mining to detect frauds, financial institutions use it to indentify potential defaulters, etc..
The subject here that I am tryting to teach is Predictive Analysis. Data Mining is also known as KDD process (Knowledge Discovery in Database).
In my next blog, I shall talk more about the Data Mining process and the project Life cycle. Stay tuned.
Bogging this from 3rd row of Classroom 2 of Building 40 at MS campus in Redmond, US :)
Tanke care, Amit
Tuesday, July 29, 2008
Some thoughts on Data Mining in SQL Server 2008-Part I
Subscribe to:
Post Comments (Atom)





1 comments:
Very Good article explaining from scratch.
Post a Comment