Useful data mining books for beginners
Want to learn about data mining but don’t know where to start? Did a Google search on the term “data mining” and were overwhelmed by the 15.3 million hits? Searched for data mining books on Amazon and were uncertain which of the 48 books to choose? Well, all of us have been through this stage. Personally, I would recommend two books to get you started.
- “Data Preparation for Data Mining” by Dorian Pyle. This book is one of the rare (only one?) ones that describe some of the techniques for data preparation. Data preparation is an essential step in the data exploration process. Without proper preparation of the data, even the best data mining tools in the world will not be able to produce an accurate model. However, most data mining books only described data preparation briefly. Thus this book is an essential read for those who wish to venture into data mining. Do note that some of the techniques described in the book is seldom mentioned in the academic world. Is it because some of the techniques are so basic that the authors feel that they do not need to be described in the Methods section of the publication? I am not sure.
- “Data Mining: Practical Machine Learning Tools and Techniques” by Ian H. Witten and Eibe Frank. The first part of this book introduces data mining techniques and how they can be used in real life situation without going too much into the mathematical details of each algorithm. Thus it provides an easy read for those who are not so mathematically incline. However, the book may not be suitable for those who wish to get more details on various machine learning algorithms and how each algorithm can be applied because the book tends to describe decision trees more than other algorithms. The second part of the book is basically a manual for the free data mining software, Weka. This will provide beginners with a useful tool to start their data mining careers.