Sebastian Raschka Last updated: 07/14/2014 [back to the main [pattern_classification repository](https://github.com/rasbt/pattern_classification)] # Free Machine Learning eBooks

**Machine Learning, Neural and Statistical Classification** [http://www1.maths.leeds.ac.uk/~charles/statlog/](http://www1.maths.leeds.ac.uk/~charles/statlog/) - Authors: D. Michie, D.J. Spiegelhalter, C.C. Taylor (eds) - Year: 1994 - Pages: 290 pages From the Back Cover: This book is based on the EC (ESPRIT) project StatLog which compare and evaluated a range of classification techniques, with an assessment of their merits, disadvantages and range of application. This integrated volume provides a concise introduction to each method, and reviews comparative trials in large-scale commercial and industrial problems. It makes accessible to a wide range of workers the complex issue of classification as approached through machine learning, statistics and neural networks, encouraging a cross-fertilization between these disciplines.

**Inductive Logic Programming Techniques and Applications** [http://www-ai.ijs.si/SasoDzeroski/ILPBook/](http://www-ai.ijs.si/SasoDzeroski/ILPBook//) - Authors: Nada Lavrac and Saso Dzeroski - Year: 1994 - Pages: 400 pages From the Back Cover: This book is an introduction to inductive logic programming (ILP), a research field at the intersection of machine learning and logic programming, which aims at a formal framework as well as practical algorithms for inductively learning relational descriptions in the form of logic programs. The book extensively covers empirical inductive logic programming, one of the two major subfields of ILP, which has already shown its application potential in the following areas: knowledge acquisition, inductive program synthesis, inductive data engineering, and knowledge discovery in databases. The book provides the reader with an in-depth understanding of empirical ILP techniques and applications. It is divided into four parts. Part I is an introduction to the field of ILP. Part II describes in detail empirical ILP techniques and systems. Part III presents the techniques of handling imperfect data in ILP, whereas Part IV gives an overview of empirical ILP applications.

**Practical Artificial Intelligence Programming With Java** [http://markwatson.com/opencontent_data/JavaAI3rd.pdf](http://markwatson.com/opencontent_data/JavaAI3rd.pdf) - Author: Mark Watson - Year: 2008 - Pages: 210 pages Author Preface: I wrote this book for both professional programmers and home hobbyists who al- ready know how to program in Java and who want to learn practical Artificial In- telligence (AI) programming and information processing techniques. I have tried to make this an enjoyable book to work through. In the style of a “cook book,” the chapters can be studied in any order. Each chapter follows the same pattern: a mo- tivation for learning a technique, some theory for the technique, and a Java example program that you can experiment with.

**Information Theory, Inference, and Learning Algorithms** [http://www.inference.phy.cam.ac.uk/itila/book.html](http://www.inference.phy.cam.ac.uk/itila/book.html) - Author: David J.C. MacKay - Year: 2003 - Pages: 628 pages Author Preface: This book is aimed at senior undergraduates and graduate students in Engi- neering, Science, Mathematics, and Computing. It expects familiarity with calculus, probability theory, and linear algebra as taught in a first- or second- year undergraduate course on mathematics for scientists and engineers. Conventional courses on information theory cover not only the beauti- ful theoretical ideas of Shannon, but also practical solutions to communica- tion problems. This book goes further, bringing in Bayesian data modelling, Monte Carlo methods, variational methods, clustering algorithms, and neural networks. Why unify information theory and machine learning? Because they are two sides of the same coin. In the 1960s, a single field, cybernetics, was populated by information theorists, computer scientists, and neuroscientists, all studying common problems. Information theory and machine learning still belong together. Brains are the ultimate compression and communication systems. And the state-of-the-art algorithms for both data compression and error-correcting codes use the same tools as machine learning.

**A Course in Machine Learning** [http://ciml.info](http://ciml.info) - Author: Hal Daumé III - Year: 2012 - Pages: 189 pages Book Description: Machine learning is a broad and fascinating field. It has been called one of the sexiest fields to work in1. It has applications in an incredibly wide variety of application areas, from medicine to advertising, from military to pedestrian. Its importance is likely to grow, as more and more areas turn to it as a way of dealing with the massive amounts of data available. The purpose of this book is to provide a gentle and pedagogically orga- nized introduction to the field. This is in contrast to most existing ma- chine learning texts, which tend to organize things topically, rather than pedagogically (an exception is Mitchell’s book2, but unfortu- nately that is getting more and more outdated). This makes sense for researchers in the field, but less sense for learners. A second goal of this book is to provide a view of machine learning that focuses on ideas and models, not on math. It is not possible (or even advisable) to avoid math. But math should be there to aid understanding, not hinder it. Finally, this book attempts to have minimal dependencies, so that one can fairly easily pick and choose chapters to read. When dependencies exist, they are listed at the start of the chapter, as well as the list of dependencies at the end of this chapter. The audience of this book is anyone who knows differential calcu- lus and discrete math, and can program reasonably well. (A little bit of linear algebra and probability will not hurt.) An undergraduate in their fourth or fifth semester should be fully capable of understand- ing this material. However, it should also be suitable for first year graduate students, perhaps at a slightly faster pace.

**Bayesian Reasoning and Machine Learning** [http://www.cs.ucl.ac.uk/staff/d.barber/brml/](http://www.cs.ucl.ac.uk/staff/d.barber/brml/) - Author: David Barber - Year: 2014 - Pages: 648 pages Book Description: The book begins with the basic concepts of graphical models and inference. For the independent reader chapters 1,2,3,4,5,9,10,13,14,15,16,17,21 and 23 would form a good introduction to probabilistic reasoning, modelling and Machine Learning. The material in chapters 19, 24, 25 and 28 is more advanced, with the remaining material being of more specialised interest. Note that in each chapter the level of material is of varying difficulty, typically with the more challenging material placed towards the end of each chapter. As an introduction to the area of probabilistic modelling, a course can be constructed from the material as indicated in the chart. The material from parts I and II has been successfully used for courses on Graphical Models. I have also taught an introduction to Probabilistic Machine Learning using material largely from part III, as indicated. These two courses can be taught separately and a useful approach would be to teach first the Graphical Models course, followed by a separate Probabilistic Machine Learning course. A short course on approximate inference can be constructed from introductory material in part I and the more advanced material in part V, as indicated. The exact inference methods in part I can be covered relatively quickly with the material in part V considered in more in depth. A timeseries course can be made by using primarily the material in part IV, possibly combined with material from part I for students that are unfamiliar with probabilistic modelling approaches. Some of this material, particularly in chapter 25 is more advanced and can be deferred until the end of the course, or considered for a more advanced course. The references are generally to works at a level consistent with the book material and which are in the most part readily available.

**Introduction to Machine Learning** [http://arxiv.org/pdf/0904.3664.pdf](http://arxiv.org/pdf/0904.3664.pdf) - Author: Amnon Shashua - Year: 2008 - Pages: 105 pages Book Description: A nice introductory book that covers the most important and basic topics: Bayesian Decision Theory, Maximum Likelihood/ Maximum Entropy Duality, EM Algorithm: ML over Mixture of Distributions, Support Vector Machines and Kernel Functions, Spectral Analysis I: PCA, LDA, CCA, Spectral Analysis II: Clustering ...

**The Elements of Statistical Learning: Data Mining, Inference, and Prediction** [http://statweb.stanford.edu/~tibs/ElemStatLearn/](http://statweb.stanford.edu/~tibs/ElemStatLearn/) - Authors: Trevor Hastie, Robert Tibshirani, Jerome Fried - Year: 2009 - Pages: 763 pages Book Description: During the past decade has been an explosion in computation and information technology. With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book descibes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting--the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization and spectral clustering. There is also a chapter on methods for ``wide'' data (italics p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful *An Introduction to the Bootstrap*. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.

**Gaussian Processes for Machine Learning** [http://www.gaussianprocess.org/gpml/chapters/](http://www.gaussianprocess.org/gpml/chapters/) - Authors: Carl Edward Rasmussen and Christopher K. I. Williams - Year: 2006 - Pages: 266 pages Book Description: Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade, and this book provides a long-needed systematic and unified treatment of theoretical and practical aspects of GPs in machine learning. The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics.The book deals with the supervised-learning problem for both regression and classification, and includes detailed algorithms. A wide variety of covariance (kernel) functions are presented and their properties discussed. Model selection is discussed both from a Bayesian and a classical perspective. Many connections to other well-known techniques from machine learning and statistics are discussed, including support-vector machines, neural networks, splines, regularization networks, relevance vector machines and others. Theoretical issues including learning curves and the PAC-Bayesian framework are treated, and several approximation methods for learning with large datasets are discussed. The book contains illustrative examples and exercises, and code and datasets are available on the Web. Appendixes provide mathematical background and a discussion of Gaussian Markov processes.

**Reinforcement Learning: An Introduction** [http://www.cse.wustl.edu/~kilian/introductions/reinforcement_learning.pdf](http://www.cse.wustl.edu/~kilian/introductions/reinforcement_learning.pdf) - Authors: Richard S. Sutton and Andrew G. Barto - Year: 1998 - Pages: 322 pages Book Description: Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability. The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

**Machine Learning** [http://www.intechopen.com/books/machine_learning](http://www.intechopen.com/books/machine_learning) - Authors: Abdelhamid Mellouk and Abdennacer Chebira - Year: 2009 - Pages: 450 pages Book Description: Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience.

**Reinforcement Learning** [http://www.intechopen.com/books/reinforcement_learning](http://www.intechopen.com/books/machine_learning) - Authors: Cornelius Weber, Mark Elshaw and Norbert Michael Mayer - Year: 2008 - Pages: 434 pages Book Description: Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. The first 11 chapters of this book describe and extend the scope of reinforcement learning. The remaining 11 chapters show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels. This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field.

**An Introduction to Statistical Learning with Applications in R** [http://www-bcf.usc.edu/%7Egareth/ISL/](http://www-bcf.usc.edu/%7Egareth/ISL/) - Authors: Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani - Year: 2013 - Pages: 426 pages Book Description: This book provides an introduction to statistical learning methods. It is aimed for upper level undergraduate students, masters students and Ph.D. students in the non-mathematical sciences. The book also contains a number of R labs with detailed explanations on how to implement the various methods in real life settings, and should be a valuable resource for a practicing data scientist.