Automatic searching for useful patterns in large amounts of data
Simple Representation of the data
Improved performance
Instance-based learning ..Just memorize all the possible sets of values
ML has some overlap with data mining, but DM involves more preprocessing, whileML uses processed data
With real world data,we likely won't be able to produce a perfect model anyway, so using less data to make the model gives us amore realistic version of what to expect
Tells how many were correctly classified and how many were incorrectly classified. Ideally, the diagonal of the matrix should have all the values(sort of like an identity matrix)
No rules... Just the majority value.
Choose a single attribute for classification;choose the one with the least errors
Divide up the numeric values
Can put a plane between data-points in different classes
if $w \cdot a > 0$ predict class A else predict class B
From wikipedia: instead of performing explicit generalization, compares new problem instances with instances seen in training, which have been stored in memory. Instance-based learning is a kind of lazy learning.
Instance based learning involves storing all of the instances of the training set. Given a new instance, a, predict the class according to its nearest neighbour
Ways to find nearest neighbour: