6. What does machine learning do?
- Machine Learning is nothing but geometry problem
- Prerequisite: Regression ( neuron) - it is the most basic ML problem
- Line of the best fit ( of the data )
- y (hat) = wx +b
- y (hat) = w2x2 +w1x1 +b
- y (hat) = w ( transport) x + b ( w, x is vector )
- convention : size of x = D
- Linear regression - predict a continuous value - regression
- Linear regression provide the groundwork to progress to DL
- Logistic regression - predict the category ( classification ), predict the boundary
- All data is the same, only semantics is different ( changed )
- You can plug in any data set from your own area of interest, algo doesn't change
- Neuron network is nothing but a bunch of logistic regressions link together
- line/plane: w ( transport ) x + b = 0
- Making prediction
- encode categories as 0 and 1
- sigmoid always outputs a number between 0, 1
- Interpret it as probability
- p ( y = 1 | x ) = (theta) ( w [transport] x + b) - binary logistic regression ( a neuron )
- (theta)(a) = 1 / 1 + exp(-1)
- prediction = round ( p( y =1 | x))
- dotting vs looping -
- numpy : a.dot(b)
- samples x of shape N x D
- N = numbers of sampes
- D = numbers of features
- Lower case for 1 sample, uppercase for multiple smaples
- prediction = sigmoid ( X.dot(w) + b ), X is NxD and w is vector of length D
- numpy use broadcasting to add a scalar
- we still don't know how to choose w
- Central is cost function/loss function. ML is nothing but a probability problem
- try to solve the max likelihood problem
- binary cross - entropy ==> find likelihood -> log it -> negate it
- J = - E ti log(yi)+( 1- ti)log(1 - yi )
- i = 1 to N
- gradient descent for optimization function
- w <- w - qV J
- regularization ( L1 - sparsity, L2 - small weights )
- regularization ensure weight does not got to infinite
- prediction then training