logo

Logistic Regression

Sigmoid/Logistic function

g ( z ) = 1 1 + e z g(z) = {1 \over 1+e^{-z}}

Hypothesis

h θ ( x ) = g ( θ x ) = 1 1 + e θ x h_\theta(x) = g(\theta \cdot x) = {1 \over 1 + e^{-\theta \cdot x}}
  • y = 1 y=1 if h θ ( x ) 0.5 h_\theta(x) \ge 0.5 (i.e. θ x 0 \theta \cdot x \ge 0 )
  • y = 0 y=0 if h θ ( x ) < 0.5 h_\theta(x) < 0.5 (i.e. θ x < 0 \theta \cdot x < 0 )

Logistic Regression Cost Function

J ( θ ) = 1 m [ i = 1 m y i log h θ ( x i ) + ( 1 y i ) log ( 1 h θ ( x i ) ) ] J(\theta) = - {1 \over m} [\sum_{i=1}^m y_i \log h_\theta(x_i) + (1-y_i) \log(1-h_\theta(x_i))]

Gradient Descent

min θ J ( θ ) \min_\theta J(\theta)

Repeat: simultaneously update all θ j \theta_j

θ j : = θ j α θ j J ( θ ) \theta_j:=\theta_j - \alpha {\partial \over \partial \theta_j} J(\theta) θ j : = θ j α i = 1 m ( h θ ( x i ) y i ) x i \theta_j:=\theta_j - \alpha \sum_{i=1}^m (h_\theta(x_i) - y_i)x_i

Regularization

J ( θ ) = 1 m [ i = 1 m y i log h θ ( x i ) + ( 1 y i ) log ( 1 h θ ( x i ) ) ] + λ 2 m i = 1 n θ j 2 J(\theta) = - {1 \over m} [\sum_{i=1}^m y_i \log h_{\theta(x_i)} + (1-y_i) \log(1-h_\theta(x_i))] + {\lambda \over 2m} \sum_{i=1}^n \theta_j^2