• Linear Regression

    "A large, worn-out chalkboard filled with a complex web of statistical formulas, equations, and graphs in various colors of chalk. A piece of chalk hovers mid-air, as if magically drawing a new line on the board, creating a dynamic representation of data analysis. Dust particles gently float around, catching the light that streams into the room, highlighting the chalkboard's significance --v 5 --ar 5:2 --q 2 --stylize 1000"

    Introduction to Linear Regression

    Linear regression is a fundamental statistical method used to study the relationship between two variables by fitting a straight line to a set of observed data points. This technique is widely used in machine learning and data analysis to make predictions and understand connections between variables.

    In this article, we’ll explore the basics of linear regression and implement a simple algorithm to predict house prices and classify whether an object is a cat. We’ll also discuss how to improve the algorithm’s performance using normalization and gradient descent.

    Linear Regression for House Price Prediction

    We’ll start by implementing a naive linear regression algorithm to predict the monthly rent of an apartment based on its features. In this example, we use the number of rooms and square meters as the independent variables and the monthly rent as the dependent variable. Here’s the Python code for the implementation:

    import numpy as np
    
    # I took the data from the local online rent website
    data = [
        # [
        #   number of rooms,
        #   square meters,
        #   price,
        # ]
        [3, 62, 798],
        [1, 35, 454],
        [2, 38, 615],
        [3, 100, 1474],
        [1, 37, 491],
        [2, 80, 921],
        [2, 82, 983],
        [2, 80, 1044],
        [3, 107, 1290],
        [2, 80, 1413],
    ]
    
    # This function represents the linear equation we use to make predictions. It takes weights w, a bias b, and an input
    # item, and calculates the prediction by multiplying the weights with the input and adding the bias.
    def model(w, b, item):
        # Calculate the prediction using the linear equation: w * item + b. Note that both `w` and `item` are vectors,
        # but that doesn't matter as we can use dot product
        return np.dot(w, item) + b
    
    # This function calculates how much the weights w and the bias b should be adjusted. It finds the difference between
    # the predicted values and the actual values, called errors, and computes the gradient (or slope) for both w and b.
    def compute_gradient(data, w, b):
        # Separate the input features (x) and target values (y) from the data
        x, y = data[:, :-1], data[:, -1]
        # Make predictions using the model function
        predictions = model(w, b, x.T)
        # Compute the errors between predictions and actual target values
        errors = predictions - y
        # Calculate the gradients for weights (w) and bias (b)
        gradient_w = np.dot(errors, x) / len(data)
        gradient_b = np.mean(errors)
        return gradient_w, gradient_b
    
    # This function updates the weights w and the bias b repeatedly, using the gradients computed in the previous function.
    # It does this for a certain number of iterations, making the adjustments smaller and smaller with a factor called the
    # learning rate alpha.
    def gradient_descent(data, w, b, alpha, iterations):
        for i in range(iterations):
            # Compute the gradients for weights (w) and bias (b)
            gradient_w, gradient_b = compute_gradient(data, w, b)
            # Update the weights (w) and bias (b) using the learning rate (alpha)
            w -= alpha * gradient_w
            b -= alpha * gradient_b
            # Print the current iteration, updated weights, and bias
            print(f"Iteration: {i}, w: {w}, b: {b}")
        return w, b
    
    # This function scales the input data to values between 0 and 1. It helps the algorithm to work better and converge
    # faster.
    def normalize_data(data):
        # Calculate the minimum and maximum values for each feature in the data
        min_vals = np.min(data, axis=0)
        max_vals = np.max(data, axis=0)
        # Normalize the data using the minimum and maximum values
        return (data - min_vals) / (max_vals - min_vals), min_vals, max_vals
    
    # Normalize the input data
    normalized_data, min_vals, max_vals = normalize_data(data)
    
    # Initialize weights and bias to zeros
    initial_w = np.zeros(normalized_data.shape[1] - 1)
    initial_b = 0
    
    # Set the learning rate (alpha) and the number of iterations for gradient descent
    alpha = 0.001
    iterations = 1000000
    
    # Perform gradient descent on the normalized data
    w, b = gradient_descent(normalized_data, initial_w, initial_b, alpha, iterations)
    print(f"Final weights: w: {w}, b: {b}")
    
    # Create an input item to make a prediction. How much would it cost for 4 rooms and 80 square meters?
    input_item = np.array([2, 80])
    
    # Normalize the input item using the same minimum and maximum values
    normalized_input_item = (
        input_item - min_vals[:-1]) / (max_vals[:-1] - min_vals[:-1])
    
    # Make a prediction using the normalized input item and previously calculated weights
    normalized_result = model(w, b, normalized_input_item)
    
    # Convert the predicted value back to the original scale
    result = normalized_result * (max_vals[-1] - min_vals[-1]) + min_vals[-1]
    print("Result: {}".format(result))

    How the output would look like:

    Iteration: 999997, w: [0.03570123 0.87858858], b: 0.03666023229940503
    Iteration: 999998, w: [0.03570123 0.87858858], b: 0.03666023229940503
    Iteration: 999999, w: [0.03570123 0.87858858], b: 0.03666023229940503
    Final weights: w: [0.03570123 0.87858858], b: 0.03666023229940503
    Result: 1106.1165418422659

    The model predicts that we need 1106$ to rent an appartment with 4 rooms and 80 square meters :)

    Classification: Is it a cat?

    Linear regression can also be used for classification problems by applying a suitable activation function. In this example, we’ll use the sigmoid function to classify whether an object is a cat based on its features. The code implementation is as follows:

    import numpy as np
    
    data = [
        # [
        # number of legs,
        # weight in g,
        # has fur,
        # has tail
        # ]
        [2, 80000, 0, 0, 0],  # a fellow human
        [2, 48000, 0, 0, 0],  # a nice lady
        [2, 62000, 0, 0, 0],  # another nice lady
        [2, 102000, 0, 0, 0],  # fair bodybuilder
        [6, 101, 1, 0, 0],  # that's a creepy big spider
        [6, 87, 1, 0, 0],  # a bit less creepy spider
        [6, 24, 1, 0, 0],  # a tiny spider, cute one
        [4, 7100, 1, 1, 1],  # a cat named "Fluffy"
        [4, 4000, 1, 1, 1],  # a cat named "Snowball"
        [4, 3000, 1, 1, 1],  # a cat named "Mr. Tinkles"
    ]
    
    # Sigmoid function to map linear combination to a value between 0 and 1
    def sigmoid(x):
        return 1 / (1 + np.exp(-x))
    
    # This function represents the linear equation we use to make predictions. It takes weights w, a bias b, and an input
    # item, and calculates the prediction by multiplying the weights with the input and adding the bias.
    def model(w, b, item):
        linear_combination = np.dot(w, item) + b
        return sigmoid(linear_combination)
    
    # This function calculates how much the weights w and the bias b should be adjusted. It finds the difference between
    # the predicted values and the actual values, called errors, and computes the gradient (or slope) for both w and b.
    
    
    def compute_gradient(data, w, b):
        # Separate the input features (x) and target values (y) from the data
        x, y = data[:, :-1], data[:, -1]
        # Make predictions using the model function
        predictions = model(w, b, x.T)
        # Compute the errors between predictions and actual target values
        errors = predictions - y
        # Calculate the gradients for weights (w) and bias (b)
        gradient_w = np.dot(errors, x) / len(data)
        gradient_b = np.mean(errors)
        return gradient_w, gradient_b
    
    # This function updates the weights w and the bias b repeatedly, using the gradients computed in the previous function.
    # It does this for a certain number of iterations, making the adjustments smaller and smaller with a factor called the
    # learning rate alpha.
    def gradient_descent(data, w, b, alpha, iterations):
        for i in range(iterations):
            # Compute the gradients for weights (w) and bias (b)
            gradient_w, gradient_b = compute_gradient(data, w, b)
            # Update the weights (w) and bias (b) using the learning rate (alpha)
            w -= alpha * gradient_w
            b -= alpha * gradient_b
            # Print the current iteration, updated weights, and bias
            print(f"Iteration: {i}, w: {w}, b: {b}")
        return w, b
    
    # This function scales the input data to values between 0 and 1. It helps the algorithm to work better and converge
    # faster.
    def normalize_data(data):
        # Calculate the minimum and maximum values for each feature in the data
        min_vals = np.min(data, axis=0)
        max_vals = np.max(data, axis=0)
        # Normalize the data using the minimum and maximum values
        return (data - min_vals) / (max_vals - min_vals), min_vals, max_vals
    
    
    # Normalize the input data
    normalized_data, min_vals, max_vals = normalize_data(data)
    
    # Initialize weights and bias to zeros
    initial_w = np.zeros(normalized_data.shape[1] - 1)
    initial_b = 0
    
    # Set the learning rate (alpha) and the number of iterations for gradient descent
    alpha = 0.001
    iterations = 1000000
    
    # Perform gradient descent on the normalized data
    w, b = gradient_descent(normalized_data, initial_w,
                            initial_b, alpha, iterations)
    print(f"Final weights: w: {w}, b: {b}")
    
    # Create an input item to make a prediction. How much would it cost for 4 rooms and 80 square meters?
    input_item = np.array([4, 80, 0, 0])
    
    # Normalize the input item using the same minimum and maximum values
    normalized_input_item = (
        input_item - min_vals[:-1]) / (max_vals[:-1] - min_vals[:-1])
    
    # Make a prediction using the normalized input item and previously calculated weights
    normalized_result = model(w, b, normalized_input_item)
    
    # Convert the predicted value back to the original scale
    result = normalized_result * (max_vals[-1] - min_vals[-1]) + min_vals[-1]
    print("Result: {}".format(result))
    # input [4, 80, 0, 0]
    Final weights: w: [-2.96788065 -3.20259811  1.58331402  9.10238936], b: -3.9189177494239376
    Result: 0.004475655395234176
    # input [4, 10000, 1, 1]
    Final weights: w: [-2.96788065 -3.20259811  1.58331402  9.10238936], b: -3.9189177494239376
    Result: 0.9931016099695166

    The output of this code shows that our model is confident that an object with 4 legs, 80g weight, no fur, and no tail is not a cat, whereas an object with 4 legs, 10kg weight, fur, and a tail is very likely a cat.

    Conclusion

    Linear regression is a powerful and versatile method for understanding relationships between variables and making predictions. In this article, we demonstrated how to implement a simple linear regression algorithm for both regression and classification problems. By applying techniques like normalization and gradient descent, we can improve the performance of our algorithm and make more accurate predictions.

One is the pinnacle of evolution; another one is me

One on the picture is the pinnacle of evolution; another one is me: inspired developer, geek culture lover, sport and coffee addict