Machine Learning Notes

Machine Learning Course Notes


  1. Learn Numpy Python Library: very important for manipulating data in the array (You can print the average, shape(matrices of the array), random numbers, max and min numbers/indexes) 
  2. Input-Shape[1]: means one dimension array
  3. Feature: The input to Machine Learning Model
  4. Labels: The output for the Model to predict
  5. Units = 1: Specifies the number of neurons in the layer e.g
    Layer1 = tf.keras.layer.Dense(units=1, input-shape=[1])
    //create a model: model = tf.keras.squential([Layer1])
    //compile the model with loss and optimizer functions
  6. In order to calculate the loss, you will have to compile the model first, and perform model.fit() for the first time the loss will be calculated and carried over to the next layer.
    -During training, the optimizer function is used to calculate an adjustment to the model's internal variables (Weight and Biases)
    -The goal is to adjust the internal variables until the model (which is really math function) mirrors the actual equation or the actual desired output of data.
  7. Loss Function: (hidden function) squared error
  8. Optimizer Function: (hidden) normally "Adam". One part of the Optimizer to know is the Learning Rate (0.1). This is the step size taken when adjusting values in the Model, here how it goes; if the value is too small, then the Optimizer will take many iterations to train the Model, if the number is too large, the accuracy goes down.
    -Finding a good value often involves some trial and error, but the range is usually within 0.001 (which is the default) and 0.1. e.g;
    model.compile(loss='mean_squred_error', optimizer= tf.keras.optimizer.Adam(0.1)) <= Learning Rate which can be tweaked when training your model

    Train the Model
  9. Training takes place when you call model.fit() function
    -During the training, the Model takes in one set of data you want to train the model with that gets calculated using the Internal Variables (Weights and Biases in the Dense Layer (Hidden algorithms))
    -Apart from the first dataset, the Model also takes in the dataset you want the Model to predict. First Data set tells the Model that, "Hey, when you see this kind of data? I want you to predict this Second set of Data" you get it? (Machine Learning Model automatically Learns the pattern and figures out the best formula to arrive on the value in the second dataset), it does this by evaluating Loss value and fine tuning with the Optimizer function.
  10. Weights: Since the weight are initially (first iteration) set randomly, the output for that first layer will not be close to the correct one.
  11. Output: The difference between the actual output (the value predicted for a layer) and desired (the real value, correct one in the second dataset that is feed in the model) output is calculated using a Loss, and the Optimizer Function that directs/adjusts the weights of the following hidden layer.
  12. Fit Function: The cycle of calculation, compare and adjust is controlled by the Fit Function of a defined Model
  13. Fit Function Arguments: The first argument is the inputs, the second argument is the desired outputs values (Its like training; you tell the Model that, when you see this value, I want you to predict this value)
  14. Epochs: The epochs arguments specify how many times this cycle should run
  15. Verbose: This argument controls how much output the function should product (you can set it to False if you want)
    e.g: object = model.fit(inputArray, AnotherInputArray, epochs=200, verbose=False)
           Print('Finish training the model')

    -After the training is done, the object now contains a fully trained model that can be used to plot a chart using Matplotlib. This information is useful if you want to see the Loss calculated after a certain number of epochs.
  16. Loss:  A higher loss means that the predicted value (calculated from the first input set of data) is not anywhere close to the desired value of output (data present in the second set of data). The difference is then calculated and passed to another layer
  17. Matplotlib: Is a library that is used to show data on the chart e.g:
    import matplotlib.pyplot as plt
    plt.Xlabel('Epoch Number')
    plt.Ylabel('Loss Magnitude')
    plt.plot(object.object['Loss']) 
     //We are only plotting the loss from the object array

    Predict
  18. Predict Function: At this point, let's pretend you have a model that has been trained to learn the relationship between two different datasets. You can now use the Predict Function to predict a value of a known input (This is called Supervised Learning, because, you are testing the model with the dataset you didn't include in the training data. You are doing this to see if the model will predict correctly) e.g:
    print(model.predict([100])) //100 will be any data you didn't include the training data set, it's kind of testing your student to see if they understood what you were saying.
  19. Dense Layer: This is the main algorithm that calculates the Weights and Biases of the Hidden Layer. In order to peek through the internal variables of the Dense Layer you can do so; 
    Layer1 = tf.keras.layers.Dense(Units=4, input_shape = [1]) //nothing but telling the model about the layer that it only contains one dimension array.
    model= tf.keras.sequential([10,11,20]) //create a model to train
    model.compile(loss='mean_squre_error',optimizer=tf.keras.optimizer.Adam(0.1)) //this is important for model to train you have to compile it first.
    model.fit(Array_Containing_inputs,Array_Containing_inputs_toLearnFrom,epochs=300,verbose=False) //Training Model Here
    print('Model Finished Training') //To let you know that training is complet
    print('These are the  layer variables:{}'.format(Layer1.get_weight()))
    print(model.predict([PassInLabelYouWantToPredict]))
    //In this case maybe you want to predict 100 as a label
    Note: some Machine Learning Libraries do not allow muiltiple Label Prediction (Which means you can't ask the to predict more lables at once) like Microsoft ML library, just keep in mind.
  20. Note: This model learns to figure out the best formula to use to arrive at the best accurate (hopefully) value with a confidence level calculated. The training that happens in the model.fit() function is really the fine tuning of the Internal Variables (Weights and Biases) to arrive on the best answer.
  21. Gradient Descent: In training (model.fit()), an optimization called Gradient Descent is used to analyze the Internal Values in form of numeric (float: right now Machine Learning works best with Floats/ Numeric dataset). It iteratively adjusts parameters, nudging them in a correct direction a bit at a time, until they reach a best value.
    -At this time the best value means that nudging them further would make the Model perform worse.
  22. Loss Function: The function that measures how good or bad a Model performs during each iteration is called the "Loss Function", and the goal to each nudge is to minimize the Loss Function.
    -The Loss Function is defined when the Model Compiles

    SUMMARY
  23. Feature: The input to the Model
  24. Example: An input/output pair used for training
  25. Label: The output of the Model
  26. Model: The representation of the Neural Network
  27. Dense and Fully Connected (FC): Each node in one Layer is connected to each node in the previous layer
  28. Weights and Biases: The Internal Variables of the Model
  29. Loss: The discrepancy between the desired output and actual output
  30. MSE: Means Squared Error, a type of loss function that counts a small number of large discrepancies as worse
  31. Gradient Descent: An algorithm that changes the Internal Variable a bit at a time to gradually reduce the loss function 
  32. Optimizer: A specific implementation of the Gradient Descent algorithm (they are many but Adam is widely used (Adaptive with Momentum) it is considered the best practice optimizer)
  33. Learning Rate: The"Step Size" for loss improvement during Gradient Descent
  34. Batch: The set of the example used during training of the Neural Network
  35. Apoch: Full pass over the entire training data set
  36. Forward Pass: The computation of the output values from input
  37. Backward Pass (Backpropagation): The calculation of Internal Variable adjustment according to the Optimizer algorithm. Starting from the ouput layer and working back through each layer to input.

    References: Guide to Keras Documentations