Machine Learning Course Udamy Notes

Question

Accepted Answer

Podcast link https://www.superdatascience.com/podcast/sds-002-machine-learning-recommender-systems-and-the-future-of-data-with-hadelin-de-ponteves
Podcast link https://www.superdatascience.com/podcast/sds-041-inspiring-journey-totally-different-background-data-science

Steps in Machine Learning Programming
1. Import necessary libraries
2. Import the CSV File needed for data processing
3. Load the Data and separate them in X (Features) and Y (Row you want to predict)
4. Clean the Data by removing missing rows (Use Sklearn)
5. Encode Categorical Data (Text column into numbers because Machine-Learning Understands only numbers) in this case use One-Hot Encoding from SKLearn Library.
- This will encode text values into Vector e.g. First text value will be encoded as 1000 a second text value would be 0100, the third text value would be encoded as 0010, etc.
6. Now that you have encoded the features into One-Hot Encode Vector Values, you need to encode the Labels which are normally "yes" or "No" values by using Label Encoders. Use SKLearn Preprocessing Class to import LabelEncoders namespace/class in order to encode Labels into Numbers (1 and 0 )

7. Now that step 6 is complete, you need to Split your data in half for training and Testing.

Question: Do we have to apply Feature Scaling before or After Data Split in Machine-Learning?
Answer: We need to apply Feature Scaling after Data Splitting because the test set that you will be conducting tests on has to be a brand new data set that the model hasn't seen before in order to have an accurate evaluation of the Machine-Learning Model.

8. Now that Step 7 is complete, you need to Feature Scale your Dataset because you wouldn't want some of the features to be dominated by other features.
- Choose Standardization or Normalization Feature Scaling.
-Machine Learning is based on a Euclidean Distance of P1 and P2 computed Value (Which means two coordinates between two points)
- Keep in mind that you don't have to apply Feature Scaling to dummy values (those values in 1's and 0's if you do, the values will lose its actual meaning) in your Features One-Hot Encoded, the value of having Standardized Scaling is to have the feature within the same range.
- You only apply Feature Scaling to those Feature Values that are not in 1's and 0's (45.8808) so that they can become within the Data Range which is 1's and 0's (values will be between -2 and 2) for those values not in 1's and 0's.
- Standardization Scaling will only compute the Mean and the Standard Deviation of the columns you want to Feature Scale.
- Fit function will compute the Formular and Transform will actually apply the results gotten from the Fit function to the Dataset.
- Don't forget to Scale the Data on the Test Dataset using the same Scaller because the dataset has to be preprocessed by the same algorithm.

Machine Learning Course Udamy Notes

Best Calculus Books to Learn Machine-Learning and

Best Calculus Books to Learn Machine-Learning and AI

Steps to Convert a Pytorch model to ONNX Format

Steps to Convert a Pytorch model to ONNX Format

Schema mismatch for feature column 'Features': ex

Schema mismatch for feature column 'Features': expected Vector<Single>, got VarVector<Single> (Parameter 'inputSchema')

Best White Paper Sources to Learn Machine-Learning

Best White Paper Sources to Learn Machine-Learning and AI

What is the best way to learn AI in 2024?

What is the best way to learn AI in 2024?

Why is Python so popular among Machine-Learning De

Why is Python so popular among Machine-Learning Developers?

What are the best websites to read Machine Learnin

What are the best websites to read Machine Learning research papers?

Machine Learning Sources/Scholar.google.com

Machine Learning Sources/Scholar.google.com

Preprocessing Data For Machine Learning

Preprocessing Data For Machine Learning

TensorFlow Machine Learning Glossary

TensorFlow Machine Learning Glossary

Machine Learning JS

Machine Learning JS

Retraining Machine Learning Model in Microsoft Mac

Retraining Machine Learning Model in Microsoft Machine Learning (ML.Net) Error

Machine Learning Notes

Machine Learning Notes

Unhandled exception. System.DllNotFoundException:

Unhandled exception. System.DllNotFoundException: Unable to load shared library 'MklImports' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libMklImports: cannot open shared object file: No such file or directory