Modeling Process
Modeling is a multi-stage methodology for creating trained and tested Machine Learning models.
The Modeling Process is essentially a scientific experiment which includes:
Development of a Hypothesis - e.g., data collected about a specific previous consumer behavior can be used to predict future behavior
Design of the Experiment - e.g., model algorithm selection
Execution of the Experiment - e.g., model training and testing
Evaluation and Explanation of Results - e.g., is the hypothesis true or false, what is the accuracy
The steps in the modeling process, which can be highly iterative between steps, generally include:
Type Identification
Platform Selection
Data Collection
Model Algorithm Selection
Model Hyperparameters Setting
Model Training
Model Testing
Model Evaluation
Model Deployment
Type Identification
The type of Machine Learning needed can have a significant influence on the steps that follow.
Type identification can be driven by:
Applications Needed
Areas of Interest
Educational Needs
Research and Development
Major type categories include:
Computer Vision (e.g., object recognition, facial recognition, handwriting recognition)
Natural Language Processing (e.g., speech to text, translation, understanding)
Pattern Recognition (e.g., event prediction, medical diagnosis)
Platform Selection
ML Platforms are generally of two types:
Open Source (e.g., TensorFlow, Keras, Scikit-Learn, Theano, Caffe, Torch)
Data Collection
Data Collection is the processing of finding, organizing, cleaning, and storing data in a form that can be fed into model training and prediction processing.
Data Collection can involve:
Databases (e.g., Columnar, Document, Relational)
Model Algorithm Selection
Algorithm Options
Model algorithms to select from include:
Selection Methodologies
Methods of selecting an algorithm include:
Identifying Project Key Criteria - often include model application, need for model explainability and interpretability, training data availability
Starting with Artificial Neural Networks - ANN algorithms are very powerful and have recently benefited from improvements in accuracy and ease of use
Reviewing Model Categories - a categorization of models and their variations can provide insights useful for algorithm selection
Researching the Latest Advancements - Machine Learning is a very dynamic field; internet searches related to the type of ML being pursued can be valuable; use the Application page of this site to see a Google search for specific areas of interest
Experimenting with Various Options - running tests using various algorithms can provide insights into their effectiveness for the type of use envisioned
Comparing Models - use a method such as a spreadsheet to compare various models
Model Hyperparameter Settings
Hyperparameters control aspects of model instantiation and training and can include factors, depending on the model algorithm being used, such as:
activation_function: which Activation Function is used in Activation Nodes
batch_size: the number of inputs to include in each processing iteration linked to the learning rate
hidden_network_layers: the number of nodes in each hidden network layer; hidden layers are those between the input and output layers
learning_rate: what algorithm to use for controlling Weight Optimization
maximum_number_of_iterations: the maximum number of iterations of data is processed through the neural network
number_of_data_features: the number of data features used for model training and inference processing
number_of_informative_data_features: the number of data features correlated to the training outputs; this simulates real world model training where the correlation of data features may not be known
number_of_model_classes: the number of output classes the neural network is being trained to predict
number_of_training_and_test_samples: the number of data samples processed through model training
print_training_progress: whether to print the loss after each training iteration; loss is a measure of the difference between calculated outputs and expected outputs
tolerance_for_optimization: a numeric value used for ending the model training iteration cycles
weight_optimization_algorithm: the algorithm used for Weight Optimization, such as Stochastic Gradient Descent
Model Training
Data is iteratively processed through the neural network while adjusting the weights and biases applied to data array links to produced increasingly more accurate output results. The diagram below illustrates an Artificial Neural Network; the concepts are true for other model algorithms.
Data Inputs - data is fed into the training process
Iteration - data is iteratively passed through the neural network
Forward Propagation - data is passed from node to node
Outputs - output results are fed into loss calculations
Loss Calculation - the difference between output results and desired results is calculated
Weight Optimization - the amount of change to data flow weights is calculated
Backpropagation - modifies the weights and biases applied to data array links
Typically the training process is performed iteratively while monitoring for factors such as best accuracy results as illustrated below:
Model Testing
Data is passed forward through the neural network to produce a result and associated confidence level that the result is true. The diagram below illustrates an Artificial Neural Network; the concepts are true for other model algorithms.
Data Inputs - data is fed into the training process
Forward Propagation - data is passed from node to node
Outputs - output results are fed into confidence level calculations
Confidence Level - is a number from 0 to 1 indicating the probability that the output results is correct
Model Evaluation
Model Evaluation involves applying Probability and Statistics using measurements such as:
Depending on the results of model evaluation, previous modeling steps may need to be adjusted and repeated.
To reduce overfitting, consider using:
Fewer Variables
Reduced Model Training Time
Model Deployment
Model software deployment typically involves: