Page 305 - Ai Book - 10
P. 305
K Keyey TTermserms
u Model Evaluation
The process of assessing the reliability of an AI model using a test dataset that was not used during training.
u True Positive (TP)
The model correctly predicts a positive reality, such as correctly forecasting a flood.
u True Negative (TN)
The model correctly predicts a negative reality, such as correctly forecasting the absence of a flood.
u False Positive (FP)
The model incorrectly predicts a positive reality, in this case, falsely predicting a flood.
u False Negative (FN)
The model incorrectly predicts a negative reality, mistakenly predicting no flood when there is one.
u Confusion Matrix
A matrix summarizing model predictions, providing a clear overview of TP, TN, FP, and FN values.
u Accuracy
The ratio of correct predictions (TP + TN) to the total number of predictions, providing an overall performance
measure.
u Precision
The ratio of TP to the sum of TP and FP, emphasizing the proportion of true positives among predicted
positives.
u Recall
The ratio of TP to the sum of TP and FN, focusing on the model’s ability to capture all positive instances.
u F1 Score
The harmonic mean of Precision and Recall, offering a balanced measure that considers both false positives
and false negatives.
In a Nutshell
In a Nutshell
• Evaluation refers to the process of understanding the reliability of any AI model by feeding a test dataset
that has never been used for training.
• Prediction and Reality are the two terms used to determine the efficiency of an AI model.
• A confusion matrix is a N*N matrix used for evaluating the performance of the model on the basis of two
parameters.
• Model Accuracy can be defined as the ratio of the correct number of predictions and the total number of
predictions.
• The term ‘Precision’ can be defined as the ratio of True positives and the sum of True Positives and False
Positives.
• The best value or the perfect value for an F1 score is 1 and the worst value is zero.
• Terms used to evaluate model efficiency in scenarios, where prediction is the model’s output and reality is
the actual condition.
• Highlighting the need for multiple evaluation metrics beyond accuracy for a comprehensive assessment.
179
179