Classification

For a beginner, to reduce complexity the library is equipped with default parameters which makes it easier in achieving acceptable results. The basic requirement, however, would be to provide with a dataset, input features (columns or indices), output label (columns or indices) and a model name (optional).

1. PROBLEM DESCRIPTION

In this example, we take a look at a synthentic dataset based on machine maintenance with different Unique IDS for machines. The dataset contains information on the Model IDs, Torque maintined by the machine, air pressure, process temperature, rotational speed and tool wear. The problem statement described here is to understand when a machine fails when it reaches a specific wear time or after certain temperature or after a certain torque value.

2. MODEL DESCRIPTION & DEVELOPMENT

The following sections describe the TwinAPI.SimuLearn.ClassificationML module with each subsection explaining the commands:

2.1. Setting up the model

2.2. Preprocessing the data

2.3. Training the model

2.4. Predicting test data

2.5. Saving the Model

2.1. Setting up the model

In order to access the information within the dataset we initiate an object for the TwinAPI.SimuLearn.ClassificationML class calling it as ‘model_classification’. We choose the x_features and y_labels which are inputs and output respectively, we also have the ability to get information on the dataset, we set the plot to either ‘basic’ or ‘detailed’ giving us with different plots concerning the dataset.

from TwinAPI.SimuLearn import ClassificationML


model_classification = ClassificationML()
model_classification.setup(data="datasets/predictive_maintenance.csv", x_features=['4:8'], y_labels=['9'])

2.2. Preprocessing the data

By default, the preprocess method sets normalize as ‘True’ and normalize_method as ‘standardizer’. A basic user doesn’t need to change this settings.

model_classification.preprocess()

2.3. Training the model

Here we have a plethora of choices from the Machine Learning Classification Algorithms from TwinAPI.SimuLearn library. We chose, ‘Random Forest Classifier’ as our classification algorithm.

model_classification.train(user_model="Random Forest Classifier")

2.4. Predicting test data

We can assess our model’s accuracy by providing with a prediction dataset not seen by the model while training. We provide with a list of inputs similar to the training dataset.

model_classification.predict(['L', 298.2, 308.6, 1433, 39.5, 7])

Note

After defining and saving our model, we can later call our trained model to test for our future predictions. By providing with load_model and model_name.

from TwinAPI.SimuLearn.MLibrary import ClassificationML
model_classification = ClassificationML()
prediction = model_classification.predict(load_model= True, model_name= 'trained_model', [1])

2.5. Saving the Model

If we are satisfied with the results, we can save our model, by using the savemodel method. It creates a json file in the backend.

model_classification.savemodel('my_classification_model')

3. SUMMARY

In this tutorial, we learned how to set up and solve a classification problem.

4. SOURCE CODE

from TwinAPI.SimuLearn import ClassificationML


# initiate the class
model_classification = ClassificationML()

# setting up the model
model_classification.setup(data="datasets/predictive_maintenance.csv",
                        x_features=['4:8'],
                        y_labels=['9'],
                        verbose=1,
                        plot='basic')

# preprocess the data
model_classification.preprocess()

# train the model
model_classification.train(user_model="Random Forest Classifier")

# sample prediction
prediction = [296.3, 307.2, 1319, 68.3, 24]
model_classification.predict(prediction_set=prediction)
print('Expected value: 1')

# save the model
model_classification.modelsave(model_name='my_classification_model')

Keywords: Classification, Predictive, Machine, Faults

Datset Reference:[https://archive.ics.uci.edu/ml/datasets/AI4I+2020+Predictive+Maintenance+Dataset] Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.