AutoML
For a beginner, to reduce complexity the library is equipped with default parameters which makes it easier in achieving acceptable results. The basic requirement, however, would be to provide with a dataset, input features (columns or indices), output label (columns or indices) and a model name (optional).
1. PROBLEM DESCRIPTION
In this example, we take a look at a synthentic dataset based on machine maintenance with different Unique IDS for machines. The dataset contains information on the Model IDs, Torque maintined by the machine, air pressure, process temperature, rotational speed and tool wear. The problem statement described here is to understand when a machine fails when it reaches a specific wear time or after certain temperature or after a certain torque value.
2. MODEL DESCRIPTION & DEVELOPMENT
The following sections describe the TwinAPI.SimuLearn.AutoML
module with each subsection explaining the commands:
2.1. Setting up the model
In order to access the information within the dataset we initiate an object for the TwinAPI.SimuLearn.AutoML
class calling it as ‘model_auto’.
We choose the x_features and y_labels which are inputs and output respectively, we also have the ability to get information on the dataset, we set the plot
to either ‘basic’ or ‘detailed’ giving us with different plots concerning the dataset.
1from TwinAPI.SimuLearn import AutoML
2
3
4model_auto = AutoML()
5model_auto.setup(data="datasets/predictive_maintenance.csv", x_features=['3:8'], y_labels=['10'])
2.3. Training the model
In TwinAPI.SimuLearn.AutoML
module, the user can select from either of ML usecase, regression or classification, the user also has the option to view a leaderboard to asses the ranking of other models compared to the selected model.
In the scenario that user is not aware of what kind of problem the dataset is applied to, disregarding the user_mlcase
parameter will result in automated selection of the best Machine Learning usecase.
1model_auto.train(scoring_method='auto',leader_board=True)
2.4. Predicting test data
We can assess our model’s accuracy by providing with a prediction dataset not seen by the model while training. We provide with a list of inputs similar to the training dataset.
1prediction = ['L', 298, 308.5, 1429, 37.7, 220]
2model_auto.predict(prediction_set=prediction)
3print('Expected value; Tool Wear Failure')
Note
After defining and saving our model, we can later call our trained model to test for our future predictions. By providing with load_model
and model_name
.
1from TwinAPI.SimuLearn.MLibrary import AutoML
2model_auto = AutoML()
3prediction = model_auto.predict(load_model= True, model_name= 'my_auto_model', ['L', 298.2, 308.6, 1433, 39.5, 7])
2.5. Saving the Model
If we are satisfied with the results, we can save our model, by using the savemodel
method. It creates a ‘json’ file in the backend.
1model_auto.savemodel('my_auto_model')
3. SUMMARY
In this tutorial, we learned how to set up and solve a classification problem using AutoML.
4. SOURCE CODE
1from TwinAPI.SimuLearn import AutoML
2
3# initiate the class
4model_auto = AutoML()
5
6# setting up the model
7model_auto.setup(data="datasets/predictive_maintenance.csv",
8 x_features=['3:8'],
9 y_labels=['10'])
10
11# train the model
12model_auto.train(scoring_method='auto', leader_board=True)
13
14# sample prediction
15prediction = ['L', 298, 308.5, 1429, 37.7, 220]
16model_auto.predict(prediction_set=prediction)
17print('Expected value; Tool Wear Failure')
18
19# save the model
20model_auto.modelsave(model_name='my_auto_model')
Keywords: AutoML, Predictive, Machine, Faults
Datset Reference:[https://archive.ics.uci.edu/ml/datasets/AI4I+2020+Predictive+Maintenance+Dataset] Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.