Orange - Part V: Parameter optimization of machine learning/statistical methods
- Put File widget (Data) to canvas and configure it to load a training set from a file.
- Put Select Attributes widget (Data) to canvas and connect the
output port from the File widget to its input port. - Specify the attributes and class for the training set.
- Click on the Apply button.
- Put K Nearest Neighbours widget (Classify) to canvas and connect the output port from the Select Attributes widget to its input port.
- Configure it by setting Number of neighbours to 3.
- Click on the Apply button.
- Put Test Learners widget (Evaluate) to canvas.
- Connect the output port from K Nearest Neighbours widget to its Learner input port.
- Connect the output port from the Select Attributes widget to its Data input port.
- Configure it by choosing Cross-validation and setting the Number of folds to 10.
The above procedure shows how Orange can be used to train and assess the performance of a model. However, it is not possible to automatically determine the optimum parameter value (e.g. Number of neighbours to consider (k) in the above procedure) for a machine learning/statistical method. To determine the optimum parameter value, you have to do it manually by setting a parameter value, execute, record the overall error rates, set another parameter value, execute again, record the overall error rates and so on, until you have evaluated all the parameter values that you are interested in. Then the parameter value which gives the lowest overall error rates will be the optimum parameter value of the machine learning/statistical method for the training set.
Share This