TANAGRA - Part V: Parameter optimization of machine learning/statistical methods

  1. Create a new diagram and configure it to load a training set from a file. This will put a Dataset operator on the diagram.
  2. Put Define status operator (Feature selection) to diagram under the Dataset operator and configure it to set the correct attributes as Input and Target.
  3. Put K-NN operator (Spv learning) to diagram under Define status operator and configure it.
  4. Put Cross validation operator (Spv learning assessment) to diagram under K-NN operator and configure it.
  5. Execute.

The above procedure shows how TANAGRA can be used to train and assess the performance of a model. However, it is not possible to automatically determine the optimum parameter value (e.g. Number of neighbours to consider (k) in the above procedure) for a machine learning/statistical method. To determine the optimum parameter value, you have to do it manually by setting a parameter value, execute, record the overall error rates, set another parameter value, execute again, record the overall error rates and so on, until you have evaluated all the parameter values that you are interested in. Then the parameter value which gives the lowest overall error rates will be the optimum parameter value of the machine learning/statistical method for the training set.

Share This

Leave a Reply


Close
E-mail It