Orange - Part II: Partitioning of dataset into training and testing sets

  1. Put File widget (Data) to canvas and configure it to load a dataset from a file.
  2. Put Data Sampler widget (Data) to canvas and connect the output port from the File widget to its input port.
    • Configure it by choosing Random sampling and setting the Sample size to 80%.
    • Click on the Sample Data button.
  3. Put Save widget (Data) on the canvas. Connect the Examples output port from the Data Sampler node to the input node of the Save widget and configure it to save the training set to a file. Then click on the Save current data button.
  4. Put Save widget (Data) on the canvas. Connect the Remaining Examples output port from the Data Sampler node to the input node of the Save and configure it to save the testing set to a file. Then click on the Save current data button.

As can be seen from the above procedure, it is very easy to partition a dataset randomly into a training set and testing set. However, Orange does not seems to contain other algorithms, like the Kennard and Stone algorithm, for partitioning datasets.

Share This

Leave a Reply


Close
E-mail It