What is this? From this page you can use the Social Web links to save KNIME - Part II: Partitioning of dataset into training and testing sets to a social bookmarking site, or the E-mail form to send a link via e-mail.

Social Web

E-mail

E-mail It
March 17, 2008

KNIME - Part II: Partitioning of dataset into training and testing sets

Posted in: Data mining, Review

  1. Put File Reader node (IO->Read) to workbench and configure it to load a dataset from a file.
  2. Put Partitioning node (Data Manipulation->Row) to workbench and connect the output port from the File Reader node to its input port.
    • Configure it by choosing Relative and setting it at 80%
    • Ensure the Draw randomly box is checked.
  3. Put two CSV Writer nodes (IO->Write) on the workbench. Connect the first output port from the Partitioning node to the input node of the first CSV Writer and configure it to save the first set (which is the training set) to a file. Connect the second output port from the Partitioning node to the input node of the second CSV Writer and configure it to save the second set (which is the testing set) to a file.
  4. Execute all nodes.

As can be seen from the above procedure, it is very easy to partition a dataset randomly into a training set and testing set. However, KNIME does not seems to contain other algorithms, like the Kennard and Stone algorithm, for partitioning datasets.


Return to: KNIME - Part II: Partitioning of dataset into training and testing sets