KNIME - Konstanz Information Miner (version 1.3.3)
From their official website, “KNIME is a modular data exploration platform that enables the user to visually create data flows (often referred to as pipelines), selectively execute some or all analysis steps, and later investigate the results through interactive views on data and models”. KNIME uses a non-profit open source license which “allows KNIME to be downloaded, distributed, and used freely as long as the software or its use is not distributed per profit”.
If you install the current version of KNIME, with all its optional plugins, you will have a total of 189 nodes, with the following nodes distribution:
- IO: 11
- Database: 2
- Data manipulation: 36
- Data views: 21
- Statistics: 4
- Machines: 28
- Chemistry: 22
- Meta: 7
- Misc: 3
- Weka: 47
- Python: 3
- R: 4
- Reporting: 2
However, since I am interested in using it for QSAR experiments, I will only examine those nodes that are relevant. Basically, KNIME can only read data from three sources: ARFF files (which are Weka files), text-delimited files (which include csv files), and from a database. There are no nodes for reading from SVMlight files or LIBSVM files or from Microsoft Excel files. The lack of support for Microsoft Excel files is no big deal since you can easily convert them to csv format using Microsoft Excel. However, the lack of support for SVMlight and LIBSVM files will inconvenient users who are already using these two popular support vector machine softwares.
At first sight, KNIME does not seem to have any descriptor selection capability. Will explore this in more detail when I start the testing proper.
Currently, KNIME contains 3 algorithms for developing regression models and 8 algorithms for constructing classification models. I did not count those algorithms that are under the Weka branch because those algorithms are just wrappers over algorithms that are present in Weka and do not have the ability to load and save developed models.
KNIME contains a Cross validation meta-node. Though the website states that it also has boosting and bagging nodes, they were not present in the downloadable version.
Overall, my first impression of KNIME is that it has a very good graphical user interface and seems easy to use. However, it may not contain sufficient tools for a full QSAR experiment.