<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.3.2" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>
<channel>
	<title>Comments on: TANAGRA - Part III: Descriptor scaling</title>
	<link>http://voyagememoirs.com/pharmine/2008/04/24/tanagra-part-iii-descriptor-scaling/</link>
	<description>Data mining in Pharmacy</description>
	<pubDate>Sun, 20 May 2012 13:56:53 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.2</generator>
		<item>
		<title>By: Sue Mann</title>
		<link>http://voyagememoirs.com/pharmine/2008/04/24/tanagra-part-iii-descriptor-scaling/#comment-1201</link>
		<dc:creator>Sue Mann</dc:creator>
		<pubDate>Thu, 18 Mar 2010 06:08:12 +0000</pubDate>
		<guid>http://voyagememoirs.com/pharmine/2008/04/24/tanagra-part-iii-descriptor-scaling/#comment-1201</guid>
		<description>Hi
I am doing a PhD on medical scoring systems, using Tanagra.  I need to train a set, after randomising, and be able to keep the training rules (which has been developed on the 1st set of randomised data) and to be able to apply them to the testing set.  Can you pleas advise me if this is feasible within Tanagra.

I hope you can help me.  I am using Tanagra 1.433
Thankyou.</description>
		<content:encoded><![CDATA[<p>Hi<br />
I am doing a PhD on medical scoring systems, using Tanagra.  I need to train a set, after randomising, and be able to keep the training rules (which has been developed on the 1st set of randomised data) and to be able to apply them to the testing set.  Can you pleas advise me if this is feasible within Tanagra.</p>
<p>I hope you can help me.  I am using Tanagra 1.433<br />
Thankyou.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Yap Chun Wei</title>
		<link>http://voyagememoirs.com/pharmine/2008/04/24/tanagra-part-iii-descriptor-scaling/#comment-52</link>
		<dc:creator>Yap Chun Wei</dc:creator>
		<pubDate>Fri, 25 Apr 2008 13:50:13 +0000</pubDate>
		<guid>http://voyagememoirs.com/pharmine/2008/04/24/tanagra-part-iii-descriptor-scaling/#comment-52</guid>
		<description>Thank you for your explanation.</description>
		<content:encoded><![CDATA[<p>Thank you for your explanation.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Profnick</title>
		<link>http://voyagememoirs.com/pharmine/2008/04/24/tanagra-part-iii-descriptor-scaling/#comment-47</link>
		<dc:creator>Profnick</dc:creator>
		<pubDate>Thu, 24 Apr 2008 22:09:58 +0000</pubDate>
		<guid>http://voyagememoirs.com/pharmine/2008/04/24/tanagra-part-iii-descriptor-scaling/#comment-47</guid>
		<description>"However, it seems that there is no option to save the parameters used to scale the descriptors in the training set and then apply them on a testing set."

This is because in Tanagra the best way of dealing with training/test sets is to  keep them all in the dataset but just select and reselect as necessary. So you would carry out your test/train split as you describe but simply deselect the test set when building the model, then reselect when doing the validation. Indeed if you use the "test" option for validation (as opposed to cross validation or LOO) then the default option is to use the unselected data points anyway. Since there is no way of introducing a second data file in TANAGRA, the use of the selection/reselection options is the only way of introducing new data.</description>
		<content:encoded><![CDATA[<p>&#8220;However, it seems that there is no option to save the parameters used to scale the descriptors in the training set and then apply them on a testing set.&#8221;</p>
<p>This is because in Tanagra the best way of dealing with training/test sets is to  keep them all in the dataset but just select and reselect as necessary. So you would carry out your test/train split as you describe but simply deselect the test set when building the model, then reselect when doing the validation. Indeed if you use the &#8220;test&#8221; option for validation (as opposed to cross validation or LOO) then the default option is to use the unselected data points anyway. Since there is no way of introducing a second data file in TANAGRA, the use of the selection/reselection options is the only way of introducing new data.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

