<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>From a Logical Point of View &#187; data mining</title>
	<atom:link href="http://www.jiangtanghu.com/blog/category/data-mining/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.jiangtanghu.com/blog</link>
	<description>Hello World by A SAS programmer</description>
	<lastBuildDate>Wed, 25 Jan 2012 21:55:31 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Map and Reduce in MapReduce: a SAS Illustration</title>
		<link>http://www.jiangtanghu.com/blog/2011/10/04/map-and-reduce-in-mapreduce-a-sas-illustration/</link>
		<comments>http://www.jiangtanghu.com/blog/2011/10/04/map-and-reduce-in-mapreduce-a-sas-illustration/#comments</comments>
		<pubDate>Tue, 04 Oct 2011 13:31:18 +0000</pubDate>
		<dc:creator>Jiangtang Hu</dc:creator>
				<category><![CDATA[data mining]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[SAS]]></category>
		<category><![CDATA[Array]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Lisp]]></category>
		<category><![CDATA[List]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://www.jiangtanghu.com/blog/2011/10/04/map-and-reduce-in-mapreduce-a-sas-illustration/</guid>
		<description><![CDATA[In last post, I mentioned Hadoop, the open source implementation of Google’s MapReduce for parallelized processing of big data. In this long National Holiday, I read the original Google paper, MapReduce: Simplified Data Processing on Large Clusters by Jeffrey Dean and Sanjay Ghemawat and got that the terminologies of “map” and “reduce” were basically borrowed [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://www.jiangtanghu.com/blog/2011/09/14/analytical-valley/" target="_blank">last post</a>, I mentioned <a href="http://hadoop.apache.org/">Hadoop</a>, the open source implementation of Google’s <a href="http://en.wikipedia.org/wiki/Mapreduce" target="_blank">MapReduce</a> for parallelized processing of big data. In this long National Holiday, I read the original Google paper, <em><a href="http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en//papers/mapreduce-osdi04.pdf" target="_blank">MapReduce: Simplified Data Processing on Large Clusters</a></em> by Jeffrey Dean and Sanjay Ghemawat and got that the terminologies of “map” and “reduce” were basically borrowed from Lisp, an old functional language that I even didn’t play “hello world” with. For Python users, the idea of Map and Reduce is also very straightforward because the workhorse data structure in Python is just the list, a sequence of values that you can just imagine that they are the nodes(clusters, chunk servers, …) in a distributed system. </p>
<p>MapReduce is a programming framework and really language independent, so SAS users can also get the basic idea from their daily programming practices and here is just a simple illustration using data step array (not array in Proc FCMP or matrix in IML). Data step array in SAS is fundamentally not a data structure but a convenient way of processing group of variables, but it can also be used to play some list operations like in Python and other rich data structure supporting languages(an editable version can be founded in <a href="http://jiangtanghu.com/docs/en/MapReduce.sas" target="_blank">here</a>):</p>
<p><a href="http://www.jiangtanghu.com/blog/wp-content/uploads/2011/10/MapReduce.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; margin-left: 0px; border-left-width: 0px; margin-right: 0px" title="MapReduce" border="0" alt="MapReduce" align="left" src="http://www.jiangtanghu.com/blog/wp-content/uploads/2011/10/MapReduce_thumb.png" width="506" height="535" /></a></p>
<p>Follow code above, the programming task is to capitalize a string “Hadoop” (Line 2) and the “master” method is just to capitalize the string in buddle(Line 8): just use a master machine to processing the data.</p>
<p>Then we introduce the idea of “big data” that the string is too huge to one master machine, so “master method” failed. Now we distribute the task to thousands of low cost machines (workers, slaves, chunk servers,. . . in this case, the one dimensional array with size of 6, see Line 11), each machine produces parts of the job (each array element only capitalizes a single letter in sequence, see Line 12-14). Such distributing operation is called “<font color="#ff0000">map</font>”. In a MapReduce system, a master machine is also needed to assign the maps and reduce.</p>
<p>How about “<font color="#ff0000">reduce</font>”?&#160; A “reduce” operation is also called “fold”—for example, in Line 17, the operation to combine all the separately values into a single value: combine results from multiple worker machines.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jiangtanghu.com/blog/2011/10/04/map-and-reduce-in-mapreduce-a-sas-illustration/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Feature Selection: Collections for Self Study</title>
		<link>http://www.jiangtanghu.com/blog/2011/01/15/feature-selection-collections-for-self-study/</link>
		<comments>http://www.jiangtanghu.com/blog/2011/01/15/feature-selection-collections-for-self-study/#comments</comments>
		<pubDate>Sat, 15 Jan 2011 02:58:44 +0000</pubDate>
		<dc:creator>Jiangtang Hu</dc:creator>
				<category><![CDATA[data mining]]></category>
		<category><![CDATA[feature selection]]></category>
		<category><![CDATA[fselector]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://www.jiangtanghu.com/blog/2011/01/15/feature-selection-collections-for-self-study/</guid>
		<description><![CDATA[Recently I start to learn the algorithms and applications of feature selection. The term  “Feature”, wildly used in machine learning and data mining literatures,  simply means “Variable”. In some practices, for example, a neural network model uses a decision tree as input; the tree performs the function of variables selection. The Arizona State University is [...]]]></description>
			<content:encoded><![CDATA[<p>Recently I start to learn the algorithms and applications of <a href="http://en.wikipedia.org/wiki/Feature_selection" target="_blank">feature selection</a>. The term  “Feature”, wildly used in machine learning and data mining literatures,  simply means “Variable”. In some practices, for example, a neural network model uses a decision tree as input; the tree performs the function of variables selection.</p>
<p>The Arizona State University is maintaining a repository of feature selection, including original documentations, Matlab packages and user guide for the following popular algorithms so far:</p>
<blockquote><p>BLogReg<br />
CFS<br />
Chi Square<br />
FCBF<br />
Fisher Score<br />
Gini Index<br />
Information Gain<br />
Kruskal-Wallis<br />
mRMR<br />
Relief-F<br />
SBMLR<br />
T-test<br />
SPEC<br />
<em><span style="color: #ff0000;">see</span></em> <a href="http://featureselection.asu.edu/software.php" target="_blank">http://featureselection.asu.edu/software.php</a></p></blockquote>
<p>A R package, <a href="http://cran.r-project.org/web/packages/FSelector/index.html" target="_blank">FSelector</a>, is also useful for step-by-step studying. This package covers:</p>
<blockquote><p><strong>Filters:<br />
</strong>*cfs<br />
*chi-squared<br />
*consistency<br />
*correlation<br />
&#8211;linear.correlation<br />
&#8211;rank.correlation<br />
*entropy.based<br />
&#8211;information.gain<br />
&#8211;gain.ratio<br />
&#8211;symmetrical.uncertainty<br />
*OneR<br />
*random.forest.importance<br />
*relif-F</p>
<p><strong>Wrappers:</strong><br />
*best.first.search<br />
*exhaustive.search<br />
*greedy.search<br />
&#8211;backward.search<br />
&#8211;forward.search<br />
*hill.climbing.search</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.jiangtanghu.com/blog/2011/01/15/feature-selection-collections-for-self-study/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Decision Trees in SAS Enterprise Miner and SPSS Clementine</title>
		<link>http://www.jiangtanghu.com/blog/2011/01/04/decision-trees-in-sas-enterprise-miner-and-spss-clementine/</link>
		<comments>http://www.jiangtanghu.com/blog/2011/01/04/decision-trees-in-sas-enterprise-miner-and-spss-clementine/#comments</comments>
		<pubDate>Tue, 04 Jan 2011 12:44:50 +0000</pubDate>
		<dc:creator>Jiangtang Hu</dc:creator>
				<category><![CDATA[data mining]]></category>
		<category><![CDATA[Industry Review]]></category>
		<category><![CDATA[SAS]]></category>
		<category><![CDATA[C5.0]]></category>
		<category><![CDATA[CART]]></category>
		<category><![CDATA[CHAID]]></category>
		<category><![CDATA[decision tree]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[QUEST]]></category>
		<category><![CDATA[SPSS]]></category>
		<category><![CDATA[SPSS Clementine]]></category>

		<guid isPermaLink="false">http://www.jiangtanghu.com/blog/2011/01/04/decision-trees-in-sas-enterprise-miner-and-spss-clementine/</guid>
		<description><![CDATA[Decision trees are included in SAS Enterprise Miner(EM). The counterpart is SPSS Clementine, which should be called IBM SPSS Modeler for precision after IBM’s acquisition of SPSS. Recently I read a paper on the comparisons of SAS EM, SPSS Clementine and IBM Intelligent Miner on their decision tree and cluster technology: Decision Tree Induction &#38; [...]]]></description>
			<content:encoded><![CDATA[<p>Decision trees are included in SAS Enterprise Miner(EM). The counterpart is SPSS Clementine, which should be called IBM SPSS Modeler for precision after IBM’s acquisition of SPSS.</p>
<p>Recently I read a paper on the comparisons of SAS EM, SPSS Clementine and IBM Intelligent Miner on their decision tree and cluster technology:</p>
<blockquote><p><em><a href="http://www.gimi.us/CLUTE_INSTITUTE/ORLANDO_2010/Article%2520452.pdf" target="_blank">Decision Tree Induction &amp; Clustering Techniques in SAS Enterprise Miner, SPSS Clementine, and IBM Intelligent Miner – A Comparative Analysis</a></em> by Abdullah M. Al Ghoson, Virginia Commonwealth University </p>
</blockquote>
<p>The output is not that surprising. SAS EM plays better in performance, functionality and auxiliary task support but worse in usability.</p>
<p><a href="http://www.jiangtanghu.com/blog/wp-content/uploads/2011/01/SAS_VS_SPSS.png"><img style="border-right-width: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto" title="SAS_VS_SPSS" border="0" alt="SAS_VS_SPSS" src="http://www.jiangtanghu.com/blog/wp-content/uploads/2011/01/SAS_VS_SPSS_thumb.png" width="503" height="476" /></a> </p>
<p>Here are few comments on decision trees implementations in SAS EM and SPSS Clementine based on my own experiences. Some advises for beginners are also supplied.</p>
<p>There are four nodes in SPSS Clementine to supports four trees algorithms respectively: <font color="#ff0000">C5.0</font>, Classification And Regression Trees (<font color="#ff0000">CART</font>),&#160; Quick, Unbiased, Efficient Statistical Tree(<font color="#ff0000">QUEST</font>) and Chi-squared Automatic Interaction Detector(<font color="#ff0000">CHAID</font>),&#160; which are most famous and popular in decision trees family.</p>
<p><a href="http://www.jiangtanghu.com/blog/wp-content/uploads/2011/01/SPSS_4_trees.png"><img style="border-bottom: 0px; border-left: 0px; display: block; float: none; margin-left: auto; border-top: 0px; margin-right: auto; border-right: 0px" title="SPSS_4_trees" border="0" alt="SPSS_4_trees" src="http://www.jiangtanghu.com/blog/wp-content/uploads/2011/01/SPSS_4_trees_thumb.png" width="412" height="105" /></a> Note that CART(R) is a registered trademark of California Statistical Software, Inc., and is licensed exclusively to Salford Systems, San Diego, California. So SPSS Clementine uses C&amp;R Tree as name.</p>
<p>In SAS EM, there is only one decision tree node:</p>
<p><a href="http://www.jiangtanghu.com/blog/wp-content/uploads/2011/01/SAS_tree.png"><img style="border-bottom: 0px; border-left: 0px; display: block; float: none; margin-left: auto; border-top: 0px; margin-right: auto; border-right: 0px" title="SAS_tree" border="0" alt="SAS_tree" src="http://www.jiangtanghu.com/blog/wp-content/uploads/2011/01/SAS_tree_thumb.png" width="155" height="74" /></a> The algorithms behind this node is called SAS tree algorithms, which incorporate and extend the four mentioned before. Just change the settings in decision tree node, you can get the trees you want. </p>
<p>Obviously, SAS tree algorithms is superior than the separated ones in SPSS Clementine for expansibility and flexibility. But at the other hand, the complexities increase. For a newbie user of SAS EM, he/she may wonder which trees he/she is training. A SPSS Clementine users just picks up a node and says: OK, I am now training a CART or CHAID.—he/she would communicate with others more smoothly.</p>
<p>Regardless of the industry application, I think this is the educational benefit of SPSS Clementine. Since almost every data mining book introduces decision trees by separated algorithms(such as ID3/C4.5/C5.0, CART, QUEST, CHAID, . . .), the beginners using SPSS Clementine as instructional tool may get the clear ideas about the algorithms one by one. Once he/she get the full understanding of the differences among tree algorithms, he/she would train trees in SAS EM more comfortable.</p>
<p>What’s more, SPSS Clementine supplies rich supporting documentations for beginners and self learners , such as Tutorial, User Guide, Algorithms Guide, Node Reference. The official documentations of SAS EM 5.x and 6.x are relatively poor. Yes there is a good SAS Help and Documentation for SAS EM 4.3 including <em>Getting Started with Enterprise Miner</em>. EM4.3 is a traditional AF application but EM5.x and above are Java client incorporated in SAS analysis platform(they are totally different!). For EM5.x and above, only installation guides and a plain reference are available.</p>
<p>SAS Institute may have its own marketing strategies. No rich references available, the Institute DOES offer <a href="https://support.sas.com/edu/prodcourses.html?code=MINER&amp;ctry=US" target="_blank">rich training programs</a> in data mining and Enterprise Miner application. Wooo, the big-budget purchasers of SAS EM can also afford the trainings.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jiangtanghu.com/blog/2011/01/04/decision-trees-in-sas-enterprise-miner-and-spss-clementine/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Run data mining codes following William Potts</title>
		<link>http://www.jiangtanghu.com/blog/2009/03/20/run-data-mining-codes-following-william-potts-2/</link>
		<comments>http://www.jiangtanghu.com/blog/2009/03/20/run-data-mining-codes-following-william-potts-2/#comments</comments>
		<pubDate>Fri, 20 Mar 2009 03:41:00 +0000</pubDate>
		<dc:creator>Jiangtang Hu</dc:creator>
				<category><![CDATA[data mining]]></category>
		<category><![CDATA[SAS]]></category>
		<category><![CDATA[SAS Enterprise Miner]]></category>
		<category><![CDATA[William Potts]]></category>

		<guid isPermaLink="false">http://www.jiangtanghu.com/blog/2009/03/20/run-data-mining-codes-following-william-potts-2/</guid>
		<description><![CDATA[FYI: SAS Enterprise Miner and SAS Text Miner Procedures: Reference for SAS 9.1.3, see: &#160; http://support.sas.com/documentation/onlinedoc/miner/emtmsas913/listing.html &#160; This entry DOES exist in the SAS Support website, but it can&#8217;t be found by any search engine or documentation tree view. You&#8217;re recommended to download these files immediately due to SAS&#8217;s easy-dead hyperlinks.^-^ &#160; ps.SAS Institute provides [...]]]></description>
			<content:encoded><![CDATA[<div><font face=Arial>FYI: <em>SAS Enterprise Miner and SAS Text Miner  Procedures: Reference for SAS 9.1.3</em>, see:</font></div>
<div><font face=Arial></font>&nbsp;</div>
<div><font face=Arial><a  href="http://support.sas.com/documentation/onlinedoc/miner/emtmsas913/listing.html">http://support.sas.com/documentation/onlinedoc/miner/emtmsas913/listing.html</a></font></div>
<div><font face=Arial></font>&nbsp;</div>
<div><font face=Arial>This entry DOES exist in the SAS Support website, but it  can&#8217;t be found by any search engine or documentation tree view. </font><font  face=Arial>You&#8217;re recommended to download these files immediately due to SAS&#8217;s  easy-dead hyperlinks.^-^ </font></div>
<div><font face=Arial></font>&nbsp;</div>
<div><font face=Arial>ps.SAS Institute provides no support for the use of  Enterprise Miner and Text Miner Procedures when they are invoked directly,  outside of the Enterprise Miner graphical user  interface.</font></div>
]]></content:encoded>
			<wfw:commentRss>http://www.jiangtanghu.com/blog/2009/03/20/run-data-mining-codes-following-william-potts-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Free Machine Learning Courses (Stanford) in YouTube</title>
		<link>http://www.jiangtanghu.com/blog/2009/03/02/free-machine-learning-courses-stanford-in-youtube-2/</link>
		<comments>http://www.jiangtanghu.com/blog/2009/03/02/free-machine-learning-courses-stanford-in-youtube-2/#comments</comments>
		<pubDate>Mon, 02 Mar 2009 02:28:00 +0000</pubDate>
		<dc:creator>Jiangtang Hu</dc:creator>
				<category><![CDATA[data mining]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[stanford]]></category>
		<category><![CDATA[youtube]]></category>

		<guid isPermaLink="false">http://www.jiangtanghu.com/blog/2009/03/02/free-machine-learning-courses-stanford-in-youtube-2/</guid>
		<description><![CDATA[FYI: &#160; http://www.youtube.com/view_play_list?p=A89DCFA6ADACE599]]></description>
			<content:encoded><![CDATA[<div><font face=Arial>FYI:</font></div>
<div><font face=Arial></font>&nbsp;</div>
<div><font face=Arial><a  href="http://www.youtube.com/view_play_list?p=A89DCFA6ADACE599">http://www.youtube.com/view_play_list?p=A89DCFA6ADACE599</a></font></div>
]]></content:encoded>
			<wfw:commentRss>http://www.jiangtanghu.com/blog/2009/03/02/free-machine-learning-courses-stanford-in-youtube-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SAS User Books and Data Mining Software Comparision: Quick Links</title>
		<link>http://www.jiangtanghu.com/blog/2009/02/17/sas-user-books-and-data-mining-software-comparision-quick-links-2/</link>
		<comments>http://www.jiangtanghu.com/blog/2009/02/17/sas-user-books-and-data-mining-software-comparision-quick-links-2/#comments</comments>
		<pubDate>Tue, 17 Feb 2009 06:55:00 +0000</pubDate>
		<dc:creator>Jiangtang Hu</dc:creator>
				<category><![CDATA[data mining]]></category>
		<category><![CDATA[SAS]]></category>

		<guid isPermaLink="false">http://www.jiangtanghu.com/blog/2009/02/17/sas-user-books-and-data-mining-software-comparision-quick-links-2/</guid>
		<description><![CDATA[SAS Books Catalog(Jan, 2009) Data Mining Software 2009: Succesul Analyses at Affordable Prices(Nov. 2008, by mayato)]]></description>
			<content:encoded><![CDATA[<ol>
<li><span style="font-family: Arial; font-size: x-small;"><a href="http://support.sas.com/publishing/pdfs/jan09.pdf">SAS Books Catalog(Jan,    2009)</a></span></li>
<li><span style="font-family: Arial; font-size: x-small;"><em><a href="http://www.mayato.com/index.php?option=com_content&amp;task=view&amp;id=105&amp;Itemid=96&amp;lang=en">Data    Mining Software 2009: Succesul Analyses at Affordable Prices</a></em>(Nov.    2008, by mayato)</span></li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.jiangtanghu.com/blog/2009/02/17/sas-user-books-and-data-mining-software-comparision-quick-links-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Industry Review: SAS and Teradata Partnership</title>
		<link>http://www.jiangtanghu.com/blog/2008/12/19/industry-review-sas-and-teradata-partnership-2/</link>
		<comments>http://www.jiangtanghu.com/blog/2008/12/19/industry-review-sas-and-teradata-partnership-2/#comments</comments>
		<pubDate>Fri, 19 Dec 2008 02:44:00 +0000</pubDate>
		<dc:creator>Jiangtang Hu</dc:creator>
				<category><![CDATA[data mining]]></category>
		<category><![CDATA[Industry Review]]></category>
		<category><![CDATA[Anti-Money Laundering]]></category>
		<category><![CDATA[Business Analytics]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Credit Risk]]></category>
		<category><![CDATA[Data Warehouse]]></category>
		<category><![CDATA[Enterprise Intelligence and Optimization Services]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[SAP]]></category>
		<category><![CDATA[SAS]]></category>
		<category><![CDATA[Teradata]]></category>

		<guid isPermaLink="false">http://www.jiangtanghu.com/blog/2008/12/19/industry-review-sas-and-teradata-partnership-2/</guid>
		<description><![CDATA[SAS and Teradata Partnership: Press Leading Companies See Value in SAS and Teradata Partnership SAS and Teradata Unveil Advantage Program to Bring Powerful In-Database Solutions and Services to Customers SAS and Teradata Enter into Strategic Partnership In BI industry, the pure players such as SAS, Teradata and Microstrategy, need to demonstrate their indispensable values against [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: Arial; font-size: 100%;">SAS and Teradata  Partnership: Press</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: Arial; font-size: 100%;"> </span></p>
<ol>
<li><span style="font-family: Arial; font-size: 100%;"><a href="http://www.sas.com/news/preleases/SASandTeradataAdvantageProgram.html">Leading Companies See Value in SAS and Teradata  Partnership</a></span></li>
<li><span style="font-family: Arial; font-size: 100%;"><a href="http://www.sas.com/news/preleases/TeradataAdvantageProgram.html">SAS and Teradata Unveil Advantage Program to Bring Powerful In-Database  Solutions and Services to Customers</a></span></li>
<li><span style="font-family: Arial; font-size: 100%;"><a href="http://www.teradata.com/t/page/173874/index.html">SAS and  Teradata Enter into Strategic Partnership</a></span><span style="font-size: 100%;"><span style="font-family: Arial; font-size: 10;"><br />
</span></span></li>
</ol>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-family: Arial; font-size: 100%;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-size: 100%;"><span style="font-family: Arial; font-size: 10;"><br />
</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt;"><span style="font-size: 100%;"><span style="font-family: Arial; font-size: 10;">In BI industry, the pure players  such as SAS, Teradata and Microstrategy, need to demonstrate their indispensable  values against the megavendors, IBM (acquired Cognos), SAP (acquired Business  Object), Oracle (acquired Hyperion) and Microsoft. Teradata is solely focused on  enterprise data warehouse. SAS, dominating in business analytics (e.g. advanced  statistics and data mining), will check and balance the BI industry due to the  private-hold structure. SAS and Teradata Advantage Program partnership, includes  wide business lines, such as Analytics, AML (Anti-Money Laundering), Credit  Risk, Enterprise Intelligence and Optimization Services. I think It&#8217;s a  effective way to learn from each other in mutual emulation and</span><span style="font-family: Times New Roman;"> </span><span style="font-family: Arial; font-size: 10;">counterbalance the </span><span class="wbtrmn"><span style="font-family: Times New Roman;">concentration</span></span><span style="font-family: Arial; font-size: 10;"> market.</span></span></p>
]]></content:encoded>
			<wfw:commentRss>http://www.jiangtanghu.com/blog/2008/12/19/industry-review-sas-and-teradata-partnership-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data Mining in Stock Market</title>
		<link>http://www.jiangtanghu.com/blog/2008/12/13/data-mining-in-stock-market-2/</link>
		<comments>http://www.jiangtanghu.com/blog/2008/12/13/data-mining-in-stock-market-2/#comments</comments>
		<pubDate>Sat, 13 Dec 2008 15:55:00 +0000</pubDate>
		<dc:creator>Jiangtang Hu</dc:creator>
				<category><![CDATA[data mining]]></category>
		<category><![CDATA[Quantitative finance]]></category>
		<category><![CDATA[decision tree]]></category>
		<category><![CDATA[parameter tuning]]></category>
		<category><![CDATA[predictive models]]></category>
		<category><![CDATA[stock market]]></category>

		<guid isPermaLink="false">http://www.jiangtanghu.com/blog/2008/12/13/data-mining-in-stock-market-2/</guid>
		<description><![CDATA[Data Mining in Stock Market? Is it crazy? or is it just a hopeless try? Every mentor in mathematics and finance educates us that the stock market is too chaotic and sentimental to use mathematical models. Most of all gift rock scientists are concentrated in the study of interest of rates and fixed income securities. [...]]]></description>
			<content:encoded><![CDATA[<p>Data Mining in Stock Market? Is it crazy? or is it just a hopeless try? Every mentor in mathematics and finance educates us that the stock market is too chaotic and <strong>sentimental</strong> to use mathematical models. Most of all gift rock scientists are concentrated in the study of interest of rates and fixed income securities. It sounds profitable to use mathematical and statistical models to predict the price of stock, but there are little successfull stories.</p>
<p>I know I might hold some academic doctrines, so I have interest to monitor any effort to try to forecast stock prices using data mining techniques. Some links from a popular data mining blog , <a href="http://dataminingresearch.blogspot.com/"><em>Data Mining Research</em></a>, are listed as follows:</p>
<ul>
<li><a href="http://dataminingresearch.blogspot.com/2008/09/stock-prediction-using-decision-tree.html">Stock Prediction using Decision Tree</a></li>
<li><a href="http://dataminingresearch.blogspot.com/2008/12/stock-picking-using-data-mining_12.html">Stock Picking using Data Mining: Parameter Tuning</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.jiangtanghu.com/blog/2008/12/13/data-mining-in-stock-market-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

