Data Mining Enhancements in the November CTP of SQL Server
From Raman Iyer’s great website on SQL Server Data Mining here are some of the new Data Mining Improvements in SQL Server 2008 November CTP.The November Community Technology Preview of SQL Server 2008has just been released and it includes major data mining improvements:
· The Microsoft_Time_Series algorithm has been enhanced to include ARIMA in addition to the existing ARTxpmethod, and a blending algorithm is now used to deliver more accurate and stable predictions, both short and long term, from a hybrid model. In addition, a new prediction mode allows you to add new data to time series models.
· Built-in support for holdout has been added. You can easily partition your data into training and test sets that are stored in the mining structure and are available to query after processing.
· You can now build mining models on filtered subsets of a mining structure’s data (e.g. just male customers), which means that you no longer have to create multiple mining structures and re-read the source data for such variations over a dataset.
· Drillthrough functionality has been extended to make all mining structure columns available, not just columns included in the model. This allows you to build more compact models without sacrificing the ability to producing actionable output reports like targeted mailing lists.
· The much-requested cross-validationfeature has been added, allowing users to quickly validate their modeling approach by automatically building temporary models and evaluating accuracy measures across K folds. The feature is available through a new cross-validation tab under Accuracy Charts in Business Intelligence Development Studio, in addition to being accessible programmatically via a stored procedure call.
Go ahead and download the CTP, work with it and give us your feedback. Also, check the SQL Server 2008 site for more information about upcoming webcasts and live chats with the product team.