{"id":336896,"date":"2017-01-03T01:43:33","date_gmt":"2017-01-03T09:43:33","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-event&#038;p=336896"},"modified":"2025-08-06T11:58:42","modified_gmt":"2025-08-06T18:58:42","slug":"uncertainty-99","status":"publish","type":"msr-event","link":"https:\/\/www.microsoft.com\/en-us\/research\/event\/uncertainty-99\/","title":{"rendered":"Uncertainty 99"},"content":{"rendered":"\n\n<p><strong>Venue:<\/strong>\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/doubletree3.hilton.com\/en\/hotels\/florida\/bahia-mar-fort-lauderdale-beach-a-doubletree-by-hilton-hotel-FLLBMDT\/index.html\" target=\"_blank\">Radisson Bahia Mar Hotel<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<p>Uncertainty 99 was the Seventh International Workshop on Artificial Intelligence and Statistics and was\u00a0presented by\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.gatsby.ucl.ac.uk\/aistats\/society.html\" target=\"_blank\">The Society for Artificial Intelligence & Statistics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<h2>Program Committee<\/h2>\n<p>\t\t\t<div class=\"ms-grid \">\n\t\t\t<div class=\"ms-row\">\n\t\t\t\t\t<div  class=\"m-col-12-24\" >\n\t\t<ul><li>Russell Almond, Educational Testing Service<\/li><li>Chris Bishop, Microsoft Research<\/li><li>Wray Buntine, Ultimode Systems<\/li><li>Peter Cheeseman, NASA Ames<\/li><li>Max Chickering, Microsoft Research<\/li><li>Paul Cohen, University of Massachusetts<\/li><li>Greg Cooper, University of Pittsburgh<\/li><li>Philip Dawid, University College London<\/li><li>David Dowe, Monash University<\/li><li>William DuMouchel, AT&T Labs<\/li><li>Sue Dumais, Microsoft Research<\/li><li>David Edwards, Novo Nordisk<\/li><li>Doug Fisher, Vanderbilt University<\/li><li>Nir Friedman, Hebrew University\u2013Jerusalem<\/li><li>Dan Geiger, Technion<\/li><li>Edward George, University of Texas<\/li><li>Clark Glymour, Carnegie-Mellon University<\/li><li>Moises Goldszmidt, SRI International<\/li><li>David Hand, Open University<\/li><li>Geoff Hinton, University of Toronto<\/li><li>Tommi Jaakkola, MIT<\/li><li>Michael Jordan, UC Berkeley<\/li><\/ul><p>\t<\/div>\n\t \t<div  class=\"m-col-12-24\" >\n\t\t<\/p><ul><li>Michael Kearns, AT&T Labs<\/li><li>Daphne Koller, Stanford University<\/li><li>Steffen Lauritzen, Aalborg University<\/li><li>Hans Lenz, Free University of Berlin<\/li><li>David Lewis, AT&T Labs<\/li><li>David Madigan, University of Washington<\/li><li>Andrew Moore, Carnegie-Mellon University<\/li><li>Daryl Pregibon, AT&T Labs<\/li><li>Thomas Richardson, University of Washington<\/li><li>Alberto Roverato, Universita di Modena<\/li><li>Lawrence Saul, AT&T Labs<\/li><li>Ross Shachter, Stanford University<\/li><li>Richard Scheines, Carnegie-Mellon University<\/li><li>Sebastian Seung, MIT<\/li><li>Prakash Shenoy, University of Kansas<\/li><li>Padhraic Smyth, UC Irvine<\/li><li>David Spiegelhalter, MRC\u2013Cambridge<\/li><li>Peter Spirtes, Carnegie-Mellon University<\/li><li>Milan Studeny, Academy of Sciences of Czech Republic<\/li><li>Nanny Wermuth, Mainz University<\/li><\/ul><p>\t<\/div>\n\t<\/p>\t\t\t<\/div>\n\t\t<\/div>\n\t\t<span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<h2>Monday, January 4<\/h2>\n<table class=\"msr-table-schedule\" style=\"border-spacing: inherit;border-collapse: collapse\">\n<thead class=\"thead\">\n<tr class=\"tr\">\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Time<\/th>\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Session<\/th>\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Speaker<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"tbody\">\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">7:30\u20138:45<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p>Registration\/Continental Breakfast<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">8:45\u20139:00<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p>Opening Comments<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">David Heckerman and Joe Whittaker<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">9:00\u201311:00<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p><strong>Session I:\u00a0Model Choice<br \/>\n<\/strong><strong>Chair:<\/strong> Thomas Richardson<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Process-oriented evaluation: The next step<\/td>\n<td style=\"padding: inherit;border: inherit\">Pedro Domingos<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Model choice<\/td>\n<td style=\"padding: inherit;border: inherit\">Alan Gelfand and Sujit Ghosh<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">A note on the comparison of polynomial selection methods<\/td>\n<td style=\"padding: inherit;border: inherit\">Murlikrishna Viswanathan<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Pattern discovery via entropy minimization<\/td>\n<td style=\"padding: inherit;border: inherit\">Matthew Brand<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">11:00\u201311:30<\/td>\n<td style=\"padding: inherit;border: inherit\">Break<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">11:30\u201312:30<\/td>\n<td style=\"padding: inherit;border: inherit\"><strong>Session II: <\/strong><b>Latent variables<br \/>\n<\/b><strong>Chair:<\/strong> Kathyrn Laskey<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">On the geometry of DAG models with hidden variables<\/td>\n<td style=\"padding: inherit;border: inherit\">Dan Geiger, David Heckerman, Henry King, Chris Meek<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Efficient structure search in the presence of latent variables<\/td>\n<td style=\"padding: inherit;border: inherit\">Thomas Richardson, Heiko Bailer, Mooulinath Bannerjee<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">12:30\u20131:30<\/td>\n<td style=\"padding: inherit;border: inherit\">Lunch (<em>provided<\/em>)<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">1:30\u20135:00<\/td>\n<td style=\"padding: inherit;border: inherit\">Break<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">5:00\u20136:00<\/td>\n<td style=\"padding: inherit;border: inherit\"><strong>Poster Summaries<\/strong> (2 mins\/poster)<br \/>\n<strong>Chair:<\/strong> Joe Whittaker<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">6:00\u20137:00<\/td>\n<td style=\"padding: inherit;border: inherit\">Dinner<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">7:00\u20139:30<\/td>\n<td style=\"padding: inherit;border: inherit\">Poster Sessions<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>Tuesday, January 5<\/h2>\n<table class=\"msr-table-schedule\" style=\"border-spacing: inherit;border-collapse: collapse\">\n<thead class=\"thead\">\n<tr class=\"tr\">\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Time<\/th>\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Session<\/th>\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Speaker<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"tbody\">\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">8:00\u20139:00<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p>Continental Breakfast<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">9:00\u201310:00<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p><strong>Session III: Theory<br \/>\nChair:<\/strong> David Madigan<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p>Conditional products: an alternative approach to conditional independence<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">Phil Dawid, Milan Studeny<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Hierarchical mixtures-of-experts for generalized linear models: some results on denseness and consistency<\/td>\n<td style=\"padding: inherit;border: inherit\">Wenxin Jiang, Martin Tanner<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">10:00\u201310:30<\/td>\n<td style=\"padding: inherit;border: inherit\">Coffee Break<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">10:30\u201311:30<\/td>\n<td style=\"padding: inherit;border: inherit\"><strong>Session IV: Regression<br \/>\nChair:<\/strong> Padhraic Smyth<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Boosting methodology for regression problems<\/td>\n<td style=\"padding: inherit;border: inherit\">Greg Ridgeway, David Madigan, Thomas Richardson<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Probabilistic kernel regression models<\/td>\n<td style=\"padding: inherit;border: inherit\">Tommi Jaakkola, David Haussler<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">11:30\u201312:30<\/td>\n<td style=\"padding: inherit;border: inherit\"><strong>Session V: Computational Methods<br \/>\nChair:<\/strong> Padhraic Smyth<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Learning structure from data efficiently: applying bounding techniques<\/td>\n<td style=\"padding: inherit;border: inherit\">Nir Friedman, Lise Getoor<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Efficient mining of statistical dependencies<\/td>\n<td style=\"padding: inherit;border: inherit\">Tim Oates, Paul Cohen, Casey Durfee<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">12:30\u20132:00<\/td>\n<td style=\"padding: inherit;border: inherit\">Lunch (<em>provided<\/em>)<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">2:00\u20133:30<\/td>\n<td style=\"padding: inherit;border: inherit\"><strong>Session VI: Applications<\/strong><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Causal mechanisms and classification trees for predicting chemical carcinogens<\/td>\n<td style=\"padding: inherit;border: inherit\">Louis A Cox<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Geometric modelling of a nuclear environment<\/td>\n<td style=\"padding: inherit;border: inherit\">Jan De Geeter<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Modeling decision tree performance with the power law<\/td>\n<td style=\"padding: inherit;border: inherit\">Lewis Frey, Doug Fisher<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">3:30\u20134:40<\/td>\n<td style=\"padding: inherit;border: inherit\">Business Meeting<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>Wednesday, January 6<\/h2>\n<table class=\"msr-table-schedule\" style=\"border-spacing: inherit;border-collapse: collapse\">\n<thead class=\"thead\">\n<tr class=\"tr\">\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Time<\/th>\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Session<\/th>\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Speaker<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"tbody\">\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">8:00\u20139:00<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p>Continental Breakfast<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">9:00\u201310:30<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p><strong>Session VII: Inference<br \/>\nChair:<\/strong> Greg Cooper<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p>Model-independent mean field theory as a local method for approximate propagation of information<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">Michael Haft, Reimar Hofmann, Volker Tresp<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Hierarchical IFA belief networks<\/td>\n<td style=\"padding: inherit;border: inherit\">Hagai Attias<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Stochastic local search for Bayesian network<\/td>\n<td style=\"padding: inherit;border: inherit\">Kalev Kask, Rina Dechter<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">10:30\u201311:00<\/td>\n<td style=\"padding: inherit;border: inherit\">Coffee Break<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">11:00\u201312:00<\/td>\n<td style=\"padding: inherit;border: inherit\"><strong>Session VIII: Applications<br \/>\nChair:<\/strong> Doug Fisher<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">An experiment in causal discovery using a pneumona database<\/td>\n<td style=\"padding: inherit;border: inherit\">Peter Spirtes, Greg Cooper<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Bayesian graphical models for non-compliance in randomaized trials<\/td>\n<td style=\"padding: inherit;border: inherit\">David Madigan<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">12:00\u201312:15<\/td>\n<td style=\"padding: inherit;border: inherit\">Closing Remarks<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><em>Plenary Presentations lasted 25 minutes with 5 minutes for questions.<\/em><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<h2>Tutorials | January 3<\/h2>\n<p>\t<div data-wp-context='{\"items\":[]}' data-wp-interactive=\"msr\/accordion\">\n\t\t\t\t\t<div class=\"clearfix\">\n\t\t\t\t<div\n\t\t\t\t\tclass=\"btn-group align-items-center mb-g float-sm-right\"\n\t\t\t\t\tdata-bi-aN=\"accordion-collapse-controls\"\n\t\t\t\t>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Expand all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onExpandAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tExpand all\t\t\t\t\t<\/button>\n\t\t\t\t\t<span aria-hidden=\"true\"> | <\/span>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Collapse all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onCollapseAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tCollapse all\t\t\t\t\t<\/button>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t\t\t<ul class=\"msr-accordion\">\n\t\t\t\t\t\t\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2186\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2186\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2185\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tInformation Access and Retrieval\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2185\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2186\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Speaker:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/sdumais\/\" target=\"_blank\">Susan Dumais<\/a>, Microsoft Research |\u00a08:30 AM \u2013\u00a010:30 AM<\/p>\n<p>The Web has made literally terabytes of information available at the click of a mouse. The challenge is in finding the right information. Information retrieval is concerned with providing access to textual data for which we have no good formal model, such as a relational model. Statistical approaches have been widely applied to this problem. This tutorial will provide an overview of: a) statistical characteristics of large text collections (e.g., size, sparcity, word distributions), b) important retrieval models (e.g., Boolean, vector space and probabilistic), and c) enhancements which use unsupervised learning to model structure in text collections, or supervised learning to incorporate user feedback. We will conclude with a discussion of open research issues where improved statistical models can improve performance.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2188\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2188\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2187\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tBayesian Statistical Analysis\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2187\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2188\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Speaker:<\/strong>\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.statslab.cam.ac.uk\/Dept\/People\/Spiegelhalter\/davids.html\" target=\"_blank\">David Spiegelhalter<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, MRC Biostatistics Unit, Institute for Public Health, Cambridge | 11:00 AM \u2013 12:00 AM and 1:30 PM \u2013\u00a02:30 PM<\/p>\n<p>The first part of the tutorial will cover the fundamentals of Bayesian inference, including probability and its subjective interpretation, evaluation of probability assessments using scoring rules, utilities and decision theory. The use of Bayes theorem for updating beliefs will be illustrated for both binomial and normal likelihoods, and the use of conjugate families of priors and predictive distributions described. The First Bayes software will be used to display conjugate Bayesian analysis. The second part will introduce the concept of `exchangeability&#8217;, and the consequent use of hierarchical models in which the unknown parameters of a common prior are included in the model. Conditional independence assumptions lead naturally to a graphical representation of hierarchical models. Markov chain Monte Carlo (MCMC) methods will be introduced as a means of carrying out the necessary numerical integrations, and topics covered will include the relationship of Gibbs sampling to graphical modelling, parameterisation, initial values, and choice of prior distributions. Real examples will be used throughout, and on-line analysis of an example in longitudinal modelling with measurement error on predictors will be carried out using the WinBUGS program.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2190\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2190\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2189\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tAdditive Logistic Regression: A Statistical View of Boosting\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2189\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2190\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Speaker:\u00a0<\/strong><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/web.stanford.edu\/~hastie\/\" target=\"_blank\">Trevor Hastie<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Stanford University |\u00a03:30 PM \u2013\u00a05:00 PM<\/p>\n<p>Boosting (Freund and Schapire, 1995) is one of the most important recent developments in classification methodology. Boosting works by sequentially applying a classification algorithm to reweighted versions of the training data, and then taking a weighted majority vote of the sequence of classifiers thus produced. For many classification algorithms, this simple strategy results in dramatic improvements in performance. We show that this seemingly mysterious phenomenon can be understood in terms of well known statistical principles, namely additive modeling and maximum likelihood. For the two-class problem, boosting can be viewed as an approximation to additive modeling on the logistic scale using maximum Bernoulli likelihood as a criterion. We develop more direct approximations and show that they exhibit nearly identical results to boosting. Directmulti-class generalizations based on multinomial likelihood are derived that exhibit performance comparable to other recently proposed multi-class generalizations of boosting in most situations, and far superior in some. We suggest a minor modification to boosting that can reduce computation, often by factors of 10 to 50. Finally, we apply these insights to produce an alternative formulation of boosting decision trees. This approach, based on best-first truncated tree induction, often leads to better performance, and can provide interpretable descriptions of the aggregate decision rule. It is also much faster computationally, making it more suitable to large scale data mining applications.<\/p>\n<p><em>* joint work with Jerome Friedman and Rob Tibshirani<\/em><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t\t\t\t\t<\/ul>\n\t<\/div>\n\t<\/p>\n<h2>Abstracts | January 4 &#8211; 6<\/h2>\n<p>\t<div data-wp-context='{\"items\":[]}' data-wp-interactive=\"msr\/accordion\">\n\t\t\t\t\t<div class=\"clearfix\">\n\t\t\t\t<div\n\t\t\t\t\tclass=\"btn-group align-items-center mb-g float-sm-right\"\n\t\t\t\t\tdata-bi-aN=\"accordion-collapse-controls\"\n\t\t\t\t>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Expand all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onExpandAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tExpand all\t\t\t\t\t<\/button>\n\t\t\t\t\t<span aria-hidden=\"true\"> | <\/span>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Collapse all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onCollapseAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tCollapse all\t\t\t\t\t<\/button>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t\t\t<ul class=\"msr-accordion\">\n\t\t\t\t\t\t\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2192\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2192\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2191\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tHierarchical IFA Belief Networks\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2191\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2192\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0Hagai Attias,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.gatsby.ucl.ac.uk\/\" target=\"_blank\">Gatsby Computational Neuroscience Unit<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0University College London<\/p>\n<p><strong>Abstract:<\/strong>\u00a0We introduce a new real-valued belief network, which is a multilayer generalization of independent factor analysis (IFA). At each level, this network extracts real-valued latent variables that are non-linear functions of the input data with a highly adaptive functional form, resulting in a hierarchical distributed representation of these data. The network is based on a probabilistic generative model, constructed by cascading single-layer IFA models. Whereas exact maximum-likelihood learning for this model is intractable, we present and demonstrate an algorithm that maximizes a lower bound on the likelihood. This algorithm is developed by formulating a variational approach to hierarchical IFA networks.<\/p>\n<p><strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/hifan.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2194\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2194\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2193\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tPattern discovery via entropy minimization\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2193\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2194\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.merl.com\/people\/brand\" target=\"_blank\">Matthew Brand<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.merl.com\/\" target=\"_blank\">Mitsubishi Electric Research Labs<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p><strong>Abstract:<\/strong>\u00a0We propose a framework for learning hidden-variable models by optimizing entropies, in which entropy minimization, posterior maximization, and free energy minimization are all equivalent. Solutions for the maximum <em>a posteriori<\/em> (MAP) estimator yield powerful learning algorithms that combine all the charms of expectation-maximization and deterministic annealing. Contained as special cases are the methods of maximum entropy, maximum likelihood, and a new method, maximum structure. We focus on the maximum structure case, in which entropy minimization maximizes the amount of evidence supporting each parameter while minimizing uncertainty in the sufficient statistics and cross-entropy between the model and the data. In iterative estimation, the MAP estimator gradually extinguishes excess parameters, sculpting a model structure that reflects hidden structures in the data. These models are highly resistant to over-fitting and have the particular virtue of being easy to interpret, often yielding insights into the hidden causes that generate the data.<\/p>\n<p><strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/pattern-discovery-entropy-minimization.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2196\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2196\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2195\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tCausal Mechanisms and Classification Trees for Predicting Chemical Carcinogens\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2195\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2196\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong> Louis Anthony (&#8220;Tony&#8221;) Cox, Jr.,\u00a0Cox Associates<\/p>\n<p><strong>Abstract:<\/strong>\u00a0Classification trees, usually used as a nonlinear, nonparametric classification method, can also provide a powerful framework for comparing, assessing, and combining information from different expert systems, by treating their predictions as the independent variables in a classification tree analysis. This paper discusses the applied problem of classifying chemicals as human carcinogens. It shows how classification trees can be used to compare the information provided by ten different carcinogen classification expert systems, construct an improved &#8220;hybrid&#8221; classification system from them, and identify cost-effective combinations of assays (the inputs to the expert systems) to use in classifying chemicals in future.<\/p>\n<p><strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/CAUSAL_MECHANISMS_AND_CLASSIFICATION_TREES_FOR_PR.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2198\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2198\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2197\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tConditional Products: An Alternative Approach to Conditional Independence\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2197\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2198\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0A. Philip Dawid,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.ucl.ac.uk\/statistics\/\" target=\"_blank\">Department of Statistical Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, University College London; <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/staff.utia.cas.cz\/studeny\/studeny_home.html?q=user_data\/studeny\/studeny_home.html\" target=\"_blank\">Milan Studeny<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.utia.cas.cz\/\" target=\"_blank\">Institute of Information Theory and Automation<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Academy of Sciences of Czech Republic, and Laboratory of Intelligent Systems, University of Economics Prague<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We introduce a new abstract approach to the study of conditional independence, founded on a concept analogous to the factorization properties of probabilistic independence, rather than the separation properties of a graph. The basic ingredient is the &#8220;conditional product&#8221;, which provides a way of combining the basic objects under consideration while preserving as much independence as possible. We introduce an appropriate axiom system for conditional product, and show how, when these axioms are obeyed, they induce a derived concept of conditional independence which obeys the usual semi-graphoid axioms. The general structure is used to throw light on three specific areas: the familiar probabilistic framework (both the discrete and the general case); a set-theoretic framework related to &#8220;variation independence&#8221;; and a variety of graphical frameworks.<\/p>\n<p><strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/daw-stu-99-pdf.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2200\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2200\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2199\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tGeometric Modeling of a Nuclear Environment\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2199\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2200\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Jan De Geeter and Marc Decr\u00e9ton,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.sckcen.be\/\" target=\"_blank\">SCK.CEN<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> (Belgian Nuclear Research Centre);\u00a0Joris De Schutter, Herman Bruyninckx, and Hendrik Van Brussel,\u00a0Department of Mechanical Engineering,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.mech.kuleuven.be\/en\/pma\" target=\"_blank\">Division PMA<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0Katholieke Universiteit Leuven<\/p>\n<p><strong>Abstract:\u00a0<\/strong>This paper is about the task-directed updating of an incomplete and inaccurate geometric model of a nuclear environment, using only robust radiation-resistant sensors installed on a robot that is remotely controlled by a human operator. In this problem, there are many sources of uncertainty and ambiguity. This paper proposes a probabilistic solution under Gaussian assumptions. Uncertainty is reduced with an estimator based on a Kalman filter. Ambiguity on the measurement-feature association is resolved by running a bank of those estimators in parallel, one for each plausible association. The residual errors of these estimators are used for hypothesis testing and for the calculation of a probability distribution over the remaining hypotheses. The best next sensing action is calculated as a Bayes decision with respect to a loss function that takes into account both the uncertainty on the current estimate, and the variance\/precision required by the task.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2202\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2202\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2201\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tProcess-Oriented Evaluation: The Next Step\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2201\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2202\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0Pedro Domingos,\u00a0Artificial Intelligence Group,\u00a0Instituto Superior T\u00e9cnico<\/p>\n<p><strong>Abstract:<\/strong>\u00a0Methods to avoid overfitting fall into two broad categories: data-oriented (using separate data for validation) and representation-oriented (penalizing complexity in the model). Both have limitations that are hard to overcome. We argue that fully adequate model evaluation is only possible if the search process by which models are obtained is also taken into account. To this end, we recently proposed a method for <em>process-oriented evaluation <\/em>(POE), and successfully applied it to rule induction (Domingos, 1998). However, for the sake of simplicity this treatment made two rather artificial assumptions. In this paper the assumptions are removed, and a simple formula for model evaluation is obtained. Empirical trials show the new, better-founded form of POE to be as accurate as the previous one, while further reducing theory sizes.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/process-oriented.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2204\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2204\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2203\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tModeling Decision Tree Performance with the Power Law\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2203\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2204\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong> Lewis J. Frey and Douglas H. Fisher, Jr.,\u00a0Computer Science Department,\u00a0Vanderbilt University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>This paper discusses the use of a power law to predict decision tree performance. Power laws are fit to learning curves of decision trees trained on data sets from the UCI repository. The learning curves are generated by training C4.5 on different size training sets. The power law predicts diminishing returns in terms of error rate as training set size increase. By characterizing the learning curve with a power law, the error rate for a given size training set can be projected. This projection can be used in estimating the amount of data needed to achieve an acceptable error rate, and the cost effectiveness of further data collection.<\/p>\n<p><strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/ModelingTree.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2206\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2206\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2205\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tEfficient Learning using Constrained Sufficient Statistics\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2205\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2206\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.cs.huji.ac.il\/~nir\/\" target=\"_blank\">Nir Friedman<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.cs.huji.ac.il\/\" target=\"_blank\">Institute of Computer Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0The Hebrew University;\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/robotics.stanford.edu\/~getoor\/\" target=\"_blank\">Lise Getoor<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www-cs.stanford.edu\/\" target=\"_blank\">Computer Science Department<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0Stanford University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Learning Bayesian networks is a central problem for pattern recognition, density estimation and classification. In this paper, we propose a new method for speeding up the computational process of learning Bayesian network <em>structure<\/em>. This approach uses constraints imposed by the statistics already collected from the data to guide the learning algorithm. This allows us to reduce the number of statistics collected during learning and thus speed up the learning time. We show that our method is capable of learning structure from data more efficiently than traditional approaches. Our technique is of particular importance when the size of the datasets is large or when learning from incomplete data. The basic technique that we introduce is general and can be used to improve learning performance in many settings where sufficient statistics must be computed. In addition, our technique may be useful for alternate search strategies such as branch and bound algorithms.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/FGe1.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2208\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2208\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2207\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tOn the geometry of DAG models with hidden variables\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2207\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2208\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.cs.technion.ac.il\/~dang\/\" target=\"_blank\">Dan Geiger<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/heckerma\/\" target=\"_blank\">David Heckerman<\/a>, and Christopher Meek, Decision Theory & Adaptive Systems,\u00a0Microsoft Research;\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.math.umd.edu\/~hking\/\" target=\"_blank\">Henry King<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.math.umd.edu\" target=\"_blank\">Mathematics Department<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0University of Maryland<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We prove that many graphical models with hidden variables are not curved exponential families. This result, together with the fact that some graphical models are curved and not linear, implies that the hierarchy of graphical models, as linear, curved, and stratified, is non-collapsing; each level in the hierarchy is strictly contained in the larger levels. This result is discussed in the context of model selection of graphical models.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2210\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2210\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2209\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tModel Choice: A minimum posterior predictive loss approach\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2209\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2210\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Sujit. Ghosh,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.ncsu.edu\/\" target=\"_blank\">Department of Statistics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0NC State University;\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/stat.uconn.edu\/alan-gelfand\/\" target=\"_blank\">Alan E. Gelfand<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/stat.uconn.edu\/\" target=\"_blank\">Department of Statistics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0University of Connecticut<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Model choice is a fundamental activity in the analysis of data sets, an activity which has become increasingly more important as computational advances enable the fitting of increasingly complex models. Such complexity typically arises through hierarchical structure which requires specification at each stage of probabilistic mechanisms, mean and dispersion forms, explanatory variables, etc. Nonnested hierarchical models introducing random effects may not be handled by classical methods. Bayesian approaches using predictive distributions can be used though the FORMAL solution, which includes Bayes factors as a special case, can be criticized. It seems natural to evaluate model performance by comparing what it predicts with what has been observed. Most classical criteria utilize such comparison. We propose a predictive criterion where the goal is good prediction of a replicate of the observed data but tempered by fidelity to the observed values. We obtain this criterion by minimizing posterior loss for a given model and then, for models under consideration, selecting the one which minimizes this criterion. For a version of log scoring loss we can do the minimization explicitly, obtaining an expression which can be interpreted as a penalized deviance criterion. We illustrate its performance with an application to a large data set involving residential property transactions.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/gelfand_biometrika_1998.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2212\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2212\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2211\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tMean Field Inference in a General Probabilistic Setting\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2211\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2212\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Michael Haft,\u00a0Reimar Hofmann, and\u00a0Volker Tresp,\u00a0Siemens AG,\u00a0Corporate Technology, Information and Communications Department<\/p>\n<p><strong>Abstract:<\/strong>\u00a0We present a systematic, model-independent formulation of\u00a0 mean field theory (MFT) as an inference method in probabilistic models. &#8220;Model-independent&#8221; means that we do not assume a particular type of dependency among the variables of a domain but instead work in a general probabilistic setting. In a Bayesian network, for example, you may use arbitrary tables to specify conditional dependencies and thus run MFT in <em>any<\/em> Bayesian network. Furthermore, the general mean field equations derived here shed a light on the essence of MFT. MFT can be interpreted as a local iteration scheme which relaxes in a consistent state (a solution of the mean field equations). Iterating the mean field equations means propagating information through the network. In general, however, there are multiple solutions to the mean field equations. We show that improved approximations can\u00a0 be obtained by forming a weighted mixture of the multiple mean field solutions.\u00a0 Simple approximate expressions for the mixture weights are given. The benefits of taking into account multiple solutions are demonstrated by using MFT for inference in a small Bayesian network representing a medical domain. Thereby it turns out that every solution of the mean field equations can be interpreted as a &#8216;disease scenario&#8217;.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/Mean_Field_Inference_in_a_General_Probabilistic_Se.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2214\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2214\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2213\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tProbabilistic kernel regression models\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2213\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2214\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/people.csail.mit.edu\/tommi\/\" target=\"_blank\">Tommi S. Jaakkola<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0Department of Computer Science and Electrical Engineering,\u00a0Massachusetts Institute of Technology;\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/genomics-old.soe.ucsc.edu\/haussler\" target=\"_blank\">David Haussler<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0Department of Computer Science,\u00a0University of California\u2013Santa Cruz<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We introduce a class of flexible conditional probability models and techniques for classification\/regression problems. Many existing methods such as generalized linear models and support vector machines are subsumed under this class. The flexibility of this class of techniques comes from the use of kernel functions as in support vector machines, and the generality from dual formulations of standard regression models.<\/p>\n<p><em><span style=\"font-family: wf_segoe-ui_bold, wf_segoe-ui_semibold, wf_segoe-ui_normal, Arial, sans-serif\">*<\/span>The work was done while T. Jaakkola was at UC Santa Cruz.<\/em><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2216\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2216\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2215\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tHierarchical Mixtures-of-Experts for Generalized Linear Models: Some Results on Denseness and Consistency\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2215\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2216\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong> Wenxin Jiang and Martin A. Tanner,\u00a0Department of Statistics,\u00a0Northwestern University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We investigate a class of hierarchical mixtures-of-experts (HME) models where exponential family regression models with generalized linear mean functions of the form $psi(a+x^T b)$ are mixed. Here $psi(cdot)$ is the inverse link function. Suppose the true response $y$ follows an exponential family regression model with mean function belonging to a class of smooth functions of the form $psi(h(x))$ where $h in W_{2;K_0}^infty$ (a Sobolev class over $[0,1]^{s}$). It is shown that the HME mean functions can approximate the true mean function, at a rate of $O(m^{-2\/s})$ in $L_p$ norm. Moreover, the HME probability density functions can approximate the true density, at a rate of $O(m^{-2\/s})$ in Hellinger distance, and at a rate of $O(m^{-4\/s})$ in Kullback-Leibler divergence. These rates can be achieved within the family of HME structures with a tree of binary splits, or within the family of structures with a single layer of experts. Here $s$ is the dimension of the predictor $x$. It is also shown that likelihood-based inference based on HME is consistent in recovering the truth, in the sense that as the sample size $n$ and the number of experts $m$ both increase, the mean square error of the estimated mean response goes to zero. Conditions for such results to hold are stated and discussed.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/1301.7390.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2218\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2218\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2217\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tStochastic local search for Bayesian networks\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2217\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2218\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Kalev Kask and Rina Dechter,\u00a0Department of Information and Computer Science,\u00a0University of California\u2013Irvine<\/p>\n<p><strong>Abstract:\u00a0<\/strong>The paper evaluates empirically the suitability\u00a0of Stochastic Local Search algorithms\u00a0(SLS) for finding most probable explanations\u00a0in Bayesian networks. SLS algorithms\u00a0(e.g. GSAT, WSAT) have recently\u00a0proven to be highly effective in solving\u00a0complex constraint-satisfaction and satisfiability problems which cannot be solved\u00a0by traditional search schemes. Our experiments\u00a0investigate the applicability of this\u00a0scheme to probabilistic optimization problems.Specifically, we show that algorithms\u00a0combining hill-climbing steps with stochastic\u00a0steps (guided by the network&#8217;s probability\u00a0distribution) called G+StS, outperform pure\u00a0hill-climbing search, pure stochastic simulation\u00a0search, as well as simulated annealing.\u00a0In addition, variants of G+StS that are augmented on top of alternative approximation\u00a0methods are shown to be particularly effective.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/r72-new_11_98.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2220\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2220\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2219\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tBayesian Graphical Models, Intention-to-Treat, and the Rubin Causal Model\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2219\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2220\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0David Madigan,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.washington.edu\/\" target=\"_blank\">Department of Statistics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0University of Washington<\/p>\n<p><strong>Abstract:<\/strong>\u00a0In clinical trials with significant noncompliance the standard intention-to-treat analyses sometimes mislead. Rubin&#8217;s causal model provides an alternative method of analysis that can shed extra light on clinical trial data. Formulating the Rubin Causal Model as a Bayesian graphical model facilitates model communication and computation.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/38981c9db80897c99c15049dcf4a0145aad5.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2222\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2222\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2221\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tEfficient Mining of Statistical Dependencies\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2221\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2222\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Tim Oates,\u00a0Matthew D. Schmill,\u00a0Paul R. Cohen, and Casey Durfee,\u00a0Experimental Knowledge Systems Lab,\u00a0Department of Computer Science,\u00a0University of Massachusetts\u2013Amherst<\/p>\n<p><strong>Abstract:<\/strong>\u00a0The Multi-Stream Dependency Detection algorithm finds rules that capture statistical dependencies between patterns in multivariate time series of categorical data. Rule strength is measured by the G statistic, and an upper bound on the value of G for the descendants of a node allows MSDD&#8217;s search space to be pruned. However, in the worst case, the algorithm will explore exponentially many rules. This paper presents and empirically evaluates two ways of addressing this problem. The first is a set of three methods for reducing the size of MSDD&#8217;s search space based on information collected during the search process. Second, we discuss an implementation of MSDD that distributes its computations over multiple machines on a network.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2224\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2224\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2223\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tTractable structure search in the presence of latent variables\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2223\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2224\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong> <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.washington.edu\/tsr\/website\/inquiry\/home.php\" target=\"_blank\">Thomas Richardson<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, Heiko Bailer, and Moulinath Banerjees,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.washington.edu\/\" target=\"_blank\">Department of Statistics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0University of Washington<\/p>\n<p><strong>Abstract:\u00a0<\/strong>The problem of learning the structure of a DAG model in the presence of latent variables presents many formidable challenges. In particular there are an infinite number of latent variable models to consider, and these models possess features which make them hard to work with. We describe a class of graphical models which can represent the conditional independence structure induced by a latent variable model over the observed margin. We give a parametrization of the set of Gaussian distributions with conditional independence structure given by a MAG model. The models are illustrated via a simple example. Different estimation techniques are discussed in the context of Zellner&#8217;s Seemingly Unrelated Regression (SUR) models.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2226\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2226\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2225\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tBoosting Methodology for Regression Problems\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2225\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2226\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Greg Ridgeway,\u00a0David Madigan, and\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.washington.edu\/tsr\/website\/inquiry\/home.php\" target=\"_blank\">Thomas Richardson<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.washington.edu\/\" target=\"_blank\">Department of Statistics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0University of Washington<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Classification problems have dominated research on boosting to date. The application of boosting to regression problems, on the other hand, has received little investigation. In this paper we develop a new boosting method for regression problems. We cast the regression problem as a classification problem and apply an interpretable form of the boosted na\u00efve Bayes classifier. This induces a regression model that we show to be expressible as an additive model for which we derive estimators and discuss computational issues. We compare the performance of our boosted na\u00efve Bayes regression model with other interpretable multivariate regression procedures.<\/p>\n<p><strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/boosting-methodology-regression-problems.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2228\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2228\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2227\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tAn Experiment in Causal Inference Using a Pneumonia Database\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2227\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2228\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong> Peter Spirtes,\u00a0Department of Philosophy,\u00a0Carnegie Mellon University;\u00a0Greg Cooper,\u00a0Center for Biomedical Informatics,\u00a0University of Pittsburgh<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We tested a causal discovery algorithm on a database of pneumonia patients. The output of the causal discovery algorithm is a list of statements &#8220;A causes B&#8221;, where A and B are variables in the database, and a score indicating the degree of confidence in the statement. We compared the output of the algorithm with the opinions of physicians about whether A caused B or not. We found that the doctors opinions were independent of the output of the algorithm. However, an examination of the output of results suggested a simple, well motivated modification of the algorithm which would bring the output of the algorithm into high agreement with the physicians opinions.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/causal-discovery.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2230\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2230\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2229\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tA Note on the Comparison of Polynomial Selection Methods\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2229\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2230\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Murlikrishna Viswanathan and Professor Chris Wallace,\u00a0School of Computer Science and Software Engineering,\u00a0Monash University\u2013Clayton<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Minimum Message Length (MML) and Structural Risk Minimisation (SRM) are two computational learning principles that have achieved wide acclaim in recent years. Whereas the former is based on Bayesian learning and the latter on the classical theory of VC-dimension, they are similar in their attempt to define a trade-off between model complexity and goodness of fit to the data. A recent empirical study by Wallace compared the performance of standard model selection methods in a one-dimensional polynomial regression framework. The results from this study provided strong evidence in support of the MML and SRM based methods over the other standard approaches. In this paper we present a detailed empirical evaluation of three model selection methods which include an MML based approach and two SRM based methods. Results from our analysis and experimental evaluation suggest that the MML-based approach in general has higher predictive accuracy and also raise questions on the inductive capabilities of the Structural Risk Minimization Principle.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t\t\t\t\t<\/ul>\n\t<\/div>\n\t<span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<h2>Poster Sessions | January 4<\/h2>\n<p>\t<div data-wp-context='{\"items\":[]}' data-wp-interactive=\"msr\/accordion\">\n\t\t\t\t\t<div class=\"clearfix\">\n\t\t\t\t<div\n\t\t\t\t\tclass=\"btn-group align-items-center mb-g float-sm-right\"\n\t\t\t\t\tdata-bi-aN=\"accordion-collapse-controls\"\n\t\t\t\t>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Expand all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onExpandAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tExpand all\t\t\t\t\t<\/button>\n\t\t\t\t\t<span aria-hidden=\"true\"> | <\/span>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Collapse all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onCollapseAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tCollapse all\t\t\t\t\t<\/button>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t\t\t<ul class=\"msr-accordion\">\n\t\t\t\t\t\t\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2232\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2232\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2231\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tTransfer of Information between System and Evidence Models\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2231\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2232\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Russell Almond,\u00a0Research Statistics Group at\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.ets.org\/\" target=\"_blank\">Educational Testing Service<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>;\u00a0Edward Herskovits,\u00a0Noetic Systems, Inc.;\u00a0Robert J. Mislevy,\u00a0Model Based Measurement Group at\u00a0Educational Testing Service;\u00a0Linda Stienberg,\u00a0Educational Policy Research at\u00a0Educational Testing Service<\/p>\n<p><strong>Abstract:<\/strong>\u00a0In this paper we illustrate a simple scheme for dividing a complex Bayes network into a <em>system model<\/em> and a collection of smaller <em>evidence models<\/em>. While the system model maintains a permanent record of the state of the system of interest, the evidence models are only used momentarily to absorb evidence from specific observations or findings and then discarded. This paper describes an implementation of a system model\u2014evidence model complex in which each system and evidence model has a separate Bayes net and Markov tree representation. As necessary, information is propagated between common Markov tree nodes of the evidence and system models. While mathematically equivalent to the full Bayes network, the system model&#8211;evidence model complex allows us to (a) separate the seldom used evidence model portions from the core system model thus reducing search and propagation time in the network and (b) easily replace the evidence models (this is particular advantageous in educational examples in which new test items are often introduced to prevent overexposure of assessment tasks).<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2234\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2234\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2233\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tA Bayesian Model for Collaborative Filtering\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2233\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2234\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Yung-Hsin Chien and Edward I. George,\u00a0Department of MSIS at\u00a0University of Texas at Austin<\/p>\n<p><strong>Abstract:<\/strong>\u00a0Consider the general setup where a set of <em>items<\/em> have been partially <em>rated<\/em> by a set of <em>judges<\/em>, in the sense that not every item has been rated by every judge. For this setup, we propose a Bayesian approach for the problem of predicting the missing ratings from the observed ratings. This approach incorporates similarity by assuming the set of judges can be partitioned into groups which share the same ratings probability distribution. This leads to a predictive distribution of missing ratings based on the posterior distribution of the groupings and associated ratings probabilities. Markov chain Monte Carlo methods and a hybrid search algorithm are then used to obtain predictions of the missing ratings.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2236\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2236\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2235\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tParameter learning from incomplete data for Bayesian networks\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2235\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2236\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0RG Cowell<\/p>\n<p><strong>Abstract:<\/strong> In a companion paper (Cowell 1999), I described a method of using maximum entropy to estimate the joint probability distribution for a set of discrete variables from missing data. Here I extend the method of that paper to incorporate prior information for application to parameter learning in Bayesian networks.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2238\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2238\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2237\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tOn the Application of The Bootstrap for Computing Confidence Measures on Features of Induced Bayesian Networks\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2237\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2238\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.cs.huji.ac.il\/~nir\/\" target=\"_blank\">Nir Friedman<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.cs.huji.ac.il\/\" target=\"_blank\">Institute of Computer Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0Hebrew University;\u00a0Moises Goldszmidt, SRI International; Abraham Wyner,\u00a0Department of Statistics at\u00a0Wharton School<\/p>\n<p><strong>Abstract:\u00a0<\/strong>In the context of learning Bayesian networks from data, very little work has been published on methods for assessing the <em>quality<\/em> of an induced model. This issue, however, has received a great deal of attention in the statistics literature. In this paper, we take a well-known method from statistics, Efron&#8217;s Bootstrap, and examine its applicability for assessing a confidence measure on features of the learned network structure. We also compare this method to assessments based on a practical realization of the Bayesian methodology.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2240\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2240\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2239\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tRelaxing the Local Independence Assumption for Quantitative Learning in Acyclic Directed Graphical Models through Hierarchical Partition Models\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2239\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2240\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Daniela Golinelli and David Madigan,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.washington.edu\/\" target=\"_blank\">Department of Statistics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0University of Washington and\u00a0Guido Consonni,\u00a0Dip. di Economia e Metodi Quantitativi at\u00a0Universita&#8217; di Pavia<\/p>\n<p><strong>Abstract:\u00a0<\/strong>The simplest method proposed by Spiegelhalter and Lauritzen (1990) to perform <em>quantitative learning<\/em> in ADG presents a potential weakness: the <em>local independence assumption<\/em>. We propose to alleviate this problem through the use of Hierarchical Partition Models. Our approach is compared with the previous one from an interpretative and predictive point of view.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2242\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2242\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2241\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tThe exploration of new methods for learning in binary Boltzmann machines\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2241\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2242\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Keith Humphreys and D.M. Titterington,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.gla.ac.uk\/subjects\/statistics\/\" target=\"_blank\">Department of Statistics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0University of Glasgow<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Exact inference for Boltzmann machines is computationally expensive. One approach to improving tractability is to approximate the gradient algorithm. We describe a new way of doing this which is based on Bahadur&#8217;s representation of the multivariate binary distribution (Bahadur, 1961). We compare the approach, for networks with no unobserved variable, to the &#8220;mean field&#8221; approximation of Peterson and Anderson (1987) and the approach of Kappen and Rodriguez (1998), which is based on the linear response theorem. We also investigate the use of the pairwise association cluster method (Tanaka and Morita, 1995).<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2244\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2244\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2243\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tStatistical Challenges to inductive inference in linked data\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2243\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2244\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0David Jensen<\/p>\n<p><strong>Abstract:<\/strong> Many data sets can be represented naturally as collections of linked objects. For example, document collections can be represented as documents (nodes) connected by citations and hypertext references (links). Similarly, organizations can be represented as people (nodes) connected reporting relationships, social relationships, and communication patterns (links).<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2246\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2246\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2245\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tMixture Model Clustering with the Multimix Program\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2245\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2246\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Murray A. Jorgensen and Lynette A. Hunt,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stats.waikato.ac.nz\/\" target=\"_blank\">Department of Statistics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0University of Waikato<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Hunt (1996) has implemented the finite mixture model approach to clustering in a program called Multimix. The program is designed to cluster multivariate data with categorical and continuous variables and possibly containing missing values. The model fitted simultaneously generalises the Latent Class model and the mixture of multivariate normals model. Like either of these models Multimix can be used to form clusters by the Bayes allocation rule. This is the intended use of the program, although the parameter estimates can be used to give a succinct description of the clusters.<\/p>\n<p>Use of the EM algorithm, with its view of the observed data as being notionally augmented by missing information to form the &#8216;complete data&#8217;, gives a broad framework for estimation which is able to handle two types of missing information: unknown cluster assignment and missing data. Using the methodology of Little and Rubin (1987). in this way Multimix is able to handle missing data in a less ad hoc way than many clustering algorithms. The program runs in acceptable time with large data matrices (say hundreds of observations on tens of variables). Use of the missing-data facility increases execution time somewhat. In this presentation we describe the approach taken to the design of Multimix and how some of the statistical problems were dealt with. As examples of the use of the program we cluster a large medical dataset and a version of Fisher&#8217;s Iris data in which a third of the values are randomly made &#8216;missing&#8217;.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2248\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2248\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2247\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tLearning Augmented Bayesian Classifiers: A Comparison of Distribution-based and Classification-based Approaches\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2247\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2248\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Eamonn Keogh and Michael J. Pazzani,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.ics.uci.edu\/\" target=\"_blank\">Information and Computer Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0University of California, Irvine<\/p>\n<p><strong>Abstract:\u00a0<\/strong>The nave Bayes classifier is built on the assumption of conditional independence between the attributes given the class. The algorithm has been shown to be surprisingly robust to obvious violations of this condition, but it is natural to ask if it is possible to further improve the accuracy by relaxing this assumption. We examine an approach where nave Bayes is augmented by the addition of correlation arcs between attributes. We explore two methods for finding the set of augmenting arcs, a greedy hill-climbing search, and a novel, more computationally efficient algorithm that we call SuperParent. We compare these methods to TAN; a state-of the-art distribution-based approach to finding the augmenting arcs.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2250\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2250\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2249\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tExploring the robustness of Bayesian and information-theoretic methods for predictive inference\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2249\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2250\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Petri Kontkanen, Petri Myllym\u00e4ki, Tomi Silander, Henry Tirri, and Kimmo Valtonen<\/p>\n<p><strong>Abstract:<\/strong>\u00a0Given a set of sample data, we study three alternative methods for determining the predictive distribution of an unseen data vector. In particular, we are interested in the behavior of the predictive accuracy of these three predictive methods as a function of the degree of the domain assumption violations. We explore this question empirically by using artificially generated data sets, where the assumptions can be violated in various ways. Our empirical results suggest that if the model assumptions are only mildly violated, marginalization over the model parameters may not be necessary in practice. This is due to the fact that in this case the computationally much simpler predictive distribution based on a single, maximum posterior probability model shows similar performance as the computationally more demanding marginal likelihood approach. The results also give support to Rissanen&#8217;s theoretical results about the usefulness of using Jereys&#8217; prior distribution for the model parameters.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/12\/10.1.1.51.1018.pdf\" target=\"_blank\">PDF<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2252\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2252\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2251\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tStructure optimization of density estimation models applied to regression problems with dynamic noise\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2251\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2252\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Martin Kreutz,\u00a0Bernhard Sendhoff, and Werner von Seelen,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.ini.rub.de\/\" target=\"_blank\">Institut f\u00fcr Neuroinformatik<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0Ruhr-Universit\u00e4t Bochum;\u00a0Anja M. Reimetz and Claus Weihs,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.statistik.uni-dortmund.de\/fakultaet.html\" target=\"_blank\">Fachbereich Statistik<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0Universit\u00e4t Dortmund<\/p>\n<p><strong>Abstract:\u00a0<\/strong>In this paper we deal with the problem of model selection for time series forecasting with dynamical noise and missing data. We employ an evolutionary algorithm to the optimization of a mixture of densities model in order to estimate, via a log-likelihood based quality measure, the joint probability density of the data. We apply our method to the prediction of both artificial time series, generated from the Mackey-Glass equation, and time series from a real world system consisting of physiological data of apnea patients.<\/p>\n<p><strong>Related work from the authors:<\/strong><\/p>\n<ul>\n<li>Martin Kreutz, Anja M. Reimetz, Bernhard Sendhoff, Claus Weihs and Werner von Seelen. Optimisation of Density Estimation Models with Evolutionary Algorithms. In A.E. Eiben, Th. B\u00e4ck, M. Schoenauer and H.P. Schwefel, editors, <em>Parallel Problem Solving from Nature &#8211; PPSN V<\/em>, pages 998-1007, Lecture Notes in Computer Science 1498, Springer, 1998.<\/li>\n<\/ul>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2254\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2254\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2253\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tA learning rule based method of feature extraction with application to acoustic signal classification\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2253\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2254\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0M.J. Larkin,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.brown.edu\/research\/projects\/brain-and-neural-systems\/\" target=\"_blank\">The Institute for Brain and Neural Systems<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at Brown University<\/p>\n<p><strong>Abstract:<\/strong> We apply the Bienenstock, Cooper, and Munro (1982) theory of visual cortical plasticity to the problem of extracting features (i.e., reduction of dimensionality) from acoustic signals; in this case, labeled samples of marine mammal sounds. We first implemented BCM learning in a single neuron model, trained the neuron on samples of acoustic data, and then observed the response when the neuron was tested on different classes of acoustic signals. Next, a multiple neuron network was constructed, with lateral inhibition among the neurons. By training neurons to be selective to inherent features in these signals, we are able to develop networks which can then be used in the design of an automated acoustic signal classifier.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2256\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2256\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2255\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tLearning Extensible Multi-Entity Directed Graphical Models\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2255\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2256\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0Kathryn Blackmond Laskey,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/seor.gmu.edu\/\" target=\"_blank\">Department of Systems Engineering and Operations Research<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0George Mason University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Graphical models have become a standard tool for representing complex probability models in statistics and artificial intelligence. In problems arising in artificial intelligence, it is useful to use the belief network formalism to represent uncertain relationships among variables in the domain, but it may not be possible to use a single, fixed belief network to encompass all problem instances. This is because the number of entities to be reasoned about and their relationships to each other varies from problem instance to problem instance. This paper describes a framework for representing probabilistic knowledge as fragments of belief networks and an approach to learning both structure and parameters from observations.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2258\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2258\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2257\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tA latent variable model for multivariate discretization\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2257\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2258\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Stefano Monti,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.isp.pitt.edu\/\" target=\"_blank\">Intelligent Systems Program<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0University of Pittsburgh and\u00a0Gregory F. Cooper,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.dbmi.pitt.edu\/\" target=\"_blank\">Center for Biomedical Informatics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0University of Pittsburgh<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We describe a new method for multivariate discretization based on the use of a latent variable model. The method is proposed as a tool to extend the scope of applicability of machine learning algorithms that handle discrete variables only. Building upon existing class-based discretization methods, we use a latent variable as a <em>proxy<\/em> class variable, which is then utilized to drive the partition of the value range of each continuous variable. We present experimental results on simulated data aimed at assessing the merits of the proposed method.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2260\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2260\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2259\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tTesting Regression Models With Fewer Regressors\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2259\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2260\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Judea Pearl and Peyman Meshkat<\/p>\n<p><strong>Abstract:\u00a0<\/strong>A BASIS for a model M is a minimal set of tests that, if satisfied, implies the satisfaction of all the assumptions behind M. This paper proposes a graphical procedure of recognizing bases of regression models. Using this precedure, it is possible to select a set of tests in which the number of regressors is small, compared with standard tests, thus resulting in improved power.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2262\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2262\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2261\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tLearning Conditional Probabilities from Incomplete Databases &#8211; An Experimental Comparison\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2261\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2262\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Marco Ramoni and\u00a0Paola Sebastiani,\u00a0The Open University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>This paper compares three methods &#8211; the EM algorithm, Gibbs sampling, and Bound and Collapse (BC) &#8211; to estimate conditional probabilities from incomplete databases in a controlled experiment. Results show a substantial equivalence of the estimates provided by the three methods and a dramatic gain in efficiency using BC.<\/p>\n<p><strong>Other information:\u00a0<\/strong>Further information is available from the home page of the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/projects.kmi.open.ac.uk\/bkd\/\" target=\"_blank\">Bayesian Knowledge Discovery<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> project at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.open.ac.uk\/\" target=\"_blank\">The Open University<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2264\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2264\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2263\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tLocal Experts Combination through Density Decomposition\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2263\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2264\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Ahmed Rida,\u00a0Abderrahim Labbi, and Christian Pellegrini,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/cuiwww.unige.ch\/AI-group\/home.html\" target=\"_blank\">Artificial Intelligence group<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/cui.unige.ch\/fr\/\" target=\"_blank\">Centre Universitaire d&#8217;Informatique<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p><strong>Abstract:\u00a0<\/strong>In this paper we describe a <em>divide-and-combine<\/em> strategy for decomposition of a complex prediction problem into simpler local sub-problems. We firstly show how to perform a <em>soft<\/em> decomposition via clustering of input data. Such decomposition leads to a partition of the input space into several regions which may overlap. Therefore, to each region is assigned a local predictor (or expert) which is trained only on local data. To construct a solution to the global prediction problem, we combine the local experts using two approaches: <em>weighted averaging<\/em> where the outputs of local experts are weighted by their prior densities, and <em>nonlinear adaptive combination<\/em> where the pooling parameters are obtained through minimization of a global error. To illustrate the validity of our approach, we show simulation results for two classification tasks, <em>vowels<\/em> and <em>phonemes<\/em>, using local experts which are Multi-Layer Perceptrons (MLP) and Support Vector Machines (SVM). We compare the results obtained using the two local combination modes with the results obtained using a global predictor and a linear combination of global predictors.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2266\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2266\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2265\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tEntropy-Driven Inference and Inconsistency\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2265\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2266\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Wilhelm R\u00f6dder and\u00a0Longgui Xu,\u00a0FernUniversit\u00e4t Gesamthochschule in Hagen<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.fernuni-hagen.de\/BWLOR\/index.php\" target=\"_blank\">Fachbereich Wirschaftswissenschaft, Lehrstuhl f\u00fcr Betriebswirtschaftslehre, insb. Operations Research<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p><strong>Abstract:\u00a0<\/strong>Probability distributions on a set of discrete variables are a suitable means to represent knowledge about their respective mutual dependencies. When now things become evident such a distribution can be adapted to the new situation and hence submitted to a sound inference process. Knowledge acquisition and inference are here performed in the rich syntax of conditional events. Both, acquisition and inference respect a sophisticated principle, namely that of maximum entropy and of minimum relative entropy. The freedom to formulate and derive knowledge in a language of rich syntax is comfortable but involves the danger of contradictions or inconsistencies. We develop a method how to solve such inconsistencies which go back to the incompatibility of experts\u2019 knowledge in their respective branches. The method is applied to the diagnosis in Chinese medicine. All calculations are performed in the Entropy-driven expert system shell SPIRIT.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/12\/roedder-1.pdf\" target=\"_blank\">PDF<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p><strong>References:\u00a0<\/strong><\/p>\n<p>[1] I. Csisz\u00e1r: I-Divergence Geometry of Probability Distributions and Minimisation Problems. The Annals of Probability 3, (1): 146 &#8211; 158 (1975).[2] I. Csisz\u00e1r: Why Least Squares and Maximum Entropy? An Axiomatic Approach to Inference for Linear Inverse Problems. The Annals of Statistics 19 (4): 2032 &#8211; 2066 (1991).[3] G. Kern-Isberner: Characterising the principle of minimum cross-entropy within a conditional-logical framework. Artificial Intelligence 98: 169-208 (1998).[4] S. L. Lauritzen: Graphical Association Models (Draft), Technical Report IR 93-2001, Institute for Electronic Systems, Dept. of Mathematics and Computer Science, Aalborg University (1993).[5] W. R\u00f6dder and G. Kern-Isberner: Representation and extraction of information by probabilistic logic. Information Systems 21 (8): 637 &#8211; 652 (1996).[6] W. R\u00f6dder and C.-H. Meyer: Coherent knowledge processing at maximum entropy by SPIRIT, Proceedings 12th Conference on Uncertainty in Artificial Intelligence, E. Horitz and F. Jensen (editors), Morgan Kaufmann, San Francisco, California: 470 &#8211; 476 (1996).[7] C. C. Schnorrenberger: Lehrbuch der chinesischen Medizin f\u00fcr westliche \u00c4rzte, Hippokrates, Stuttgart (1985).[8] C. E. Shannon: A mathematical theory of communication, Bell System Tech. J. 27, 379-423 (part I), 623 &#8211; 656 (part II) (1948).[9] J. E. Shore and R. W. Johnson: Axiomatic Derivation of the Principle of Maximum Entropy and the Principle of Minimum Cross Entropy. IEEE Trans. Information Theory 26 (1): 26 &#8211; 37 (1980).[10] J. Whittaker: Graphical Models in Applied Mathematical Multivariate Statistics, John Wiley & Sons (1990).<span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2268\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2268\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2267\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tLearned Models for Continuous Planning\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2267\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2268\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Matthew D. Schmill,\u00a0Tim Oates, and Paul R. Cohen, Experimental Knowledge Systems Laboratory at\u00a0University of Massachusetts<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We are interested in the nature of <em>activity<\/em>\u00a0\u2014 structured behavior of nontrivial duration \u2014\u00a0in intelligent agents. We believe that the development of activity is a continual process in which simpler activities are composed, via planning, to form more sophisticated ones in a hierarchical fashion. The success or failure of a planner depends on its models of the environment, and its ability to implement its plans in the world. We describe an approach to generating dynamical models of activity from real-world experiences and explain how they can be applied towards planning in a continuous state space.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2270\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2270\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2269\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tEfficient Optimization of Large k Real-time Control Algorithm\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2269\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2270\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Delphi Delco Electronics Systems,\u00a0Restraint Systems Electronics and\u00a0Daniel H. Loughlin, North Carolina State University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Resource requirements for global optimization increase dramatically with the number of real-valued decision variables (k). Efficient search strategies are needed to satisfy constraints of time, effort, and funding. In this paper, a conjunction of several disparate methods is used to automatically calibrate a non-linear real-time control used in the automotive industry. By combining a response surface methodology with a hybrid genetic algorithm search, air-bag deployment calibrations can be automated, producing solutions superior to conventional manual search.<\/p>\n<p>\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2272\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2272\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2271\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tModel Folding for Data Subject to Nonresponse\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2271\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2272\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t<\/p>\n<p><strong>Authors:<\/strong>\u00a0Paola Sebastiani, The Open University and\u00a0Marco Ramoni,\u00a0Knowledge Media Institute at\u00a0The Open University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>In this paper we introduce a deterministic method to estimate the posterior probability of rival models from data with partially ignorable nonresponse.\u00a0 The accuracy of the method will be shown via an application to synthetic data.<\/p>\n<p><strong>Other information:\u00a0<\/strong>Futher information is available from the home page of the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/projects.kmi.open.ac.uk\/bkd\/\" target=\"_blank\">Bayesian Knowledge Discovery<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> project at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.open.ac.uk\/\" target=\"_blank\">The Open University<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<p>\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2274\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2274\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2273\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tGeometry, Moments and Bayesian Networks with Hidden Variables\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2273\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2274\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t<\/p>\n<p><strong>Authors:<\/strong>\u00a0Raffaella Settimi,\u00a0University of Warwick and\u00a0Jim Q. Smith,\u00a0University of Warwick<\/p>\n<p><strong>Abstract:\u00a0<\/strong>The purpose of this paper is to present a systematic way of analysing the geometry of the probability spaces for a particular class of Bayesian networks with hidden variables. It will be shown that the conditional independence statements implicit in such graphical models can be neatly expressed as simple polynomial relationships among central moments. This algebraic framework\u00a0 will enable us to explore and identify the structural constraints on the sample space induced by\u00a0 models with tree structures and therefore characterise the families of distributions consistent with such conditional independence assumptions.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2276\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2276\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2275\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tJoint probabilistic clustering of multivariate and sequential data\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2275\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2276\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0Padhraic Smyth,\u00a0University of California\u2013Irvine<\/p>\n<p>Consider the following problem. We have a set of individuals (a random sample from a larger population) whom we would like to cluster into groups based on observational data. For each individual we can measure characteristics which are relatively static (eg, their height, weight, income, age, sex, etc). Probabilistic model-based clustering in this contextusually takes the form of a nite mixture model, where each component in the mixture is a multivariate probability density function (or distribution function) for a particular group. This approach has been found to be a useful general technique for extracting hidden structure from multivariate data (Ban eld and Raftery, 1993; Thiesson et al, 1997).<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2278\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2278\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2277\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tAnalysis of multivariate time series via a hidden graphical model\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2277\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2278\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Elena Stanghellini, Universita&#8217; di Perugia and Joe Whittaker,\u00a0Lancaster University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We propose a chain graph with unobserved variables to model a multivariate time series. We assume that an underlying common trend linearly affects the observed time series, but we do not restrict our analysis to models where the underlying factor accounts for all the contemporary correlations of the series. The residual correlation is modelled using results of graphical models. Modelling the associations left unexplained is an alternative to augmenting the dimension of the underlying factor. It is justified when a clear interpretation of the residual associations is available. It is also an informative way to explore sources of deviation from standard dynamic single-factor models.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2280\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2280\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2279\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tVisual design support for probabilistic network application\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2279\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2280\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:\u00a0<\/strong>Axel Vogler,\u00a0Daimler Benz AG<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Understanding inference in probabilistic networks is an important point in the design phase. Their causal structure and locally defined parameters are intuitive to human experts. The global system induced by the local parameters can lead to results not intended by the human experts. To support network design an edge coloring scheme explaining influences between variables responsible for inference result is introduced.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t\t\t\t\t<\/ul>\n\t<\/div>\n\t<span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Seventh International Workshop on Artificial Intelligence and Statistics was presented by The Society for Artificial Intelligence & Statistics.<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_startdate":"1999-01-03","msr_enddate":"1999-01-06","msr_location":"Fort Lauderdale, FL, USA","msr_expirationdate":"","msr_event_recording_link":"","msr_event_link":"","msr_event_link_redirect":false,"msr_event_time":"","msr_hide_region":false,"msr_private_event":true,"msr_hide_image_in_river":0,"footnotes":""},"research-area":[13556],"msr-region":[197900],"msr-event-type":[197941],"msr-video-type":[],"msr-locale":[268875],"msr-program-audience":[],"msr-post-option":[],"msr-impact-theme":[],"class_list":["post-336896","msr-event","type-msr-event","status-publish","hentry","msr-research-area-artificial-intelligence","msr-region-north-america","msr-event-type-conferences","msr-locale-en_us"],"msr_about":"<!-- wp:msr\/event-details {\"title\":\"Uncertainty 99\",\"backgroundColor\":\"grey\"} \/-->\n\n<!-- wp:msr\/content-tabs --><!-- wp:msr\/content-tab {\"title\":\"Home\"} --><!-- wp:freeform --><p><strong>Venue:<\/strong>\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/doubletree3.hilton.com\/en\/hotels\/florida\/bahia-mar-fort-lauderdale-beach-a-doubletree-by-hilton-hotel-FLLBMDT\/index.html\" target=\"_blank\">Radisson Bahia Mar Hotel<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<p>Uncertainty 99 was the Seventh International Workshop on Artificial Intelligence and Statistics and was\u00a0presented by\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.gatsby.ucl.ac.uk\/aistats\/society.html\" target=\"_blank\">The Society for Artificial Intelligence &amp; Statistics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<h2>Program Committee<\/h2>\n<p>\t\t\t<div class=\"ms-grid \">\n\t\t\t<div class=\"ms-row\">\n\t\t\t\t\t<div  class=\"m-col-12-24\" >\n\t\t<ul><li>Russell Almond, Educational Testing Service<\/li><li>Chris Bishop, Microsoft Research<\/li><li>Wray Buntine, Ultimode Systems<\/li><li>Peter Cheeseman, NASA Ames<\/li><li>Max Chickering, Microsoft Research<\/li><li>Paul Cohen, University of Massachusetts<\/li><li>Greg Cooper, University of Pittsburgh<\/li><li>Philip Dawid, University College London<\/li><li>David Dowe, Monash University<\/li><li>William DuMouchel, AT&amp;T Labs<\/li><li>Sue Dumais, Microsoft Research<\/li><li>David Edwards, Novo Nordisk<\/li><li>Doug Fisher, Vanderbilt University<\/li><li>Nir Friedman, Hebrew University\u2013Jerusalem<\/li><li>Dan Geiger, Technion<\/li><li>Edward George, University of Texas<\/li><li>Clark Glymour, Carnegie-Mellon University<\/li><li>Moises Goldszmidt, SRI International<\/li><li>David Hand, Open University<\/li><li>Geoff Hinton, University of Toronto<\/li><li>Tommi Jaakkola, MIT<\/li><li>Michael Jordan, UC Berkeley<\/li><\/ul><p>\t<\/div>\n\t \t<div  class=\"m-col-12-24\" >\n\t\t<\/p><ul><li>Michael Kearns, AT&amp;T Labs<\/li><li>Daphne Koller, Stanford University<\/li><li>Steffen Lauritzen, Aalborg University<\/li><li>Hans Lenz, Free University of Berlin<\/li><li>David Lewis, AT&amp;T Labs<\/li><li>David Madigan, University of Washington<\/li><li>Andrew Moore, Carnegie-Mellon University<\/li><li>Daryl Pregibon, AT&amp;T Labs<\/li><li>Thomas Richardson, University of Washington<\/li><li>Alberto Roverato, Universita di Modena<\/li><li>Lawrence Saul, AT&amp;T Labs<\/li><li>Ross Shachter, Stanford University<\/li><li>Richard Scheines, Carnegie-Mellon University<\/li><li>Sebastian Seung, MIT<\/li><li>Prakash Shenoy, University of Kansas<\/li><li>Padhraic Smyth, UC Irvine<\/li><li>David Spiegelhalter, MRC\u2013Cambridge<\/li><li>Peter Spirtes, Carnegie-Mellon University<\/li><li>Milan Studeny, Academy of Sciences of Czech Republic<\/li><li>Nanny Wermuth, Mainz University<\/li><\/ul><p>\t<\/div>\n\t<\/p>\t\t\t<\/div>\n\t\t<\/div>\n\t\t<span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<!-- \/wp:freeform --><!-- \/wp:msr\/content-tab --><!-- wp:msr\/content-tab {\"title\":\"Program\"} --><!-- wp:freeform --><h2>Monday, January 4<\/h2>\n<table class=\"msr-table-schedule\" style=\"border-spacing: inherit;border-collapse: collapse\">\n<thead class=\"thead\">\n<tr class=\"tr\">\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Time<\/th>\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Session<\/th>\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Speaker<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"tbody\">\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">7:30\u20138:45<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p>Registration\/Continental Breakfast<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">8:45\u20139:00<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p>Opening Comments<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">David Heckerman and Joe Whittaker<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">9:00\u201311:00<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p><strong>Session I:\u00a0Model Choice<br \/>\n<\/strong><strong>Chair:<\/strong> Thomas Richardson<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Process-oriented evaluation: The next step<\/td>\n<td style=\"padding: inherit;border: inherit\">Pedro Domingos<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Model choice<\/td>\n<td style=\"padding: inherit;border: inherit\">Alan Gelfand and Sujit Ghosh<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">A note on the comparison of polynomial selection methods<\/td>\n<td style=\"padding: inherit;border: inherit\">Murlikrishna Viswanathan<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Pattern discovery via entropy minimization<\/td>\n<td style=\"padding: inherit;border: inherit\">Matthew Brand<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">11:00\u201311:30<\/td>\n<td style=\"padding: inherit;border: inherit\">Break<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">11:30\u201312:30<\/td>\n<td style=\"padding: inherit;border: inherit\"><strong>Session II: <\/strong><b>Latent variables<br \/>\n<\/b><strong>Chair:<\/strong> Kathyrn Laskey<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">On the geometry of DAG models with hidden variables<\/td>\n<td style=\"padding: inherit;border: inherit\">Dan Geiger, David Heckerman, Henry King, Chris Meek<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Efficient structure search in the presence of latent variables<\/td>\n<td style=\"padding: inherit;border: inherit\">Thomas Richardson, Heiko Bailer, Mooulinath Bannerjee<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">12:30\u20131:30<\/td>\n<td style=\"padding: inherit;border: inherit\">Lunch (<em>provided<\/em>)<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">1:30\u20135:00<\/td>\n<td style=\"padding: inherit;border: inherit\">Break<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">5:00\u20136:00<\/td>\n<td style=\"padding: inherit;border: inherit\"><strong>Poster Summaries<\/strong> (2 mins\/poster)<br \/>\n<strong>Chair:<\/strong> Joe Whittaker<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">6:00\u20137:00<\/td>\n<td style=\"padding: inherit;border: inherit\">Dinner<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">7:00\u20139:30<\/td>\n<td style=\"padding: inherit;border: inherit\">Poster Sessions<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>Tuesday, January 5<\/h2>\n<table class=\"msr-table-schedule\" style=\"border-spacing: inherit;border-collapse: collapse\">\n<thead class=\"thead\">\n<tr class=\"tr\">\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Time<\/th>\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Session<\/th>\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Speaker<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"tbody\">\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">8:00\u20139:00<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p>Continental Breakfast<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">9:00\u201310:00<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p><strong>Session III: Theory<br \/>\nChair:<\/strong> David Madigan<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p>Conditional products: an alternative approach to conditional independence<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">Phil Dawid, Milan Studeny<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Hierarchical mixtures-of-experts for generalized linear models: some results on denseness and consistency<\/td>\n<td style=\"padding: inherit;border: inherit\">Wenxin Jiang, Martin Tanner<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">10:00\u201310:30<\/td>\n<td style=\"padding: inherit;border: inherit\">Coffee Break<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">10:30\u201311:30<\/td>\n<td style=\"padding: inherit;border: inherit\"><strong>Session IV: Regression<br \/>\nChair:<\/strong> Padhraic Smyth<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Boosting methodology for regression problems<\/td>\n<td style=\"padding: inherit;border: inherit\">Greg Ridgeway, David Madigan, Thomas Richardson<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Probabilistic kernel regression models<\/td>\n<td style=\"padding: inherit;border: inherit\">Tommi Jaakkola, David Haussler<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">11:30\u201312:30<\/td>\n<td style=\"padding: inherit;border: inherit\"><strong>Session V: Computational Methods<br \/>\nChair:<\/strong> Padhraic Smyth<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Learning structure from data efficiently: applying bounding techniques<\/td>\n<td style=\"padding: inherit;border: inherit\">Nir Friedman, Lise Getoor<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Efficient mining of statistical dependencies<\/td>\n<td style=\"padding: inherit;border: inherit\">Tim Oates, Paul Cohen, Casey Durfee<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">12:30\u20132:00<\/td>\n<td style=\"padding: inherit;border: inherit\">Lunch (<em>provided<\/em>)<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">2:00\u20133:30<\/td>\n<td style=\"padding: inherit;border: inherit\"><strong>Session VI: Applications<\/strong><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Causal mechanisms and classification trees for predicting chemical carcinogens<\/td>\n<td style=\"padding: inherit;border: inherit\">Louis A Cox<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Geometric modelling of a nuclear environment<\/td>\n<td style=\"padding: inherit;border: inherit\">Jan De Geeter<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Modeling decision tree performance with the power law<\/td>\n<td style=\"padding: inherit;border: inherit\">Lewis Frey, Doug Fisher<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">3:30\u20134:40<\/td>\n<td style=\"padding: inherit;border: inherit\">Business Meeting<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>Wednesday, January 6<\/h2>\n<table class=\"msr-table-schedule\" style=\"border-spacing: inherit;border-collapse: collapse\">\n<thead class=\"thead\">\n<tr class=\"tr\">\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Time<\/th>\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Session<\/th>\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Speaker<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"tbody\">\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">8:00\u20139:00<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p>Continental Breakfast<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">9:00\u201310:30<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p><strong>Session VII: Inference<br \/>\nChair:<\/strong> Greg Cooper<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div class=\"msr-table-schedule-cell\">\n<p>Model-independent mean field theory as a local method for approximate propagation of information<\/p>\n<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">Michael Haft, Reimar Hofmann, Volker Tresp<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Hierarchical IFA belief networks<\/td>\n<td style=\"padding: inherit;border: inherit\">Hagai Attias<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Stochastic local search for Bayesian network<\/td>\n<td style=\"padding: inherit;border: inherit\">Kalev Kask, Rina Dechter<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">10:30\u201311:00<\/td>\n<td style=\"padding: inherit;border: inherit\">Coffee Break<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">11:00\u201312:00<\/td>\n<td style=\"padding: inherit;border: inherit\"><strong>Session VIII: Applications<br \/>\nChair:<\/strong> Doug Fisher<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">An experiment in causal discovery using a pneumona database<\/td>\n<td style=\"padding: inherit;border: inherit\">Peter Spirtes, Greg Cooper<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\">Bayesian graphical models for non-compliance in randomaized trials<\/td>\n<td style=\"padding: inherit;border: inherit\">David Madigan<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<tr class=\"tr\">\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">12:00\u201312:15<\/td>\n<td style=\"padding: inherit;border: inherit\">Closing Remarks<\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<td style=\"padding: inherit;border: inherit\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><em>Plenary Presentations lasted 25 minutes with 5 minutes for questions.<\/em><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<!-- \/wp:freeform --><!-- \/wp:msr\/content-tab --><!-- wp:msr\/content-tab {\"title\":\"Tutorials u0026 Abstracts\"} --><!-- wp:freeform --><h2>Tutorials | January 3<\/h2>\n<p>\t<div data-wp-context='{\"items\":[]}' data-wp-interactive=\"msr\/accordion\">\n\t\t\t\t\t<div class=\"clearfix\">\n\t\t\t\t<div\n\t\t\t\t\tclass=\"btn-group align-items-center mb-g float-sm-right\"\n\t\t\t\t\tdata-bi-aN=\"accordion-collapse-controls\"\n\t\t\t\t>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Expand all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onExpandAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tExpand all\t\t\t\t\t<\/button>\n\t\t\t\t\t<span aria-hidden=\"true\"> | <\/span>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Collapse all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onCollapseAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tCollapse all\t\t\t\t\t<\/button>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t\t\t<ul class=\"msr-accordion\">\n\t\t\t\t\t\t\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2186\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2186\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2185\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tInformation Access and Retrieval\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2185\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2186\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Speaker:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/sdumais\/\" target=\"_blank\">Susan Dumais<\/a>, Microsoft Research |\u00a08:30 AM \u2013\u00a010:30 AM<\/p>\n<p>The Web has made literally terabytes of information available at the click of a mouse. The challenge is in finding the right information. Information retrieval is concerned with providing access to textual data for which we have no good formal model, such as a relational model. Statistical approaches have been widely applied to this problem. This tutorial will provide an overview of: a) statistical characteristics of large text collections (e.g., size, sparcity, word distributions), b) important retrieval models (e.g., Boolean, vector space and probabilistic), and c) enhancements which use unsupervised learning to model structure in text collections, or supervised learning to incorporate user feedback. We will conclude with a discussion of open research issues where improved statistical models can improve performance.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2188\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2188\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2187\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tBayesian Statistical Analysis\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2187\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2188\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Speaker:<\/strong>\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.statslab.cam.ac.uk\/Dept\/People\/Spiegelhalter\/davids.html\" target=\"_blank\">David Spiegelhalter<\/a>, MRC Biostatistics Unit, Institute for Public Health, Cambridge | 11:00 AM \u2013 12:00 AM and 1:30 PM \u2013\u00a02:30 PM<\/p>\n<p>The first part of the tutorial will cover the fundamentals of Bayesian inference, including probability and its subjective interpretation, evaluation of probability assessments using scoring rules, utilities and decision theory. The use of Bayes theorem for updating beliefs will be illustrated for both binomial and normal likelihoods, and the use of conjugate families of priors and predictive distributions described. The First Bayes software will be used to display conjugate Bayesian analysis. The second part will introduce the concept of `exchangeability&#8217;, and the consequent use of hierarchical models in which the unknown parameters of a common prior are included in the model. Conditional independence assumptions lead naturally to a graphical representation of hierarchical models. Markov chain Monte Carlo (MCMC) methods will be introduced as a means of carrying out the necessary numerical integrations, and topics covered will include the relationship of Gibbs sampling to graphical modelling, parameterisation, initial values, and choice of prior distributions. Real examples will be used throughout, and on-line analysis of an example in longitudinal modelling with measurement error on predictors will be carried out using the WinBUGS program.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2190\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2190\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2189\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tAdditive Logistic Regression: A Statistical View of Boosting\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2189\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2190\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Speaker:\u00a0<\/strong><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/web.stanford.edu\/~hastie\/\" target=\"_blank\">Trevor Hastie<\/a>, Stanford University |\u00a03:30 PM \u2013\u00a05:00 PM<\/p>\n<p>Boosting (Freund and Schapire, 1995) is one of the most important recent developments in classification methodology. Boosting works by sequentially applying a classification algorithm to reweighted versions of the training data, and then taking a weighted majority vote of the sequence of classifiers thus produced. For many classification algorithms, this simple strategy results in dramatic improvements in performance. We show that this seemingly mysterious phenomenon can be understood in terms of well known statistical principles, namely additive modeling and maximum likelihood. For the two-class problem, boosting can be viewed as an approximation to additive modeling on the logistic scale using maximum Bernoulli likelihood as a criterion. We develop more direct approximations and show that they exhibit nearly identical results to boosting. Directmulti-class generalizations based on multinomial likelihood are derived that exhibit performance comparable to other recently proposed multi-class generalizations of boosting in most situations, and far superior in some. We suggest a minor modification to boosting that can reduce computation, often by factors of 10 to 50. Finally, we apply these insights to produce an alternative formulation of boosting decision trees. This approach, based on best-first truncated tree induction, often leads to better performance, and can provide interpretable descriptions of the aggregate decision rule. It is also much faster computationally, making it more suitable to large scale data mining applications.<\/p>\n<p><em>* joint work with Jerome Friedman and Rob Tibshirani<\/em><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t\t\t\t\t<\/ul>\n\t<\/div>\n\t<\/p>\n<h2>Abstracts | January 4 &#8211; 6<\/h2>\n<p>\t<div data-wp-context='{\"items\":[]}' data-wp-interactive=\"msr\/accordion\">\n\t\t\t\t\t<div class=\"clearfix\">\n\t\t\t\t<div\n\t\t\t\t\tclass=\"btn-group align-items-center mb-g float-sm-right\"\n\t\t\t\t\tdata-bi-aN=\"accordion-collapse-controls\"\n\t\t\t\t>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Expand all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onExpandAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tExpand all\t\t\t\t\t<\/button>\n\t\t\t\t\t<span aria-hidden=\"true\"> | <\/span>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Collapse all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onCollapseAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tCollapse all\t\t\t\t\t<\/button>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t\t\t<ul class=\"msr-accordion\">\n\t\t\t\t\t\t\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2192\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2192\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2191\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tHierarchical IFA Belief Networks\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2191\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2192\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0Hagai Attias,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.gatsby.ucl.ac.uk\/\" target=\"_blank\">Gatsby Computational Neuroscience Unit<\/a>,\u00a0University College London<\/p>\n<p><strong>Abstract:<\/strong>\u00a0We introduce a new real-valued belief network, which is a multilayer generalization of independent factor analysis (IFA). At each level, this network extracts real-valued latent variables that are non-linear functions of the input data with a highly adaptive functional form, resulting in a hierarchical distributed representation of these data. The network is based on a probabilistic generative model, constructed by cascading single-layer IFA models. Whereas exact maximum-likelihood learning for this model is intractable, we present and demonstrate an algorithm that maximizes a lower bound on the likelihood. This algorithm is developed by formulating a variational approach to hierarchical IFA networks.<\/p>\n<p><strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/hifan.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2194\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2194\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2193\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tPattern discovery via entropy minimization\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2193\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2194\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.merl.com\/people\/brand\" target=\"_blank\">Matthew Brand<\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.merl.com\/\" target=\"_blank\">Mitsubishi Electric Research Labs<\/a><\/p>\n<p><strong>Abstract:<\/strong>\u00a0We propose a framework for learning hidden-variable models by optimizing entropies, in which entropy minimization, posterior maximization, and free energy minimization are all equivalent. Solutions for the maximum <em>a posteriori<\/em> (MAP) estimator yield powerful learning algorithms that combine all the charms of expectation-maximization and deterministic annealing. Contained as special cases are the methods of maximum entropy, maximum likelihood, and a new method, maximum structure. We focus on the maximum structure case, in which entropy minimization maximizes the amount of evidence supporting each parameter while minimizing uncertainty in the sufficient statistics and cross-entropy between the model and the data. In iterative estimation, the MAP estimator gradually extinguishes excess parameters, sculpting a model structure that reflects hidden structures in the data. These models are highly resistant to over-fitting and have the particular virtue of being easy to interpret, often yielding insights into the hidden causes that generate the data.<\/p>\n<p><strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/pattern-discovery-entropy-minimization.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2196\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2196\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2195\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tCausal Mechanisms and Classification Trees for Predicting Chemical Carcinogens\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2195\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2196\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong> Louis Anthony (&#8220;Tony&#8221;) Cox, Jr.,\u00a0Cox Associates<\/p>\n<p><strong>Abstract:<\/strong>\u00a0Classification trees, usually used as a nonlinear, nonparametric classification method, can also provide a powerful framework for comparing, assessing, and combining information from different expert systems, by treating their predictions as the independent variables in a classification tree analysis. This paper discusses the applied problem of classifying chemicals as human carcinogens. It shows how classification trees can be used to compare the information provided by ten different carcinogen classification expert systems, construct an improved &#8220;hybrid&#8221; classification system from them, and identify cost-effective combinations of assays (the inputs to the expert systems) to use in classifying chemicals in future.<\/p>\n<p><strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/CAUSAL_MECHANISMS_AND_CLASSIFICATION_TREES_FOR_PR.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2198\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2198\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2197\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tConditional Products: An Alternative Approach to Conditional Independence\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2197\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2198\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0A. Philip Dawid,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.ucl.ac.uk\/statistics\/\" target=\"_blank\">Department of Statistical Science<\/a>, University College London; <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/staff.utia.cas.cz\/studeny\/studeny_home.html?q=user_data\/studeny\/studeny_home.html\" target=\"_blank\">Milan Studeny<\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.utia.cas.cz\/\" target=\"_blank\">Institute of Information Theory and Automation<\/a>, Academy of Sciences of Czech Republic, and Laboratory of Intelligent Systems, University of Economics Prague<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We introduce a new abstract approach to the study of conditional independence, founded on a concept analogous to the factorization properties of probabilistic independence, rather than the separation properties of a graph. The basic ingredient is the &#8220;conditional product&#8221;, which provides a way of combining the basic objects under consideration while preserving as much independence as possible. We introduce an appropriate axiom system for conditional product, and show how, when these axioms are obeyed, they induce a derived concept of conditional independence which obeys the usual semi-graphoid axioms. The general structure is used to throw light on three specific areas: the familiar probabilistic framework (both the discrete and the general case); a set-theoretic framework related to &#8220;variation independence&#8221;; and a variety of graphical frameworks.<\/p>\n<p><strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/daw-stu-99-pdf.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2200\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2200\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2199\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tGeometric Modeling of a Nuclear Environment\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2199\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2200\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Jan De Geeter and Marc Decr\u00e9ton,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.sckcen.be\/\" target=\"_blank\">SCK.CEN<\/a> (Belgian Nuclear Research Centre);\u00a0Joris De Schutter, Herman Bruyninckx, and Hendrik Van Brussel,\u00a0Department of Mechanical Engineering,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.mech.kuleuven.be\/en\/pma\" target=\"_blank\">Division PMA<\/a>,\u00a0Katholieke Universiteit Leuven<\/p>\n<p><strong>Abstract:\u00a0<\/strong>This paper is about the task-directed updating of an incomplete and inaccurate geometric model of a nuclear environment, using only robust radiation-resistant sensors installed on a robot that is remotely controlled by a human operator. In this problem, there are many sources of uncertainty and ambiguity. This paper proposes a probabilistic solution under Gaussian assumptions. Uncertainty is reduced with an estimator based on a Kalman filter. Ambiguity on the measurement-feature association is resolved by running a bank of those estimators in parallel, one for each plausible association. The residual errors of these estimators are used for hypothesis testing and for the calculation of a probability distribution over the remaining hypotheses. The best next sensing action is calculated as a Bayes decision with respect to a loss function that takes into account both the uncertainty on the current estimate, and the variance\/precision required by the task.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2202\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2202\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2201\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tProcess-Oriented Evaluation: The Next Step\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2201\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2202\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0Pedro Domingos,\u00a0Artificial Intelligence Group,\u00a0Instituto Superior T\u00e9cnico<\/p>\n<p><strong>Abstract:<\/strong>\u00a0Methods to avoid overfitting fall into two broad categories: data-oriented (using separate data for validation) and representation-oriented (penalizing complexity in the model). Both have limitations that are hard to overcome. We argue that fully adequate model evaluation is only possible if the search process by which models are obtained is also taken into account. To this end, we recently proposed a method for <em>process-oriented evaluation <\/em>(POE), and successfully applied it to rule induction (Domingos, 1998). However, for the sake of simplicity this treatment made two rather artificial assumptions. In this paper the assumptions are removed, and a simple formula for model evaluation is obtained. Empirical trials show the new, better-founded form of POE to be as accurate as the previous one, while further reducing theory sizes.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/process-oriented.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2204\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2204\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2203\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tModeling Decision Tree Performance with the Power Law\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2203\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2204\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong> Lewis J. Frey and Douglas H. Fisher, Jr.,\u00a0Computer Science Department,\u00a0Vanderbilt University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>This paper discusses the use of a power law to predict decision tree performance. Power laws are fit to learning curves of decision trees trained on data sets from the UCI repository. The learning curves are generated by training C4.5 on different size training sets. The power law predicts diminishing returns in terms of error rate as training set size increase. By characterizing the learning curve with a power law, the error rate for a given size training set can be projected. This projection can be used in estimating the amount of data needed to achieve an acceptable error rate, and the cost effectiveness of further data collection.<\/p>\n<p><strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/ModelingTree.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2206\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2206\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2205\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tEfficient Learning using Constrained Sufficient Statistics\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2205\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2206\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.cs.huji.ac.il\/~nir\/\" target=\"_blank\">Nir Friedman<\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.cs.huji.ac.il\/\" target=\"_blank\">Institute of Computer Science<\/a>,\u00a0The Hebrew University;\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/robotics.stanford.edu\/~getoor\/\" target=\"_blank\">Lise Getoor<\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www-cs.stanford.edu\/\" target=\"_blank\">Computer Science Department<\/a>,\u00a0Stanford University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Learning Bayesian networks is a central problem for pattern recognition, density estimation and classification. In this paper, we propose a new method for speeding up the computational process of learning Bayesian network <em>structure<\/em>. This approach uses constraints imposed by the statistics already collected from the data to guide the learning algorithm. This allows us to reduce the number of statistics collected during learning and thus speed up the learning time. We show that our method is capable of learning structure from data more efficiently than traditional approaches. Our technique is of particular importance when the size of the datasets is large or when learning from incomplete data. The basic technique that we introduce is general and can be used to improve learning performance in many settings where sufficient statistics must be computed. In addition, our technique may be useful for alternate search strategies such as branch and bound algorithms.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/FGe1.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2208\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2208\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2207\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tOn the geometry of DAG models with hidden variables\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2207\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2208\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.cs.technion.ac.il\/~dang\/\" target=\"_blank\">Dan Geiger<\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/heckerma\/\" target=\"_blank\">David Heckerman<\/a>, and Christopher Meek, Decision Theory &amp; Adaptive Systems,\u00a0Microsoft Research;\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.math.umd.edu\/~hking\/\" target=\"_blank\">Henry King<\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.math.umd.edu\" target=\"_blank\">Mathematics Department<\/a>,\u00a0University of Maryland<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We prove that many graphical models with hidden variables are not curved exponential families. This result, together with the fact that some graphical models are curved and not linear, implies that the hierarchy of graphical models, as linear, curved, and stratified, is non-collapsing; each level in the hierarchy is strictly contained in the larger levels. This result is discussed in the context of model selection of graphical models.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2210\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2210\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2209\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tModel Choice: A minimum posterior predictive loss approach\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2209\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2210\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Sujit. Ghosh,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.ncsu.edu\/\" target=\"_blank\">Department of Statistics<\/a>,\u00a0NC State University;\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/stat.uconn.edu\/alan-gelfand\/\" target=\"_blank\">Alan E. Gelfand<\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/stat.uconn.edu\/\" target=\"_blank\">Department of Statistics<\/a>,\u00a0University of Connecticut<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Model choice is a fundamental activity in the analysis of data sets, an activity which has become increasingly more important as computational advances enable the fitting of increasingly complex models. Such complexity typically arises through hierarchical structure which requires specification at each stage of probabilistic mechanisms, mean and dispersion forms, explanatory variables, etc. Nonnested hierarchical models introducing random effects may not be handled by classical methods. Bayesian approaches using predictive distributions can be used though the FORMAL solution, which includes Bayes factors as a special case, can be criticized. It seems natural to evaluate model performance by comparing what it predicts with what has been observed. Most classical criteria utilize such comparison. We propose a predictive criterion where the goal is good prediction of a replicate of the observed data but tempered by fidelity to the observed values. We obtain this criterion by minimizing posterior loss for a given model and then, for models under consideration, selecting the one which minimizes this criterion. For a version of log scoring loss we can do the minimization explicitly, obtaining an expression which can be interpreted as a penalized deviance criterion. We illustrate its performance with an application to a large data set involving residential property transactions.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/gelfand_biometrika_1998.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2212\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2212\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2211\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tMean Field Inference in a General Probabilistic Setting\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2211\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2212\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Michael Haft,\u00a0Reimar Hofmann, and\u00a0Volker Tresp,\u00a0Siemens AG,\u00a0Corporate Technology, Information and Communications Department<\/p>\n<p><strong>Abstract:<\/strong>\u00a0We present a systematic, model-independent formulation of\u00a0 mean field theory (MFT) as an inference method in probabilistic models. &#8220;Model-independent&#8221; means that we do not assume a particular type of dependency among the variables of a domain but instead work in a general probabilistic setting. In a Bayesian network, for example, you may use arbitrary tables to specify conditional dependencies and thus run MFT in <em>any<\/em> Bayesian network. Furthermore, the general mean field equations derived here shed a light on the essence of MFT. MFT can be interpreted as a local iteration scheme which relaxes in a consistent state (a solution of the mean field equations). Iterating the mean field equations means propagating information through the network. In general, however, there are multiple solutions to the mean field equations. We show that improved approximations can\u00a0 be obtained by forming a weighted mixture of the multiple mean field solutions.\u00a0 Simple approximate expressions for the mixture weights are given. The benefits of taking into account multiple solutions are demonstrated by using MFT for inference in a small Bayesian network representing a medical domain. Thereby it turns out that every solution of the mean field equations can be interpreted as a &#8216;disease scenario&#8217;.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/Mean_Field_Inference_in_a_General_Probabilistic_Se.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2214\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2214\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2213\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tProbabilistic kernel regression models\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2213\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2214\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/people.csail.mit.edu\/tommi\/\" target=\"_blank\">Tommi S. Jaakkola<\/a>,\u00a0Department of Computer Science and Electrical Engineering,\u00a0Massachusetts Institute of Technology;\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/genomics-old.soe.ucsc.edu\/haussler\" target=\"_blank\">David Haussler<\/a>,\u00a0Department of Computer Science,\u00a0University of California\u2013Santa Cruz<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We introduce a class of flexible conditional probability models and techniques for classification\/regression problems. Many existing methods such as generalized linear models and support vector machines are subsumed under this class. The flexibility of this class of techniques comes from the use of kernel functions as in support vector machines, and the generality from dual formulations of standard regression models.<\/p>\n<p><em><span style=\"font-family: wf_segoe-ui_bold, wf_segoe-ui_semibold, wf_segoe-ui_normal, Arial, sans-serif\">*<\/span>The work was done while T. Jaakkola was at UC Santa Cruz.<\/em><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2216\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2216\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2215\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tHierarchical Mixtures-of-Experts for Generalized Linear Models: Some Results on Denseness and Consistency\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2215\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2216\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong> Wenxin Jiang and Martin A. Tanner,\u00a0Department of Statistics,\u00a0Northwestern University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We investigate a class of hierarchical mixtures-of-experts (HME) models where exponential family regression models with generalized linear mean functions of the form $psi(a+x^T b)$ are mixed. Here $psi(cdot)$ is the inverse link function. Suppose the true response $y$ follows an exponential family regression model with mean function belonging to a class of smooth functions of the form $psi(h(x))$ where $h in W_{2;K_0}^infty$ (a Sobolev class over $[0,1]^{s}$). It is shown that the HME mean functions can approximate the true mean function, at a rate of $O(m^{-2\/s})$ in $L_p$ norm. Moreover, the HME probability density functions can approximate the true density, at a rate of $O(m^{-2\/s})$ in Hellinger distance, and at a rate of $O(m^{-4\/s})$ in Kullback-Leibler divergence. These rates can be achieved within the family of HME structures with a tree of binary splits, or within the family of structures with a single layer of experts. Here $s$ is the dimension of the predictor $x$. It is also shown that likelihood-based inference based on HME is consistent in recovering the truth, in the sense that as the sample size $n$ and the number of experts $m$ both increase, the mean square error of the estimated mean response goes to zero. Conditions for such results to hold are stated and discussed.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/1301.7390.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2218\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2218\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2217\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tStochastic local search for Bayesian networks\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2217\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2218\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Kalev Kask and Rina Dechter,\u00a0Department of Information and Computer Science,\u00a0University of California\u2013Irvine<\/p>\n<p><strong>Abstract:\u00a0<\/strong>The paper evaluates empirically the suitability\u00a0of Stochastic Local Search algorithms\u00a0(SLS) for finding most probable explanations\u00a0in Bayesian networks. SLS algorithms\u00a0(e.g. GSAT, WSAT) have recently\u00a0proven to be highly effective in solving\u00a0complex constraint-satisfaction and satisfiability problems which cannot be solved\u00a0by traditional search schemes. Our experiments\u00a0investigate the applicability of this\u00a0scheme to probabilistic optimization problems.Specifically, we show that algorithms\u00a0combining hill-climbing steps with stochastic\u00a0steps (guided by the network&#8217;s probability\u00a0distribution) called G+StS, outperform pure\u00a0hill-climbing search, pure stochastic simulation\u00a0search, as well as simulated annealing.\u00a0In addition, variants of G+StS that are augmented on top of alternative approximation\u00a0methods are shown to be particularly effective.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/r72-new_11_98.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2220\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2220\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2219\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tBayesian Graphical Models, Intention-to-Treat, and the Rubin Causal Model\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2219\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2220\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0David Madigan,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.washington.edu\/\" target=\"_blank\">Department of Statistics<\/a>,\u00a0University of Washington<\/p>\n<p><strong>Abstract:<\/strong>\u00a0In clinical trials with significant noncompliance the standard intention-to-treat analyses sometimes mislead. Rubin&#8217;s causal model provides an alternative method of analysis that can shed extra light on clinical trial data. Formulating the Rubin Causal Model as a Bayesian graphical model facilitates model communication and computation.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/38981c9db80897c99c15049dcf4a0145aad5.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2222\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2222\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2221\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tEfficient Mining of Statistical Dependencies\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2221\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2222\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Tim Oates,\u00a0Matthew D. Schmill,\u00a0Paul R. Cohen, and Casey Durfee,\u00a0Experimental Knowledge Systems Lab,\u00a0Department of Computer Science,\u00a0University of Massachusetts\u2013Amherst<\/p>\n<p><strong>Abstract:<\/strong>\u00a0The Multi-Stream Dependency Detection algorithm finds rules that capture statistical dependencies between patterns in multivariate time series of categorical data. Rule strength is measured by the G statistic, and an upper bound on the value of G for the descendants of a node allows MSDD&#8217;s search space to be pruned. However, in the worst case, the algorithm will explore exponentially many rules. This paper presents and empirically evaluates two ways of addressing this problem. The first is a set of three methods for reducing the size of MSDD&#8217;s search space based on information collected during the search process. Second, we discuss an implementation of MSDD that distributes its computations over multiple machines on a network.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2224\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2224\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2223\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tTractable structure search in the presence of latent variables\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2223\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2224\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong> <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.washington.edu\/tsr\/website\/inquiry\/home.php\" target=\"_blank\">Thomas Richardson<\/a>, Heiko Bailer, and Moulinath Banerjees,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.washington.edu\/\" target=\"_blank\">Department of Statistics<\/a>,\u00a0University of Washington<\/p>\n<p><strong>Abstract:\u00a0<\/strong>The problem of learning the structure of a DAG model in the presence of latent variables presents many formidable challenges. In particular there are an infinite number of latent variable models to consider, and these models possess features which make them hard to work with. We describe a class of graphical models which can represent the conditional independence structure induced by a latent variable model over the observed margin. We give a parametrization of the set of Gaussian distributions with conditional independence structure given by a MAG model. The models are illustrated via a simple example. Different estimation techniques are discussed in the context of Zellner&#8217;s Seemingly Unrelated Regression (SUR) models.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2226\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2226\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2225\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tBoosting Methodology for Regression Problems\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2225\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2226\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Greg Ridgeway,\u00a0David Madigan, and\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.washington.edu\/tsr\/website\/inquiry\/home.php\" target=\"_blank\">Thomas Richardson<\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.washington.edu\/\" target=\"_blank\">Department of Statistics<\/a>,\u00a0University of Washington<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Classification problems have dominated research on boosting to date. The application of boosting to regression problems, on the other hand, has received little investigation. In this paper we develop a new boosting method for regression problems. We cast the regression problem as a classification problem and apply an interpretable form of the boosted na\u00efve Bayes classifier. This induces a regression model that we show to be expressible as an additive model for which we derive estimators and discuss computational issues. We compare the performance of our boosted na\u00efve Bayes regression model with other interpretable multivariate regression procedures.<\/p>\n<p><strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/boosting-methodology-regression-problems.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2228\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2228\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2227\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tAn Experiment in Causal Inference Using a Pneumonia Database\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2227\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2228\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong> Peter Spirtes,\u00a0Department of Philosophy,\u00a0Carnegie Mellon University;\u00a0Greg Cooper,\u00a0Center for Biomedical Informatics,\u00a0University of Pittsburgh<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We tested a causal discovery algorithm on a database of pneumonia patients. The output of the causal discovery algorithm is a list of statements &#8220;A causes B&#8221;, where A and B are variables in the database, and a score indicating the degree of confidence in the statement. We compared the output of the algorithm with the opinions of physicians about whether A caused B or not. We found that the doctors opinions were independent of the output of the algorithm. However, an examination of the output of results suggested a simple, well motivated modification of the algorithm which would bring the output of the algorithm into high agreement with the physicians opinions.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/causal-discovery.pdf\" target=\"_blank\">PDF<\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2230\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2230\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2229\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tA Note on the Comparison of Polynomial Selection Methods\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2229\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2230\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Murlikrishna Viswanathan and Professor Chris Wallace,\u00a0School of Computer Science and Software Engineering,\u00a0Monash University\u2013Clayton<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Minimum Message Length (MML) and Structural Risk Minimisation (SRM) are two computational learning principles that have achieved wide acclaim in recent years. Whereas the former is based on Bayesian learning and the latter on the classical theory of VC-dimension, they are similar in their attempt to define a trade-off between model complexity and goodness of fit to the data. A recent empirical study by Wallace compared the performance of standard model selection methods in a one-dimensional polynomial regression framework. The results from this study provided strong evidence in support of the MML and SRM based methods over the other standard approaches. In this paper we present a detailed empirical evaluation of three model selection methods which include an MML based approach and two SRM based methods. Results from our analysis and experimental evaluation suggest that the MML-based approach in general has higher predictive accuracy and also raise questions on the inductive capabilities of the Structural Risk Minimization Principle.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t\t\t\t\t<\/ul>\n\t<\/div>\n\t<span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<!-- \/wp:freeform --><!-- \/wp:msr\/content-tab --><!-- wp:msr\/content-tab {\"title\":\"Poster Sessions\"} --><!-- wp:freeform --><h2>Poster Sessions | January 4<\/h2>\n<p>\t<div data-wp-context='{\"items\":[]}' data-wp-interactive=\"msr\/accordion\">\n\t\t\t\t\t<div class=\"clearfix\">\n\t\t\t\t<div\n\t\t\t\t\tclass=\"btn-group align-items-center mb-g float-sm-right\"\n\t\t\t\t\tdata-bi-aN=\"accordion-collapse-controls\"\n\t\t\t\t>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Expand all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onExpandAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tExpand all\t\t\t\t\t<\/button>\n\t\t\t\t\t<span aria-hidden=\"true\"> | <\/span>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Collapse all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onCollapseAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tCollapse all\t\t\t\t\t<\/button>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t\t\t<ul class=\"msr-accordion\">\n\t\t\t\t\t\t\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2232\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2232\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2231\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tTransfer of Information between System and Evidence Models\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2231\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2232\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Russell Almond,\u00a0Research Statistics Group at\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.ets.org\/\" target=\"_blank\">Educational Testing Service<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>;\u00a0Edward Herskovits,\u00a0Noetic Systems, Inc.;\u00a0Robert J. Mislevy,\u00a0Model Based Measurement Group at\u00a0Educational Testing Service;\u00a0Linda Stienberg,\u00a0Educational Policy Research at\u00a0Educational Testing Service<\/p>\n<p><strong>Abstract:<\/strong>\u00a0In this paper we illustrate a simple scheme for dividing a complex Bayes network into a <em>system model<\/em> and a collection of smaller <em>evidence models<\/em>. While the system model maintains a permanent record of the state of the system of interest, the evidence models are only used momentarily to absorb evidence from specific observations or findings and then discarded. This paper describes an implementation of a system model\u2014evidence model complex in which each system and evidence model has a separate Bayes net and Markov tree representation. As necessary, information is propagated between common Markov tree nodes of the evidence and system models. While mathematically equivalent to the full Bayes network, the system model&#8211;evidence model complex allows us to (a) separate the seldom used evidence model portions from the core system model thus reducing search and propagation time in the network and (b) easily replace the evidence models (this is particular advantageous in educational examples in which new test items are often introduced to prevent overexposure of assessment tasks).<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2234\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2234\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2233\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tA Bayesian Model for Collaborative Filtering\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2233\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2234\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Yung-Hsin Chien and Edward I. George,\u00a0Department of MSIS at\u00a0University of Texas at Austin<\/p>\n<p><strong>Abstract:<\/strong>\u00a0Consider the general setup where a set of <em>items<\/em> have been partially <em>rated<\/em> by a set of <em>judges<\/em>, in the sense that not every item has been rated by every judge. For this setup, we propose a Bayesian approach for the problem of predicting the missing ratings from the observed ratings. This approach incorporates similarity by assuming the set of judges can be partitioned into groups which share the same ratings probability distribution. This leads to a predictive distribution of missing ratings based on the posterior distribution of the groupings and associated ratings probabilities. Markov chain Monte Carlo methods and a hybrid search algorithm are then used to obtain predictions of the missing ratings.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2236\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2236\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2235\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tParameter learning from incomplete data for Bayesian networks\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2235\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2236\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0RG Cowell<\/p>\n<p><strong>Abstract:<\/strong> In a companion paper (Cowell 1999), I described a method of using maximum entropy to estimate the joint probability distribution for a set of discrete variables from missing data. Here I extend the method of that paper to incorporate prior information for application to parameter learning in Bayesian networks.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2238\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2238\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2237\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tOn the Application of The Bootstrap for Computing Confidence Measures on Features of Induced Bayesian Networks\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2237\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2238\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.cs.huji.ac.il\/~nir\/\" target=\"_blank\">Nir Friedman<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.cs.huji.ac.il\/\" target=\"_blank\">Institute of Computer Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0Hebrew University;\u00a0Moises Goldszmidt, SRI International; Abraham Wyner,\u00a0Department of Statistics at\u00a0Wharton School<\/p>\n<p><strong>Abstract:\u00a0<\/strong>In the context of learning Bayesian networks from data, very little work has been published on methods for assessing the <em>quality<\/em> of an induced model. This issue, however, has received a great deal of attention in the statistics literature. In this paper, we take a well-known method from statistics, Efron&#8217;s Bootstrap, and examine its applicability for assessing a confidence measure on features of the learned network structure. We also compare this method to assessments based on a practical realization of the Bayesian methodology.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2240\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2240\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2239\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tRelaxing the Local Independence Assumption for Quantitative Learning in Acyclic Directed Graphical Models through Hierarchical Partition Models\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2239\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2240\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Daniela Golinelli and David Madigan,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stat.washington.edu\/\" target=\"_blank\">Department of Statistics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0University of Washington and\u00a0Guido Consonni,\u00a0Dip. di Economia e Metodi Quantitativi at\u00a0Universita&#8217; di Pavia<\/p>\n<p><strong>Abstract:\u00a0<\/strong>The simplest method proposed by Spiegelhalter and Lauritzen (1990) to perform <em>quantitative learning<\/em> in ADG presents a potential weakness: the <em>local independence assumption<\/em>. We propose to alleviate this problem through the use of Hierarchical Partition Models. Our approach is compared with the previous one from an interpretative and predictive point of view.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2242\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2242\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2241\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tThe exploration of new methods for learning in binary Boltzmann machines\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2241\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2242\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Keith Humphreys and D.M. Titterington,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.gla.ac.uk\/subjects\/statistics\/\" target=\"_blank\">Department of Statistics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0University of Glasgow<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Exact inference for Boltzmann machines is computationally expensive. One approach to improving tractability is to approximate the gradient algorithm. We describe a new way of doing this which is based on Bahadur&#8217;s representation of the multivariate binary distribution (Bahadur, 1961). We compare the approach, for networks with no unobserved variable, to the &#8220;mean field&#8221; approximation of Peterson and Anderson (1987) and the approach of Kappen and Rodriguez (1998), which is based on the linear response theorem. We also investigate the use of the pairwise association cluster method (Tanaka and Morita, 1995).<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2244\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2244\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2243\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tStatistical Challenges to inductive inference in linked data\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2243\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2244\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0David Jensen<\/p>\n<p><strong>Abstract:<\/strong> Many data sets can be represented naturally as collections of linked objects. For example, document collections can be represented as documents (nodes) connected by citations and hypertext references (links). Similarly, organizations can be represented as people (nodes) connected reporting relationships, social relationships, and communication patterns (links).<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2246\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2246\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2245\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tMixture Model Clustering with the Multimix Program\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2245\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2246\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Murray A. Jorgensen and Lynette A. Hunt,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.stats.waikato.ac.nz\/\" target=\"_blank\">Department of Statistics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0University of Waikato<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Hunt (1996) has implemented the finite mixture model approach to clustering in a program called Multimix. The program is designed to cluster multivariate data with categorical and continuous variables and possibly containing missing values. The model fitted simultaneously generalises the Latent Class model and the mixture of multivariate normals model. Like either of these models Multimix can be used to form clusters by the Bayes allocation rule. This is the intended use of the program, although the parameter estimates can be used to give a succinct description of the clusters.<\/p>\n<p>Use of the EM algorithm, with its view of the observed data as being notionally augmented by missing information to form the &#8216;complete data&#8217;, gives a broad framework for estimation which is able to handle two types of missing information: unknown cluster assignment and missing data. Using the methodology of Little and Rubin (1987). in this way Multimix is able to handle missing data in a less ad hoc way than many clustering algorithms. The program runs in acceptable time with large data matrices (say hundreds of observations on tens of variables). Use of the missing-data facility increases execution time somewhat. In this presentation we describe the approach taken to the design of Multimix and how some of the statistical problems were dealt with. As examples of the use of the program we cluster a large medical dataset and a version of Fisher&#8217;s Iris data in which a third of the values are randomly made &#8216;missing&#8217;.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2248\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2248\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2247\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tLearning Augmented Bayesian Classifiers: A Comparison of Distribution-based and Classification-based Approaches\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2247\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2248\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Eamonn Keogh and Michael J. Pazzani,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.ics.uci.edu\/\" target=\"_blank\">Information and Computer Science<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0University of California, Irvine<\/p>\n<p><strong>Abstract:\u00a0<\/strong>The nave Bayes classifier is built on the assumption of conditional independence between the attributes given the class. The algorithm has been shown to be surprisingly robust to obvious violations of this condition, but it is natural to ask if it is possible to further improve the accuracy by relaxing this assumption. We examine an approach where nave Bayes is augmented by the addition of correlation arcs between attributes. We explore two methods for finding the set of augmenting arcs, a greedy hill-climbing search, and a novel, more computationally efficient algorithm that we call SuperParent. We compare these methods to TAN; a state-of the-art distribution-based approach to finding the augmenting arcs.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2250\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2250\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2249\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tExploring the robustness of Bayesian and information-theoretic methods for predictive inference\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2249\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2250\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Petri Kontkanen, Petri Myllym\u00e4ki, Tomi Silander, Henry Tirri, and Kimmo Valtonen<\/p>\n<p><strong>Abstract:<\/strong>\u00a0Given a set of sample data, we study three alternative methods for determining the predictive distribution of an unseen data vector. In particular, we are interested in the behavior of the predictive accuracy of these three predictive methods as a function of the degree of the domain assumption violations. We explore this question empirically by using artificially generated data sets, where the assumptions can be violated in various ways. Our empirical results suggest that if the model assumptions are only mildly violated, marginalization over the model parameters may not be necessary in practice. This is due to the fact that in this case the computationally much simpler predictive distribution based on a single, maximum posterior probability model shows similar performance as the computationally more demanding marginal likelihood approach. The results also give support to Rissanen&#8217;s theoretical results about the usefulness of using Jereys&#8217; prior distribution for the model parameters.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/12\/10.1.1.51.1018.pdf\" target=\"_blank\">PDF<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2252\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2252\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2251\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tStructure optimization of density estimation models applied to regression problems with dynamic noise\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2251\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2252\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Martin Kreutz,\u00a0Bernhard Sendhoff, and Werner von Seelen,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.ini.rub.de\/\" target=\"_blank\">Institut f\u00fcr Neuroinformatik<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0Ruhr-Universit\u00e4t Bochum;\u00a0Anja M. Reimetz and Claus Weihs,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.statistik.uni-dortmund.de\/fakultaet.html\" target=\"_blank\">Fachbereich Statistik<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0Universit\u00e4t Dortmund<\/p>\n<p><strong>Abstract:\u00a0<\/strong>In this paper we deal with the problem of model selection for time series forecasting with dynamical noise and missing data. We employ an evolutionary algorithm to the optimization of a mixture of densities model in order to estimate, via a log-likelihood based quality measure, the joint probability density of the data. We apply our method to the prediction of both artificial time series, generated from the Mackey-Glass equation, and time series from a real world system consisting of physiological data of apnea patients.<\/p>\n<p><strong>Related work from the authors:<\/strong><\/p>\n<ul>\n<li>Martin Kreutz, Anja M. Reimetz, Bernhard Sendhoff, Claus Weihs and Werner von Seelen. Optimisation of Density Estimation Models with Evolutionary Algorithms. In A.E. Eiben, Th. B\u00e4ck, M. Schoenauer and H.P. Schwefel, editors, <em>Parallel Problem Solving from Nature &#8211; PPSN V<\/em>, pages 998-1007, Lecture Notes in Computer Science 1498, Springer, 1998.<\/li>\n<\/ul>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2254\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2254\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2253\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tA learning rule based method of feature extraction with application to acoustic signal classification\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2253\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2254\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0M.J. Larkin,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.brown.edu\/research\/projects\/brain-and-neural-systems\/\" target=\"_blank\">The Institute for Brain and Neural Systems<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at Brown University<\/p>\n<p><strong>Abstract:<\/strong> We apply the Bienenstock, Cooper, and Munro (1982) theory of visual cortical plasticity to the problem of extracting features (i.e., reduction of dimensionality) from acoustic signals; in this case, labeled samples of marine mammal sounds. We first implemented BCM learning in a single neuron model, trained the neuron on samples of acoustic data, and then observed the response when the neuron was tested on different classes of acoustic signals. Next, a multiple neuron network was constructed, with lateral inhibition among the neurons. By training neurons to be selective to inherent features in these signals, we are able to develop networks which can then be used in the design of an automated acoustic signal classifier.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2256\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2256\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2255\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tLearning Extensible Multi-Entity Directed Graphical Models\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2255\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2256\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0Kathryn Blackmond Laskey,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/seor.gmu.edu\/\" target=\"_blank\">Department of Systems Engineering and Operations Research<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0George Mason University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Graphical models have become a standard tool for representing complex probability models in statistics and artificial intelligence. In problems arising in artificial intelligence, it is useful to use the belief network formalism to represent uncertain relationships among variables in the domain, but it may not be possible to use a single, fixed belief network to encompass all problem instances. This is because the number of entities to be reasoned about and their relationships to each other varies from problem instance to problem instance. This paper describes a framework for representing probabilistic knowledge as fragments of belief networks and an approach to learning both structure and parameters from observations.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2258\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2258\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2257\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tA latent variable model for multivariate discretization\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2257\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2258\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Stefano Monti,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.isp.pitt.edu\/\" target=\"_blank\">Intelligent Systems Program<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0University of Pittsburgh and\u00a0Gregory F. Cooper,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.dbmi.pitt.edu\/\" target=\"_blank\">Center for Biomedical Informatics<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0University of Pittsburgh<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We describe a new method for multivariate discretization based on the use of a latent variable model. The method is proposed as a tool to extend the scope of applicability of machine learning algorithms that handle discrete variables only. Building upon existing class-based discretization methods, we use a latent variable as a <em>proxy<\/em> class variable, which is then utilized to drive the partition of the value range of each continuous variable. We present experimental results on simulated data aimed at assessing the merits of the proposed method.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2260\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2260\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2259\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tTesting Regression Models With Fewer Regressors\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2259\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2260\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Judea Pearl and Peyman Meshkat<\/p>\n<p><strong>Abstract:\u00a0<\/strong>A BASIS for a model M is a minimal set of tests that, if satisfied, implies the satisfaction of all the assumptions behind M. This paper proposes a graphical procedure of recognizing bases of regression models. Using this precedure, it is possible to select a set of tests in which the number of regressors is small, compared with standard tests, thus resulting in improved power.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2262\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2262\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2261\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tLearning Conditional Probabilities from Incomplete Databases - An Experimental Comparison\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2261\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2262\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Marco Ramoni and\u00a0Paola Sebastiani,\u00a0The Open University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>This paper compares three methods &#8211; the EM algorithm, Gibbs sampling, and Bound and Collapse (BC) &#8211; to estimate conditional probabilities from incomplete databases in a controlled experiment. Results show a substantial equivalence of the estimates provided by the three methods and a dramatic gain in efficiency using BC.<\/p>\n<p><strong>Other information:\u00a0<\/strong>Further information is available from the home page of the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/projects.kmi.open.ac.uk\/bkd\/\" target=\"_blank\">Bayesian Knowledge Discovery<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> project at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.open.ac.uk\/\" target=\"_blank\">The Open University<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2264\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2264\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2263\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tLocal Experts Combination through Density Decomposition\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2263\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2264\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Ahmed Rida,\u00a0Abderrahim Labbi, and Christian Pellegrini,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/cuiwww.unige.ch\/AI-group\/home.html\" target=\"_blank\">Artificial Intelligence group<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> at\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/cui.unige.ch\/fr\/\" target=\"_blank\">Centre Universitaire d&#8217;Informatique<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p><strong>Abstract:\u00a0<\/strong>In this paper we describe a <em>divide-and-combine<\/em> strategy for decomposition of a complex prediction problem into simpler local sub-problems. We firstly show how to perform a <em>soft<\/em> decomposition via clustering of input data. Such decomposition leads to a partition of the input space into several regions which may overlap. Therefore, to each region is assigned a local predictor (or expert) which is trained only on local data. To construct a solution to the global prediction problem, we combine the local experts using two approaches: <em>weighted averaging<\/em> where the outputs of local experts are weighted by their prior densities, and <em>nonlinear adaptive combination<\/em> where the pooling parameters are obtained through minimization of a global error. To illustrate the validity of our approach, we show simulation results for two classification tasks, <em>vowels<\/em> and <em>phonemes<\/em>, using local experts which are Multi-Layer Perceptrons (MLP) and Support Vector Machines (SVM). We compare the results obtained using the two local combination modes with the results obtained using a global predictor and a linear combination of global predictors.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2266\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2266\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2265\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tEntropy-Driven Inference and Inconsistency\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2265\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2266\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Wilhelm R\u00f6dder and\u00a0Longgui Xu,\u00a0FernUniversit\u00e4t Gesamthochschule in Hagen<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.fernuni-hagen.de\/BWLOR\/index.php\" target=\"_blank\">Fachbereich Wirschaftswissenschaft, Lehrstuhl f\u00fcr Betriebswirtschaftslehre, insb. Operations Research<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p><strong>Abstract:\u00a0<\/strong>Probability distributions on a set of discrete variables are a suitable means to represent knowledge about their respective mutual dependencies. When now things become evident such a distribution can be adapted to the new situation and hence submitted to a sound inference process. Knowledge acquisition and inference are here performed in the rich syntax of conditional events. Both, acquisition and inference respect a sophisticated principle, namely that of maximum entropy and of minimum relative entropy. The freedom to formulate and derive knowledge in a language of rich syntax is comfortable but involves the danger of contradictions or inconsistencies. We develop a method how to solve such inconsistencies which go back to the incompatibility of experts\u2019 knowledge in their respective branches. The method is applied to the diagnosis in Chinese medicine. All calculations are performed in the Entropy-driven expert system shell SPIRIT.<\/p>\n<p><strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/12\/roedder-1.pdf\" target=\"_blank\">PDF<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p><strong>References:\u00a0<\/strong><\/p>\n<p>[1] I. Csisz\u00e1r: I-Divergence Geometry of Probability Distributions and Minimisation Problems. The Annals of Probability 3, (1): 146 &#8211; 158 (1975).[2] I. Csisz\u00e1r: Why Least Squares and Maximum Entropy? An Axiomatic Approach to Inference for Linear Inverse Problems. The Annals of Statistics 19 (4): 2032 &#8211; 2066 (1991).[3] G. Kern-Isberner: Characterising the principle of minimum cross-entropy within a conditional-logical framework. Artificial Intelligence 98: 169-208 (1998).[4] S. L. Lauritzen: Graphical Association Models (Draft), Technical Report IR 93-2001, Institute for Electronic Systems, Dept. of Mathematics and Computer Science, Aalborg University (1993).[5] W. R\u00f6dder and G. Kern-Isberner: Representation and extraction of information by probabilistic logic. Information Systems 21 (8): 637 &#8211; 652 (1996).[6] W. R\u00f6dder and C.-H. Meyer: Coherent knowledge processing at maximum entropy by SPIRIT, Proceedings 12th Conference on Uncertainty in Artificial Intelligence, E. Horitz and F. Jensen (editors), Morgan Kaufmann, San Francisco, California: 470 &#8211; 476 (1996).[7] C. C. Schnorrenberger: Lehrbuch der chinesischen Medizin f\u00fcr westliche \u00c4rzte, Hippokrates, Stuttgart (1985).[8] C. E. Shannon: A mathematical theory of communication, Bell System Tech. J. 27, 379-423 (part I), 623 &#8211; 656 (part II) (1948).[9] J. E. Shore and R. W. Johnson: Axiomatic Derivation of the Principle of Maximum Entropy and the Principle of Minimum Cross Entropy. IEEE Trans. Information Theory 26 (1): 26 &#8211; 37 (1980).[10] J. Whittaker: Graphical Models in Applied Mathematical Multivariate Statistics, John Wiley &amp; Sons (1990).<span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2268\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2268\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2267\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tLearned Models for Continuous Planning\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2267\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2268\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Matthew D. Schmill,\u00a0Tim Oates, and Paul R. Cohen, Experimental Knowledge Systems Laboratory at\u00a0University of Massachusetts<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We are interested in the nature of <em>activity<\/em>\u00a0\u2014 structured behavior of nontrivial duration \u2014\u00a0in intelligent agents. We believe that the development of activity is a continual process in which simpler activities are composed, via planning, to form more sophisticated ones in a hierarchical fashion. The success or failure of a planner depends on its models of the environment, and its ability to implement its plans in the world. We describe an approach to generating dynamical models of activity from real-world experiences and explain how they can be applied towards planning in a continuous state space.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2270\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2270\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2269\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tEfficient Optimization of Large k Real-time Control Algorithm\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2269\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2270\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:\u00a0<\/strong>Delphi Delco Electronics Systems,\u00a0Restraint Systems Electronics and\u00a0Daniel H. Loughlin, North Carolina State University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Resource requirements for global optimization increase dramatically with the number of real-valued decision variables (k). Efficient search strategies are needed to satisfy constraints of time, effort, and funding. In this paper, a conjunction of several disparate methods is used to automatically calibrate a non-linear real-time control used in the automotive industry. By combining a response surface methodology with a hybrid genetic algorithm search, air-bag deployment calibrations can be automated, producing solutions superior to conventional manual search.<\/p>\n<p>\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2272\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2272\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2271\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tModel Folding for Data Subject to Nonresponse\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2271\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2272\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t<\/p>\n<p><strong>Authors:<\/strong>\u00a0Paola Sebastiani, The Open University and\u00a0Marco Ramoni,\u00a0Knowledge Media Institute at\u00a0The Open University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>In this paper we introduce a deterministic method to estimate the posterior probability of rival models from data with partially ignorable nonresponse.\u00a0 The accuracy of the method will be shown via an application to synthetic data.<\/p>\n<p><strong>Other information:\u00a0<\/strong>Futher information is available from the home page of the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/projects.kmi.open.ac.uk\/bkd\/\" target=\"_blank\">Bayesian Knowledge Discovery<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> project at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.open.ac.uk\/\" target=\"_blank\">The Open University<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<p>\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2274\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2274\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2273\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tGeometry, Moments and Bayesian Networks with Hidden Variables\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2273\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2274\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t<\/p>\n<p><strong>Authors:<\/strong>\u00a0Raffaella Settimi,\u00a0University of Warwick and\u00a0Jim Q. Smith,\u00a0University of Warwick<\/p>\n<p><strong>Abstract:\u00a0<\/strong>The purpose of this paper is to present a systematic way of analysing the geometry of the probability spaces for a particular class of Bayesian networks with hidden variables. It will be shown that the conditional independence statements implicit in such graphical models can be neatly expressed as simple polynomial relationships among central moments. This algebraic framework\u00a0 will enable us to explore and identify the structural constraints on the sample space induced by\u00a0 models with tree structures and therefore characterise the families of distributions consistent with such conditional independence assumptions.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2276\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2276\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2275\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tJoint probabilistic clustering of multivariate and sequential data\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2275\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2276\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:<\/strong>\u00a0Padhraic Smyth,\u00a0University of California\u2013Irvine<\/p>\n<p>Consider the following problem. We have a set of individuals (a random sample from a larger population) whom we would like to cluster into groups based on observational data. For each individual we can measure characteristics which are relatively static (eg, their height, weight, income, age, sex, etc). Probabilistic model-based clustering in this contextusually takes the form of a nite mixture model, where each component in the mixture is a multivariate probability density function (or distribution function) for a particular group. This approach has been found to be a useful general technique for extracting hidden structure from multivariate data (Ban eld and Raftery, 1993; Thiesson et al, 1997).<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2278\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2278\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2277\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tAnalysis of multivariate time series via a hidden graphical model\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2277\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2278\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Authors:<\/strong>\u00a0Elena Stanghellini, Universita&#8217; di Perugia and Joe Whittaker,\u00a0Lancaster University<\/p>\n<p><strong>Abstract:\u00a0<\/strong>We propose a chain graph with unobserved variables to model a multivariate time series. We assume that an underlying common trend linearly affects the observed time series, but we do not restrict our analysis to models where the underlying factor accounts for all the contemporary correlations of the series. The residual correlation is modelled using results of graphical models. Modelling the associations left unexplained is an alternative to augmenting the dimension of the underlying factor. It is justified when a clear interpretation of the residual associations is available. It is also an informative way to explore sources of deviation from standard dynamic single-factor models.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2280\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2280\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-2279\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tVisual design support for probabilistic network application\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-2279\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2280\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><strong>Author:\u00a0<\/strong>Axel Vogler,\u00a0Daimler Benz AG<\/p>\n<p><strong>Abstract:\u00a0<\/strong>Understanding inference in probabilistic networks is an important point in the design phase. Their causal structure and locally defined parameters are intuitive to human experts. The global system induced by the local parameters can lead to results not intended by the human experts. To support network design an edge coloring scheme explaining influences between variables responsible for inference result is introduced.<\/p>\n<p><span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t\t\t\t\t<\/ul>\n\t<\/div>\n\t<span id=\"label-external-link\" class=\"sr-only\" aria-hidden=\"true\">Opens in a new tab<\/span><\/p>\n<!-- \/wp:freeform --><!-- \/wp:msr\/content-tab --><!-- \/wp:msr\/content-tabs -->","tab-content":[{"id":0,"name":"Home","content":"Uncertainty 99 was the Seventh International Workshop on Artificial Intelligence and Statistics and was\u00a0presented by\u00a0<a href=\"http:\/\/www.gatsby.ucl.ac.uk\/aistats\/society.html\" target=\"_blank\">The Society for Artificial Intelligence &amp; Statistics<\/a>.\r\n<h2>Program Committee<\/h2>\r\n[row][column class=\"m-col-12-24\"]\r\n<ul>\r\n \t<li>Russell Almond, Educational Testing Service<\/li>\r\n \t<li>Chris Bishop, Microsoft Research<\/li>\r\n \t<li>Wray Buntine, Ultimode Systems<\/li>\r\n \t<li>Peter Cheeseman, NASA Ames<\/li>\r\n \t<li>Max Chickering, Microsoft Research<\/li>\r\n \t<li>Paul Cohen, University of Massachusetts<\/li>\r\n \t<li>Greg Cooper, University of Pittsburgh<\/li>\r\n \t<li>Philip Dawid, University College London<\/li>\r\n \t<li>David Dowe, Monash University<\/li>\r\n \t<li>William DuMouchel, AT&amp;T Labs<\/li>\r\n \t<li>Sue Dumais, Microsoft Research<\/li>\r\n \t<li>David Edwards, Novo Nordisk<\/li>\r\n \t<li>Doug Fisher, Vanderbilt University<\/li>\r\n \t<li>Nir Friedman, Hebrew University\u2013Jerusalem<\/li>\r\n \t<li>Dan Geiger, Technion<\/li>\r\n \t<li>Edward George, University of Texas<\/li>\r\n \t<li>Clark Glymour, Carnegie-Mellon University<\/li>\r\n \t<li>Moises Goldszmidt, SRI International<\/li>\r\n \t<li>David Hand, Open University<\/li>\r\n \t<li>Geoff Hinton, University of Toronto<\/li>\r\n \t<li>Tommi Jaakkola, MIT<\/li>\r\n \t<li>Michael Jordan, UC Berkeley<\/li>\r\n<\/ul>\r\n[\/column] [column class=\"m-col-12-24\"]\r\n<ul>\r\n \t<li>Michael Kearns, AT&amp;T Labs<\/li>\r\n \t<li>Daphne Koller, Stanford University<\/li>\r\n \t<li>Steffen Lauritzen, Aalborg University<\/li>\r\n \t<li>Hans Lenz, Free University of Berlin<\/li>\r\n \t<li>David Lewis, AT&amp;T Labs<\/li>\r\n \t<li>David Madigan, University of Washington<\/li>\r\n \t<li>Andrew Moore, Carnegie-Mellon University<\/li>\r\n \t<li>Daryl Pregibon, AT&amp;T Labs<\/li>\r\n \t<li>Thomas Richardson, University of Washington<\/li>\r\n \t<li>Alberto Roverato, Universita di Modena<\/li>\r\n \t<li>Lawrence Saul, AT&amp;T Labs<\/li>\r\n \t<li>Ross Shachter, Stanford University<\/li>\r\n \t<li>Richard Scheines, Carnegie-Mellon University<\/li>\r\n \t<li>Sebastian Seung, MIT<\/li>\r\n \t<li>Prakash Shenoy, University of Kansas<\/li>\r\n \t<li>Padhraic Smyth, UC Irvine<\/li>\r\n \t<li>David Spiegelhalter, MRC\u2013Cambridge<\/li>\r\n \t<li>Peter Spirtes, Carnegie-Mellon University<\/li>\r\n \t<li>Milan Studeny, Academy of Sciences of Czech Republic<\/li>\r\n \t<li>Nanny Wermuth, Mainz University<\/li>\r\n<\/ul>\r\n[\/column][\/row]"},{"id":1,"name":"Program","content":"<h2>Monday, January 4<\/h2>\r\n<table class=\"msr-table-schedule\" style=\"border-spacing: inherit;border-collapse: collapse\">\r\n<thead class=\"thead\">\r\n<tr class=\"tr\">\r\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Time<\/th>\r\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Session<\/th>\r\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Speaker<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody class=\"tbody\">\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">\r\n<div class=\"msr-table-schedule-cell\">7:30\u20138:45<\/div><\/td>\r\n<td style=\"padding: inherit;border: inherit\">\r\n<div class=\"msr-table-schedule-cell\">\r\n\r\nRegistration\/Continental Breakfast\r\n\r\n<\/div><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">8:45\u20139:00<\/td>\r\n<td style=\"padding: inherit;border: inherit\">\r\n<div class=\"msr-table-schedule-cell\">\r\n\r\nOpening Comments\r\n\r\n<\/div><\/td>\r\n<td style=\"padding: inherit;border: inherit\">David Heckerman and Joe Whittaker<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">\r\n<div class=\"msr-table-schedule-cell\">9:00\u201311:00<\/div><\/td>\r\n<td style=\"padding: inherit;border: inherit\">\r\n<div class=\"msr-table-schedule-cell\">\r\n\r\n<strong>Session I:\u00a0Model Choice\r\n<\/strong><strong>Chair:<\/strong> Thomas Richardson\r\n\r\n<\/div><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Process-oriented evaluation: The next step<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Pedro Domingos<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Model choice<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Alan Gelfand and Sujit Ghosh<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">A note on the comparison of polynomial selection methods<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Murlikrishna Viswanathan<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Pattern discovery via entropy minimization<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Matthew Brand<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">11:00\u201311:30<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Break<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">11:30\u201312:30<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><strong>Session II: <\/strong><b>Latent variables\r\n<\/b><strong>Chair:<\/strong> Kathyrn Laskey<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">On the geometry of DAG models with hidden variables<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Dan Geiger, David Heckerman, Henry King, Chris Meek<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Efficient structure search in the presence of latent variables<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Thomas Richardson, Heiko Bailer, Mooulinath Bannerjee<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">12:30\u20131:30<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Lunch (<em>provided<\/em>)<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">1:30\u20135:00<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Break<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">5:00\u20136:00<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><strong>Poster Summaries<\/strong> (2 mins\/poster)\r\n<strong>Chair:<\/strong> Joe Whittaker<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">6:00\u20137:00<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Dinner<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">7:00\u20139:30<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Poster Sessions<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<h2>Tuesday, January 5<\/h2>\r\n<table class=\"msr-table-schedule\" style=\"border-spacing: inherit;border-collapse: collapse\">\r\n<thead class=\"thead\">\r\n<tr class=\"tr\">\r\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Time<\/th>\r\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Session<\/th>\r\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Speaker<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody class=\"tbody\">\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">\r\n<div class=\"msr-table-schedule-cell\">8:00\u20139:00<\/div><\/td>\r\n<td style=\"padding: inherit;border: inherit\">\r\n<div class=\"msr-table-schedule-cell\">\r\n\r\nContinental Breakfast\r\n\r\n<\/div><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">9:00\u201310:00<\/td>\r\n<td style=\"padding: inherit;border: inherit\">\r\n<div class=\"msr-table-schedule-cell\">\r\n\r\n<strong>Session III: Theory\r\nChair:<\/strong> David Madigan\r\n\r\n<\/div><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">\r\n<div class=\"msr-table-schedule-cell\">\r\n\r\nConditional products: an alternative approach to conditional independence\r\n\r\n<\/div><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Phil Dawid, Milan Studeny<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Hierarchical mixtures-of-experts for generalized linear models: some results on denseness and consistency<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Wenxin Jiang, Martin Tanner<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">10:00\u201310:30<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Coffee Break<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">10:30\u201311:30<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><strong>Session IV: Regression\r\nChair:<\/strong> Padhraic Smyth<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Boosting methodology for regression problems<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Greg Ridgeway, David Madigan, Thomas Richardson<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Probabilistic kernel regression models<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Tommi Jaakkola, David Haussler<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">11:30\u201312:30<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><strong>Session V: Computational Methods\r\nChair:<\/strong> Padhraic Smyth<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Learning structure from data efficiently: applying bounding techniques<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Nir Friedman, Lise Getoor<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Efficient mining of statistical dependencies<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Tim Oates, Paul Cohen, Casey Durfee<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">12:30\u20132:00<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Lunch (<em>provided<\/em>)<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">2:00\u20133:30<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><strong>Session VI: Applications<\/strong><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Causal mechanisms and classification trees for predicting chemical carcinogens<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Louis A Cox<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Geometric modelling of a nuclear environment<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Jan De Geeter<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Modeling decision tree performance with the power law<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Lewis Frey, Doug Fisher<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">3:30\u20134:40<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Business Meeting<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<h2>Wednesday, January 6<\/h2>\r\n<table class=\"msr-table-schedule\" style=\"border-spacing: inherit;border-collapse: collapse\">\r\n<thead class=\"thead\">\r\n<tr class=\"tr\">\r\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Time<\/th>\r\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Session<\/th>\r\n<th class=\"th\" style=\"padding: inherit;border: inherit\">Speaker<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody class=\"tbody\">\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">\r\n<div class=\"msr-table-schedule-cell\">8:00\u20139:00<\/div><\/td>\r\n<td style=\"padding: inherit;border: inherit\">\r\n<div class=\"msr-table-schedule-cell\">\r\n\r\nContinental Breakfast\r\n\r\n<\/div><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">9:00\u201310:30<\/td>\r\n<td style=\"padding: inherit;border: inherit\">\r\n<div class=\"msr-table-schedule-cell\">\r\n\r\n<strong>Session VII: Inference\r\nChair:<\/strong> Greg Cooper\r\n\r\n<\/div><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">\r\n<div class=\"msr-table-schedule-cell\">\r\n\r\nModel-independent mean field theory as a local method for approximate propagation of information\r\n\r\n<\/div><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Michael Haft, Reimar Hofmann, Volker Tresp<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Hierarchical IFA belief networks<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Hagai Attias<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Stochastic local search for Bayesian network<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Kalev Kask, Rina Dechter<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">10:30\u201311:00<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Coffee Break<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">11:00\u201312:00<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><strong>Session VIII: Applications\r\nChair:<\/strong> Doug Fisher<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">An experiment in causal discovery using a pneumona database<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Peter Spirtes, Greg Cooper<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Bayesian graphical models for non-compliance in randomaized trials<\/td>\r\n<td style=\"padding: inherit;border: inherit\">David Madigan<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<tr class=\"tr\">\r\n<td class=\"td-1-4\" style=\"padding: inherit;border: inherit\">12:00\u201312:15<\/td>\r\n<td style=\"padding: inherit;border: inherit\">Closing Remarks<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<em>Plenary Presentations lasted 25 minutes with 5 minutes for questions.<\/em>"},{"id":2,"name":"Tutorials & Abstracts","content":"<h2>Tutorials | January 3<\/h2>\r\n[accordion]\r\n\r\n[panel header=\"Information Access and Retrieval\"]\r\n\r\n<strong>Speaker:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/sdumais\/\" target=\"_blank\">Susan Dumais<\/a>, Microsoft Research |\u00a08:30 AM \u2013\u00a010:30 AM\r\n\r\nThe Web has made literally terabytes of information available at the click of a mouse. The challenge is in finding the right information. Information retrieval is concerned with providing access to textual data for which we have no good formal model, such as a relational model. Statistical approaches have been widely applied to this problem. This tutorial will provide an overview of: a) statistical characteristics of large text collections (e.g., size, sparcity, word distributions), b) important retrieval models (e.g., Boolean, vector space and probabilistic), and c) enhancements which use unsupervised learning to model structure in text collections, or supervised learning to incorporate user feedback. We will conclude with a discussion of open research issues where improved statistical models can improve performance.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Bayesian Statistical Analysis\"]\r\n\r\n<strong>Speaker:<\/strong>\u00a0<a href=\"http:\/\/www.statslab.cam.ac.uk\/Dept\/People\/Spiegelhalter\/davids.html\" target=\"_blank\">David Spiegelhalter<\/a>, MRC Biostatistics Unit, Institute for Public Health, Cambridge | 11:00 AM \u2013 12:00 AM and 1:30 PM \u2013\u00a02:30 PM\r\n\r\nThe first part of the tutorial will cover the fundamentals of Bayesian inference, including probability and its subjective interpretation, evaluation of probability assessments using scoring rules, utilities and decision theory. The use of Bayes theorem for updating beliefs will be illustrated for both binomial and normal likelihoods, and the use of conjugate families of priors and predictive distributions described. The First Bayes software will be used to display conjugate Bayesian analysis. The second part will introduce the concept of `exchangeability', and the consequent use of hierarchical models in which the unknown parameters of a common prior are included in the model. Conditional independence assumptions lead naturally to a graphical representation of hierarchical models. Markov chain Monte Carlo (MCMC) methods will be introduced as a means of carrying out the necessary numerical integrations, and topics covered will include the relationship of Gibbs sampling to graphical modelling, parameterisation, initial values, and choice of prior distributions. Real examples will be used throughout, and on-line analysis of an example in longitudinal modelling with measurement error on predictors will be carried out using the WinBUGS program.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Additive Logistic Regression: A Statistical View of Boosting\"]\r\n\r\n<strong>Speaker:\u00a0<\/strong><a href=\"http:\/\/web.stanford.edu\/~hastie\/\" target=\"_blank\">Trevor Hastie<\/a>, Stanford University |\u00a03:30 PM \u2013\u00a05:00 PM\r\n\r\nBoosting (Freund and Schapire, 1995) is one of the most important recent developments in classification methodology. Boosting works by sequentially applying a classification algorithm to reweighted versions of the training data, and then taking a weighted majority vote of the sequence of classifiers thus produced. For many classification algorithms, this simple strategy results in dramatic improvements in performance. We show that this seemingly mysterious phenomenon can be understood in terms of well known statistical principles, namely additive modeling and maximum likelihood. For the two-class problem, boosting can be viewed as an approximation to additive modeling on the logistic scale using maximum Bernoulli likelihood as a criterion. We develop more direct approximations and show that they exhibit nearly identical results to boosting. Directmulti-class generalizations based on multinomial likelihood are derived that exhibit performance comparable to other recently proposed multi-class generalizations of boosting in most situations, and far superior in some. We suggest a minor modification to boosting that can reduce computation, often by factors of 10 to 50. Finally, we apply these insights to produce an alternative formulation of boosting decision trees. This approach, based on best-first truncated tree induction, often leads to better performance, and can provide interpretable descriptions of the aggregate decision rule. It is also much faster computationally, making it more suitable to large scale data mining applications.\r\n\r\n<em>* joint work with Jerome Friedman and Rob Tibshirani<\/em>\r\n\r\n[\/panel]\r\n\r\n[\/accordion]\r\n<h2>Abstracts | January 4 - 6<\/h2>\r\n[accordion]\r\n\r\n[panel header=\"Hierarchical IFA Belief Networks\"]\r\n\r\n<strong>Author:<\/strong>\u00a0Hagai Attias,\u00a0<a href=\"http:\/\/www.gatsby.ucl.ac.uk\/\" target=\"_blank\">Gatsby Computational Neuroscience Unit<\/a>,\u00a0University College London\r\n\r\n<strong>Abstract:<\/strong>\u00a0We introduce a new real-valued belief network, which is a multilayer generalization of independent factor analysis (IFA). At each level, this network extracts real-valued latent variables that are non-linear functions of the input data with a highly adaptive functional form, resulting in a hierarchical distributed representation of these data. The network is based on a probabilistic generative model, constructed by cascading single-layer IFA models. Whereas exact maximum-likelihood learning for this model is intractable, we present and demonstrate an algorithm that maximizes a lower bound on the likelihood. This algorithm is developed by formulating a variational approach to hierarchical IFA networks.\r\n\r\n<strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/hifan.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Pattern discovery via entropy minimization\"]\r\n\r\n<strong>Author:<\/strong>\u00a0<a href=\"http:\/\/www.merl.com\/people\/brand\" target=\"_blank\">Matthew Brand<\/a>,\u00a0<a href=\"http:\/\/www.merl.com\/\" target=\"_blank\">Mitsubishi Electric Research Labs<\/a>\r\n\r\n<strong>Abstract:<\/strong>\u00a0We propose a framework for learning hidden-variable models by optimizing entropies, in which entropy minimization, posterior maximization, and free energy minimization are all equivalent. Solutions for the maximum <em>a posteriori<\/em> (MAP) estimator yield powerful learning algorithms that combine all the charms of expectation-maximization and deterministic annealing. Contained as special cases are the methods of maximum entropy, maximum likelihood, and a new method, maximum structure. We focus on the maximum structure case, in which entropy minimization maximizes the amount of evidence supporting each parameter while minimizing uncertainty in the sufficient statistics and cross-entropy between the model and the data. In iterative estimation, the MAP estimator gradually extinguishes excess parameters, sculpting a model structure that reflects hidden structures in the data. These models are highly resistant to over-fitting and have the particular virtue of being easy to interpret, often yielding insights into the hidden causes that generate the data.\r\n\r\n<strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/pattern-discovery-entropy-minimization.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Causal Mechanisms and Classification Trees for Predicting Chemical Carcinogens\"]\r\n\r\n<strong>Author:<\/strong> Louis Anthony (\"Tony\") Cox, Jr.,\u00a0Cox Associates\r\n\r\n<strong>Abstract:<\/strong>\u00a0Classification trees, usually used as a nonlinear, nonparametric classification method, can also provide a powerful framework for comparing, assessing, and combining information from different expert systems, by treating their predictions as the independent variables in a classification tree analysis. This paper discusses the applied problem of classifying chemicals as human carcinogens. It shows how classification trees can be used to compare the information provided by ten different carcinogen classification expert systems, construct an improved \"hybrid\" classification system from them, and identify cost-effective combinations of assays (the inputs to the expert systems) to use in classifying chemicals in future.\r\n\r\n<strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/CAUSAL_MECHANISMS_AND_CLASSIFICATION_TREES_FOR_PR.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Conditional Products: An Alternative Approach to Conditional Independence\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0A. Philip Dawid,\u00a0<a href=\"http:\/\/www.ucl.ac.uk\/statistics\/\" target=\"_blank\">Department of Statistical Science<\/a>, University College London; <a href=\"http:\/\/staff.utia.cas.cz\/studeny\/studeny_home.html?q=user_data\/studeny\/studeny_home.html\" target=\"_blank\">Milan Studeny<\/a>,\u00a0<a href=\"http:\/\/www.utia.cas.cz\/\" target=\"_blank\">Institute of Information Theory and Automation<\/a>, Academy of Sciences of Czech Republic, and Laboratory of Intelligent Systems, University of Economics Prague\r\n\r\n<strong>Abstract:\u00a0<\/strong>We introduce a new abstract approach to the study of conditional independence, founded on a concept analogous to the factorization properties of probabilistic independence, rather than the separation properties of a graph. The basic ingredient is the \"conditional product\", which provides a way of combining the basic objects under consideration while preserving as much independence as possible. We introduce an appropriate axiom system for conditional product, and show how, when these axioms are obeyed, they induce a derived concept of conditional independence which obeys the usual semi-graphoid axioms. The general structure is used to throw light on three specific areas: the familiar probabilistic framework (both the discrete and the general case); a set-theoretic framework related to \"variation independence\"; and a variety of graphical frameworks.\r\n\r\n<strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/daw-stu-99-pdf.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Geometric Modeling of a Nuclear Environment\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0Jan De Geeter and Marc Decr\u00e9ton,\u00a0<a href=\"http:\/\/www.sckcen.be\/\" target=\"_blank\">SCK.CEN<\/a> (Belgian Nuclear Research Centre);\u00a0Joris De Schutter, Herman Bruyninckx, and Hendrik Van Brussel,\u00a0Department of Mechanical Engineering,\u00a0<a href=\"https:\/\/www.mech.kuleuven.be\/en\/pma\" target=\"_blank\">Division PMA<\/a>,\u00a0Katholieke Universiteit Leuven\r\n\r\n<strong>Abstract:\u00a0<\/strong>This paper is about the task-directed updating of an incomplete and inaccurate geometric model of a nuclear environment, using only robust radiation-resistant sensors installed on a robot that is remotely controlled by a human operator. In this problem, there are many sources of uncertainty and ambiguity. This paper proposes a probabilistic solution under Gaussian assumptions. Uncertainty is reduced with an estimator based on a Kalman filter. Ambiguity on the measurement-feature association is resolved by running a bank of those estimators in parallel, one for each plausible association. The residual errors of these estimators are used for hypothesis testing and for the calculation of a probability distribution over the remaining hypotheses. The best next sensing action is calculated as a Bayes decision with respect to a loss function that takes into account both the uncertainty on the current estimate, and the variance\/precision required by the task.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Process-Oriented Evaluation: The Next Step\"]\r\n\r\n<strong>Author:<\/strong>\u00a0Pedro Domingos,\u00a0Artificial Intelligence Group,\u00a0Instituto Superior T\u00e9cnico\r\n\r\n<strong>Abstract:<\/strong>\u00a0Methods to avoid overfitting fall into two broad categories: data-oriented (using separate data for validation) and representation-oriented (penalizing complexity in the model). Both have limitations that are hard to overcome. We argue that fully adequate model evaluation is only possible if the search process by which models are obtained is also taken into account. To this end, we recently proposed a method for <em>process-oriented evaluation <\/em>(POE), and successfully applied it to rule induction (Domingos, 1998). However, for the sake of simplicity this treatment made two rather artificial assumptions. In this paper the assumptions are removed, and a simple formula for model evaluation is obtained. Empirical trials show the new, better-founded form of POE to be as accurate as the previous one, while further reducing theory sizes.\r\n\r\n<strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/process-oriented.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Modeling Decision Tree Performance with the Power Law\"]\r\n\r\n<strong>Authors:<\/strong> Lewis J. Frey and Douglas H. Fisher, Jr.,\u00a0Computer Science Department,\u00a0Vanderbilt University\r\n\r\n<strong>Abstract:\u00a0<\/strong>This paper discusses the use of a power law to predict decision tree performance. Power laws are fit to learning curves of decision trees trained on data sets from the UCI repository. The learning curves are generated by training C4.5 on different size training sets. The power law predicts diminishing returns in terms of error rate as training set size increase. By characterizing the learning curve with a power law, the error rate for a given size training set can be projected. This projection can be used in estimating the amount of data needed to achieve an acceptable error rate, and the cost effectiveness of further data collection.\r\n\r\n<strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/ModelingTree.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Efficient Learning using Constrained Sufficient Statistics\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0<a href=\"http:\/\/www.cs.huji.ac.il\/~nir\/\" target=\"_blank\">Nir Friedman<\/a>,\u00a0<a href=\"http:\/\/www.cs.huji.ac.il\/\" target=\"_blank\">Institute of Computer Science<\/a>,\u00a0The Hebrew University;\u00a0<a href=\"http:\/\/robotics.stanford.edu\/~getoor\/\" target=\"_blank\">Lise Getoor<\/a>,\u00a0<a href=\"http:\/\/www-cs.stanford.edu\/\" target=\"_blank\">Computer Science Department<\/a>,\u00a0Stanford University\r\n\r\n<strong>Abstract:\u00a0<\/strong>Learning Bayesian networks is a central problem for pattern recognition, density estimation and classification. In this paper, we propose a new method for speeding up the computational process of learning Bayesian network <em>structure<\/em>. This approach uses constraints imposed by the statistics already collected from the data to guide the learning algorithm. This allows us to reduce the number of statistics collected during learning and thus speed up the learning time. We show that our method is capable of learning structure from data more efficiently than traditional approaches. Our technique is of particular importance when the size of the datasets is large or when learning from incomplete data. The basic technique that we introduce is general and can be used to improve learning performance in many settings where sufficient statistics must be computed. In addition, our technique may be useful for alternate search strategies such as branch and bound algorithms.\r\n\r\n<strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/FGe1.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"On the geometry of DAG models with hidden variables\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0<a href=\"http:\/\/www.cs.technion.ac.il\/~dang\/\" target=\"_blank\">Dan Geiger<\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/heckerma\/\" target=\"_blank\">David Heckerman<\/a>, and Christopher Meek, Decision Theory &amp; Adaptive Systems,\u00a0Microsoft Research;\u00a0<a href=\"http:\/\/www.math.umd.edu\/~hking\/\" target=\"_blank\">Henry King<\/a>,\u00a0<a href=\"http:\/\/www.math.umd.edu\" target=\"_blank\">Mathematics Department<\/a>,\u00a0University of Maryland\r\n\r\n<strong>Abstract:\u00a0<\/strong>We prove that many graphical models with hidden variables are not curved exponential families. This result, together with the fact that some graphical models are curved and not linear, implies that the hierarchy of graphical models, as linear, curved, and stratified, is non-collapsing; each level in the hierarchy is strictly contained in the larger levels. This result is discussed in the context of model selection of graphical models.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Model Choice: A minimum posterior predictive loss approach\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0Sujit. Ghosh,\u00a0<a href=\"http:\/\/www.stat.ncsu.edu\/\" target=\"_blank\">Department of Statistics<\/a>,\u00a0NC State University;\u00a0<a href=\"http:\/\/stat.uconn.edu\/alan-gelfand\/\" target=\"_blank\">Alan E. Gelfand<\/a>,\u00a0<a href=\"http:\/\/stat.uconn.edu\/\" target=\"_blank\">Department of Statistics<\/a>,\u00a0University of Connecticut\r\n\r\n<strong>Abstract:\u00a0<\/strong>Model choice is a fundamental activity in the analysis of data sets, an activity which has become increasingly more important as computational advances enable the fitting of increasingly complex models. Such complexity typically arises through hierarchical structure which requires specification at each stage of probabilistic mechanisms, mean and dispersion forms, explanatory variables, etc. Nonnested hierarchical models introducing random effects may not be handled by classical methods. Bayesian approaches using predictive distributions can be used though the FORMAL solution, which includes Bayes factors as a special case, can be criticized. It seems natural to evaluate model performance by comparing what it predicts with what has been observed. Most classical criteria utilize such comparison. We propose a predictive criterion where the goal is good prediction of a replicate of the observed data but tempered by fidelity to the observed values. We obtain this criterion by minimizing posterior loss for a given model and then, for models under consideration, selecting the one which minimizes this criterion. For a version of log scoring loss we can do the minimization explicitly, obtaining an expression which can be interpreted as a penalized deviance criterion. We illustrate its performance with an application to a large data set involving residential property transactions.\r\n\r\n<strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/gelfand_biometrika_1998.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Mean Field Inference in a General Probabilistic Setting\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0Michael Haft,\u00a0Reimar Hofmann, and\u00a0Volker Tresp,\u00a0Siemens AG,\u00a0Corporate Technology, Information and Communications Department\r\n\r\n<strong>Abstract:<\/strong>\u00a0We present a systematic, model-independent formulation of\u00a0 mean field theory (MFT) as an inference method in probabilistic models. \"Model-independent'' means that we do not assume a particular type of dependency among the variables of a domain but instead work in a general probabilistic setting. In a Bayesian network, for example, you may use arbitrary tables to specify conditional dependencies and thus run MFT in <em>any<\/em> Bayesian network. Furthermore, the general mean field equations derived here shed a light on the essence of MFT. MFT can be interpreted as a local iteration scheme which relaxes in a consistent state (a solution of the mean field equations). Iterating the mean field equations means propagating information through the network. In general, however, there are multiple solutions to the mean field equations. We show that improved approximations can\u00a0 be obtained by forming a weighted mixture of the multiple mean field solutions.\u00a0 Simple approximate expressions for the mixture weights are given. The benefits of taking into account multiple solutions are demonstrated by using MFT for inference in a small Bayesian network representing a medical domain. Thereby it turns out that every solution of the mean field equations can be interpreted as a 'disease scenario'.\r\n\r\n<strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/Mean_Field_Inference_in_a_General_Probabilistic_Se.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Probabilistic kernel regression models\"]\r\n\r\n<strong>Authors:\u00a0<\/strong><a href=\"http:\/\/people.csail.mit.edu\/tommi\/\" target=\"_blank\">Tommi S. Jaakkola<\/a>,\u00a0Department of Computer Science and Electrical Engineering,\u00a0Massachusetts Institute of Technology;\u00a0<a href=\"https:\/\/genomics-old.soe.ucsc.edu\/haussler\" target=\"_blank\">David Haussler<\/a>,\u00a0Department of Computer Science,\u00a0University of California\u2013Santa Cruz\r\n\r\n<strong>Abstract:\u00a0<\/strong>We introduce a class of flexible conditional probability models and techniques for classification\/regression problems. Many existing methods such as generalized linear models and support vector machines are subsumed under this class. The flexibility of this class of techniques comes from the use of kernel functions as in support vector machines, and the generality from dual formulations of standard regression models.\r\n\r\n<em><span style=\"font-family: wf_segoe-ui_bold, wf_segoe-ui_semibold, wf_segoe-ui_normal, Arial, sans-serif\">*<\/span>The work was done while T. Jaakkola was at UC Santa Cruz.<\/em>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Hierarchical Mixtures-of-Experts for Generalized Linear Models: Some Results on Denseness and Consistency\"]\r\n\r\n<strong>Authors:<\/strong> Wenxin Jiang and Martin A. Tanner,\u00a0Department of Statistics,\u00a0Northwestern University\r\n\r\n<strong>Abstract:\u00a0<\/strong>We investigate a class of hierarchical mixtures-of-experts (HME) models where exponential family regression models with generalized linear mean functions of the form $\\psi(a+x^T b)$ are mixed. Here $\\psi(\\cdot)$ is the inverse link function. Suppose the true response $y$ follows an exponential family regression model with mean function belonging to a class of smooth functions of the form $\\psi(h(x))$ where $h \\in W_{2;K_0}^\\infty$ (a Sobolev class over $[0,1]^{s}$). It is shown that the HME mean functions can approximate the true mean function, at a rate of $O(m^{-2\/s})$ in $L_p$ norm. Moreover, the HME probability density functions can approximate the true density, at a rate of $O(m^{-2\/s})$ in Hellinger distance, and at a rate of $O(m^{-4\/s})$ in Kullback-Leibler divergence. These rates can be achieved within the family of HME structures with a tree of binary splits, or within the family of structures with a single layer of experts. Here $s$ is the dimension of the predictor $x$. It is also shown that likelihood-based inference based on HME is consistent in recovering the truth, in the sense that as the sample size $n$ and the number of experts $m$ both increase, the mean square error of the estimated mean response goes to zero. Conditions for such results to hold are stated and discussed.\r\n\r\n<strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/1301.7390.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Stochastic local search for Bayesian networks\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0Kalev Kask and Rina Dechter,\u00a0Department of Information and Computer Science,\u00a0University of California\u2013Irvine\r\n\r\n<strong>Abstract:\u00a0<\/strong>The paper evaluates empirically the suitability\u00a0of Stochastic Local Search algorithms\u00a0(SLS) for finding most probable explanations\u00a0in Bayesian networks. SLS algorithms\u00a0(e.g. GSAT, WSAT) have recently\u00a0proven to be highly effective in solving\u00a0complex constraint-satisfaction and satisfiability problems which cannot be solved\u00a0by traditional search schemes. Our experiments\u00a0investigate the applicability of this\u00a0scheme to probabilistic optimization problems.\r\nSpecifically, we show that algorithms\u00a0combining hill-climbing steps with stochastic\u00a0steps (guided by the network's probability\u00a0distribution) called G+StS, outperform pure\u00a0hill-climbing search, pure stochastic simulation\u00a0search, as well as simulated annealing.\u00a0In addition, variants of G+StS that are augmented on top of alternative approximation\u00a0methods are shown to be particularly effective.\r\n\r\n<strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/r72-new_11_98.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Bayesian Graphical Models, Intention-to-Treat, and the Rubin Causal Model\"]\r\n\r\n<strong>Author:<\/strong>\u00a0David Madigan,\u00a0<a href=\"http:\/\/www.stat.washington.edu\/\" target=\"_blank\">Department of Statistics<\/a>,\u00a0University of Washington\r\n\r\n<strong>Abstract:<\/strong>\u00a0In clinical trials with significant noncompliance the standard intention-to-treat analyses sometimes mislead. Rubin's causal model provides an alternative method of analysis that can shed extra light on clinical trial data. Formulating the Rubin Causal Model as a Bayesian graphical model facilitates model communication and computation.\r\n\r\n<strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/38981c9db80897c99c15049dcf4a0145aad5.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Efficient Mining of Statistical Dependencies\"]\r\n\r\n<strong>Authors:\u00a0<\/strong>Tim Oates,\u00a0Matthew D. Schmill,\u00a0Paul R. Cohen, and Casey Durfee,\u00a0Experimental Knowledge Systems Lab,\u00a0Department of Computer Science,\u00a0University of Massachusetts\u2013Amherst\r\n\r\n<strong>Abstract:<\/strong>\u00a0The Multi-Stream Dependency Detection algorithm finds rules that capture statistical dependencies between patterns in multivariate time series of categorical data. Rule strength is measured by the G statistic, and an upper bound on the value of G for the descendants of a node allows MSDD's search space to be pruned. However, in the worst case, the algorithm will explore exponentially many rules. This paper presents and empirically evaluates two ways of addressing this problem. The first is a set of three methods for reducing the size of MSDD's search space based on information collected during the search process. Second, we discuss an implementation of MSDD that distributes its computations over multiple machines on a network.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Tractable structure search in the presence of latent variables\"]\r\n\r\n<strong>Authors:<\/strong> <a href=\"http:\/\/www.stat.washington.edu\/tsr\/website\/inquiry\/home.php\" target=\"_blank\">Thomas Richardson<\/a>, Heiko Bailer, and Moulinath Banerjees,\u00a0<a href=\"http:\/\/www.stat.washington.edu\/\" target=\"_blank\">Department of Statistics<\/a>,\u00a0University of Washington\r\n\r\n<strong>Abstract:\u00a0<\/strong>The problem of learning the structure of a DAG model in the presence of latent variables presents many formidable challenges. In particular there are an infinite number of latent variable models to consider, and these models possess features which make them hard to work with. We describe a class of graphical models which can represent the conditional independence structure induced by a latent variable model over the observed margin. We give a parametrization of the set of Gaussian distributions with conditional independence structure given by a MAG model. The models are illustrated via a simple example. Different estimation techniques are discussed in the context of Zellner's Seemingly Unrelated Regression (SUR) models.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Boosting Methodology for Regression Problems\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0Greg Ridgeway,\u00a0David Madigan, and\u00a0<a href=\"http:\/\/www.stat.washington.edu\/tsr\/website\/inquiry\/home.php\" target=\"_blank\">Thomas Richardson<\/a>,\u00a0<a href=\"http:\/\/www.stat.washington.edu\/\" target=\"_blank\">Department of Statistics<\/a>,\u00a0University of Washington\r\n\r\n<strong>Abstract:\u00a0<\/strong>Classification problems have dominated research on boosting to date. The application of boosting to regression problems, on the other hand, has received little investigation. In this paper we develop a new boosting method for regression problems. We cast the regression problem as a classification problem and apply an interpretable form of the boosted na\u00efve Bayes classifier. This induces a regression model that we show to be expressible as an additive model for which we derive estimators and discuss computational issues. We compare the performance of our boosted na\u00efve Bayes regression model with other interpretable multivariate regression procedures.\r\n\r\n<strong>Availability:<\/strong>\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/boosting-methodology-regression-problems.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"An Experiment in Causal Inference Using a Pneumonia Database\"]\r\n\r\n<strong>Authors:<\/strong> Peter Spirtes,\u00a0Department of Philosophy,\u00a0Carnegie Mellon University;\u00a0Greg Cooper,\u00a0Center for Biomedical Informatics,\u00a0University of Pittsburgh\r\n\r\n<strong>Abstract:\u00a0<\/strong>We tested a causal discovery algorithm on a database of pneumonia patients. The output of the causal discovery algorithm is a list of statements \"A causes B\", where A and B are variables in the database, and a score indicating the degree of confidence in the statement. We compared the output of the algorithm with the opinions of physicians about whether A caused B or not. We found that the doctors opinions were independent of the output of the algorithm. However, an examination of the output of results suggested a simple, well motivated modification of the algorithm which would bring the output of the algorithm into high agreement with the physicians opinions.\r\n\r\n<strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/01\/causal-discovery.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"A Note on the Comparison of Polynomial Selection Methods\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0Murlikrishna Viswanathan and Professor Chris Wallace,\u00a0School of Computer Science and Software Engineering,\u00a0Monash University\u2013Clayton\r\n\r\n<strong>Abstract:\u00a0<\/strong>Minimum Message Length (MML) and Structural Risk Minimisation (SRM) are two computational learning principles that have achieved wide acclaim in recent years. Whereas the former is based on Bayesian learning and the latter on the classical theory of VC-dimension, they are similar in their attempt to define a trade-off between model complexity and goodness of fit to the data. A recent empirical study by Wallace compared the performance of standard model selection methods in a one-dimensional polynomial regression framework. The results from this study provided strong evidence in support of the MML and SRM based methods over the other standard approaches. In this paper we present a detailed empirical evaluation of three model selection methods which include an MML based approach and two SRM based methods. Results from our analysis and experimental evaluation suggest that the MML-based approach in general has higher predictive accuracy and also raise questions on the inductive capabilities of the Structural Risk Minimization Principle.\r\n\r\n[\/panel]\r\n\r\n[\/accordion]"},{"id":3,"name":"Poster Sessions","content":"<h2>Poster Sessions | January 4<\/h2>\r\n[accordion]\r\n\r\n[panel header=\"Transfer of Information between System and Evidence Models\"]\r\n\r\n<strong>Authors:\u00a0<\/strong>Russell Almond,\u00a0Research Statistics Group at\u00a0<a href=\"http:\/\/www.ets.org\/\" target=\"_blank\">Educational Testing Service<\/a>;\u00a0Edward Herskovits,\u00a0Noetic Systems, Inc.;\u00a0Robert J. Mislevy,\u00a0Model Based Measurement Group at\u00a0Educational Testing Service;\u00a0Linda Stienberg,\u00a0Educational Policy Research at\u00a0Educational Testing Service\r\n\r\n<strong>Abstract:<\/strong>\u00a0In this paper we illustrate a simple scheme for dividing a complex Bayes network into a <em>system model<\/em> and a collection of smaller <em>evidence models<\/em>. While the system model maintains a permanent record of the state of the system of interest, the evidence models are only used momentarily to absorb evidence from specific observations or findings and then discarded. This paper describes an implementation of a system model\u2014evidence model complex in which each system and evidence model has a separate Bayes net and Markov tree representation. As necessary, information is propagated between common Markov tree nodes of the evidence and system models. While mathematically equivalent to the full Bayes network, the system model--evidence model complex allows us to (a) separate the seldom used evidence model portions from the core system model thus reducing search and propagation time in the network and (b) easily replace the evidence models (this is particular advantageous in educational examples in which new test items are often introduced to prevent overexposure of assessment tasks).\r\n\r\n[\/panel]\r\n\r\n[panel header=\"A Bayesian Model for Collaborative Filtering\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0Yung-Hsin Chien and Edward I. George,\u00a0Department of MSIS at\u00a0University of Texas at Austin\r\n\r\n<strong>Abstract:<\/strong>\u00a0Consider the general setup where a set of <em>items<\/em> have been partially <em>rated<\/em> by a set of <em>judges<\/em>, in the sense that not every item has been rated by every judge. For this setup, we propose a Bayesian approach for the problem of predicting the missing ratings from the observed ratings. This approach incorporates similarity by assuming the set of judges can be partitioned into groups which share the same ratings probability distribution. This leads to a predictive distribution of missing ratings based on the posterior distribution of the groupings and associated ratings probabilities. Markov chain Monte Carlo methods and a hybrid search algorithm are then used to obtain predictions of the missing ratings.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Parameter learning from incomplete data for Bayesian networks\"]\r\n\r\n<strong>Author:<\/strong>\u00a0RG Cowell\r\n\r\n<strong>Abstract:<\/strong> In a companion paper (Cowell 1999), I described a method of using maximum entropy to estimate the joint probability distribution for a set of discrete variables from missing data. Here I extend the method of that paper to incorporate prior information for application to parameter learning in Bayesian networks.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"On the Application of The Bootstrap for Computing Confidence Measures on Features of Induced Bayesian Networks\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0<a href=\"http:\/\/www.cs.huji.ac.il\/~nir\/\" target=\"_blank\">Nir Friedman<\/a>,\u00a0<a href=\"http:\/\/www.cs.huji.ac.il\/\" target=\"_blank\">Institute of Computer Science<\/a> at\u00a0Hebrew University;\u00a0Moises Goldszmidt, SRI International; Abraham Wyner,\u00a0Department of Statistics at\u00a0Wharton School\r\n\r\n<strong>Abstract:\u00a0<\/strong>In the context of learning Bayesian networks from data, very little work has been published on methods for assessing the <em>quality<\/em> of an induced model. This issue, however, has received a great deal of attention in the statistics literature. In this paper, we take a well-known method from statistics, Efron's Bootstrap, and examine its applicability for assessing a confidence measure on features of the learned network structure. We also compare this method to assessments based on a practical realization of the Bayesian methodology.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Relaxing the Local Independence Assumption for Quantitative Learning in Acyclic Directed Graphical Models through Hierarchical Partition Models\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0Daniela Golinelli and David Madigan,\u00a0<a href=\"http:\/\/www.stat.washington.edu\/\" target=\"_blank\">Department of Statistics<\/a> at\u00a0University of Washington and\u00a0Guido Consonni,\u00a0Dip. di Economia e Metodi Quantitativi at\u00a0Universita' di Pavia\r\n\r\n<strong>Abstract:\u00a0<\/strong>The simplest method proposed by Spiegelhalter and Lauritzen (1990) to perform <em>quantitative learning<\/em> in ADG presents a potential weakness: the <em>local independence assumption<\/em>. We propose to alleviate this problem through the use of Hierarchical Partition Models. Our approach is compared with the previous one from an interpretative and predictive point of view.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"The exploration of new methods for learning in binary Boltzmann machines\"]\r\n\r\n<strong>Authors:\u00a0<\/strong>Keith Humphreys and D.M. Titterington,\u00a0<a href=\"http:\/\/www.gla.ac.uk\/subjects\/statistics\/\" target=\"_blank\">Department of Statistics<\/a> at\u00a0University of Glasgow\r\n\r\n<strong>Abstract:\u00a0<\/strong>Exact inference for Boltzmann machines is computationally expensive. One approach to improving tractability is to approximate the gradient algorithm. We describe a new way of doing this which is based on Bahadur's representation of the multivariate binary distribution (Bahadur, 1961). We compare the approach, for networks with no unobserved variable, to the \"mean field'' approximation of Peterson and Anderson (1987) and the approach of Kappen and Rodriguez (1998), which is based on the linear response theorem. We also investigate the use of the pairwise association cluster method (Tanaka and Morita, 1995).\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Statistical Challenges to inductive inference in linked data\"]\r\n\r\n<strong>Author:<\/strong>\u00a0David Jensen\r\n\r\n<strong>Abstract:<\/strong> Many data sets can be represented naturally as collections of linked objects. For example, document collections can be represented as documents (nodes) connected by citations and hypertext references (links). Similarly, organizations can be represented as people (nodes) connected reporting relationships, social relationships, and communication patterns (links).\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Mixture Model Clustering with the Multimix Program\"]\r\n\r\n<strong>Authors:\u00a0<\/strong>Murray A. Jorgensen and Lynette A. Hunt,\u00a0<a href=\"http:\/\/www.stats.waikato.ac.nz\/\" target=\"_blank\">Department of Statistics<\/a> at\u00a0University of Waikato\r\n\r\n<strong>Abstract:\u00a0<\/strong>Hunt (1996) has implemented the finite mixture model approach to clustering in a program called Multimix. The program is designed to cluster multivariate data with categorical and continuous variables and possibly containing missing values. The model fitted simultaneously generalises the Latent Class model and the mixture of multivariate normals model. Like either of these models Multimix can be used to form clusters by the Bayes allocation rule. This is the intended use of the program, although the parameter estimates can be used to give a succinct description of the clusters.\r\n\r\nUse of the EM algorithm, with its view of the observed data as being notionally augmented by missing information to form the 'complete data', gives a broad framework for estimation which is able to handle two types of missing information: unknown cluster assignment and missing data. Using the methodology of Little and Rubin (1987). in this way Multimix is able to handle missing data in a less ad hoc way than many clustering algorithms. The program runs in acceptable time with large data matrices (say hundreds of observations on tens of variables). Use of the missing-data facility increases execution time somewhat. In this presentation we describe the approach taken to the design of Multimix and how some of the statistical problems were dealt with. As examples of the use of the program we cluster a large medical dataset and a version of Fisher's Iris data in which a third of the values are randomly made 'missing'.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Learning Augmented Bayesian Classifiers: A Comparison of Distribution-based and Classification-based Approaches\"]\r\n\r\n<strong>Authors:\u00a0<\/strong>Eamonn Keogh and Michael J. Pazzani,\u00a0<a href=\"http:\/\/www.ics.uci.edu\/\" target=\"_blank\">Information and Computer Science<\/a> at\u00a0University of California, Irvine\r\n\r\n<strong>Abstract:\u00a0<\/strong>The nave Bayes classifier is built on the assumption of conditional independence between the attributes given the class. The algorithm has been shown to be surprisingly robust to obvious violations of this condition, but it is natural to ask if it is possible to further improve the accuracy by relaxing this assumption. We examine an approach where nave Bayes is augmented by the addition of correlation arcs between attributes. We explore two methods for finding the set of augmenting arcs, a greedy hill-climbing search, and a novel, more computationally efficient algorithm that we call SuperParent. We compare these methods to TAN; a state-of the-art distribution-based approach to finding the augmenting arcs.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Exploring the robustness of Bayesian and information-theoretic methods for predictive inference\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0Petri Kontkanen, Petri Myllym\u00e4ki, Tomi Silander, Henry Tirri, and Kimmo Valtonen\r\n\r\n<strong>Abstract:<\/strong>\u00a0Given a set of sample data, we study three alternative methods for determining the predictive distribution of an unseen data vector. In particular, we are interested in the behavior of the predictive accuracy of these three predictive methods as a function of the degree of the domain assumption violations. We explore this question empirically by using artificially generated data sets, where the assumptions can be violated in various ways. Our empirical results suggest that if the model assumptions are only mildly violated, marginalization over the model parameters may not be necessary in practice. This is due to the fact that in this case the computationally much simpler predictive distribution based on a single, maximum posterior probability model shows similar performance as the computationally more demanding marginal likelihood approach. The results also give support to Rissanen's theoretical results about the usefulness of using Jereys' prior distribution for the model parameters.\r\n\r\n<strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/12\/10.1.1.51.1018.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Structure optimization of density estimation models applied to regression problems with dynamic noise\"]\r\n\r\n<strong>Authors:\u00a0<\/strong>Martin Kreutz,\u00a0Bernhard Sendhoff, and Werner von Seelen,\u00a0<a href=\"https:\/\/www.ini.rub.de\/\" target=\"_blank\">Institut f\u00fcr Neuroinformatik<\/a> at\u00a0Ruhr-Universit\u00e4t Bochum;\u00a0Anja M. Reimetz and Claus Weihs,\u00a0<a href=\"http:\/\/www.statistik.uni-dortmund.de\/fakultaet.html\" target=\"_blank\">Fachbereich Statistik<\/a> at\u00a0Universit\u00e4t Dortmund\r\n\r\n<strong>Abstract:\u00a0<\/strong>In this paper we deal with the problem of model selection for time series forecasting with dynamical noise and missing data. We employ an evolutionary algorithm to the optimization of a mixture of densities model in order to estimate, via a log-likelihood based quality measure, the joint probability density of the data. We apply our method to the prediction of both artificial time series, generated from the Mackey-Glass equation, and time series from a real world system consisting of physiological data of apnea patients.\r\n\r\n<strong>Related work from the authors:<\/strong>\r\n<ul>\r\n \t<li>Martin Kreutz, Anja M. Reimetz, Bernhard Sendhoff, Claus Weihs and Werner von Seelen. Optimisation of Density Estimation Models with Evolutionary Algorithms. In A.E. Eiben, Th. B\u00e4ck, M. Schoenauer and H.P. Schwefel, editors, <em>Parallel Problem Solving from Nature - PPSN V<\/em>, pages 998-1007, Lecture Notes in Computer Science 1498, Springer, 1998.<\/li>\r\n<\/ul>\r\n[\/panel]\r\n\r\n[panel header=\"A learning rule based method of feature extraction with application to acoustic signal classification\"]\r\n\r\n<strong>Author:<\/strong>\u00a0M.J. Larkin,\u00a0<a href=\"https:\/\/www.brown.edu\/research\/projects\/brain-and-neural-systems\/\" target=\"_blank\">The Institute for Brain and Neural Systems<\/a> at Brown University\r\n\r\n<strong>Abstract:<\/strong> We apply the Bienenstock, Cooper, and Munro (1982) theory of visual cortical plasticity to the problem of extracting features (i.e., reduction of dimensionality) from acoustic signals; in this case, labeled samples of marine mammal sounds. We first implemented BCM learning in a single neuron model, trained the neuron on samples of acoustic data, and then observed the response when the neuron was tested on different classes of acoustic signals. Next, a multiple neuron network was constructed, with lateral inhibition among the neurons. By training neurons to be selective to inherent features in these signals, we are able to develop networks which can then be used in the design of an automated acoustic signal classifier.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Learning Extensible Multi-Entity Directed Graphical Models\"]\r\n\r\n<strong>Author:<\/strong>\u00a0Kathryn Blackmond Laskey,\u00a0<a href=\"https:\/\/seor.gmu.edu\/\" target=\"_blank\">Department of Systems Engineering and Operations Research<\/a> at\u00a0George Mason University\r\n\r\n<strong>Abstract:\u00a0<\/strong>Graphical models have become a standard tool for representing complex probability models in statistics and artificial intelligence. In problems arising in artificial intelligence, it is useful to use the belief network formalism to represent uncertain relationships among variables in the domain, but it may not be possible to use a single, fixed belief network to encompass all problem instances. This is because the number of entities to be reasoned about and their relationships to each other varies from problem instance to problem instance. This paper describes a framework for representing probabilistic knowledge as fragments of belief networks and an approach to learning both structure and parameters from observations.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"A latent variable model for multivariate discretization\"]\r\n\r\n<strong>Authors:\u00a0<\/strong>Stefano Monti,\u00a0<a href=\"http:\/\/www.isp.pitt.edu\/\" target=\"_blank\">Intelligent Systems Program<\/a> at\u00a0University of Pittsburgh and\u00a0Gregory F. Cooper,\u00a0<a href=\"http:\/\/www.dbmi.pitt.edu\/\" target=\"_blank\">Center for Biomedical Informatics<\/a> at\u00a0University of Pittsburgh\r\n\r\n<strong>Abstract:\u00a0<\/strong>We describe a new method for multivariate discretization based on the use of a latent variable model. The method is proposed as a tool to extend the scope of applicability of machine learning algorithms that handle discrete variables only. Building upon existing class-based discretization methods, we use a latent variable as a <em>proxy<\/em> class variable, which is then utilized to drive the partition of the value range of each continuous variable. We present experimental results on simulated data aimed at assessing the merits of the proposed method.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Testing Regression Models With Fewer Regressors\"]\r\n\r\n<strong>Authors:\u00a0<\/strong>Judea Pearl and Peyman Meshkat\r\n\r\n<strong>Abstract:\u00a0<\/strong>A BASIS for a model M is a minimal set of tests that, if satisfied, implies the satisfaction of all the assumptions behind M. This paper proposes a graphical procedure of recognizing bases of regression models. Using this precedure, it is possible to select a set of tests in which the number of regressors is small, compared with standard tests, thus resulting in improved power.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Learning Conditional Probabilities from Incomplete Databases - An Experimental Comparison\"]\r\n\r\n<strong>Authors:\u00a0<\/strong>Marco Ramoni and\u00a0Paola Sebastiani,\u00a0The Open University\r\n\r\n<strong>Abstract:\u00a0<\/strong>This paper compares three methods - the EM algorithm, Gibbs sampling, and Bound and Collapse (BC) - to estimate conditional probabilities from incomplete databases in a controlled experiment. Results show a substantial equivalence of the estimates provided by the three methods and a dramatic gain in efficiency using BC.\r\n\r\n<strong>Other information:\u00a0<\/strong>Further information is available from the home page of the <a href=\"http:\/\/projects.kmi.open.ac.uk\/bkd\/\" target=\"_blank\">Bayesian Knowledge Discovery<\/a> project at <a href=\"http:\/\/www.open.ac.uk\/\" target=\"_blank\">The Open University<\/a>.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Local Experts Combination through Density Decomposition\"]\r\n\r\n<strong>Authors:\u00a0<\/strong>Ahmed Rida,\u00a0Abderrahim Labbi, and Christian Pellegrini,\u00a0<a href=\"http:\/\/cuiwww.unige.ch\/AI-group\/home.html\" target=\"_blank\">Artificial Intelligence group<\/a> at\u00a0<a href=\"http:\/\/cui.unige.ch\/fr\/\" target=\"_blank\">Centre Universitaire d'Informatique<\/a>\r\n\r\n<strong>Abstract:\u00a0<\/strong>In this paper we describe a <em>divide-and-combine<\/em> strategy for decomposition of a complex prediction problem into simpler local sub-problems. We firstly show how to perform a <em>soft<\/em> decomposition via clustering of input data. Such decomposition leads to a partition of the input space into several regions which may overlap. Therefore, to each region is assigned a local predictor (or expert) which is trained only on local data. To construct a solution to the global prediction problem, we combine the local experts using two approaches: <em>weighted averaging<\/em> where the outputs of local experts are weighted by their prior densities, and <em>nonlinear adaptive combination<\/em> where the pooling parameters are obtained through minimization of a global error. To illustrate the validity of our approach, we show simulation results for two classification tasks, <em>vowels<\/em> and <em>phonemes<\/em>, using local experts which are Multi-Layer Perceptrons (MLP) and Support Vector Machines (SVM). We compare the results obtained using the two local combination modes with the results obtained using a global predictor and a linear combination of global predictors.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Entropy-Driven Inference and Inconsistency\"]\r\n\r\n<strong>Authors:\u00a0<\/strong>Wilhelm R\u00f6dder and\u00a0Longgui Xu,\u00a0FernUniversit\u00e4t Gesamthochschule in Hagen\r\n<a href=\"http:\/\/www.fernuni-hagen.de\/BWLOR\/index.php\" target=\"_blank\">Fachbereich Wirschaftswissenschaft, Lehrstuhl f\u00fcr Betriebswirtschaftslehre, insb. Operations Research<\/a>\r\n\r\n<strong>Abstract:\u00a0<\/strong>Probability distributions on a set of discrete variables are a suitable means to represent knowledge about their respective mutual dependencies. When now things become evident such a distribution can be adapted to the new situation and hence submitted to a sound inference process. Knowledge acquisition and inference are here performed in the rich syntax of conditional events. Both, acquisition and inference respect a sophisticated principle, namely that of maximum entropy and of minimum relative entropy. The freedom to formulate and derive knowledge in a language of rich syntax is comfortable but involves the danger of contradictions or inconsistencies. We develop a method how to solve such inconsistencies which go back to the incompatibility of experts\u2019 knowledge in their respective branches. The method is applied to the diagnosis in Chinese medicine. All calculations are performed in the Entropy-driven expert system shell SPIRIT.\r\n\r\n<strong>Availability:<\/strong> <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/12\/roedder-1.pdf\" target=\"_blank\">PDF<\/a>\r\n\r\n<strong>References:\u00a0<\/strong>\r\n\r\n[1] I. Csisz\u00e1r: I-Divergence Geometry of Probability Distributions and Minimisation Problems. The Annals of Probability 3, (1): 146 - 158 (1975).\r\n\r\n[2] I. Csisz\u00e1r: Why Least Squares and Maximum Entropy? An Axiomatic Approach to Inference for Linear Inverse Problems. The Annals of Statistics 19 (4): 2032 - 2066 (1991).\r\n\r\n[3] G. Kern-Isberner: Characterising the principle of minimum cross-entropy within a conditional-logical framework. Artificial Intelligence 98: 169-208 (1998).\r\n\r\n[4] S. L. Lauritzen: Graphical Association Models (Draft), Technical Report IR 93-2001, Institute for Electronic Systems, Dept. of Mathematics and Computer Science, Aalborg University (1993).\r\n\r\n[5] W. R\u00f6dder and G. Kern-Isberner: Representation and extraction of information by probabilistic logic. Information Systems 21 (8): 637 - 652 (1996).\r\n\r\n[6] W. R\u00f6dder and C.-H. Meyer: Coherent knowledge processing at maximum entropy by SPIRIT, Proceedings 12th Conference on Uncertainty in Artificial Intelligence, E. Horitz and F. Jensen (editors), Morgan Kaufmann, San Francisco, California: 470 - 476 (1996).\r\n\r\n[7] C. C. Schnorrenberger: Lehrbuch der chinesischen Medizin f\u00fcr westliche \u00c4rzte, Hippokrates, Stuttgart (1985).\r\n\r\n[8] C. E. Shannon: A mathematical theory of communication, Bell System Tech. J. 27, 379-423 (part I), 623 - 656 (part II) (1948).\r\n\r\n[9] J. E. Shore and R. W. Johnson: Axiomatic Derivation of the Principle of Maximum Entropy and the Principle of Minimum Cross Entropy. IEEE Trans. Information Theory 26 (1): 26 - 37 (1980).\r\n\r\n[10] J. Whittaker: Graphical Models in Applied Mathematical Multivariate Statistics, John Wiley &amp; Sons (1990).\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Learned Models for Continuous Planning\"]\r\n\r\n<strong>Authors:\u00a0<\/strong>Matthew D. Schmill,\u00a0Tim Oates, and Paul R. Cohen, Experimental Knowledge Systems Laboratory at\u00a0University of Massachusetts\r\n\r\n<strong>Abstract:\u00a0<\/strong>We are interested in the nature of <em>activity<\/em>\u00a0\u2014 structured behavior of nontrivial duration \u2014\u00a0in intelligent agents. We believe that the development of activity is a continual process in which simpler activities are composed, via planning, to form more sophisticated ones in a hierarchical fashion. The success or failure of a planner depends on its models of the environment, and its ability to implement its plans in the world. We describe an approach to generating dynamical models of activity from real-world experiences and explain how they can be applied towards planning in a continuous state space.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Efficient Optimization of Large k Real-time Control Algorithm\"]\r\n\r\n<strong>Authors:\u00a0<\/strong>Delphi Delco Electronics Systems,\u00a0Restraint Systems Electronics and\u00a0Daniel H. Loughlin, North Carolina State University\r\n\r\n<strong>Abstract:\u00a0<\/strong>Resource requirements for global optimization increase dramatically with the number of real-valued decision variables (k). Efficient search strategies are needed to satisfy constraints of time, effort, and funding. In this paper, a conjunction of several disparate methods is used to automatically calibrate a non-linear real-time control used in the automotive industry. By combining a response surface methodology with a hybrid genetic algorithm search, air-bag deployment calibrations can be automated, producing solutions superior to conventional manual search.\r\n\r\n[panel header=\"Model Folding for Data Subject to Nonresponse\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0Paola Sebastiani, The Open University and\u00a0Marco Ramoni,\u00a0Knowledge Media Institute at\u00a0The Open University\r\n\r\n<strong>Abstract:\u00a0<\/strong>In this paper we introduce a deterministic method to estimate the posterior probability of rival models from data with partially ignorable nonresponse.\u00a0 The accuracy of the method will be shown via an application to synthetic data.\r\n\r\n<strong>Other information:\u00a0<\/strong>Futher information is available from the home page of the <a href=\"http:\/\/projects.kmi.open.ac.uk\/bkd\/\" target=\"_blank\">Bayesian Knowledge Discovery<\/a> project at <a href=\"http:\/\/www.open.ac.uk\/\" target=\"_blank\">The Open University<\/a>.\r\n\r\n[panel header=\"Geometry, Moments and Bayesian Networks with Hidden Variables\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0Raffaella Settimi,\u00a0University of Warwick and\u00a0Jim Q. Smith,\u00a0University of Warwick\r\n\r\n<strong>Abstract:\u00a0<\/strong>The purpose of this paper is to present a systematic way of analysing the geometry of the probability spaces for a particular class of Bayesian networks with hidden variables. It will be shown that the conditional independence statements implicit in such graphical models can be neatly expressed as simple polynomial relationships among central moments. This algebraic framework\u00a0 will enable us to explore and identify the structural constraints on the sample space induced by\u00a0 models with tree structures and therefore characterise the families of distributions consistent with such conditional independence assumptions.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Joint probabilistic clustering of multivariate and sequential data\"]\r\n\r\n<strong>Author:<\/strong>\u00a0Padhraic Smyth,\u00a0University of California\u2013Irvine\r\n\r\nConsider the following problem. We have a set of individuals (a random sample from a larger population) whom we would like to cluster into groups based on observational data. For each individual we can measure characteristics which are relatively static (eg, their height, weight, income, age, sex, etc). Probabilistic model-based clustering in this context\r\nusually takes the form of a nite mixture model, where each component in the mixture is a multivariate probability density function (or distribution function) for a particular group. This approach has been found to be a useful general technique for extracting hidden structure from multivariate data (Ban eld and Raftery, 1993; Thiesson et al, 1997).\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Analysis of multivariate time series via a hidden graphical model\"]\r\n\r\n<strong>Authors:<\/strong>\u00a0Elena Stanghellini, Universita' di Perugia and Joe Whittaker,\u00a0Lancaster University\r\n\r\n<strong>Abstract:\u00a0<\/strong>We propose a chain graph with unobserved variables to model a multivariate time series. We assume that an underlying common trend linearly affects the observed time series, but we do not restrict our analysis to models where the underlying factor accounts for all the contemporary correlations of the series. The residual correlation is modelled using results of graphical models. Modelling the associations left unexplained is an alternative to augmenting the dimension of the underlying factor. It is justified when a clear interpretation of the residual associations is available. It is also an informative way to explore sources of deviation from standard dynamic single-factor models.\r\n\r\n[\/panel]\r\n\r\n[panel header=\"Visual design support for probabilistic network application\"]\r\n\r\n<strong>Author:\u00a0<\/strong>Axel Vogler,\u00a0Daimler Benz AG\r\n\r\n<strong>Abstract:\u00a0<\/strong>Understanding inference in probabilistic networks is an important point in the design phase. Their causal structure and locally defined parameters are intuitive to human experts. The global system induced by the local parameters can lead to results not intended by the human experts. To support network design an edge coloring scheme explaining influences between variables responsible for inference result is introduced.\r\n\r\n[\/panel]\r\n\r\n[\/accordion]"}],"msr_startdate":"1999-01-03","msr_enddate":"1999-01-06","msr_event_time":"","msr_location":"Fort Lauderdale, FL, USA","msr_event_link":"","msr_event_recording_link":"","msr_startdate_formatted":"January 3, 1999","msr_register_text":"Watch now","msr_cta_link":"","msr_cta_text":"","msr_cta_bi_name":"","featured_image_thumbnail":null,"event_excerpt":"The Seventh International Workshop on Artificial Intelligence and Statistics was presented by The Society for Artificial Intelligence & Statistics.","msr_research_lab":[],"related-researchers":[],"msr_impact_theme":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-opportunities":[],"related-publications":[],"related-videos":[],"related-posts":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event\/336896","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-event"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event\/336896\/revisions"}],"predecessor-version":[{"id":1147199,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event\/336896\/revisions\/1147199"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=336896"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=336896"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=336896"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=336896"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=336896"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=336896"},{"taxonomy":"msr-program-audience","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-program-audience?post=336896"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=336896"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=336896"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}