{"id":169805,"date":"2001-11-05T12:17:42","date_gmt":"2001-11-05T12:17:42","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/project\/support-vector-machines\/"},"modified":"2019-08-14T14:33:07","modified_gmt":"2019-08-14T21:33:07","slug":"support-vector-machines","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/support-vector-machines\/","title":{"rendered":"Support Vector Machines"},"content":{"rendered":"<p class=\"asset-content\">Support vector machines are a set of algorithms that learn from data by creating models that maximize their margin of error.<!-- .asset-content --><\/p>\n<p><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/en.wikipedia.org\/wiki\/Support_Vector_Machines\" target=\"_blank\" rel=\"noopener noreferrer\">Support vector machines<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>\u00a0(SVMs) are a family of algorithms for\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/en.wikipedia.org\/wiki\/Statistical_classification\" target=\"_blank\" rel=\"noopener noreferrer\">classification<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/en.wikipedia.org\/wiki\/Regression_analysis\" target=\"_blank\" rel=\"noopener noreferrer\">regression<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/en.wikipedia.org\/wiki\/Transduction_(machine_learning)\" target=\"_blank\" rel=\"noopener noreferrer\">transduction<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/en.wikipedia.org\/wiki\/Novelty_detection\" target=\"_blank\" rel=\"noopener noreferrer\">novelty detection<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, and\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/en.wikipedia.org\/wiki\/Semi-supervised_learning\" target=\"_blank\" rel=\"noopener noreferrer\">semi-supervised learning<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. They work by choosing a model that\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/en.wikipedia.org\/wiki\/Maximum-margin_hyperplane\" target=\"_blank\" rel=\"noopener noreferrer\">maximizes the error margin<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> of a training set.<\/p>\n<p>SVMs\u00a0were originally developed by\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/en.wikipedia.org\/wiki\/Vladimir_Vapnik\" target=\"_blank\" rel=\"noopener noreferrer\">Vladimir Vapnik<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> in 1963. Since the mid-90s, a energetic research community has grown around them. If you want to learn more about SVMs, you can read Chris Burges&#8217; <a href=\"http:\/\/\/CITESeerx.ist.psu.edu\/viewdoc\/summary?doi=10.1.1.18.1083\" target=\"_self\" rel=\"noopener noreferrer\">tutorial<\/a>.\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.support-vector.net\/nello.html\" target=\"_blank\" rel=\"noopener noreferrer\">Nello Cristianini<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.cs.ucl.ac.uk\/staff\/J.Shawe-Taylor\/\" target=\"_blank\" rel=\"noopener noreferrer\">John Shawe-Taylor<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> have written\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.support-vector.net\/\" target=\"_blank\" rel=\"noopener noreferrer\">a textbook<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> about them.\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.kyb.tuebingen.mpg.de\/~bs\" target=\"_blank\" rel=\"noopener noreferrer\">Bernhard Sch\u00f6lkopf<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/alex.smola.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Alex Smola<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> wrote <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/mitpress.mit.edu\/catalog\/item\/default.asp?ttype=2&tid=8684\" target=\"_blank\" rel=\"noopener noreferrer\">a textbook about kernel methods<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, which are a closely-related set of methods.<\/p>\n<p>Since 1998, we&#8217;ve done basic research into making SVMs be more user-friendly. Our research has resulted in:<\/p>\n<ul>\n<li>SMO: <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/sequential-minimal-optimization-a-fast-algorithm-for-training-support-vector-machines\/\">A fast algorithm for training SVMs from data<\/a>, which is easy to understand and code.<\/li>\n<li>A method for\u00a0calibrating the output of an SVM to yield probabilities.<\/li>\n<li>A simple method to convert a multi-class problem into a series of faster-to-solve two-class SVMs<\/li>\n<li>A method to apply\u00a0SVMs to find unusual items in a training set (novelty detection).<\/li>\n<li>An online approximation to SVMs.<\/li>\n<\/ul>\n<p>See the list of publications, below, for complete citations.<\/p>\n\t<div data-wp-context='{\"items\":[]}' data-wp-interactive=\"msr\/accordion\">\n\t\t\t\t\t<div class=\"clearfix\">\n\t\t\t\t<div\n\t\t\t\t\tclass=\"btn-group align-items-center mb-g float-sm-right\"\n\t\t\t\t\tdata-bi-aN=\"accordion-collapse-controls\"\n\t\t\t\t>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Expand all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllExpanded\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onExpandAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tExpand all\t\t\t\t\t<\/button>\n\t\t\t\t\t<span aria-hidden=\"true\"> | <\/span>\n\t\t\t\t\t<button\n\t\t\t\t\t\tclass=\"btn btn-link m-0\"\n\t\t\t\t\t\tdata-bi-cN=\"Collapse all\"\n\t\t\t\t\t\tdata-wp-bind--aria-controls=\"state.ariaControls\"\n\t\t\t\t\t\tdata-wp-bind--aria-expanded=\"state.ariaExpanded\"\n\t\t\t\t\t\tdata-wp-bind--disabled=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-class--inactive=\"state.isAllCollapsed\"\n\t\t\t\t\t\tdata-wp-on--click=\"actions.onCollapseAll\"\n\t\t\t\t\t\ttype=\"button\"\n\t\t\t\t\t>\n\t\t\t\t\t\tCollapse all\t\t\t\t\t<\/button>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t\t\t<ul class=\"msr-accordion\">\n\t\t\t\t\t\t\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-2\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-2\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-1\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tData sets and software\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-1\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-2\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p>The real-world data sets described in the technical report (below) are available in a compressed ASCII format (zip format). Both the\u00a0adult data and the\u00a0web data are available. There is a readme.txt file in each zip archive that explains the format of the file. The testing set for the adult data, the testing set for the web data set, and the\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/yann.lecun.com\/exdb\/mnist\/\" target=\"_blank\" rel=\"noopener noreferrer\">MNIST data set<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> is also available.<\/p>\n<p>MSR currently does not have any software that implements SVMs.\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.csie.ntu.edu.tw\/~cjlin\/libsvm\/\" target=\"_blank\" rel=\"noopener noreferrer\">LIBSVM<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> is a popular package that is based on a SMO-like algorithm.<\/p>\n<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/sequential-minimal-optimization-a-fast-algorithm-for-training-support-vector-machines\/\">Check here<\/a>\u00a0for errata on the SMO &#8220;Fast training&#8221;\u00a0physical\u00a0paper (already corrected in the on-line version).<\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-4\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-4\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-3\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tRelated external publications\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-3\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-4\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/keerthis.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">Sathiya Keerthi<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>\u00a0and colleagues have a\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/www.keerthis.com\/smoreg_ieee_shevade_00.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">paper<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> that describes an improved SMO: instead of updating a single threshold, they update the bounds on permissible thresholds. They report substantial improvement in speed, especially for extreme C values.<\/p>\n<p><a href=\"http:\/\/www.microsoft.com\/presspass\/exec\/techfellow\/Flake\/default.mspx\" target=\"_self\" rel=\"noopener noreferrer\">Gary Flake<\/a>\u00a0and Steve Lawrence have an efficient SMO algorithm for <a href=\"http:\/\/\/CITESeerx.ist.psu.edu\/viewdoc\/summary?doi=10.1.1.28.5344\" target=\"_self\" rel=\"noopener noreferrer\">Support Vector Regression<\/a>.<\/p>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t<li class=\"m-0\" data-wp-context='{\"id\":\"accordion-content-6\"}' data-wp-init=\"callbacks.init\">\n\t\t<div class=\"accordion-header\">\n\t\t\t<button\n\t\t\t\taria-controls=\"accordion-content-6\"\n\t\t\t\tclass=\"btn btn-collapse\"\n\t\t\t\tdata-wp-bind--aria-expanded=\"state.isExpanded\"\n\t\t\t\tdata-wp-on--click=\"actions.onClick\"\n\t\t\t\tid=\"accordion-button-5\"\n\t\t\t\ttype=\"button\"\n\t\t\t>\n\t\t\t\tCorrections\t\t\t<\/button>\n\t\t<\/div>\n\t\t<div\n\t\t\taria-labelledby=\"accordion-button-5\"\n\t\t\tclass=\"msr-accordion__content\"\n\t\t\tdata-wp-bind--inert=\"!state.isExpanded\"\n\t\t\tdata-wp-run=\"callbacks.run\"\n\t\t\tid=\"accordion-content-6\"\n\t\t>\n\t\t\t<div class=\"msr-accordion__body\">\n\t\t\t\t<p>There are some unclear ideas and errors in the \u201cFast Training\u201d paper that I (John Platt) would like to clarify on this web page:<\/p>\n<ul>\n<li>Equations (12.9) and (12.10) in the paper are for when the decision function (1.7) subtracts b, not adds it.<\/li>\n<li>Equation (12.18) has a sign error: the plus should be a minus.<\/li>\n<li>The sentence after (12.24) should read \u201cis negative\u201d not \u201cis positive\u201d<\/li>\n<li>In the pseudo-code, after the \u201cif (eta < 0)\u201d statement, right before the \u201cif (|a2-alph2| < \u2026)\u201d statement, a2 should be checked to see if it is within 1e-8 of 0 or C. If so, then a2 should be set to the nearest bound. This prevents round-off error from mistakenly forcing a point to be a support vector. Richard Lippmann points out that this value of 1e-8 is appropriate when inputs have approximately unit variance. Therefore, it is a good idea to scale your inputs (or kernel outputs) so that they are on the order of unity.<\/li>\n<\/ul>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/li>\n\t\t\t\t\t\t<\/ul>\n\t<\/div>\n\t\n","protected":false},"excerpt":{"rendered":"<p>Support vector machines are a set of algorithms that learn from data by creating models that maximize their margin of error. Support vector machines\u00a0(SVMs) are a family of algorithms for\u00a0classification,\u00a0regression,\u00a0transduction, novelty detection, and\u00a0semi-supervised learning. They work by choosing a model that\u00a0maximizes the error margin of a training set. SVMs\u00a0were originally developed by\u00a0Vladimir Vapnik in 1963. [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13561,13556,13555],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-169805","msr-project","type-msr-project","status-publish","hentry","msr-research-area-algorithms","msr-research-area-artificial-intelligence","msr-research-area-search-information-retrieval","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"2001-11-05","related-publications":[144958,151099,151241,152322,152409],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"related-researchers":[],"msr_research_lab":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/169805","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":3,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/169805\/revisions"}],"predecessor-version":[{"id":603543,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/169805\/revisions\/603543"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=169805"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=169805"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=169805"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=169805"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=169805"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}