{"id":1031460,"date":"2024-08-16T12:50:17","date_gmt":"2024-08-16T19:50:17","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=1031460"},"modified":"2025-01-08T09:02:34","modified_gmt":"2025-01-08T17:02:34","slug":"guidance-control-lm-output","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/guidance-control-lm-output\/","title":{"rendered":"guidance | control LM output"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background  has-background- card-background--full-bleed\">\n\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1920\" height=\"720\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/guidance_header_1920x720.jpg\" class=\"attachment-full size-full\" alt=\"background pattern\" style=\"\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/guidance_header_1920x720.jpg 1920w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/guidance_header_1920x720-300x113.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/guidance_header_1920x720-1024x384.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/guidance_header_1920x720-768x288.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/guidance_header_1920x720-1536x576.jpg 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/guidance_header_1920x720-1600x600.jpg 1600w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/05\/guidance_header_1920x720-240x90.jpg 240w\" sizes=\"auto, (max-width: 1920px) 100vw, 1920px\" \/>\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 \">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 class=\"wp-block-heading\" id=\"guidance\">Guidance<\/h1>\n\n\n\n<p>Control LM outputs. Reduce latency and cost.<\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<h2 class=\"wp-block-heading\" id=\"get-the-lm-output-you-need-with-a-single-prompt\">Get the LM output you need with a single prompt<\/h2>\n\n\n\n<p>Guidance<strong> <\/strong>is a <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/guidance-ai\/guidance\">proven open-source Python library<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> for controlling outputs of any language model (LM). In only one API call, developers express in Python the precise programmatic constraints the model must follow for structured output in JSON, Python, HTML, SQL, whatever the use case requires. <\/p>\n\n\n\n<h5 class=\"wp-block-heading has-text-align-left\" id=\"the-result-100-guaranteed-output-structure-with-30-50-reduction-in-latency-and-costs\">The result: 100% guaranteed output structure\u2014with 30\u201350% reduction in latency and costs.<\/h5>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-fill-github\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/github.com\/guidance-ai\/guidance\" target=\"_blank\" rel=\"noreferrer noopener\">Get started with {guidance}<\/a><\/div>\n<\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"steering-the-model-token-by-token\">Steering the model token by token <\/h3>\n\n\n\n<p>Guidance works with most open-source LMs that can be hosted locally. Fundamentally different from conventional prompting techniques, Guidance enforces constraints by steering the model token by token in the inference layer to deliver accurate outputs. No need for expensive retries or fine-tuning. The Guidance advantage includes:<\/p>\n\n\n\n<div style=\"height:16px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<h4 class=\"wp-block-heading\" id=\"cost-savings-1\">Cost savings<\/h4>\n\n\n\n<p>Save significantly on runtime, while accelerating inference. In contrast to prompt chaining, Guidance programs are a single API call. A Guidance program batches\u2014instead of generating\u2014any additional text that is added by the user as execution unrolls. <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/guidance-ai\/guidance?tab=readme-ov-file#guidance-acceleration\" target=\"_blank\" rel=\"noopener noreferrer\">See example<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<h4 class=\"wp-block-heading\" id=\"flexibility-1\">Flexibility<\/h4>\n\n\n\n<p>Get structured LM output in any specified format. Guidance is uniquely flexible compared with alternative technologies, enabling developers to constrain outputs to JSON, Python, HTML, SQL, whatever is required. Enforce other constraints with <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/guidance-ai\/guidance?tab=readme-ov-file#select-basic\" target=\"_blank\" rel=\"noopener noreferrer\">select<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> (i.e., a set of options), <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/guidance-ai\/guidance?tab=readme-ov-file#regular-expressions\" target=\"_blank\" rel=\"noopener noreferrer\">regular expressions<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/guidance-ai\/guidance?tab=readme-ov-file#context-free-grammars\" target=\"_blank\" rel=\"noopener noreferrer\">context-free grammars<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<h4 class=\"wp-block-heading\" id=\"elegant-workflow\">Elegant workflow<\/h4>\n\n\n\n<p>Write constraints in pure Python and Guidance enforces syntax for a smooth developer workflow. The Guidance interface and library functionality are designed to reduce developer pain. <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/guidance-ai\/guidance?tab=readme-ov-file#call-and-deploy-tools-easily-with-automatic-interleaving-of-control-and-generation\" target=\"_blank\" rel=\"noopener noreferrer\">Call and deploy tools easily<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. Access rich templates with f-strings and prebuilt components (e.g., substrings). <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/guidance-ai\/guidance?tab=readme-ov-file#features\" target=\"_blank\" rel=\"noopener noreferrer\">More &#8230;<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<\/div>\n<\/div>\n\n\n\n<div class=\"wp-block-media-text has-video  has-vertical-margin-small  has-vertical-padding-none  is-stacked-on-mobile is-style-border\"><figure class=\"wp-block-media-text__media video-wrapper\"><div class=\"yt-consent-placeholder\" role=\"region\" aria-label=\"Video playback requires cookie consent\" data-video-id=\"742R2gPvZAQ\" data-poster=\"https:\/\/img.youtube.com\/vi\/742R2gPvZAQ\/maxresdefault.jpg\"><iframe class=\"media-text__video\" data-src=\"https:\/\/www.youtube-nocookie.com\/embed\/742R2gPvZAQ?enablejsapi=1&rel=0\" frameborder=\"0\" allow=\"accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen aria-hidden=\"true\" tabindex=\"-1\"><\/iframe><div class=\"yt-consent-placeholder__overlay\"><button class=\"yt-consent-placeholder__play\"><svg width=\"42\" height=\"42\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" aria-hidden=\"true\" focusable=\"false\"><g fill=\"none\" fill-rule=\"evenodd\"><circle fill=\"#000\" opacity=\".556\" cx=\"21\" cy=\"21\" r=\"21\"\/><path stroke=\"#FFF\" d=\"M27.5 22l-12 8.5v-17z\"\/><\/g><\/svg><span class=\"yt-consent-placeholder__label\">Video playback requires cookie consent<\/span><\/button><\/div><\/div><\/figure><div class=\"wp-block-media-text__content\">\n<h3 class=\"wp-block-heading\" id=\"building-lm-powered-apps\">Building LM-powered apps?<\/h3>\n\n\n\n<p><strong>Learn in this video how Guidance works to give you unprecedented control of LM outputs.<\/strong><\/p>\n\n\n\n<p>Developers love Guidance. With more than 19K GitHub stars, Guidance has a thriving community of developers and researchers.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-fill-github\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/github.com\/guidance-ai\/guidance\" target=\"_blank\" rel=\"noreferrer noopener\">Get started with {guidance}<\/a><\/div>\n<\/div>\n<\/div><\/div>\n\n\n","protected":false},"excerpt":{"rendered":"<p>Guidance | Control LM outputs. Reduce latency and cost. Guidance is a proven open-source Python library for controlling outputs of any language model (LM). In only one API call, developers express in Python the precise programmatic constraints the model must follow for structured output in JSON, Python, HTML, SQL, whatever the use case requires. The result: 100% guaranteed output structure\u2014with 30\u201350% reduction in latency and costs.<\/p>\n","protected":false},"featured_media":1031505,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13556],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-1031460","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[{"type":"user_nicename","display_name":"Harsha Nori","user_id":41461,"people_section":"Section name 0","alias":"hanori"},{"type":"user_nicename","display_name":"Ben Zorn","user_id":35154,"people_section":"Section name 0","alias":"zorn"},{"type":"user_nicename","display_name":"Dean Carignan","user_id":40087,"people_section":"Section name 0","alias":"dcarig"},{"type":"user_nicename","display_name":"Ece Kamar","user_id":31710,"people_section":"Section name 0","alias":"eckamar"},{"type":"user_nicename","display_name":"Emre Kiciman","user_id":31739,"people_section":"Section name 0","alias":"emrek"},{"type":"user_nicename","display_name":"Eric Horvitz","user_id":32033,"people_section":"Section name 0","alias":"horvitz"},{"type":"user_nicename","display_name":"Forough Poursabzi","user_id":40264,"people_section":"Section name 0","alias":"fpoursabzi"},{"type":"guest","display_name":"Jingya Chen","user_id":767776,"people_section":"Section name 0","alias":""},{"type":"user_nicename","display_name":"Madan Musuvathi","user_id":32766,"people_section":"Section name 0","alias":"madanm"},{"type":"user_nicename","display_name":"Mihaela Vorvoreanu","user_id":36804,"people_section":"Section name 0","alias":"mivorvor"},{"type":"guest","display_name":"Nicholas King","user_id":761683,"people_section":"Section name 0","alias":""},{"type":"user_nicename","display_name":"Paul Koch","user_id":33207,"people_section":"Section name 0","alias":"paulkoch"},{"type":"user_nicename","display_name":"Richard Edgar","user_id":43326,"people_section":"Section name 0","alias":"riedgar"},{"type":"guest","display_name":"Xavier Fernandes","user_id":761638,"people_section":"Section name 0","alias":""}],"msr_research_lab":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1031460","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":39,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1031460\/revisions"}],"predecessor-version":[{"id":1116252,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1031460\/revisions\/1116252"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1031505"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1031460"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1031460"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1031460"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1031460"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=1031460"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}