{"id":654579,"date":"2020-07-22T11:23:08","date_gmt":"2020-07-22T18:23:08","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=654579"},"modified":"2022-07-22T10:31:53","modified_gmt":"2022-07-22T17:31:53","slug":"prose-framework","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/prose-framework\/","title":{"rendered":"PROSE Framework"},"content":{"rendered":"<h2>Microsoft Program Synthesis using Examples (PROSE)<\/h2>\n<p>Microsoft PROSE is a framework of technologies for programming by examples: automatic generation of programs from input-output examples.<\/p>\n<p>Given a domain-specific language (DSL) and input-output examples for the desired program\u2019s behavior, PROSE synthesizes a ranked set of DSL programs that are consistent with the examples.<\/p>\n<div style=\"height: 30px\"><\/div>\n<p>\t\t\t<div class=\"ms-grid \">\n\t\t\t<div class=\"ms-row\">\n\t\t\t\t\t<div  class=\"m-col-8-24\" >\n\t\t<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.nuget.org\/packages\/Microsoft.ProgramSynthesis\/\" target=\"_blank\" rel=\"noopener noreferrer\"><strong>Download NuGet package<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><p>NuGet package for automatically synthesizing programs from examples.<\/p><p>\t<\/div>\n\t \t<div  class=\"m-col-8-24\" >\n\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/prose-framework\/#!tutorial\"><strong>Tutorial<\/strong><\/a><\/p><p>A step-by-step walkthrough of the process of building a DSL in PROSE and enabling program synthesis for it.<\/p><p>\t<\/div>\n\t \t<div  class=\"m-col-8-24\" >\n\t\t<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/prose\" target=\"_blank\" rel=\"noopener noreferrer\"><strong>API and samples<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p><p>If you want to apply an existing PROSE DSL, check out its documentation and our <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/prose\" target=\"_blank\" rel=\"noopener noreferrer\">samples<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p><p>\t<\/div>\n\t<\/p>\t\t\t<\/div>\n\t\t<\/div>\n\t\t<br \/>\n\t\t\t<div class=\"ms-grid \">\n\t\t\t<div class=\"ms-row\">\n\t\t\t\t\t<div  class=\"m-col-8-24\" >\n\t\t<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/prose\/tree\/master\/release-notes\" target=\"_blank\" rel=\"noopener noreferrer\"><strong>Release notes<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><p>Read the release notes for the SDK.<\/p><p>\t<\/div>\n\t \t<div  class=\"m-col-8-24\" >\n\t\t<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/Microsoft\/prose\/issues\/new\" target=\"_blank\" rel=\"noopener noreferrer\"><strong>GitHub<\/strong><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p><p>For questions,\u00a0<a href=\"mailto:prose-contact@microsoft.com\">email us<\/a>\u00a0or\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/Microsoft\/prose\/issues\/new\" target=\"_blank\" rel=\"noopener noreferrer\">open an issue on GitHub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p><p>\t<\/div>\n\t\t<div  class=\"m-col-8-24\" >\n\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/prose-framework\/#!faq\"><strong>Datasets<\/strong><\/a><\/p><p>Get datasets for some DSLs in our SDK.<\/p><p>\t<\/div>\n\t<\/p>\t\t\t<\/div>\n\t\t<\/div>\n\t\t<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Microsoft PROSE SDK is a framework of technologies for programming by examples: automatic generation of programs from input-output examples.<\/p>\n","protected":false},"featured_media":674232,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13545,13560],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-654579","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-human-language-technologies","msr-research-area-programming-languages-software-engineering","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[453858,564657,641625,335339,335864],"related-downloads":[],"related-videos":[],"related-groups":[663303],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[{"id":0,"name":"Tutorial","content":"<h2>Tutorial<\/h2>\r\nThe core component of the PROSE SDK is its program synthesis framework for custom domain-specific languages (DSLs). It allows you to define a DSL that describes a typical space of tasks in your application domain, and automatically provides parsing, execution, and synthesis technologies for this DSL. <a href=\"https:\/\/microsoft.github.io\/prose\/documentation\/transformation-text\/intro\">Text transformation<\/a> and <a href=\"https:\/\/microsoft.github.io\/prose\/documentation\/extraction-text\/intro\">text extraction<\/a> DSLs are programming-by-example technologies that have been developed on top of the PROSE core synthesis framework.\r\n\r\nThis is a step-by-step tutorial on using the PROSE framework to <em>create new DSLs<\/em>. To use an existing DSL in your programming-by-examples application, please check out its individual documentation link on the left.\r\n<h3>DSL Structure<\/h3>\r\nA DSL consists of four main components:\r\n<ol>\r\n \t<li><strong>Syntax<\/strong> \u2013 a <em>context-free grammar<\/em> describing a space of possible programs in a DSL.<\/li>\r\n \t<li><strong>Semantics<\/strong> \u2013 an executable function for each user-defined DSL operator.<\/li>\r\n \t<li>[optional] <strong>Witness functions<\/strong> \u2013 small \u201cinverse semantics\u201d functions that enable <a href=\"https:\/\/github.com\/microsoft\/prose\/blob\/140fd35772586272d24dd31a77bd06993d483cd6\/documentation\/prose\/backpropagation.md\">deductive backpropagation<\/a>, the main synthesis technology provided in the PROSE framework.<\/li>\r\n \t<li>[optional] <strong>Features<\/strong> \u2013 computed attributes on individual programs in the DSL (for instance, a syntactic score of a program for ranking purposes).<\/li>\r\n<\/ol>\r\nBelow, we illustrate the usage of all four components on a small example DSL \u2013 a portion of FlashFill.\r\n\r\n<hr style=\"border-top: 1px solid #008272\" \/>\r\n\r\nSee sections on: <strong>Syntax<\/strong> | <a href=\"#semantics\">Semantics<\/a> | <a href=\"#witnessfunctions\">Witness functions<\/a> | <a href=\"#features\">Features<\/a>\r\n\r\n<hr style=\"border-top: 1px solid #008272\" \/>\r\n\r\n<div id=\"syntax\" style=\"height: 30px\"><\/div>\r\n<h4>Syntax<\/h4>\r\nOur example DSL describes a space of programs that extract a substring from a given string. They can do it in two possible ways \u2013 either extract a substring based on absolute position indices, or based on matches of regular expressions.\r\n\r\nHere\u2019s the first version of the grammar of our DSL:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\n#reference 'file:SubstringExtraction.dll';\r\nusing semantics SubstringExtraction.Semantics;\r\nlanguage SubstringExtraction;\r\n\r\n@input string x;\r\n\r\n\/\/ Extract a substring from 'x' between positions 'posPair'\r\n@start string out := Substring(x, posPair);\r\nTuple &lt;int?, int?&gt; posPair := PositionPair(pos, pos)\r\n    = Pair(pos, pos);\r\nint? pos := \/\/ A position at index 'k' (from the left if k &gt;= 0, or from the right if k &lt; 0)\r\n    AbsolutePosition(x, k)\r\n\t\/\/ A position where two regexes 'positionBoundaries' match to left and to the right,\r\n\t\/\/ respectively, and it is the 'k'-th such position\r\n\t| RegexPosition(x, positionBoundaries, k);\r\nTuple &lt;Regex, Regex&gt; positionBoundaries := RegexPair(r, r)\r\n    = Pair(r, r);\r\n\t\r\nRegex r;\r\nint k;\r\n[\/code]\r\n\r\nHere are some possible extraction programs contained in the <code class=\"language-plaintext highlighter-rouge\">SubstringExtraction<\/code> DSL:\r\n<ul>\r\n \t<li>First 5 characters:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nSubstring(x, PositionPair(AbsolutePosition(x, 0), AbsolutePosition(x, 5)))\r\n[\/code]\r\n\r\n<\/li>\r\n \t<li>Last character:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nSubstring(x, PositionPair(AbsolutePosition(x, -2), AbsolutePosition(x, -1)))[\/code]\r\n\r\n<\/li>\r\n \t<li>A substring from the start until (but not including) the last number<sup id=\"fnref:regex\"><a class=\"footnote\" href=\"regex\">1<\/a><\/sup>:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nSubstring(x, PositionPair(AbsolutePosition(x, 0), RegexPosition(x, RegexPair(\/\/, \/\\d+\/), -1))[\/code]\r\n\r\n<\/li>\r\n \t<li>A substring between the first pair of parentheses:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nSubstring(x, PositionPair(RegexPosition(x, RegexPair(\/\\(\/, \/\/), 0), RegexPosition(x, RegexPair(\/\/, \/\\)\/), 0)))[\/code]\r\n\r\nNote that this program will not extract the desired content if <em>x<\/em> contains unbalanced parentheses (for instance, <em>x<\/em> = <code class=\"language-plaintext highlighter-rouge\">\"a) Bread ($2.00)\"<\/code>). This problem can be addressed by a language extension.<\/li>\r\n<\/ul>\r\nIn general, a DSL consists of <strong>symbols<\/strong> and <strong>operators<\/strong> upon these symbols. In a context-free grammar, a DSL is represented as a set of <em>rules<\/em>, where each symbol on the left-hand side is bound to a set of possible operators on the right-hand side that represent this symbol. Every operator must be <strong>pure<\/strong> \u2013 it should not have observable side effects. In other words, PROSE DSLs are functional \u2013 they operate upon immutable states.\r\n\r\nA <strong>program<\/strong> in a DSL transforms <strong>input<\/strong> data into <strong>output<\/strong> data. One terminal symbol in a DSL should be marked as <code class=\"language-plaintext highlighter-rouge\">@input<\/code> \u2013 this is the input variable to all programs in this DSL. One nonterminal symbol in a DSL should be marked as <code class=\"language-plaintext highlighter-rouge\">@start<\/code> \u2013 this is the root symbol for all programs in the DSL.\r\n\r\nA program is represented as an <strong>abstract syntax tree (AST)<\/strong> of the DSL operators. Each node in this tree (similarly, each DSL operator) has some invocation semantics. More formally, each AST node has a method <code class=\"language-plaintext highlighter-rouge\">Invoke<\/code>. It takes as input a <strong>state<\/strong> &amp;sigma; and returns some output.\r\n\r\nA state is a mapping from DSL variables to their values. Initially, the topmost AST node invoked on a state with a single variable binding for the DSL\u2019s <em>input variable<\/em>.\r\n\r\nHere\u2019s how you can parse and execute a program in our <code class=\"language-plaintext highlighter-rouge\">SubstringExtraction<\/code> DSL:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nusing Microsoft.ProgramSynthesis;\r\nusing Microsoft.ProgramSynthesis.AST;\r\nusing Microsoft.ProgramSynthesis.Compiler;\r\n\r\n\/\/ Parse the DSL grammar above, saved in a .grammar file\r\nvar grammar = DSLCompiler.ParseGrammarFromFile(\"SubstringExtraction.grammar\").Value;\r\n\/\/ Parse a program in this grammar. PROSE supports two serialization formats:\r\n\/\/ \"human-readable\" expression format, used in this tutorial, and machine-readable XML.\r\nvar ast =\r\n  ProgramNode.Parse(\"Substring(x, PositionPair(AbsolutePosition(x, 0), AbsolutePosition(x, 5)))\",\r\n                    grammar, ASTSerializationFormat.HumanReadable);\r\n\/\/ Create an input state to the program. It contains one binding: a variable 'x' (DSL input)\r\n\/\/ is bound to the string \"PROSE Rocks\".\r\nvar input = State.Create(grammar.InputSymbol, \"PROSE Rocks\");\r\n\/\/ Execute the program on the input state.\r\nvar output = (string) ast.Invoke(input);\r\nAssert(output == \"PROSE\");\r\n[\/code]\r\n\r\nA <code class=\"language-plaintext highlighter-rouge\">ParseGrammarFromFile<\/code> invocation returns a <code class=\"language-plaintext highlighter-rouge\">Result&amp;lt;Grammar&amp;gt;<\/code>, which either holds a valid parsed grammar in its <code class=\"language-plaintext highlighter-rouge\">Value<\/code> property, or a collection of errors\/warnings in its <code class=\"language-plaintext highlighter-rouge\">Diagnostics<\/code> property. You can quickly examine all diagnostics by calling <code class=\"language-plaintext highlighter-rouge\">result.TraceDiagnostics()<\/code>.\r\n\r\nAt this moment grammar parsing will fail since we haven\u2019t defined any execution semantics for the operators in our DSL, only its syntax. Let\u2019s fix that.\r\n<div><a href=\"#top\">Back to top &gt;<\/a><\/div>\r\n\r\n<hr \/>\r\n\r\n<div id=\"semantics\" style=\"height: 30px\"><\/div>\r\n<h4>Semantics<\/h4>\r\nAn <em>executable semantics<\/em> for an operator\u00a0<strong>F<\/strong> in PROSE is a .NET function that matches the signature of <strong>F<\/strong>. You need to provide it for every operator in your DSL that is not imported from the standard library of PROSE or another language. In our example, such operators are <code class=\"language-plaintext highlighter-rouge\">Substring<\/code>, <code class=\"language-plaintext highlighter-rouge\">AbsolutePosition<\/code>, and <code class=\"language-plaintext highlighter-rouge\">RegexPosition<\/code>.\r\n\r\nAll semantics functions should be defined as static methods. Static classes where PROSE searches for such functions (called <em>semantics holders<\/em>) are indicated in the grammar with a <code class=\"language-plaintext highlighter-rouge\">using semantics<\/code> declaration. A DSL may contain multiple <code class=\"language-plaintext highlighter-rouge\">using semantics<\/code> declarations, but each operator should correspond to exactly one semantics function with the same name and signature in one of semantics holders.\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nusing System.Collections.Generic;\r\nusing System.Linq;\r\nusing System.Text.RegularExpressions;\r\n\r\nstatic class Semantics\r\n{\r\n    static string Substring(string x, Tuple&lt;int?, int?&gt; posPair)\r\n    {\r\n        if (posPair.Item1 == null || posPair.Item2 == null)\r\n            return null;\r\n        int start = posPair.Item1.Value;\r\n        int end = posPair.Item2.Value;\r\n        if (start &lt; 0 || start &gt;= x.Length ||\r\n            end &lt; 0 || end &gt;= x.Length || end &lt; start)\r\n            return null;\r\n        return x.Substring(start, end - start);\r\n    }\r\n\r\n    static int? AbsolutePosition(string x, int k)\r\n    {\r\n        if (k &gt; x.Length || k &lt; -x.Length - 1)\r\n            return null;\r\n        return k &gt;= 0 ? k : (x.Length + k + 1);\r\n    }\r\n\r\n    static int? RegexPosition(string x, Tuple&lt;Regex, Regex&gt; regexPair, int occurrence)\r\n    {\r\n        if (regexPair.Item1 == null || regexPair.Item2 == null)\r\n            return null;\r\n        Regex left = regexPair.Item1;\r\n        Regex right = regexPair.Item2;\r\n        var rightMatches = right.Matches(x).Cast&lt;Match&gt;().ToDictionary(m =&gt; m.Index);\r\n        var matchPositions = new List&lt;int&gt;();\r\n        foreach (Match m in left.Matches(x))\r\n        {\r\n            if (rightMatches.ContainsKey(m.Index + m.Length))\r\n                matchPositions.Add(m.Index + m.Length);\r\n        }\r\n        if (occurrence &gt;= matchPositions.Count ||\r\n            occurrence &lt; -matchPositions.Count)\r\n            return null;\r\n        return occurrence &gt;= 0\r\n            ? matchPositions[occurrence]\r\n            : matchPositions[matchPositions.Count + occurrence];\r\n    }\r\n}\r\n[\/code]\r\n\r\nThis code illustrates several important points that you should keep in mind when designing a DSL:\r\n<ul>\r\n \t<li>DSL operators must be <em>total<\/em> (return a value for each possible combination of inputs) and <em>pure<\/em> (deterministic without observable side effects). An invalid input or any other exceptional situation should be handled not by throwing an exception, but by returning <code class=\"language-plaintext highlighter-rouge\">null<\/code> instead. In PROSE, <code class=\"language-plaintext highlighter-rouge\">null<\/code> is a valid value with a meaning \u201ccomputation failed\u201d.<\/li>\r\n \t<li>Semantics functions should have the same name and signature as their corresponding DSL operators. They don\u2019t have access to the current input state \u2013 if you need to access a DSL variable in scope, include it explicitly as a parameter. In our example, <code class=\"language-plaintext highlighter-rouge\">x<\/code> is a parameter for both <code class=\"language-plaintext highlighter-rouge\">AbsolutePosition<\/code> and <code class=\"language-plaintext highlighter-rouge\">RegexPosition<\/code>.<\/li>\r\n<\/ul>\r\n<blockquote><strong>Note:<\/strong> The <code class=\"language-plaintext highlighter-rouge\">dslc<\/code> grammar compiler uses reflection to find definitions of external components of a grammar, such as semantics functions. It searches over the assemblies specified with <code class=\"language-plaintext highlighter-rouge\">reference<\/code> statements in the grammar. Those assemblies must be built and present at given locations when you execute <code class=\"language-plaintext highlighter-rouge\">dslc<\/code> (in a command-line or API form). If you build your semantics functions and your grammar definition in the same solution, make sure to separate them into different projects and make the grammar project depend on the semantics project, so that the latter one is built first.\r\n\r\nIf you generate your project template using our <a href=\"https:\/\/www.npmjs.com\/package\/generator-prose\">Yeoman generator<\/a>, its projects will be pre-arranged in this fashion automatically.<\/blockquote>\r\nSyntax and semantics above constitute a minimal DSL definition. They are sufficient for our little parsing\/execution sample to work. Let\u2019s proceed now to synthesizing programs in this DSL.\r\n\r\n<hr \/>\r\n\r\n<h4>Synthesis<\/h4>\r\nPROSE comes with a default synthesis strategy which we call <a href=\"https:\/\/microsoft.github.io\/prose\/documentation\/prose\/backpropagation\">deductive backpropagation<\/a>. It also enables researchers in the field of synthesis to develop their own strategies on top of its common API. However, in this tutorial we explain how to use backpropagation for synthesis of programs in our <code class=\"language-plaintext highlighter-rouge\">SubstringExtraction<\/code> DSL.\r\n\r\nProgram synthesis starts with a specification: what do we want from a desired program? In PROSE, specifications are <em>inductive<\/em>: they specify an output of a desired program on some input state, or, more generally, some property of this output.\r\n\r\nIn this tutorial, we start with the simplest form of such a spec \u2013 <code class=\"language-plaintext highlighter-rouge\">ExampleSpec<\/code>, an input-output example. Given a spec, we invoke a learning session on it, generating a set of programs in the DSL that are consistent with the input-output examples in the spec.\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nusing Microsoft.ProgramSynthesis.Specifications;\r\nusing Microsoft.ProgramSynthesis.Learning;\r\n\r\nvar input = State.Create(grammar.InputSymbol, \"PROSE Rocks\");\r\nstring desiredOutput = \"PROSE\";\r\nvar spec = new ExampleSpec(new Dictionary&lt;State, object&gt; { [input] = desiredOutput });\r\nvar engine = new SynthesisEngine(grammar);\r\nProgramSet learned = engine.LearnGrammar(spec);\r\nAssert(learned?.Size &gt; 0);\r\n[\/code]\r\n\r\nAt this moment the learning result will be empty. The reason for this is that PROSE does not have information about your DSL to perform any kind of reasoning over it. For instance, terminal symbols <code class=\"language-plaintext highlighter-rouge\">k<\/code> and <code class=\"language-plaintext highlighter-rouge\">r<\/code> should be replaced with literal <code class=\"language-plaintext highlighter-rouge\">int<\/code> or <code class=\"language-plaintext highlighter-rouge\">Regex<\/code> constants, respectively, in each generated program. However, they are seemingly unbounded: any integer or regular expression in the world could possibly be present in a desired program, thus our search space is effectively infinite.\r\n\r\nWhat about the specification, then?\r\nAn input-output example that we provided drastically restricts the search space size. For instance, the input string <code class=\"language-plaintext highlighter-rouge\">\"PROSE Rocks\"<\/code> is 11 characters long, hence any absolute position extraction logic <code class=\"language-plaintext highlighter-rouge\">AbsolutePosition(x, k)<\/code> with\u00a0<strong>k<\/strong> &amp;gt; 12 or\u00a0<strong>k<\/strong> &amp;lt; -11 cannot be consistent with the spec. What we just did here was backwards reasoning over the DSL structure: we deduced a constraint on <code class=\"language-plaintext highlighter-rouge\">k<\/code> in a desired program from a constraint on the entire program.\r\n\r\nTo do that, we essentially <em>inverted the semantics of <code class=\"language-plaintext highlighter-rouge\">AbsolutePosition<\/code><\/em>, deducing its inputs (or their properties) given the output. In PROSE, such a procedure is called a <em>witness function<\/em>, and it is a surprisingly simple way to specify immensely powerful hints for the learning process.\r\n<div><a href=\"#top\">Back to top &gt;<\/a><\/div>\r\n\r\n<hr \/>\r\n\r\n<div id=\"witnessfunctions\" style=\"height: 30px\"><\/div>\r\n<h3>Witness Functions<\/h3>\r\nA witness function is defined for a <em>parameter<\/em> of a DSL operator. In its simplest form a witness function deduces a specification on that parameter given a specification on the entire operator. A witness function does not by itself constitute a learning algorithm (or even a substantial portion of it), it is simply a domain-specific property of some operator in your language \u2013 its inverse semantics.\r\n\r\nFor instance, the first witness function we\u2019ll write in this tutorial is defined for the parameter <code class=\"language-plaintext highlighter-rouge\">posPair<\/code> of the rule <code class=\"language-plaintext highlighter-rouge\">Substring(x, posPair)<\/code> of our <code class=\"language-plaintext highlighter-rouge\">SubstringExtraction<\/code> DSL. It takes as input an <code class=\"language-plaintext highlighter-rouge\">ExampleSpec<\/code> &amp;phi; on an output of <code class=\"language-plaintext highlighter-rouge\">Substring(x, posPair)<\/code>, and deduces a spec &amp;phi; on an output of <code class=\"language-plaintext highlighter-rouge\">posPair<\/code> subexpression that is necessary (or even better, necessary and sufficient) for the entire expression to satisfy &amp;phi;.\r\n\r\nConsider a program <code class=\"language-plaintext highlighter-rouge\">Substring(x, posPair)<\/code> that outputs <code class=\"language-plaintext highlighter-rouge\">\"PROSE\"<\/code> on a given input state { <code class=\"language-plaintext highlighter-rouge\">x<\/code> := <code class=\"language-plaintext highlighter-rouge\">\"PROSE Rocks\"<\/code> }. What could be a possible spec on <code class=\"language-plaintext highlighter-rouge\">posPair<\/code>? Clearly, we know it precisely for the given example: <code class=\"language-plaintext highlighter-rouge\">posPair<\/code>, whatever this program is, must evaluate to <code class=\"language-plaintext highlighter-rouge\">(0, 5)<\/code> because this is the only occurrence of the string <code class=\"language-plaintext highlighter-rouge\">\"PROSE\"<\/code> in the given input <code class=\"language-plaintext highlighter-rouge\">\"PROSE Rocks\"<\/code>.\r\n\r\nIn a more complex example, however, there is no single answer. For instance, suppose <em>x<\/em> = <code class=\"language-plaintext highlighter-rouge\">\"(555) 279-2261\"<\/code>, and the corresponding desired output in a spec is <code class=\"language-plaintext highlighter-rouge\">\"2\"<\/code>. In this case, the substring <code class=\"language-plaintext highlighter-rouge\">\"2\"<\/code> could have been extracted from 3 different places in the input string. Therefore, instead of <em>witnessing<\/em> a single output value for <code class=\"language-plaintext highlighter-rouge\">posPair<\/code> on a given input, in this case we witness a <em>disjunction<\/em> of three possible output values. A disjunction of possible outputs has its own representative spec type in PROSE \u2013 <code class=\"language-plaintext highlighter-rouge\">DisjunctiveExamplesSpec<\/code>.\r\n\r\nThe two cases above lead us to a general definition of a witness function for <code class=\"language-plaintext highlighter-rouge\">posPair<\/code>: find all occurrences of the desired output string in the input, and return a disjunction of them. In PROSE, you express it in the following way:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nusing Microsoft.ProgramSynthesis;\r\nusing Microsoft.ProgramSynthesis.Specifications;\r\nusing Microsoft.ProgramSynthesis.Learning;\r\n\r\nclass WitnessFunctions : DomainLearningLogic\r\n{\r\n    public WitnessFunctions(Grammar grammar) : base(grammar) { }  \r\n  \r\n    [WitnessFunction(\"Substring\", parameterIndex: 1)]\r\n    DisjunctiveExamplesSpec WitnessPositionPair(GrammarRule rule, ExampleSpec spec)\r\n    {\r\n        var result = new Dictionary&lt;State, IEnumerable&lt;object&gt;&gt;();\r\n        foreach (var example in spec.Examples)\r\n        {\r\n            State inputState = example.Key;\r\n            \/\/ the first parameter of Substring is the variable symbol 'x'\r\n            \/\/ we extract its current bound value from the given input state\r\n            var x = (string) inputState[rule.Body[0]];\r\n            var substring = (string) example.Value;\r\n            var occurrences = new List&lt;Tuple&lt;int?, int?&gt;&gt;();\r\n            \/\/ Iterate over all occurrences of 'substring' in 'x',\r\n            \/\/ and add their position boundaries to the list of possible outputs for posPair.\r\n            for (int i = x.IndexOf(substring);\r\n                 i &gt;= 0;\r\n                 i = x.IndexOf(substring, i + 1))\r\n            {\r\n                occurrences.Add(Tuple.Create((int?) i, (int?) i + substring.Length));\r\n            }\r\n            if (occurrences.Count == 0) return null;\r\n            result[inputState] = occurrences;\r\n        }\r\n        return new DisjunctiveExamplesSpec(result);\r\n    }\r\n}\r\n[\/code]\r\n\r\nWe put this witness function in a class called, for instance, <code class=\"language-plaintext highlighter-rouge\">SubstringExtraction.WitnessFunctions<\/code>. Such a class holds <em>domain learning logic<\/em> \u2013 all hints and annotations that a DSL designer wants to provide to help the PROSE synthesis engine.\r\n<blockquote><code class=\"language-plaintext highlighter-rouge\">WitnessFunctions<\/code> inherits from <code class=\"language-plaintext highlighter-rouge\">Microsoft.ProgramSynthesis.Learning.DomainLearningLogic<\/code>, and must override a constructor to provide its base class with the grammar of the DSL it is defined for. Usually it\u2019s easiest to supply this grammar to the constructor of <code class=\"language-plaintext highlighter-rouge\">WitnessFunctions<\/code> since it is not accessible from within the <code class=\"language-plaintext highlighter-rouge\">WitnessFunction<\/code> \u2013 the grammar definition depends on domain learning logic, not the other way around.<\/blockquote>\r\nA domain learning logic is specified in the grammar file similarly to a semantics holder, with its own statement:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nusing learners SubstringExtraction.WitnessFunctions;\r\n[\/code]\r\n\r\nSimilarly to semantics, a grammar may contain multiple <code class=\"language-plaintext highlighter-rouge\">using learners<\/code> statements.\r\n<h4>Writing witness functions<\/h4>\r\nSome important points on writing witness functions:\r\n<ul>\r\n \t<li>A witness function is defined for one parameter of an operator, not the entire operator.<\/li>\r\n \t<li>A witness function takes as input a spec on the output of <em>the entire operator expression<\/em>, and outputs a spec on the output of <em>one parameter program in that expression<\/em>.<\/li>\r\n \t<li>Ideally, the spec produced by a witness function is necessary and sufficient to satisfy the given outer spec. You can also write an <em>imprecise<\/em> witness function, whose produced spec is only necessary for the outer spec to hold (in other words, it is an <em>overapproximation<\/em>). Such a witness function says \u201cI cannot constrain this parameter precisely, but I can narrow down the space of possibilities. All valid parameter programs should satisfy my produced spec, but there may be some invalid ones that also satisfy it.\u201d\r\nTo mark a witness function as imprecise, add a property <code class=\"language-plaintext highlighter-rouge\">Verify = true<\/code> to its <code class=\"language-plaintext highlighter-rouge\">[WitnessFunction]<\/code> attribute.<\/li>\r\n \t<li>You don\u2019t need to define witness functions for parameters that are grammar variables in a state (such as <code class=\"language-plaintext highlighter-rouge\">x<\/code>). More generally, you don\u2019t need to define witness functions for a parameter\u00a0<strong>p<\/strong> if all DSL programs that may derive from\u00a0<strong>p<\/strong> do not include any constant literals.<\/li>\r\n \t<li>You don\u2019t need to define witness functions for operators from the standard library (with some exceptions).<\/li>\r\n \t<li>Returning <code class=\"language-plaintext highlighter-rouge\">null<\/code> from a witness function means \u201cThe given spec is inconsistent, no program can possibly satisfy it.\u201d<\/li>\r\n \t<li><strong>Hint:<\/strong> use C# <code class=\"language-plaintext highlighter-rouge\">nameof<\/code> operator with your <code class=\"language-plaintext highlighter-rouge\">Semantics<\/code> implementations to provide an operator name in the <code class=\"language-plaintext highlighter-rouge\">[WitnessFunction]<\/code> attribute.<\/li>\r\n<\/ul>\r\n<h4>Absolute positions<\/h4>\r\nCovering more DSL operators with witness functions is straightforward. The next one witnesses <code class=\"language-plaintext highlighter-rouge\">k<\/code> in the <code class=\"language-plaintext highlighter-rouge\">AbsolutePosition<\/code> operator.\r\n\r\nGiven an example of position <strong>l<\/strong> that <code class=\"language-plaintext highlighter-rouge\">AbsolutePosition(x, k)<\/code> produced, <code class=\"language-plaintext highlighter-rouge\">k<\/code> must have been one of two options: the offset of <strong>l<\/strong> from the left or from the right in <code class=\"language-plaintext highlighter-rouge\">x<\/code>. We can apply similar logic if we are given not one position <strong>l<\/strong> but a disjunction of them: the witness function just enumerates over each option, collecting all possible values of <code class=\"language-plaintext highlighter-rouge\">k<\/code>.\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\n[WitnessFunction(\"AbsolutePosition\", 1)\r\nDisjunctiveExamplesSpec WitnessK(GrammarRule rule, DisjunctiveExamplesSpec spec)\r\n{\r\n    var result = new Dictionary&lt;State, IEnumerable&lt;object&gt;&gt;();\r\n    foreach (var example in spec.DisjunctiveExamples)\r\n    {\r\n        State inputState = example.Key;\r\n        var ks = new HashSet&lt;int?&gt;();\r\n        var x = (string) inputState[rule.Body[0]];\r\n        foreach (int? pos in example.Value)\r\n        {\r\n            ks.Add(pos);\r\n            ks.Add(pos - x.Length - 1);\r\n        }\r\n        if (ks.Count == 0) return null;\r\n        result[inputState] = ks.Cast&lt;object&gt;();\r\n    }\r\n    return new DisjunctiveExamplesSpec(result);\r\n}[\/code]\r\n\r\n<h4>Regex-based positions<\/h4>\r\nOperator <code class=\"language-plaintext highlighter-rouge\">RegexPosition<\/code> needs two witness functions: one for its <code class=\"language-plaintext highlighter-rouge\">rr<\/code> parameter and one for its <code class=\"language-plaintext highlighter-rouge\">k<\/code> parameter. For the first one, we need to learn a list of regular expressions that match to the left and to the right of given position. There are many techniques for doing that; in this tutorial, we assume that we have a predefined list of \u201ccommon\u201d regexes like <code class=\"language-plaintext highlighter-rouge\">\/[0-9]+\/<\/code>, and enumerate them exhaustively at a given position.\r\n<blockquote><strong>Note:<\/strong> to make the semantics of <code class=\"language-plaintext highlighter-rouge\">RegexPosition(x, rr, k)<\/code> and its witness functions consistent, we need to agree on what it means for a regex to \u201cmatch\u201d at a given position. If we take a standard definition of \u201cmatching\u201d, and simply test each regex at each position, we may run into problems when determining the corresponding <code class=\"language-plaintext highlighter-rouge\">k<\/code> for each regex.\r\n\r\nConsider a string <code class=\"language-plaintext highlighter-rouge\">x<\/code> = <code class=\"language-plaintext highlighter-rouge\">\"abc def\"<\/code>. We would like a program <code class=\"language-plaintext highlighter-rouge\">RegexPosition(x, RegexPair(\/\/, \/[a-z]+\/), 1)<\/code> to match before a second word in <code class=\"language-plaintext highlighter-rouge\">x<\/code> \u2013 in this case, at position \u21164. However, for that we need to assume <em>non-overlapping<\/em> semantics of regex matches, since the regex <code class=\"language-plaintext highlighter-rouge\">\/[a-z]+\/<\/code> also matches at positions \u21160, \u21161, and \u21162. In fact, there are 6 matches of this regex in <code class=\"language-plaintext highlighter-rouge\">x<\/code>, but only two \u201cwords\u201d by a \u201ccommon sense\u201d definition. Therefore, instead of testing a regex at each position, we need to first run it against the entire string, record a list of non-overlapping matches, and only then test a position for a match in that list.\r\n\r\nFor computational efficiency, ideally we should <em>cache<\/em> the run of each predefined regex against each input string in the examples before the learning session starts. That way, we avoid recomputing it in each call of <code class=\"language-plaintext highlighter-rouge\">RegexPosition<\/code> semantics and its witness functions. We avoid such caching in this tutorial for simplicity of presentation.<\/blockquote>\r\nHere\u2019s a witness function for <code class=\"language-plaintext highlighter-rouge\">rr<\/code>:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nRegex[] UsefulRegexes = {\r\n    new Regex(@\"\\w+\"),  \/\/ Word\r\n    new Regex(@\"\\d+\"),  \/\/ Number\r\n    \/\/ ...\r\n};\r\n\r\n\/\/ For efficiency, this function should be invoked only once for each input string before the learning session starts\r\nstatic void BuildStringMatches(string x,\r\n                               out List&lt;Tuple&lt;Match, Regex&gt;&gt;[] leftMatches,\r\n                               out List&lt;Tuple&lt;Match, Regex&gt;&gt;[] rightMatches)\r\n{\r\n    leftMatches = new List&lt;Tuple&lt;Match, Regex&gt;&gt;[inp.Length + 1];\r\n    rightMatches = new List&lt;Tuple&lt;Match, Regex&gt;&gt;[inp.Length + 1];\r\n    for (int p = 0; p &lt;= x.Length; ++p)\r\n    {\r\n        leftMatches[p] = new List&lt;Tuple&lt;Match, Regex&gt;&gt;();\r\n        rightMatches[p] = new List&lt;Tuple&lt;Match, Regex&gt;&gt;();\r\n    }\r\n    foreach (Regex r in UsefulRegexes)\r\n    {\r\n        foreach (Match m in r.Matches(x))\r\n        {\r\n            leftMatches[m.Index + m.Length].Add(Tuple.Create(m, r));\r\n            rightMatches[m.Index].Add(Tuple.Create(m, r));\r\n        }\r\n    }\r\n}\r\n\r\n[WitnessFunction(\"RegexPosition\", 1)]\r\nDisjunctiveExamplesSpec WitnessRegexPair(GrammarRule rule, DisjunctiveExamplesSpec spec)\r\n{\r\n    var result = new Dictionary&lt;State, IEnumerable&lt;object&gt;&gt;();\r\n    foreach (var example in spec.DisjunctiveExamples)\r\n    {\r\n        State inputState = example.Key;\r\n        var x = (string) inputState[rule.Body[0]];\r\n        List&lt;Tuple&lt;Match, Regex&gt;&gt;[] leftMatches, rightMatches;\r\n        BuildStringMatches(x, out leftMatches, out rightMatches);\r\n        var regexes = new List&lt;Tuple&lt;Regex, Regex&gt;&gt;();\r\n        foreach (int? pos in example.Value)\r\n        {\r\n            regexes.AddRange(from l in leftMatches[pos.Value]\r\n                             from r in rightMatches[pos.Value]\r\n                             select Tuple.Create(l.Item2, r.Item2));\r\n        }\r\n        if (regexes.Count == 0) return null;\r\n        result[inputState] = regexes;\r\n    }\r\n    return new DisjunctiveExamplesSpec(result);\r\n}\r\n[\/code]\r\n\r\n<h4>Conditional witness functions<\/h4>\r\nThe last witness function in this tutorial witnesses a match index <code class=\"language-plaintext highlighter-rouge\">k<\/code> for each regex pair in <code class=\"language-plaintext highlighter-rouge\">RegexPosition<\/code>. To write this witness function, an outer spec on <code class=\"language-plaintext highlighter-rouge\">RegexPosition<\/code> alone is insufficient: we can only write it for each individual regex pair, but not for all possible regex pairs at once. Thus, our witness function for <code class=\"language-plaintext highlighter-rouge\">k<\/code> is <em>conditional<\/em> on <code class=\"language-plaintext highlighter-rouge\">rr<\/code>: in addition to an outer spec, it takes an additional input \u2013 a spec on its <em>prerequisite parameter<\/em> <code class=\"language-plaintext highlighter-rouge\">rr<\/code>.\r\n\r\nIn general, the prerequisite spec can be of any kind that provides your witness function any useful information. Typically, an <code class=\"language-plaintext highlighter-rouge\">ExampleSpec<\/code> (i.e., a concrete value of prerequisite \u2013 in this case <code class=\"language-plaintext highlighter-rouge\">rr<\/code>) is the most useful and common prerequisite spec. We use <code class=\"language-plaintext highlighter-rouge\">ExampleSpec<\/code> here to deduce possible indices <code class=\"language-plaintext highlighter-rouge\">k<\/code> for each regex pair in a manner similar to deducing absolute positions above.\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\n[WitnessFunction(\"RegexPosition\", 2, DependsOnParameters = new[] { 1 })]\r\nDisjunctiveExamplesSpec WitnessKForRegexPair(GrammarRule rule, DisjunctiveExamplesSpec spec,\r\n                                             ExampleSpec rrSpec)\r\n{\r\n    var result = new Dictionary&lt;State, IEnumerable&lt;object&gt;&gt;();\r\n    foreach (var example in spec.DisjunctiveExamples)\r\n    {\r\n        State inputState = example.Key;\r\n        var x = (string) inputState[rule.Body[0]];\r\n        var regexPair = (Tuple&lt;Regex, Regex&gt;) rrSpec.Examples[inputState];\r\n        Regex left = regexPair.Item1;\r\n        Regex right = regexPair.Item2;\r\n        var rightMatches = right.Matches(x).Cast&lt;Match&gt;().ToDictionary(m =&gt; m.Index);\r\n        var matchPositions = new List&lt;int&gt;();\r\n        foreach (Match m in left.Matches(x))\r\n        {\r\n            if (rightMatches.ContainsKey(m.Index + m.Length))\r\n                matchPositions.Add(m.Index + m.Length);\r\n        }\r\n        var ks = new HashSet&lt;int?&gt;();\r\n        foreach (int? pos in example.Value)\r\n        {\r\n            int occurrence = matchPositions.BinarySearch(pos.Value);\r\n            if (occurrence &lt; 0) continue;\r\n            ks.Add(occurrence);\r\n            ks.Add(occurrence - matchPositions.Count);\r\n        }\r\n        if (ks.Count == 0) return null;\r\n        result[inputState] = ks.Cast&lt;object&gt;();\r\n    }\r\n    return new DisjunctiveExamplesSpec(result);\r\n}\r\n[\/code]\r\n\r\n<h4>Learning with witness functions<\/h4>\r\nAfter completing these four witness functions, we must make one final change in order for learning to happen \u2013 we must create an instance of <code class=\"language-plaintext highlighter-rouge\">WitnessFunctions<\/code> and communicate it to the synthesis engine at learning time. To do that, we override a standard configuration of a <code class=\"language-plaintext highlighter-rouge\">SynthesisEngine<\/code>, providing our <code class=\"language-plaintext highlighter-rouge\">WitnessFunctions<\/code> to the <code class=\"language-plaintext highlighter-rouge\">DeductiveSynthesis<\/code> strategy.\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nvar engine = new SynthesisEngine(grammar, new SynthesisEngine.Config\r\n{\r\n    Strategies = new ISynthesisStrategy[]\r\n    {\r\n        new DeductiveSynthesis(witnessFunctions),\r\n    }\r\n});\r\n[\/code]\r\n\r\nAfter making this change, our <code class=\"language-plaintext highlighter-rouge\">LearnGrammar<\/code> call succeeds, and we get back a set of several dozens possible consistent programs!\r\n<h4>Ranking<\/h4>\r\nExample are inherently an ambiguous form of specification. A user-provided spec of several input-output examples usually produces a huge set of DSL programs that are consistent with it (often billions of them!). To build a useful application, a synthesis-based technology has to somehow pick one \u201cmost likely\u201d program from such a set. Many disambiguation techniques exist; in this tutorial, we show the most common one \u2013 ranking.\r\n\r\nRanking assigns each program a <em>score<\/em> \u2013 an approximation to its \u201clikelihood\u201d of being a desired program. For instance, string extraction based on absolute indices is less common than extraction based on regular expressions, therefore the former should be assigned a smaller score than the latter.\r\n\r\n<hr \/>\r\n\r\n<div><a href=\"#top\">Back to top &gt;<\/a><\/div>\r\n<div id=\"features\" style=\"height: 30px\"><\/div>\r\n<h3>Features<\/h3>\r\nIn PROSE, scores are represented using <em>computed features<\/em>. A feature is a named attribute on a program AST, computed using provided <em>feature calculator<\/em> functions. A feature can be <em>complete<\/em>, which means that it must be defined with some value for each possible DSL program, or <em>incomplete<\/em> if it only exists on some DSL programs.\r\n\r\nA feature is defined in a DSL as follows:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\n@complete feature double Score = SubstringExtraction.ScoreCalculator;\r\n[\/code]\r\n\r\nHere <code class=\"language-plaintext highlighter-rouge\">Score<\/code> is its name, <code class=\"language-plaintext highlighter-rouge\">double<\/code> is its type, and <code class=\"language-plaintext highlighter-rouge\">ScoreCalculator<\/code> is a <em>feature class<\/em> that holds calculator functions. To implement this feature class, we inherit from <code class=\"language-plaintext highlighter-rouge\">Microsoft.ProgramSynthesis.Feature&amp;lt;T&amp;gt;<\/code>:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nusing Microsoft.ProgramSynthesis;\r\nusing Microsoft.ProgramSynthesis.AST;\r\n\r\nnamespace SubstringExtraction\r\n{\r\n    class ScoreCalculator : Feature&lt;double&gt;\r\n    {\r\n        public ScoreCalculator(Grammar grammar): base(grammar, \"Score\", isComplete: true) { }\r\n            \/\/\/ \r\n\r\n            \/\/\/     Calculates the value of the feature represented by the current instance\r\n            \/\/\/     for a given \r\n      \t    \/\/\/\t    program .\r\n            \/\/\/   \r\n        protected override double GetFeatureValueForVariable(VariableNode variable) =&gt; 0;\r\n    }\r\n}[\/code]\r\n\r\nGiven a <code class=\"language-plaintext highlighter-rouge\">ProgramNode p<\/code>, you can access the value of <code class=\"language-plaintext highlighter-rouge\">Score<\/code> on this program as follows:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nvar scoreFeature = new ScoreCalculator(grammar);\r\nConsole.WriteLine(p.GetFeatureValue(scoreFeature));[\/code]\r\n\r\n<blockquote><strong>Note:<\/strong> Since we defined <code class=\"language-plaintext highlighter-rouge\">Score<\/code> as a complete feature, we had to override its <code class=\"language-plaintext highlighter-rouge\">GetFeatureValueForVariable<\/code> method to provide an implementation of this feature computation on variable ASTs such as <code class=\"language-plaintext highlighter-rouge\">x<\/code>.<\/blockquote>\r\n<h4>Feature calculators<\/h4>\r\nA feature calculator is defined for a grammar rule. There are four ways to define a calculator: based on <em>recursive feature values<\/em>, based on <em>program syntax<\/em> (given a <code class=\"language-plaintext highlighter-rouge\">ProgramNode<\/code> or its children), or based on <em>literal values<\/em>.\r\n<h5>Calculation from recursive values<\/h5>\r\nThe most common feature definitions are inductive, recursively defined over the grammar. For instance, a score for <code class=\"language-plaintext highlighter-rouge\">RegexPosition(x, rr, k)<\/code> would be defined as a formula over a score for <code class=\"language-plaintext highlighter-rouge\">rr<\/code> and a score for <code class=\"language-plaintext highlighter-rouge\">k<\/code>. Such feature calculators take as input recursively computed values of the same feature on parameters of a current program AST:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\n[FeatureCalculator(\"RegexPosition\", Method = CalculationMethod.FromRecursiveFeatureValues)]\r\ndouble ScoreRegexPosition(double inScore, double rrScore, double kScore) =&gt; rrScore * kScore;[\/code]\r\n\r\n<h5>Calculation from program syntax<\/h5>\r\nWhen recursively computed feature values are insufficient, you can take into account the entire syntax of a program AST. There are two ways to define such a calculator: <strong>(a)<\/strong> take as input a <code class=\"language-plaintext highlighter-rouge\">ProgramNode<\/code> to score, or <strong>(b)<\/strong> take as input several individual <code class=\"language-plaintext highlighter-rouge\">ProgramNode<\/code> instances representing ASTs of its parameter programs.\r\n<blockquote><strong>Hint:<\/strong> You can specify a specific subclass of <code class=\"language-plaintext highlighter-rouge\">ProgramNode<\/code> as a parameter, if you know that your grammar structure only allows some specific AST kinds at this place.<\/blockquote>\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\n[FeatureCalculator(\"AbsolutePosition\", Method = CalculationMethod.FromChildrenNodes)]\r\ndouble ScoreAbsolutePosition(VariableNode x, LiteralNode k)\r\n{\r\n    double score = (double) x.GetFeatureValue(this) + (double) k.GetFeatureValue(this);\r\n    int kValue = (int) k.Value;\r\n    if (Math.Abs(k) &lt;= 1)\r\n        score *= 10;\r\n    return score;\r\n}\r\n\r\n\/\/ Alternatively:\r\n[FeatureCalculator(\"AbsolutePosition\", Method = CalculationMethod.FromProgramNode)]\r\ndouble ScoreAbsolutePosition(ProgramNode p) { ... }[\/code]\r\n\r\n<h5>Calculation from literals<\/h5>\r\nAn inductively defined computed feature needs a basic case \u2013 its value on literal program ASTs. Feature calculators on terminal rules can take as input simply the value of a literal in a <code class=\"language-plaintext highlighter-rouge\">LiteralNode<\/code> currently being scored. Instead of an operator name, you provide the name of the corresponding terminal symbol in the <code class=\"language-plaintext highlighter-rouge\">[FeatureCalculator]<\/code> attribute:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\n[FeatureCalculator(\"k\", Method = CalculationMethod.FromLiteral)] double KScore(int k) =&gt; 1.0 \/ (1 + Math.Abs(k));[\/code]\r\n\r\n<h3>Learning top programs<\/h3>\r\nOur <code class=\"language-plaintext highlighter-rouge\">SubstringExtraction<\/code> grammar needs a <code class=\"language-plaintext highlighter-rouge\">Score<\/code> calculator for each rule (including standard library rules like <code class=\"language-plaintext highlighter-rouge\">PositionPair<\/code>). After you define all of them, you can extract\u00a0<strong>k<\/strong>\u00a0top-ranked programs out of the candidates returned by learning:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nProgramSet learned = engine.LearnGrammar(spec);\r\nvar scoreFeature = new ScoreCalculator(grammar);\r\nIEnumerable&lt;ProgramNode&gt; best = learned.TopK(scoreFeature, k: 1);[\/code]\r\n\r\nThe method <code class=\"language-plaintext highlighter-rouge\">TopK<\/code> assumes that your feature has a numerical type (convertible to <code class=\"language-plaintext highlighter-rouge\">double<\/code>). It returns an descendingly ordered sequence of programs (greater score values are better). If several programs have the same score, they are all returned in the sequence, thus it may hold more programs than the requested value of <strong>k<\/strong>.\r\n\r\nInstead of ordering a learned set of programs, you can instead learn only <strong>k <\/strong>topmost-ranked programs in the first place, if you are not interested in the entire set of candidates. Such a request significantly improves learning performance because PROSE can perform aggressive filtering in the middle of the learning process.\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nIEnumerable&lt;ProgramNode&gt; bestLearned = engine.LearnGrammarTopK(spec, scoreFeature, k: 1);[\/code]\r\n\r\n<blockquote><strong>Important:<\/strong> in order for both <code class=\"language-plaintext highlighter-rouge\">TopK<\/code> methods to work soundly, your feature must be <em>monotonic<\/em> over the grammar structure. In other words, greater-scored subexpressions should produce greater-scored expressions.<\/blockquote>\r\nWe can now take the best program and apply it on new user-provided data. Assuming scoring functions similar to FlashFill, this program will be \u201cExtract the first word\u201d:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nProgramNode p = bestLearned.First();\r\nConsole.WriteLine(p);\r\n\/*  Substring(x, PositionPair(RegexPosition(x, RegexPair(\/\/, \/\\w+\/), 0), RegexPosition(x, RegexPair(\/\\w+\/, \/\/), 0))  *\/\r\nState input = State.Create(grammar.InputSymbol, \"Program Synthesis\");\r\nConsole.WriteLine(p.Invoke(input));\r\n\/*  Program  *\/[\/code]\r\n\r\n\r\n<hr \/>\r\n\r\n<h3>Further reading<\/h3>\r\nThis concludes our basic PROSE tutorial. To learn about more PROSE features, please check out these resources:\r\n<ul>\r\n \t<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/prose-framework\/#!usage\">PROSE usage documentation<\/a><\/li>\r\n \t<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/prose-framework\/#!backpropagation\">Backpropagation<\/a><\/li>\r\n \t<li><a href=\"https:\/\/github.com\/Microsoft\/prose\" target=\"_blank\" rel=\"noopener\">Sample code<\/a><\/li>\r\n \t<li>Presentations<\/li>\r\n<\/ul>\r\n<div class=\"footnotes\">\r\n\r\n<strong>Footnote<\/strong>\r\n<ol>\r\n \t<li id=\"fn:regex\">PROSE uses JavaScript\/Perl syntax for regular expression literals. <a class=\"reversefootnote\" href=\"regex\">\u21a9<\/a><\/li>\r\n<\/ol>\r\n\r\n<hr \/>\r\n\r\n<\/div>\r\n<div><a href=\"#top\">Back to top &gt;<\/a><\/div>"},{"id":1,"name":"Usage","content":"<h2>Usage<\/h2>\r\n<h3>Terminology<\/h3>\r\nA <strong>domain-specific language (DSL)<\/strong> is a context-free programming language, created for a specific purpose, which can express tasks from a certain domain. A DSL consists of <strong>symbols<\/strong> and <strong>operators<\/strong> upon these symbols. A DSL is described as a context-free grammar \u2013 i.e., as a set of rules, where each symbol on the left-hand side is bound to a set of possible operators on the right-hand side that represent this symbol. Every operator of a DSL is <strong>pure<\/strong> \u2013 it does not have side effects.\r\n\r\nA <strong>program<\/strong> in a DSL transforms <strong>input<\/strong> data into <strong>output<\/strong> data. A program is represented as an <strong>abstract syntax tree (AST)<\/strong> of the DSL operators. Each node in this tree (similarly, each DSL operator) has some invocation semantics. More formally, each AST node has a method <code class=\"language-plaintext highlighter-rouge\">Invoke<\/code>. It takes as input a <strong>state<\/strong> &amp;sigma; and returns some output. A state is a mapping from DSL variables to their values. Initially, the topmost AST node invoked on a state &amp;sigma; = <strong>{v -&gt; i}<\/strong> , where\u00a0<strong>v<\/strong> is the DSL variable that represents program input, and\u00a0<strong>i<\/strong>\u00a0is the value of the input data on which the program is invoked. AST nodes can modify this state (add new variable bindings in scope) and pass it down to children nodes in recursive invocations of <code class=\"language-plaintext highlighter-rouge\">Invoke<\/code>.\r\n<h3>Architecture<\/h3>\r\nThe Microsoft.ProgramSynthesis package unites the core pieces of the meta-synthesizer. Here are the included assemblies:\r\n<ul>\r\n \t<li><strong>Microsoft.ProgramSynthesis.<\/strong> includes the core functionality for manipulating\r\nASTs, validating DSLs, maintain a DSL and a set of its operators, and core infrastructure for the synthesis algorithms.<\/li>\r\n \t<li><strong>dslc.exe<\/strong> is a DSL compiler. This is the executable that takes a language definition, parses it, validates it, potentially emits some diagnostics (warnings and\/or errors), and serializes the parsed language representation into a portable file.<\/li>\r\n \t<li><strong>Microsoft.ProgramSynthesis.Learning<\/strong> is a core library of synthesis algorithms,\r\nstandard operators and strategies.<\/li>\r\n \t<li><strong>Microsoft.ProgramSynthesis.Utils<\/strong> is a collection of utility methods that are used in different parts of the project.<\/li>\r\n<\/ul>\r\nA typical workflow of a DSL designer consists of the following steps:\r\n<ol>\r\n \t<li>Define a DSL in a separate file (usually has a *.grammar extension).<\/li>\r\n \t<li>Define the semantics and all the supporting infrastructure for the DSL in a separate .NET assembly (see Section 4 below).<\/li>\r\n \t<li>Use one of the two options below to compile the DSL definition into a <code class=\"language-plaintext highlighter-rouge\">Grammar<\/code> object:\r\n<ul>\r\n \t<li>At compile-time, invoke the DSL compiler <strong>dslc.exe<\/strong> manually on a *.grammar file with a DSL definition. Deploy the resulting serialized *.grammar.xml file with your application. At run-time, deserialize it using <code class=\"language-plaintext highlighter-rouge\">Grammar.Deserialize<\/code> method.<\/li>\r\n \t<li>Deploy the <strong>dslc.exe<\/strong> and the *.grammar file with your application. At run-time, compile your DSL definition in memory using <code class=\"language-plaintext highlighter-rouge\">DSLCompiler.Compile<\/code> method.<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ol>\r\n<h3>Language definition<\/h3>\r\nThe main class <code class=\"language-plaintext highlighter-rouge\">Grammar<\/code> represents a context-free grammar as a set of DSL rules and a list of references to operators\u2019 semantics and\/or custom operator learners, implemented in C# in a separate assembly. The method <code class=\"language-plaintext highlighter-rouge\">DSLCompiler.Compile<\/code> loads a grammar definition from a string. Multiple examples of its usage can be found in our <a href=\"https:\/\/github.com\/microsoft\/prose\">sample repository<\/a>.\r\n\r\nHere\u2019s a typical language definition:\r\n\r\n[code]\r\nusing TestSemantics;\r\nusing semantics TestSemantics.FlashFill;\r\nusing learners TestSemantics.FlashFill.Learners;\r\n\r\nlanguage FlashFill;\r\n\r\n@start string f := ConstStr(s) | let x: string = v in SubStr(x, P, P);\r\nint P := CPos(x, k) | Pos(x, r, r, k);\r\nint k;\r\n@values[StringGen] string s;\r\nRegex r;\r\n@input string v;\r\n[\/code]\r\n\r\nNOTE: The operators <code class=\"language-plaintext highlighter-rouge\">ConstStr<\/code>, <code class=\"language-plaintext highlighter-rouge\">SubStr<\/code>, <code class=\"language-plaintext highlighter-rouge\">CPos<\/code> and <code class=\"language-plaintext highlighter-rouge\">Pos<\/code> are defined in <a href=\"#black-box-operators\">Black-box operators<\/a> below.\r\n\r\nFirst, we reference any namespaces for external typename lookup. Then, we specify the static classes for semantics and learners. There may be several <strong>using<\/strong>, <strong>using semantics<\/strong>, and <strong>using learners<\/strong> instructions. In this example, <code class=\"language-plaintext highlighter-rouge\">TestSemantics.FlashFill<\/code> is a static class defined in the assembly \u201cTestSemantics.dll\u201d (which we pass as a reference in the DSLCompiler.Compile invocation) as is <code class=\"language-plaintext highlighter-rouge\">TestSemantics.FlashFill.Learners<\/code>.\r\n\r\nNext, we specify the language name and its rules in a form of a context-free grammar (BNF). Each <strong>nonterminal<\/strong> rule has a symbol on the left-hand side and a list of operators on the right-hand side, separated by \u201c|\u201d. Each <strong>terminal<\/strong> rule has only the left-hand side. Each symbol has its type, specified on the LHS. The supported types are: any of the standard C# types, including classes from <code class=\"language-plaintext highlighter-rouge\">System.Collection.Generic<\/code>, <code class=\"language-plaintext highlighter-rouge\">System.Text.RegularExpressions<\/code>, etc., or your own custom types, defined in the same semantics assembly. In the latter case, the framework searches for the type name in the namespaces referenced in <strong>using<\/strong> instructions.\r\n\r\nExactly one nonterminal rule should be annotated as \u201c<strong>@start<\/strong>\u201d \u2013 this is a start symbol of the grammar (i.e., the topmost node of every AST). Exactly one terminal rule should be annotated as \u201c<strong>@input<\/strong>\u201d \u2013 this is the input data to the program.\r\n<h4>Terminals<\/h4>\r\nTerminal rules specify the leaf symbols that will be replaced with literal constants or variables in the AST. For example:\r\n<ul>\r\n \t<li>A terminal rule <code class=\"language-plaintext highlighter-rouge\">int <strong>k<\/strong>;<\/code> specifies a symbol\u00a0<strong>k<\/strong> that represents a literal integer constant.<\/li>\r\n \t<li>A terminal rule <code class=\"language-plaintext highlighter-rouge\">@input string v;<\/code> specifies a <em>variable<\/em> symbol\u00a0<strong>v<\/strong> that will be replaced with program input data <em>at runtime.<\/em><\/li>\r\n<\/ul>\r\nA user can specify the list of possible values that a literal symbol can be set to. This is done with a <strong>@values[<\/strong>G<strong>]<\/strong> annotation, where\u00a0<strong>G<\/strong> is a <strong>value generator<\/strong> \u2013 a reference to a user-defined static field, property, or method.\u00a0 <strong>G <\/strong>should evaluate to <code class=\"language-plaintext highlighter-rouge\">IEnumerable<\/code> (thus, it can be any standard .NET collection, such as an array or a list, or a user-defined collection type). The framework searches for the definition of <strong>G <\/strong>in the <em>learner classes,<\/em> specified by a <strong>using learners<\/strong> instruction. For our example grammar, any of the following definition can serve as a value generator for <strong>S<\/strong>:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\npublic static class FlashFill\r\n{\r\n    public static class Learners\r\n    {\r\n        public static string[] StringGen = {\"\", \"123\", \"Gates, B.\", \"ABRACADABRA\"};\r\n        public static string[] StringGen\r\n        {\r\n            get { return new[] {\"\", \"123\", \"Gates, B.\", \"ABRACADABRA\"}; }\r\n        }\r\n        public static string[] StringGen()\r\n        {\r\n            return new[] { \"\", \"123\", \"Gates, B.\", \"ABRACADABRA\" };\r\n        }\r\n    }\r\n}[\/code]\r\n\r\nIf no value generator is provided, the framework can either pick a standard generator for some common types (such as &lt;code class=\"language-plaintext highlighter-rouge\"&gt;byte&lt;\/code&gt;), or assume that the literal can be set to any value. This can impact the performance of some synthesis strategies or even make them inapplicable.\r\n&lt;div id=\"#black-box-operators\"&gt;\r\n&lt;\/div&gt;&lt;h4&gt;Black-box operators&lt;\/h4&gt;\r\nA\u00a0<strong>black-box<\/strong>\u00a0operator is an operator that does not refer to any standard concepts, and its invocation semantics do not modify the state &amp;sigma; that has been passed to it. In the example above\u00a0<code class=\"\u201dlanguage-plaintext\">ConstStr<\/code>,\u00a0<code class=\"\u201dlanguage-plaintext\">SubStr<\/code>,\u00a0<code class=\"\u201dlanguage-plaintext\">CPos<\/code>,\u00a0<code class=\"\u201dlanguage-plaintext\">Pos<\/code>\u00a0are black-box operators. Note that\u00a0<code class=\"\u201dlanguage-plaintext\">SubStr<\/code>\u00a0is a black-box operator, even though the\u00a0<code class=\"\u201dlanguage-plaintext\">Let<\/code>\u00a0concept surrounding it isn\u2019t.\u00a0<code class=\"\u201dlanguage-plaintext\">SubStr<\/code>\u2019s semantics don\u2019t modify &amp;sigma;, they just pass it on to parameter symbols.\u00a0<code class=\"\u201dlanguage-plaintext\">Let<\/code>\u2019s semantics change the state &amp;sigma; by extending it with a new variable binding\u00a0<b>x -&gt; v<\/b>.\r\n\r\nThe semantics of every black-box operator should be defined in the semantics assembly with the same name as a public static method:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\npublic static class Semantics\r\n{\r\n    public static int CPos(string s, int k)\r\n    {\r\n        return k &lt; 0 ? s.Length + 1 + k : k;\r\n    }\r\n\r\n    public static int? Pos(string s, Regex leftRegex, Regex rightRegex,\r\n                           int occurrence)\r\n    {\r\n        var boundaryRegex = new Regex(string.Format(\"(?&lt;={0}){1}\", leftRegex, rightRegex));\r\n        var matches = boundaryRegex.Matches(s);\r\n        int index = occurrence &gt; 0 ? occurrence - 1 : matches.Count + occurrence;\r\n        if (index &lt; 0 || index &gt;= matches.Count)\r\n        {\r\n            return null;\r\n        }\r\n        return matches[index].Index;\r\n    }\r\n\r\n    public static string ConstStr(string s)\r\n    {\r\n        return s;\r\n    }\r\n\r\n    public static string SubStr(string s, int left, int right)\r\n    {\r\n        if (left &lt; 0 || right &gt; s.Length || right &lt; left)\r\n        {\r\n            return null;\r\n        }\r\n        return s.Substring(left, right - left);\r\n    }\r\n\r\n    public static string Concat(string s, string t)\r\n    {\r\n        return s + t;\r\n    }\r\n}[\/code]\r\n\r\nIt is important to note that the learning process requires that semantic functions are <em>total<\/em>. If the function is inapplicable for the current choice of arguments or if for any other reason it would throw an exception, it must return <strong>null<\/strong>\u00a0instead&lt;sup id=\"fnref:1\"&gt;&lt;a class=\"footnote\" href=\"1\" data-mce-href=\"1\"&gt;1&lt;\/a&gt;&lt;\/sup&gt;. In such a case, if the return type of a function is a .NET value type, it should be made nullable.\r\n&lt;blockquote&gt;\r\nThis choice was made to enable efficient program synthesis. During learning, PROSE may repeatedly invoke partial programs on various inputs for verification purposes. If an input is invalid and a program handles it by throwing an exception, it slows down the learning by two orders of magnitude.\r\n&lt;\/blockquote&gt;\r\n\r\nNulls are automatically propagated upward like exceptions: if an argument to a semantic function is <strong>null<\/strong>, its result is automatically presumed to be <strong>null<\/strong> as well. You can override this behavior by annotating your semantics function with a &lt;code class=\"language-plaintext highlighter-rouge\"&gt;[LazySemantics]&lt;\/code&gt; attribute. In that case, your function will receive any <strong>null<\/strong> arguments as usual, and must handle it on its own.\r\n\r\nEvery grammar operator should be pure, and its <em>formal<\/em> signature should be &amp;sigma;\u00a0<strong>-&gt; T<\/strong> (where\u00a0<strong>T<\/strong> is the type of the corresponding LHS symbol). The <em>actual <\/em>signature of the semantics function that implements a black-box operator is &lt;code class=\"language-plaintext highlighter-rouge\"&gt;( <strong>T<\/strong>&lt;sub&gt;1&lt;\/sub&gt;,<strong>T<\/strong>&lt;sub&gt;2&lt;\/sub&gt;},. . .,<strong>T<\/strong>&lt;sub&gt;k&lt;\/sub&gt; ) <strong>-&gt; T<\/strong>&lt;\/code&gt;, where <strong>T<\/strong>&lt;sub&gt;j&lt;\/sub&gt; is the type of\u00a0 j&lt;sup&gt;th&lt;\/sup&gt; parameter symbol, and\u00a0<strong>T<\/strong> is the return type of the LHS symbol. Consequently, if one needs to invoke their operators on anything other than the program input data, they have to introduce additional variables using\u00a0<strong>Let<\/strong>\u00a0construct and\/or lambda functions.\r\n\r\n&lt;h4&gt;Let construct&lt;\/h4&gt;\r\n<strong>Let<\/strong> construct is a standard concept with the following syntax:\r\n\r\nlet x&lt;sub&gt;1&lt;\/sub&gt;:&amp;gamma;&lt;sub&gt;1&lt;\/sub&gt; = v&lt;sub&gt;1&lt;\/sub&gt;,. . ., x&lt;sub&gt;k&lt;\/sub&gt;:&amp;gamma;&lt;sub&gt;k&lt;\/sub&gt; = v&lt;sub&gt;k&lt;\/sub&gt; in <strong>RHS<\/strong>\r\n\r\nwhere\u00a0<strong>RHS<\/strong> is any standard rule <strong>RHS<\/strong> (a black-box operator, a grammar symbol, etc.). The RHS and any of the symbols it (indirectly) references can make use of the variables x&lt;sub&gt;1&lt;\/sub&gt;,. . .,x&lt;sub&gt;k&lt;\/sub&gt; Grammar symbols v&lt;sub&gt;1&lt;\/sub&gt;,. . .,v&lt;sub&gt;k&lt;\/sub&gt; are parameters of the rule; at runtime, each variable x&lt;sub&gt;j&lt;\/sub&gt; is bound to some value of the corresponding symbol v&lt;sub&gt;j&lt;\/sub&gt;.\r\n\r\n&lt;strong&gt;Example:&lt;\/strong&gt;\r\n\r\nThe running grammar definition FlashFill contains the following rule:\r\n\r\n[code]\r\n@start string f := let x: string = v in SubStr(x, P, P)\r\n[\/code]\r\n\r\nThe formal signature of this rule has three free parameters: <strong>v<\/strong>, <strong>P<\/strong>, <strong>P<\/strong>. At runtime, the <code class=\"language-plaintext highlighter-rouge\">Let<\/code> construct, when given a state &amp;sigma;, executes the following operations:\r\n<ul>\r\n \t<li>&amp;thetasym; :=\u00a0 [[v]] &amp;sigma;\r\n(Execute the parameter symbol <strong>v <\/strong>on the state and save the value as <code class=\"language-plaintext highlighter-rouge\">&amp;thetasym;<\/code>)<\/li>\r\n \t<li><code class=\"language-plaintext highlighter-rouge\">&amp;sigma;&lt;sup&gt;'&lt;\/sup&gt; := &amp;sigma; &amp;cup; {\u00a0<strong>x -&gt;<\/strong> &amp;thetasym;}<\/code>\u00a0(Add a new variable\u00a0<strong>x<\/strong> to the state, bind it to the value <code class=\"language-plaintext highlighter-rouge\">&amp;thetasym;<\/code>)<\/li>\r\n \t<li>return [[SubStr(x,<strong>P<\/strong>,<strong>P<\/strong>) ]] &amp;sigma;&lt;sup&gt;'&lt;\/sup&gt;\r\n(Invoke the RHS \u2013 the black-box operator <code class=\"language-plaintext highlighter-rouge\">SubStr<\/code> \u2013 on the new state)<\/li>\r\n<\/ul>\r\nThe RHS indirectly references the variable\u00a0<strong>x<\/strong> further in the grammar through the symbol <strong>P<\/strong>. Namely, the grammar further contains the following rule:\r\n\r\n[code class=\"language-dsl\"]int P := Pos(x, r, r, k)\r\n[\/code]\r\n\r\n<code class=\"language-plaintext highlighter-rouge\">Pos<\/code> is a simple black-box operator, just like <code class=\"language-plaintext highlighter-rouge\">SubStr<\/code>. When given a state &amp;sigma;&lt;sup&gt;'&lt;\/sup&gt;, it passes it on to its parameter symbols <strong>x,r,r,k<\/strong>. The latter three symbols are terminals; they will be represented as literal constants of the corresponding types in any final program (AST). The first symbol is a variable <strong>x<\/strong>; when executed on a state &amp;sigma;&lt;sup&gt;'&lt;\/sup&gt;, it will just extract the binding\u00a0<strong>x<\/strong>\u00a0<strong>-&gt;<\/strong> &amp;thetasym; from it, and return &amp;thetasym;.\r\n<h4>Standard concepts<\/h4>\r\nThere exists a range of concepts that are common for many DSLs and implement standard functionality. In particular, many list\/set processing concepts (<code class=\"language-plaintext highlighter-rouge\">Map<\/code>, <code class=\"language-plaintext highlighter-rouge\">Filter<\/code>, <code class=\"language-plaintext highlighter-rouge\">TakeWhile<\/code>, etc.) encode various forms of loops that arise in many DSLs for different purposes. The synthesizer treats many such concepts as first-class citizens and is aware of many efficient strategies for synthesizing such concepts from example specifications. Consequently, we encourage framework users to use <strong>standard concepts<\/strong> for <em>any<\/em> loop form that arises in their DSLs, instead of encoding it as a black-box operator. In our experience, most loop forms commonly used in DSLs can be expressed as combinations of standard concepts.\r\n\r\nThe current list of supported concepts can be found in the <code class=\"language-plaintext highlighter-rouge\">Microsoft.ProgramSynthesis.Rules.Concepts<\/code> namespace.\r\n\r\nMost standard loop concepts express some form of list\/set processing with lambda functions. We explain their usage here on a simple <code class=\"language-plaintext highlighter-rouge\">Filter<\/code> example.\r\n\r\nA <strong>Filter(p, s)<\/strong> concept is an operator that takes as input a predicate symbol\u00a0<strong>p<\/strong> and a set symbol <strong>s<\/strong>. It evaluates\u00a0<strong>s<\/strong> on a state &amp;sigma; to some set\u00a0<strong>S<\/strong> of objects. It also evaluates\u00a0<strong>p<\/strong> on a state &amp;sigma; to some lambda function\u00a0<strong>L<\/strong> = &amp;lambda; <strong>x:e<\/strong>. Finally, it filters the set\u00a0<strong>S<\/strong> using\u00a0<strong>L<\/strong> as a predicate, and returns the result. Essentially,\u00a0<strong>Filter<\/strong>\u00a0is equivalent to <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/vstudio\/bb548891(v=vs.110\\).aspx\">Select in LINQ<\/a>.\r\n\r\nConsider the following grammar for various filters of an array of strings:\r\n\r\n[code class=\"language-dsl\"]using TestSemantics;\r\nusing semantics TestSemantics.Flare;\r\nusing learners TestSemantics.Flare.Learners;\r\n\r\nlanguage Flare;\r\n\r\n@input string[] v;\r\nRegex r;\r\nStringFilter f := Match(r) | FTrue();\r\n@start string[] P := Selected(f, v) = Filter(f, v);\r\n[\/code]\r\n\r\nThe input string array\u00a0<strong>v<\/strong> is filtered with a filter <strong>f<\/strong>. A filter <strong>f <\/strong>can filter elements either according to a regular expression <strong>r<\/strong>, or trivially (by returning <code class=\"language-plaintext highlighter-rouge\">true<\/code>).\r\n\r\nThe main rule of the grammar is <code class=\"language-plaintext highlighter-rouge\">string[] P := Selected(f, v)<\/code>. Here <code class=\"language-plaintext highlighter-rouge\">Selected(f, v)<\/code> is a <em>formal<\/em> operator signature: this is how it would look like if it was implemented as a black-box operator. However, the <em>actual<\/em> implementation of <code class=\"language-plaintext highlighter-rouge\">Selected(f, v)<\/code> refers to the standard <code class=\"language-plaintext highlighter-rouge\">Filter<\/code> concept instead of a black-box semantics implementation. The <em>arguments<\/em> of the <code class=\"language-plaintext highlighter-rouge\">Filter<\/code> concept are in this case the <em>parameters<\/em> of a <code class=\"language-plaintext highlighter-rouge\">Selected<\/code> operator \u2013 the symbols\u00a0<strong>f<\/strong> and <strong>v<\/strong>. At runtime, the framework interprets a <code class=\"language-plaintext highlighter-rouge\">Selected<\/code> AST node by executing the standard <code class=\"language-plaintext highlighter-rouge\">Filter<\/code> semantics with the corresponding arguments.\r\n\r\nFor a valid execution, the runtime types of the symbols\u00a0<strong>f<\/strong> and\u00a0<strong>v<\/strong> should satisfy the following contract:\r\n<ol>\r\n \t<li>The type of\u00a0<strong>v<\/strong> should implement <code class=\"language-plaintext highlighter-rouge\">IEnumerable<\/code>. It can be a standard .NET collection (array, <code class=\"language-plaintext highlighter-rouge\">List&lt;T&gt;<\/code>, etc.), or a user-defined custom type that implements <code class=\"language-plaintext highlighter-rouge\">IEnumerable<\/code>.<\/li>\r\n \t<li>The type of\u00a0<strong>f<\/strong> should have <em>functional semantics<\/em>. In other words, it should behave like a lambda function, because at runtime it will be \u201cinvoked\u201d with array elements as arguments. Moreover, for the <code class=\"language-plaintext highlighter-rouge\">Filter<\/code> concept specifically, the return type of this \u201cfunction\u201d should be assignable to <code class=\"language-plaintext highlighter-rouge\">bool<\/code>.<\/li>\r\n<\/ol>\r\n<h4>Lambda functions<\/h4>\r\nOne can capture functional semantics in an explicit lambda function in the grammar:\r\n\r\n[code class=\"language-dsl\"]using semantics TestSemantics.Flare;\r\nusing learners TestSemantics.Flare.Learners;\r\n\r\nlanguage Flare;\r\n\r\n@input string[] v;\r\nRegex r;\r\nbool f := Match(x, r) | True();\r\n@start string[] P := Selected(f, v) = Filter(\\x: string =&gt; f, v);\r\n[\/code]\r\n\r\nThe first argument of the <code class=\"language-plaintext highlighter-rouge\">Filter<\/code> concept is a\r\nlambda function &amp;lambda; <strong>x:f<\/strong>. In our syntax, it is represented as <code class=\"language-plaintext highlighter-rouge\">\\x: string =&gt; f<\/code><sup id=\"fnref:3\"><a class=\"footnote\" href=\"3\">2<\/a><\/sup>. Here x is a freshly bound variable (lambda function parameter), and\u00a0<strong>f<\/strong> is a grammar symbol that represents function body. The runtime type of\u00a0<strong>x<\/strong> should be specified explicitly after a colon.\r\n\r\nJust as with <code class=\"language-plaintext highlighter-rouge\">Let<\/code> constructs, the lambda function body symbol and all its indirect descendants can now reference the variable symbol <strong>x<\/strong>. At runtime, it will be successively bound to every element of the input string array. Since this binding introduces a new variable in the state &amp;sigma;, a lambda definition cannot be expressed as a black-box rule. Instead, it is a special rule kind with first-class treatment in the framework (again, just as <code class=\"language-plaintext highlighter-rouge\">Let<\/code>).\r\n\r\nThe corresponding semantics implementation is now much simpler:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\npublic static class Flare\r\n{\r\n    public static bool Match(string x, Regex r)\r\n    {\r\n        return r.IsMatch(x);\r\n    }\r\n\r\n    public static bool True()\r\n    {\r\n        return true;\r\n    }\r\n}[\/code]\r\n\r\nNote that a lambda function body is a free symbol on the RHS of the \u201c=\u201d sign. In other words, the set of free symbols on the RHS includes the direct parameters of the concept (in the example above, <strong>v<\/strong>) and the lambda function bodies (in the example above, <strong>f<\/strong>).\u00a0To make the concept rule well-defined, this set should be exactly the same as the set of free symbols on the LHS of the \u201c=\u201d sign (i.e. the set of formal parameters of the rule). However, they should only be equivalent as <em>sets<\/em>, the order of parameters does not matter. In case of multiple usages of the same symbol among parameters, the correspondence between the LHS and the RHS is resolved in a left-to-right order.\r\n\r\n<strong>Example:<\/strong>\r\n\r\nConsider the following concept rule:\r\n\r\n[code class=\"language-dsl\"]int S := F(v, P, P) = G(\\x: string =&gt; P, v, P);\r\n[\/code]\r\n\r\nHere\u00a0<strong>G<\/strong> is some standard concept, and <strong>S<\/strong>, <strong>v<\/strong>,\u00a0<strong>P<\/strong> are grammar symbols. The correspondence between formal parameters on the LHS and free symbols on the RHS is resolved as follows:\r\n<ul>\r\n \t<li>The first parameter\u00a0<strong>v<\/strong> on the LHS corresponds to the second argument\u00a0<strong>v<\/strong> on the RHS.<\/li>\r\n \t<li>The second parameter\u00a0<strong>P<\/strong> on the LHS corresponds to the body of the first argument &amp;lambda; <strong>x<\/strong>:<strong>P<\/strong>\u00a0on the RHS.<\/li>\r\n \t<li>The third parameter\u00a0<strong>P<\/strong> on the LHS corresponds to the third argument\u00a0<strong>P<\/strong> on the RHS.<\/li>\r\n<\/ul>\r\nTwo usages of the same symbol\u00a0<strong>P<\/strong> among parameters were resolved in a left-to-right order.\r\n<h3>Language usage<\/h3>\r\nDefinition and usage of custom DSLs starts with the following steps:\r\n<ol>\r\n \t<li>Define a string representation of your DSL grammar in our syntax.<\/li>\r\n \t<li>Implement your black-box operator semantics, custom types, and value generators in a separate assembly. Make sure that it is accessible at a path specified in the grammar string<sup id=\"fnref:4\"><a class=\"footnote\" href=\"4\">3<\/a><\/sup>.<\/li>\r\n \t<li>Load the grammar into a <code class=\"language-plaintext highlighter-rouge\">Grammar<\/code> object:\r\n<ul>\r\n \t<li>Option 1: programmatically using <code class=\"language-plaintext highlighter-rouge\">DSLCompiler.Compile<\/code>.<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ol>\r\nThe example below shows how to compile the grammar programmatically (option \u201ca\u201d):\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\n const string TestFlareGrammar = @\"\r\n         using TestSemantics;\r\n         using semantics TestSemantics.Flare;\r\n         language Flare;\r\n         @input string[] v;\r\n         Regex r;\r\n         StringFilter f := Match(r) | FTrue();\r\n         @start string[] P := Selected(f, v) = Filter(f, v);\";\r\n         \r\n         \r\nvar grammar = DSLCompiler.Compile(new CompilerOptions()\r\n    {\r\n        InputGrammarText = TestFlareGrammar,\r\n        References = CompilerReference.FromAssemblyFiles(typeof(Program).GetTypeInfo().Assembly)\r\n    }).Value;[\/code]\r\n\r\n<h4>Partial programs<\/h4>\r\nIn many applications there is need to manipulate <strong>partial programs<\/strong> \u2013 ASTs where some tree nodes are replaced with <strong>holes<\/strong>. A hole is a special type of an AST node: it\u2019s an unfilled placeholder for some instantiation of a corresponding grammar symbol. In the AST node class hierarchy it is represented as <code class=\"language-plaintext highlighter-rouge\">Microsoft.ProgramSynthesis.AST.Hole<\/code>.\r\n\r\nTo get a list of descendant holes in a program <strong>p<\/strong>, call <code class=\"language-plaintext highlighter-rouge\">p.Holes<\/code>. Note that in current architecture AST nodes do not maintain any references to their parents. Consequently, to enable practical usages, the <code class=\"language-plaintext highlighter-rouge\">Holes<\/code> property returns not the <code class=\"language-plaintext highlighter-rouge\">Hole<\/code> nodes themselves, but their <em>positions<\/em> instead. A position is represented as a tuple <strong>(P, k<\/strong>, <strong>H<\/strong>), where\u00a0<strong>P<\/strong> is a parent AST node of a hole,\u00a0<strong>k<\/strong> is the hole\u2019s index in the list of <strong>P<\/strong>\u2019s children, and\u00a0<strong>H<\/strong> is the hole itself.\r\n\r\nA string representation of a hole of symbol\u00a0<strong>S<\/strong> is <strong>\u201c?S\u201d.<\/strong> The framework supports parsing AST strings that contain holes.\r\n<h3>Synthesis<\/h3>\r\n<h4>Specifications<\/h4>\r\nProgram synthesis in the PROSE SDK is defined as a process of generating a <strong>set of programs<\/strong> &amp;tilde;<strong>P<\/strong> that start with a <strong>root symbol<\/strong> <strong>P<\/strong>, given a <strong>specification<\/strong> &amp;phi;<strong>.<\/strong> A specification is a way of defining the desired behavior of every program in &amp;tilde;<strong>P<\/strong>. Different synthesis applications have different <em>kinds<\/em> of specifications to specify the desired program behavior. Some of the examples from prior work include:\r\n<ul>\r\n \t<li>In FlashFill, &amp;phi; is an <em>example specification<\/em>: for every given input state &amp;sigma;, the desired program output should be equal to the given string &amp;phi;[&amp;sigma;].<\/li>\r\n \t<li>In Extraction.Text, &amp;phi; is a <em>subset specification<\/em>. It assumes that\r\nthe runtime type of a root symbol\u00a0<strong>P<\/strong> is some sequence <code class=\"language-plaintext highlighter-rouge\">IEnumerable&lt;T&gt;<\/code>. The specification &amp;phi; expresses the following property: for every given input state &amp;phi;, the desired program output should be a <em>superset<\/em> of the given subset &amp;phi;[&amp;sigma;]. In terms of Extraction.Text, &amp;phi;[&amp;sigma;] are the substrings highlighted by the user, and the program output includes them and the rest of the substrings that should be highlighted.<\/li>\r\n<\/ul>\r\nThe supported specification kinds are located in the namespace <code class=\"language-plaintext highlighter-rouge\">Microsoft.ProgramSynthesis.Specifications<\/code>. All of them inherit the base abstract class <code class=\"language-plaintext highlighter-rouge\">Specification<\/code>. This class has a single abstract method <code class=\"language-plaintext highlighter-rouge\">Valid:<\/code><code class=\"language-plaintext highlighter-rouge\">(State, object)<\/code> to <code class=\"language-plaintext highlighter-rouge\">bool<\/code>, which returns <code class=\"language-plaintext highlighter-rouge\">true<\/code> if and only if the program output on the given input state satisfies the constraint that the specification expresses.\r\n\r\n<code class=\"language-plaintext highlighter-rouge\">Spec<\/code> is the main abstract base class for all\r\n<em>inductive specifications<\/em>, i.e. those that constraint the program behavior on a set of provided input states &amp;phi;.<code class=\"language-plaintext highlighter-rouge\">ProvidedInputs<\/code>. Some of the main inductive specification kinds are described below:\r\n<ul>\r\n \t<li><code class=\"language-plaintext highlighter-rouge\">InductiveConstraint<\/code> specifies an arbitrary constraint <code class=\"language-plaintext highlighter-rouge\">Constraint<\/code>: <code class=\"language-plaintext highlighter-rouge\">(State, object)<\/code> to <code class=\"language-plaintext highlighter-rouge\">bool<\/code> on the set of provided input states.<\/li>\r\n \t<li><code class=\"language-plaintext highlighter-rouge\">SubsequenceSpec<\/code> specifies a subsequence of the desired output sequence for each provided input state in the dictionary <code class=\"language-plaintext highlighter-rouge\">Examples<\/code>: <code class=\"language-plaintext highlighter-rouge\">State<\/code> <strong>-&amp;gt;<\/strong>\u00a0<code class=\"language-plaintext highlighter-rouge\">IEnumerable&amp;lt;object&amp;gt;<\/code>.\r\n<ul>\r\n \t<li><code class=\"language-plaintext highlighter-rouge\">PrefixSpec<\/code> is a subclass of <code class=\"language-plaintext highlighter-rouge\">SubsequenceSpec<\/code> where the desired subsequence of a program output is also required to be its prefix.<\/li>\r\n<\/ul>\r\n<\/li>\r\n \t<li><code class=\"language-plaintext highlighter-rouge\">FunctionalOutputSpec<\/code> describes the behavior of the program whose output type is a lambda function or a functional symbol. For each input state, it specifies a set of input\/output pairs. These pairs are the behavior examples for the function <em>that is the output of a desired program on a given state<\/em>.<\/li>\r\n \t<li><code class=\"language-plaintext highlighter-rouge\">DisjunctiveExamplesSpec<\/code> specifies a set of possible desired outputs for each provided input state. On a given state &amp;sigma;, the program output is allowed to equal <em>any of<\/em> the given possible outputs &amp;phi;.<code class=\"language-plaintext highlighter-rouge\">DisjunctiveExamples<\/code>[&amp;sigma;].\r\n<ul>\r\n \t<li><code class=\"language-plaintext highlighter-rouge\">ExampleSpec<\/code> is a subclass of\r\n<code class=\"language-plaintext highlighter-rouge\">DisjunctiveExamplesSpec<\/code> where the size of\r\n&amp;phi;.<code class=\"language-plaintext highlighter-rouge\">DisjunctiveExamples<\/code>[&amp;sigma;] is constrained to 1. In other words, this is the simplest specification that specifies the single desired program output for each provided input state.<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\n<h4>Strategies<\/h4>\r\nThe main point of program synthesis in the framework is the <code class=\"language-plaintext highlighter-rouge\">SynthesisEngine<\/code> class. Its function is to execute different <strong>synthesis strategies<\/strong> for synthesizing parts of the resulting program. The synthesis process starts with an <code class=\"language-plaintext highlighter-rouge\">engine.LearnGrammar(&amp;phi;<\/code><code class=\"language-plaintext highlighter-rouge\">)<\/code> call to synthesize a program that starts with a start symbol of the grammar, or, more generally, <code class=\"language-plaintext highlighter-rouge\">engine.LearnSymbol(P<\/code>, <code class=\"language-plaintext highlighter-rouge\">&amp;phi;)<\/code> call to synthesize a program that starts with a given grammar symbol <strong>P<\/strong>.\r\n\r\nGiven a <strong>learning task<\/strong>\u00a0 (P,&amp;phi;) to synthesize a program that starts with a symbol\u00a0<strong>P<\/strong> and satisfies the specification &amp;phi;, the engine can assign this task to any of the available <strong>synthesis strategies<\/strong>. A synthesis strategy represents a specific <em>algorithm<\/em> that can synthesize a program set &amp;tilde;P for a particular kind of a learning task (P,&amp;phi;). In other words, a synthesis strategy is parameterized by its supported <em>specification type<\/em>, takes a learning task (P,&amp;phi;) where &amp;phi; should be an instance of this specification type, and learns a set of programs &amp;tilde;P that are consistent with &amp;phi;.\r\n\r\nA synthesis strategy is represented as a class inheriting <code class=\"language-plaintext highlighter-rouge\">Microsoft.ProgramSynthesis.Learning.SynthesisStrategy&lt;TSpec&gt;<\/code>, which specifies the following contract:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\npublic abstract class SynthesisStrategy&lt;TSpec&gt; : ISynthesisStrategy\r\n        where TSpec : Spec\r\n{\r\n    void Initialize(SynthesisEngine engine);\r\n    abstract Optional&lt;ProgramSet&gt; Learn(SynthesisEngine engine, LearningTask task, CancellationToken cancel);\r\n    bool CanCall(Spec spec);\r\n}[\/code]\r\n\r\nHere <code class=\"language-plaintext highlighter-rouge\">TSpec<\/code> is a supported specification type, <code class=\"language-plaintext highlighter-rouge\">Learn<\/code> is the main learning method, and <code class=\"language-plaintext highlighter-rouge\">CanCall<\/code> is the function that determines whether this synthesis strategy supports learning for a given specification <code class=\"language-plaintext highlighter-rouge\">spec<\/code> (in the default implementation, the result is <code class=\"language-plaintext highlighter-rouge\">true<\/code> if and only if <code class=\"language-plaintext highlighter-rouge\">spec<\/code> is an instance of <code class=\"language-plaintext highlighter-rouge\">TSpec<\/code>).\r\n<h4>Version space algebra<\/h4>\r\nThe return type of <code class=\"language-plaintext highlighter-rouge\">Learn<\/code> in <code class=\"language-plaintext highlighter-rouge\">SynthesisStrategy<\/code> is <code class=\"language-plaintext highlighter-rouge\">Microsoft.ProgramSynthesis.VersionSpace.ProgramSet<\/code>. This is an abstract base class for our representation for the sets of programs. A <strong>version space<\/strong>, defined by Mitchell and Lau , is a succinct representation of a program set (<em>hypothesis space<\/em> in machine learning community), consistent with a specification. A version space can be defined explicitly (as a set of programs), or composed from smaller version spaces using standard set operators. The latter property is a key to succinctness of version spaces: representing exponential sets of programs using composition operators requires only polynomial space. Such a structure defines an <em>algebra<\/em> over primitive version spaces, hence the name.\r\n\r\nAn abstract <code class=\"language-plaintext highlighter-rouge\">ProgramSet<\/code> class defines the following contract:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\npublic abstract class ProgramSet : ILanguage\r\n{\r\n    protected ProgramSet(Symbol symbol)\r\n    {\r\n        Symbol = symbol;\r\n    }\r\n\r\n    public Symbol Symbol { get; private set; }\r\n\r\n    public abstract IEnumerable&lt;ProgramNode&gt; RealizedPrograms { get; }\r\n\r\n    public abstract ulong Size { get; }\r\n\r\n    public virtual bool IsEmpty\r\n    {\r\n        get { return RealizedPrograms.IsEmpty(); }\r\n    }\r\n\r\n    public abstract ProgramSet Intersect(ProgramSet other);\r\n\r\n    public Dictionary&lt;object, ProgramSet&gt; ClusterOnInput(State input);\r\n\r\n    public Dictionary&lt;object[], ProgramSet&gt; ClusterOnInputs(IEnumerable&lt;State&gt; input);\r\n\r\n    public abstract XElement ToXML();\r\n}[\/code]\r\n\r\nThe property <code class=\"language-plaintext highlighter-rouge\">RealizedPrograms<\/code> calculates the set of programs stored in the version space.\r\n<h4>Direct version space<\/h4>\r\nDirect version space is a primitive version space that represents a set of programs explicitly, by storing a reference to <code class=\"language-plaintext highlighter-rouge\">IEnumerable&lt;ProgramSet&gt;<\/code>. Since <code class=\"language-plaintext highlighter-rouge\">IEnumerable&lt;T&gt;<\/code> in .NET is lazy, storing it in a version space does not by itself enumerate it into an explicit set. This allows you to implement a synthesis strategy as an <em>iterator<\/em> in C# (with <strong>yield return<\/strong> statements), which calculates the required number of consistent programs on the fly, as needed. Moreover, such an iterator can even be theoretically infinite, as long as the end-user requests only a finite number of consistent programs.\r\n<h4>Union version space<\/h4>\r\nA union of\u00a0<strong>k<\/strong> version spaces is a version space that contains those and only those programs that belong to at least one of the given\u00a0<strong>k<\/strong> spaces. Such a version space naturally arises when we want to learn a set of programs that start with a certain grammar symbol by learning each of its possible RHS rules automatically. For example:\r\n\r\n[code class=\"language-dsl\"]P := F(A, B) | G(C, D) [\/code]\r\n\r\nHere <strong>P,A,B,C,D<\/strong> are grammar symbols, and <strong>F,G<\/strong> are black-box operators. Given a learning task (P,&amp;phi;), one way to resolve it is to independently learn a set of programs &amp;tilde;F of form F(?A, ?B), and a set of programs &amp;tilde;G of form G(?C, ?D). If all the programs in &amp;tilde;F and &amp;tilde;G are consistent with &amp;phi;, then &amp;tildeF &amp;cup; &amp;tilde;G is a valid answer to the outer learning task (P,&amp;phi;).\r\n<h4>Join version space<\/h4>\r\nA join version space is defined for a single operator <code class=\"language-plaintext highlighter-rouge\">P := F( E&lt;sub&gt;1&lt;\/sub&gt;,. . .,E&lt;sub&gt;k&lt;\/sub&gt;)<\/code>. It represents a set of programs &amp;tilde;P formed by a Cartesian product of parameter version spaces <code class=\"language-plaintext highlighter-rouge\">&amp;tilde;{E&lt;sub&gt;1&lt;\/sub&gt;},. . .,&amp;tilde;{E&lt;sub&gt;k&lt;\/sub&gt;}<\/code>. In general, not every combination of the parameter programs from this Cartesian product is a valid sequence of parameters for the operator <strong>F<\/strong>, thus join version space depends on the operator logic to filter out the invalid combinations.\r\n\r\nThe table below outlines the APIs for building version spaces in the Microsoft Program Synthesis using Examples framework:\r\n<table class=\" msr-table-schedule all-borders\" style=\"border-collapse: collapse;width: 80%;border-spacing: inherit\">\r\n<thead>\r\n<tr>\r\n<th>Version space<\/th>\r\n<th>Given<\/th>\r\n<th>Builder code<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>Empty<\/td>\r\n<td><code class=\"language-plaintext highlighter-rouge\">Symbol s<\/code><\/td>\r\n<td><code class=\"language-plaintext highlighter-rouge\">ProgramSet.Empty(s)<\/code><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Explicit list of programs<\/td>\r\n<td><code class=\"language-plaintext highlighter-rouge\">Symbol s<\/code>\r\n<code class=\"language-plaintext highlighter-rouge\">ProgramNode p1, \u2026, pk<\/code><\/td>\r\n<td><code class=\"language-plaintext highlighter-rouge\">ProgramSet.List(s, p1, \u2026, pk)<\/code><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Lazy stream of programs<\/td>\r\n<td><code class=\"language-plaintext highlighter-rouge\">Symbol s<\/code>\r\n<code class=\"language-plaintext highlighter-rouge\">IEnumerable&lt;ProgramNode&gt; stream<\/code><\/td>\r\n<td><code class=\"language-plaintext highlighter-rouge\">new DirectProgramSet(s, stream)<\/code><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Union of version spaces for the rules<\/td>\r\n<td><code class=\"language-plaintext highlighter-rouge\">Symbol s<\/code>\r\n<code class=\"language-plaintext highlighter-rouge\">ProgramSet v1, \u2026, vk<\/code><\/td>\r\n<td><code class=\"language-plaintext highlighter-rouge\">new UnionProgramSet(s, v1, \u2026, vk)<\/code><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Join of version spaces for the parameters<\/td>\r\n<td><code class=\"language-plaintext highlighter-rouge\">Symbol s<\/code>\r\n<code class=\"language-plaintext highlighter-rouge\">GrammarRule r \/\/ r.Head == s<\/code>\r\n<code class=\"language-plaintext highlighter-rouge\">ProgramSet v1, \u2026, vk<\/code><\/td>\r\n<td><code class=\"language-plaintext highlighter-rouge\">new JoinProgramSet(r, v1, \u2026, vk)<\/code><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<div style=\"height: 30px\"><\/div>\r\n<div>\r\n\r\n<hr \/>\r\n\r\n<h3>Keeping documentation up to date is a challenge.<\/h3>\r\n<ul>\r\n \t<li>If you find something that doesn't work for you, please <a href=\"mailto:prose-contact@microsoft.com\">contact us<\/a>.<\/li>\r\n \t<li>In the meantime, check out our <a href=\"https:\/\/github.com\/Microsoft\/prose\">sample code<\/a> to learn the API.<\/li>\r\n<\/ul>\r\n\r\n<hr \/>\r\n\r\n<\/div>\r\n<div style=\"height: 30px\"><\/div>\r\n<div class=\"footnotes\">\r\n<h3>Footnotes<\/h3>\r\n<ol>\r\n \t<li>In other words, <strong>null<\/strong> is used as a special value &amp;bot; that is typically found in a formal definition of a language. <a class=\"reversefootnote\" href=\"1\">\u21a9<\/a><\/li>\r\n \t<li>Following Haskell syntax, we start our lambda functions with the \u201c\\\u201d character, which is supposed to approximately represent the letter &amp;lambda;. <a class=\"reversefootnote\" href=\"3\">\u21a9<\/a><\/li>\r\n \t<li>The common recipe that we use in our development is to reference the semantics DLL from the main project in a Visual Studio solution. This way, the semantics DLL is automatically copied to the target subdirectory on each build next to the main executable, and you can refer to it in the grammar string by simply using its filename. Alternatively, you can specify additional library paths as extra parameters to the <code class=\"language-plaintext highlighter-rouge\">Compile<\/code> method. <a class=\"reversefootnote\" href=\"4\">\u21a9<\/a><\/li>\r\n<\/ol>\r\n<\/div>"},{"id":2,"name":"Grammar syntax","content":"<h2>Grammar syntax<\/h2>\r\nThis section describes the syntax of PROSE v1 DSL grammars.\r\n<p style=\"padding-left: 40px\"><em><strong>Note<\/strong>: this syntax is volatile and subject to breaking changes in the future. We will notify you about each breaking change in accordance with SemVer.<\/em><\/p>\r\n<p style=\"padding-left: 40px\"><em>Hereafter in the general syntax descriptions, angle brackets &lt;&gt; denote a placeholder to be filled with a concrete value, and curly braces {} with an opt subscript denote an optional component (unless specified otherwise).<\/em><\/p>\r\n\r\n<h3>Basic structure<\/h3>\r\nThe basic structure of a <code class=\"language-plaintext highlighter-rouge\">*.grammar<\/code> file is as follows:\r\n\r\n[code]&lt;Namespace usings&gt;\r\n&lt;Semantics usings&gt;\r\n&lt;Learner usings&gt;\r\n\r\nlanguage &lt;Name&gt;;\r\n&lt;Feature declarations&gt;\r\n\r\n&lt;Nonterminal symbols and rules&gt;\r\n&lt;Terminal symbols&gt;\r\n[\/code]\r\n\r\nIt first specifies some metadata about the DSL, and then describes it as a grammar. A PROSE language is represented as a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Context-free_grammar\">context-free grammar<\/a> \u2013 <em>i.e.,<\/em> as a set of <em>rules<\/em>, where each <em>symbol<\/em> on the left-hand side is bound to a set of possible expansions of this symbol on the right-hand side.\r\n<h3>Usings<\/h3>\r\n<h4>Namespace usings<\/h4>\r\nThese statements are identical to the corresponding C# forms. They import a namespace into the current scope.\r\n\r\n[code class=\"language-bnf\"]using System.Text.RegularExpressions;\r\n[\/code]\r\n\r\n<h4>Semantics usings<\/h4>\r\nThese statements specify <em>semantics holders<\/em> \u2013 static classes that contain implementations of the grammar\u2019s operators. There may be more than one semantics holder, as long as their members do not conflict with each other.\r\n\r\n[code class=\"language-bnf\"]using semantics TestLanguage.Semantics.Holder;\r\n[\/code]\r\n\r\n<h4>Learner usings<\/h4>\r\nThese statements specify <em>learning logic holders<\/em> \u2013 non-static classes that inherit <code class=\"language-plaintext highlighter-rouge\">DomainLearningLogic<\/code> and contain domain-specific helper learning logic such as <a href=\"https:\/\/microsoft.github.io\/prose\/documentation\/prose\/backpropagation\/#witness-functions\">witness functions<\/a> and value generators. There may be more than one learning logic holder.\r\n\r\n[code class=\"language-bnf\"]using learners TestLanguage.Learning.LogicHolder;\r\n[\/code]\r\n\r\n<h3>Language name<\/h3>\r\nMay be any valid C# <em>full type identifier<\/em> \u2013 that is, a dot-separated string where each element is a valid C# identifier.\r\n<h3>Features<\/h3>\r\nA <a href=\"https:\/\/microsoft.github.io\/prose\/documentation\/prose\/tutorial\/#features\">feature<\/a> is a computed property on an AST in the language. Each such property has a name, type, and associated <em>feature calculator<\/em> functions that compute the value of this property on each given AST. A feature may be declared as <em>complete<\/em>, which requires it to be defined on every possible AST kind in the language. By default, a feature is not complete. <strong>Syntax:<\/strong>\r\n\r\n[code class=\"language-bnf\"]{@complete} feature &lt;Type&gt; &lt;Name&gt; = &lt;Implementation class 1&gt;, \u2026, &lt;Implementation class N&gt;;\r\n[\/code]\r\n\r\nHere <code class=\"language-plaintext highlighter-rouge\">@complete<\/code> is an optional completeness annotation, and the comma-separated identifiers on the right specify one or more classes that inherit <code class=\"language-plaintext highlighter-rouge\">Feature&lt;T&gt;<\/code> and provide implementations of the feature calculators. Notice that one feature may be implemented in multiple possible ways (<em>e.g.<\/em> the program\u2019s <code class=\"language-plaintext highlighter-rouge\">RankingScore<\/code> may be computed differently as <code class=\"language-plaintext highlighter-rouge\">LikelihoodScore<\/code> or <code class=\"language-plaintext highlighter-rouge\">PerformanceScore<\/code>, depending on the requirements), thus it is possible to specify multiple implementation classes for the same feature. A feature class does <em>not<\/em> have to be specified in the <code class=\"language-plaintext highlighter-rouge\">*.grammar<\/code> file to properly interact with the framework. As long as it inherits <code class=\"language-plaintext highlighter-rouge\">Feature&lt;T&gt;<\/code> and holds the required feature calculator implementations, its instances may be used at runtime to compute the value of the corresponding feature on the ASTs. However, specifying it in the <code class=\"language-plaintext highlighter-rouge\">*.grammar<\/code> file provides additional information to the <code class=\"language-plaintext highlighter-rouge\">dslc<\/code> grammar compiler. The compiler can then verify your feature calculator definitions and provide more detailed error messages. <strong>Example:<\/strong>\r\n\r\n[code class=\"language-bnf\"]using TestLanguage.ScoreFunctions;\r\n@complete feature double RankingScore = LikelihoodScore, PerformanceScore, ReadabilityScore;\r\n\r\n\/\/ alternatively:\r\n@complete feature double RankingScore = TestLanguage.ScoreFunctions.LikelihoodScore,\r\n                                        TestLanguage.ScoreFunctions.PerformanceScore,\r\n                                        TestLanguage.ScoreFunctions.ReadabilityScore;\r\n\r\nfeature HashSet&lt;int&gt; UsedConstants = TestLanguage.UsedConstantsCalculator;\r\n[\/code]\r\n\r\n<h3>Terminal rules<\/h3>\r\nEach <em>terminal symbol<\/em> of the grammar is associated with its own unique <em>terminal rule<\/em>. Terminal rules specify the leaf symbols that will be replaced with literal constants or variables in the AST. For example:\r\n<ul>\r\n \t<li>A terminal rule <code class=\"language-plaintext highlighter-rouge\">int k;<\/code> specifies a symbol k that represents a literal integer constant.<\/li>\r\n \t<li>A terminal rule <code class=\"language-plaintext highlighter-rouge\">@input string v;<\/code> specifies a <em>variable<\/em> symbol v that contains program input data <em>at runtime.<\/em><\/li>\r\n<\/ul>\r\n<strong>Syntax:<\/strong>\r\n\r\n[code class=\"language-bnf\"]{@values[&lt;Generator member&gt;]} {@input} &lt;Type&gt; &lt;Symbol name&gt;;\r\n[\/code]\r\n\r\n<h4>Annotations<\/h4>\r\n<h5><code class=\"language-plaintext highlighter-rouge\">@input<\/code><\/h5>\r\nDenotes the input variable passed to the DSL programs. A DSL program may depend only on a single input variable, although of an arbitrary type.\r\n<h5><code class=\"language-plaintext highlighter-rouge\">@values<\/code><\/h5>\r\nA user can specify the list of possible values that a literal symbol can be set to. This is done with a <strong>@values[<\/strong>G<strong>]<\/strong> annotation, where G is a <strong>value generator<\/strong> \u2013 a reference to a user-defined static field, property, or method. The compiler will search for G in the provided learning logic holders, and will report an error if it does not find a type-compatible member. <strong>Example:<\/strong> given the following declaration of a terminal <code class=\"language-plaintext highlighter-rouge\">s<\/code>:\r\n\r\n[code class=\"language-bnf\"]using learners TestLanguage.Learners;\r\n@values[StringGen] string s;\r\n[\/code]\r\n\r\nany of the following members in <code class=\"language-plaintext highlighter-rouge\">TestLanguage.Learners<\/code> can serve as a generator for <code class=\"language-plaintext highlighter-rouge\">s<\/code>:\r\n<div class=\"language-csharp highlighter-rouge\">\r\n<div class=\"highlight\">\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nnamespace TestLanguage\r\n{\r\n\tpublic class Learners : DomainLearningLogic\r\n\t{\r\n\t\t\/\/ Field\r\n\t\tpublic static string[] StringGen = {\"\", \"42\", \"foobar\"};\r\n\r\n\t\t\/\/ Property\r\n\t\tpublic static string[] StringGen =&gt; new[] {\"\", \"42\", \"foobar\"};\r\n        public static string[] StringGen {\r\n            get { return new[] {\"\", \"42\", \"foobar\"}; }\r\n        }\r\n\r\n\t\t\/\/ Method\r\n\t\tpublic static string[] StringGen() =&gt; new[] {\"\", \"42\", \"foobar\"};\r\n        public static string[] StringGen() {\r\n            return new[] {\"\", \"42\", \"foobar\"};\r\n        }\r\n\t}\r\n}[\/code]\r\n\r\n<\/div>\r\n<\/div>\r\n<h3>Nonterminal rules<\/h3>\r\nA <em>nonterminal rule<\/em> describes a possible <a href=\"https:\/\/en.wikipedia.org\/wiki\/Production_(computer_science)\">production<\/a> in a context-free grammar of the DSL. In contrast to conventional programming languages, the productions of PROSE grammars describe not the surface syntax of DSL programs, but their direct semantics as ASTs. In other words, where a conventional context-free grammar would specify something like\r\n<div class=\"language-plaintext highlighter-rouge\">\r\n<div class=\"highlight\">\r\n\r\n[code]expression := atom '+' atom | atom '-' atom ;\r\n[\/code]\r\n\r\n<\/div>\r\n<\/div>\r\nthe corresponding PROSE grammar specifies\r\n\r\n[code class=\"language-dsl\"]expression := Plus(atom, atom) | Minus(atom, atom) ;\r\n[\/code]\r\n\r\nThis snippet contains two nonterminal rules <code class=\"language-plaintext highlighter-rouge\">expression := Plus(atom, atom)<\/code> and <code class=\"language-plaintext highlighter-rouge\">expression := Minus(atom, atom)<\/code>. The functions <code class=\"language-plaintext highlighter-rouge\">Plus<\/code> and <code class=\"language-plaintext highlighter-rouge\">Minus<\/code> are <em>operators<\/em> in the grammar \u2013 domain-specific functions that may be included as steps of your DSL programs. Thusly, PROSE DSLs do not have a syntax \u2013 they directly describe a grammar of possible domain-specific program actions.\r\n<h4>Structure<\/h4>\r\nEvery nonterminal rule has a <em>head<\/em> and a <em>body<\/em>. Its head is a typed <em>nonterminal symbol<\/em> on the left-hand side of the production. Its body is a sequence of free symbols on the right-hand side, which may be nonterminal or terminal (i.e. variables or constants). There exist multiple different kinds of nonterminal rules, which differ in their semantics as well as in the roles of the symbols in their bodies. <strong>Syntax:<\/strong>\r\n\r\n[code class=\"language-bnf\"]{@start} &lt;Type&gt; &lt;Symbol name&gt; := &lt;Rule 1&gt; | ... | &lt;Rule N&gt;;\r\n[\/code]\r\n\r\n<h4>Annotations<\/h4>\r\n<h5><code class=\"language-plaintext highlighter-rouge\">@start<\/code><\/h5>\r\nAn optional annotation that specifies the <em>start symbol<\/em> of the grammar \u2013 that is, the root nonterminal symbol of all programs in this DSL."},{"id":3,"name":"Backpropagation","content":"<h2>Backpropagation<\/h2>\r\nDeductive strategy (i.e., backpropagation) is the main synthesis algorithm used by the PROSE SDK. It relies on external annotations, provided by the DSL designer for the language operators \u2013 witness functions. Our tutorial and samples show many use cases for specific witness functions.\r\n<h3>Witness Functions<\/h3>\r\nA witness function is a domain-specific deductive procedure for a parameter k of an operator F, that, given an outer spec \u03d5 on F, answer a question: \u201cWhat should a program used as this parameter satisfy in order for the entire F expression to satisfy \u03d5?\u201d In other words, witness functions backpropagate specifications from expressions to their subexpressions.\r\n\r\nThere are two kinds of witness functions: non-conditional and conditional.\r\n<h4>Non-conditional<\/h4>\r\nNon-conditional witness functions have the following signature:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\n[WitnessFunction(\"OperatorName\", paramIndex)]\r\nSpec Witness(GrammarRule rule, Spec outerSpec);[\/code]\r\n\r\nSince PROSE uses .NET reflection to extract information about witness functions, you should make the actual types in your signature as precise as possible.\r\nIn particular, the <code class=\"language-plaintext highlighter-rouge\">outerSpec<\/code> parameter specifies what kind of spec for\u00a0<strong>F<\/strong> your witness function can handle.\r\nTypically a different witness function is written for each spec kind: it takes a different requirement to satisfy a <code class=\"language-plaintext highlighter-rouge\">PrefixSpec<\/code> that an <code class=\"language-plaintext highlighter-rouge\">ExampleSpec<\/code>.\r\n\r\nA witness function may produce an <em>overapproximation<\/em> to the required spec instead of a necessary and sufficient spec for the parameter <strong>k<\/strong>.\r\nPROSE can still use such a witness function, but it should be marked with <code class=\"language-plaintext highlighter-rouge\">Verify = true<\/code> in its <code class=\"language-plaintext highlighter-rouge\">[WitnessFunction]<\/code> metadata attribute.\r\n\r\nIf <code class=\"language-plaintext highlighter-rouge\">outerSpec<\/code> is inconsistent (no program can possibly satisfy it), witness function should return <code class=\"language-plaintext highlighter-rouge\">null<\/code>.\r\n<code class=\"language-plaintext highlighter-rouge\">null<\/code> in PROSE is a placeholder for an \u201calways false\u201d spec.\r\n(An \u201calways true\u201d spec is called <code class=\"language-plaintext highlighter-rouge\">TopSpec<\/code>).\r\n<h4>Conditional<\/h4>\r\nConditional witness functions depend not only on an outer spec on their operator, but also possibly on some other parameters of that operator.\r\nThey have the following signature:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\n[WitnessFunction(\"OperatorName\", paramIndex, DependsOnParameters = new[] { prereqParam1, prereqParam2, ... }]\r\nSpec Witness(GrammarRule rule, Spec outerSpec, Spec prereqSpec1, Spec prereqSpec2, ...);[\/code]\r\n\r\nAs with non-conditional witness functions, prerequisite specs in the signature should be as precise as possible.\r\nTypically they will be <code class=\"language-plaintext highlighter-rouge\">ExampleSpec<\/code>s: deductive reasoning is easiest when you know precisely some fixed value of a prerequisite on the same input state.\r\n\r\nYou can use <code class=\"language-plaintext highlighter-rouge\">DependsOnSymbols = new[] { prereqName1, prereqName2, ... }<\/code> in the attribute, referring to parameter names instead of their indices (if they are unambiguous).\r\n<h4>ID Annotations<\/h4>\r\nIf a target grammar rule does not have a name (for instance, it is a <code class=\"language-plaintext highlighter-rouge\">let<\/code> rule or a conversion rule <code class=\"language-plaintext highlighter-rouge\">A := B<\/code>), you can use an <code class=\"language-plaintext highlighter-rouge\">@id<\/code> annotation in the grammar file to give it one, and then use this name as a reference in <code class=\"language-plaintext highlighter-rouge\">[WitnessFunction]<\/code> attributes.<code class=\"language-dsl\"><\/code><code><\/code>\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nstring expr := @id['LetSubstring'] let x = ChooseInput(inputs, k) in SubStr(x, posPair);\r\n\r\n[WitnessFunction(\"LetSubstring\", 0)]\r\nSpec Witness(LetRule rule, Spec outerSpec);[\/code]\r\n\r\n<code>\r\n<\/code>\r\n\r\nIn case of a <code class=\"language-plaintext highlighter-rouge\">let<\/code> rule, it has two parameters: its \u201cbinding\u201d expression (the part on the right-hand side of an equal sign) and its \u201cbody\u201d expression (the part after <strong><code class=\"language-plaintext highlighter-rouge\">in<\/code><\/strong>).\r\nPROSE provides an automatic witness function for the body parameter, so you only to write one for the binding parameter (whose index in the containing <code class=\"language-plaintext highlighter-rouge\">let<\/code> rule is <strong>o<\/strong>).\r\n<h3>Rule Learners<\/h3>\r\n<em>Rule learners<\/em> are designed for use cases when you cannot express you deductive logic in terms of witness functions on individual parameters.\r\nThey are mini-strategies: search algorithms for one grammar rule.\r\n<blockquote><strong>Note:<\/strong> Usage of rule learners is generally discouraged: if you can describe deductive reasoning as a witness function, PROSE framework can do a more aggressive optimization of its search process.<\/blockquote>\r\nA rule learner has the following signature:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\n[RuleLearner(\"OperatorName\")]\r\nOptional&lt;ProgramSet&gt; Learn(SynthesisEngine engine, GrammarRule rule, LearningTask&lt;Spec&gt; task, CancellationToken token);[\/code]\r\n\r\nYou can make recursive calls to <code class=\"language-plaintext highlighter-rouge\">engine.LearnSymbol<\/code> in your rule learner to solve deductive subproblems.\r\nThe final result should be constructed as a <code class=\"language-plaintext highlighter-rouge\">Optional&lt;ProgramSet&gt;<\/code> out of such subproblem results.\r\nThe learning task may have a more precise <code class=\"language-plaintext highlighter-rouge\">Spec<\/code> type, for example <code class=\"language-plaintext highlighter-rouge\">DisjunctiveExamplesSpec<\/code> or <code class=\"language-plaintext highlighter-rouge\">ExampleSpec<\/code>.\r\n<blockquote><strong>Note:<\/strong> it is a good .NET practice to check on the given <code class=\"language-plaintext highlighter-rouge\">CancellationToken<\/code> regularly and throw a <code class=\"language-plaintext highlighter-rouge\">TaskCancelledException<\/code> when you detect a cancellation request.<\/blockquote>"},{"id":4,"name":"Standard concepts","content":"<h2>Standard concepts<\/h2>\r\n<strong><em>Standard concepts<\/em><\/strong> are built-in PROSE operators. They have predefined semantics and, most of the time, witness functions for <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/prose-framework\/#!backpropagation\">backpropagation<\/a>. Thus, you can use standard concepts arbitrarily in your own DSLs without reimplementing them or designing your own synthesis procedures for them.\r\n<h3>Concept rules<\/h3>\r\nThe grammar syntax for a simple <em>concept rule<\/em> looks as follows:\r\n\r\n[code class=\"language-dsl\"]Type P := CustomOperatorName(E1, ..., Ek) = ConceptName(E1, ..., Ek);\r\n[\/code]\r\n\r\nHere <code class=\"language-plaintext highlighter-rouge\">ConceptName<\/code> is the name of the concept (<em>e.g.<\/em> <code class=\"language-plaintext highlighter-rouge\">Pair<\/code>), and <code class=\"language-plaintext highlighter-rouge\">CustomOperatorName<\/code> is a DSL-specific name for <em>this particular usage<\/em> of the concept (<em>e.g.<\/em> <code class=\"language-plaintext highlighter-rouge\">RegexPair<\/code>).\r\nThe parameter symbols <code class=\"language-plaintext highlighter-rouge\">E1<\/code>, \u2026, <code class=\"language-plaintext highlighter-rouge\">Ek<\/code> are passed directly to the standard concept.\r\n<blockquote><strong>Example:<\/strong> in the <code class=\"language-plaintext highlighter-rouge\">SubstringExtraction<\/code> language in our <a href=\"https:\/\/microsoft.github.io\/prose\/documentation\/prose\/tutorial\">tutorial<\/a>, the standard concept <code class=\"language-plaintext highlighter-rouge\">Pair<\/code> is referenced as follows:\r\n\r\n[code class=\"language-dsl\"]Tuple&lt;Regex, Regex&gt; positionBoundaries := RegexPair(r, r) = Pair(r, r);\r\nRegex r;\r\n[\/code]\r\n\r\nThe two <code class=\"language-plaintext highlighter-rouge\">r<\/code> parameters denote the regexes to the left and to the right of a given position boundary. They are united into a tuple with a standard concept <code class=\"language-plaintext highlighter-rouge\">Pair<\/code>.<\/blockquote>\r\n<h3>Lambda functions<\/h3>\r\nMore complex concept rules may include a <em>lambda function<\/em> on their right-hand side.\r\nFor instance, a list-processing operator <code class=\"language-plaintext highlighter-rouge\">Filter<\/code> takes as input a predicate of type <code class=\"language-plaintext highlighter-rouge\">Func&lt;T, bool&gt;<\/code> and a sequence of type <code class=\"language-plaintext highlighter-rouge\">IEnumerable&lt;T&gt;<\/code>, and returns the filtered subsequence of elements that satisfy the predicate.\r\nHere is a complete example of referencing <code class=\"language-plaintext highlighter-rouge\">Filter<\/code> in a DSL:\r\n\r\n[code class=\"language-dsl\"]#reference 'file:TestLanguage.dll';\r\nusing semantics TestLanguage.Filtering;\r\nusing learners TestLanguage.Filtering.Learners;\r\n\r\nlanguage Filtering;\r\n\r\n@input string[] v;\r\nRegex r;\r\nbool f := Match(x, r) | True();\r\n@start string[] P := Selected(f, v) = Filter(\\x: string =&gt; f, v);\r\n[\/code]\r\n\r\nHere the <em>custom operator<\/em> <code class=\"language-plaintext highlighter-rouge\">Selected(f, v)<\/code> is implemented as a <em>concept rule<\/em> <code class=\"language-plaintext highlighter-rouge\">Filter<\/code>.\r\nThe first parameter of <code class=\"language-plaintext highlighter-rouge\">Filter<\/code> is a lambda <code class=\"language-plaintext highlighter-rouge\">\\x: string =&gt; f<\/code>, and the second one is <code class=\"language-plaintext highlighter-rouge\">v<\/code>.\r\nNotice that <em>you can either use a formal parameter directly in a concept, or pass it down into a lambda body<\/em>.\r\n<h3>List of concepts<\/h3>\r\n<table class=\" msr-table-schedule all-borders\" style=\"border-collapse: collapse;width: 80%;border-spacing: inherit\">\r\n<thead>\r\n<tr>\r\n<th style=\"padding: inherit;border: inherit\">Concept<\/th>\r\n<th style=\"padding: inherit;border: inherit\">Semantics<\/th>\r\n<th style=\"padding: inherit;border: inherit\">Specs handled by PROSE<\/th>\r\n<th style=\"padding: inherit;border: inherit\">Witness functions needed?<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td style=\"padding: inherit;border: inherit\"><strong>Pair<\/strong>(<code class=\"language-plaintext highlighter-rouge\">x: T<\/code>, <code class=\"language-plaintext highlighter-rouge\">y: T<\/code>): <code class=\"language-plaintext highlighter-rouge\">Tuple&lt;T, T&gt;<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Combine <code class=\"language-plaintext highlighter-rouge\">x<\/code> and <code class=\"language-plaintext highlighter-rouge\">y<\/code> in a tuple<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><code class=\"language-plaintext highlighter-rouge\">DisjunctiveExamplesSpec<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">\u2014<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"padding: inherit;border: inherit\"><strong>Map<\/strong>(<code class=\"language-plaintext highlighter-rouge\">f: Func&lt;T, U&gt;<\/code>, <code class=\"language-plaintext highlighter-rouge\">seq: IEnumerable&lt;T&gt;<\/code>): <code class=\"language-plaintext highlighter-rouge\">IEnumerable&lt;U&gt;<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Apply <code class=\"language-plaintext highlighter-rouge\">f<\/code> to each element of <code class=\"language-plaintext highlighter-rouge\">seq<\/code>, and return a sequence of results<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><code class=\"language-plaintext highlighter-rouge\">PrefixSpec<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><code class=\"language-plaintext highlighter-rouge\">seq<\/code><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"padding: inherit;border: inherit\"><strong>Filter<\/strong>(<code class=\"language-plaintext highlighter-rouge\">f: Func&lt;T, bool&gt;<\/code>, <code class=\"language-plaintext highlighter-rouge\">seq: IEnumerable&lt;T&gt;<\/code>): <code class=\"language-plaintext highlighter-rouge\">IEnumerable&lt;T&gt;<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Return only those elements of <code class=\"language-plaintext highlighter-rouge\">seq<\/code> that satisfy the predicate <code class=\"language-plaintext highlighter-rouge\">f<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><code class=\"language-plaintext highlighter-rouge\">PrefixSpec<\/code>, <code class=\"language-plaintext highlighter-rouge\">SubsequenceSpec<\/code>, <code class=\"language-plaintext highlighter-rouge\">ExampleSpec<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">\u2014<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"padding: inherit;border: inherit\"><strong>FilterNot<\/strong>(<code class=\"language-plaintext highlighter-rouge\">f: Func&lt;T, bool&gt;<\/code>, <code class=\"language-plaintext highlighter-rouge\">seq: IEnumerable&lt;T&gt;<\/code>): <code class=\"language-plaintext highlighter-rouge\">IEnumerable&lt;T&gt;<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Return only those elements of <code class=\"language-plaintext highlighter-rouge\">seq<\/code> that <strong>do not<\/strong> satisfy <code class=\"language-plaintext highlighter-rouge\">f<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><code class=\"language-plaintext highlighter-rouge\">PrefixSpec<\/code>, <code class=\"language-plaintext highlighter-rouge\">SubsequenceSpec<\/code>,<\/td>\r\n<td style=\"padding: inherit;border: inherit\">\u2014<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"padding: inherit;border: inherit\"><strong>Kth<\/strong>(<code class=\"language-plaintext highlighter-rouge\">seq: IEnumerable&lt;T&gt;<\/code>, <code class=\"language-plaintext highlighter-rouge\">k: int<\/code>): <code class=\"language-plaintext highlighter-rouge\">T<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Return an element of <code class=\"language-plaintext highlighter-rouge\">seq<\/code> at the specified index, from the left if k &gt;= 0 or from the right if k &lt; 0<\/td>\r\n<td style=\"padding: inherit;border: inherit\"><code class=\"language-plaintext highlighter-rouge\">DisjunctiveExamplesSpec<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">\u2014<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"padding: inherit;border: inherit\"><strong>TakeWhile<\/strong>(<code class=\"language-plaintext highlighter-rouge\">f: Func&lt;T, bool&gt;<\/code>, <code class=\"language-plaintext highlighter-rouge\">seq: IEnumerable&lt;T&gt;<\/code>): <code class=\"language-plaintext highlighter-rouge\">IEnumerable&lt;T&gt;<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Return the longest prefix of <code class=\"language-plaintext highlighter-rouge\">seq<\/code> where all the elements satisfy <code class=\"language-plaintext highlighter-rouge\">f<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><code class=\"language-plaintext highlighter-rouge\">SubsequenceSpec<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">\u2014<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"padding: inherit;border: inherit\"><strong>FilterInt<\/strong>(<code class=\"language-plaintext highlighter-rouge\">initIter: Tuple&lt;int, int&gt;<\/code>, <code class=\"language-plaintext highlighter-rouge\">seq: IEnumerable&lt;T&gt;<\/code>): <code class=\"language-plaintext highlighter-rouge\">IEnumerable&lt;T&gt;<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Return a subsequence of <code class=\"language-plaintext highlighter-rouge\">seq<\/code> defined by the arithmetic progression starting at the index <code class=\"language-plaintext highlighter-rouge\">initIter.Item1<\/code> with the step <code class=\"language-plaintext highlighter-rouge\">initIter.Item2<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><code class=\"language-plaintext highlighter-rouge\">PrefixSpec<\/code>, <code class=\"language-plaintext highlighter-rouge\">SubsequenceSpec<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">\u2014<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"padding: inherit;border: inherit\"><strong>First<\/strong>(<code class=\"language-plaintext highlighter-rouge\">f: Func&lt;T, bool&gt;<\/code>, <code class=\"language-plaintext highlighter-rouge\">seq: IEnumerable&lt;T&gt;<\/code>): <code class=\"language-plaintext highlighter-rouge\">T<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">Return the first element of <code class=\"language-plaintext highlighter-rouge\">seq<\/code> that satisfies <code class=\"language-plaintext highlighter-rouge\">f<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\"><code class=\"language-plaintext highlighter-rouge\">ExampleSpec<\/code><\/td>\r\n<td style=\"padding: inherit;border: inherit\">\u2014<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>"},{"id":5,"name":"Interactivity","content":"<h2>Interactivity<\/h2>\r\nProducing a program using a program synthesis system involves a series of interactions between the user and the system. These interactions take the general form of the user providing information about the task followed by reviewing the synthesized program to determine what, if any, additional information they need to provide to accomplish their intended goal. More concretely, a user might provide an example for one input and manually inspect the output of the synthesized program on other inputs, looking for an input with an incorrect output to give a second example on. By seeking to capture that entire process instead of just the step where a program is learned from examples, PROSE's <code class=\"language-plaintext highlighter-rouge\">Session<\/code> API is a useful model for real scenarios.\r\n\r\nPROSE's <code class=\"language-plaintext highlighter-rouge\">Session<\/code> provides a stateful API for program synthesis to support interactive workflows. A <code class=\"language-plaintext highlighter-rouge\">Session<\/code> represents a user's efforts to synthesize a program for a single task. To begin a task, a new <code class=\"language-plaintext highlighter-rouge\">Session<\/code> object is constructed and maintained until the user is satisfied with the final synthesized program (and possibly serialized for future refinements to that program). In addition to keeping track of the inputs and constraints to be fed to the synthesizer, the <code class=\"language-plaintext highlighter-rouge\">Session<\/code> keeps track of programs which have been learned and provides APIs for helping the user select new inputs and outputs.\r\n\r\nEach DSL exposes a <code class=\"language-plaintext highlighter-rouge\">Session<\/code> subclass as the entrypoint to its learning API (e.g. <code class=\"language-plaintext highlighter-rouge\">Transformation.Text.Session<\/code>). To implement a <code class=\"language-plaintext highlighter-rouge\">Session<\/code> for your own DSL, extend <code class=\"language-plaintext highlighter-rouge\">Wrangling.NonInteractiveSession<\/code> (or the base class <code class=\"language-plaintext highlighter-rouge\">Wrangling.Session<\/code> if you want more control).\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nusing Microsoft.ProgramSynthesis.Transformation.Text; \/\/ construct a session\r\n\r\nvar session = new Session();[\/code]\r\n\r\n<h3>Inputs<\/h3>\r\nThe collection of all known inputs the program is expected to be run on should be provided using <code class=\"language-plaintext highlighter-rouge\">.Inputs.Add()<\/code>. If that set is large, then providing all of them may not be worthwhile (as the algorithms will only have time to consider a subset anyway). If selecting a subset of inputs to provide, they should be representative of the inputs the program will be run on. The inputs provided can accessed using <code class=\"language-plaintext highlighter-rouge\">.Inputs<\/code> and removed using <code class=\"language-plaintext highlighter-rouge\">.RemoveInputs()<\/code> or <code class=\"language-plaintext highlighter-rouge\">.RemoveAllInputs()<\/code>.\r\n\r\nThe inputs are used when learning and ranking programs (unless <code class=\"language-plaintext highlighter-rouge\">.UseInputsInLearn<\/code> is set to <code class=\"language-plaintext highlighter-rouge\">false<\/code>), as well as for <a href=\"#significant-inputs\">suggesting inputs that more information is likely needed for<\/a>.\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\n\/\/ provide some inputs\r\nsession.Inputs.Add(new InputRow(\"Greta\", \"Hermansson\"),\r\n    new InputRow(\"Kettil\", \"Hansson\"),\r\n    new InputRow(\"Etelka\", \"bala\"));[\/code]\r\n\r\n<h3>Constraints<\/h3>\r\nConstraints are any information that describe the program to be synthesized. The most common kind of constraint is an example, but DSLs may support many kinds of constraints including negative examples, types for the output, programs the synthesized program should similar to, or any constraint the author of the synthesizer wishes to define.\r\n\r\nThe base type for constraints is <code class=\"language-plaintext highlighter-rouge\">Constraint&lt;TInput, TOutput&gt;<\/code> where <code class=\"language-plaintext highlighter-rouge\">TInput<\/code> and <code class=\"language-plaintext highlighter-rouge\">TOutput<\/code> are the input and output types of programs in the DSL being synthesized. For example, for <a href=\"https:\/\/microsoft.github.io\/prose\/documentation\/transformation-text\/intro\/\"><code class=\"language-plaintext highlighter-rouge\">Transformation.Text<\/code><\/a>, the type of constraints is <code class=\"language-plaintext highlighter-rouge\">Constraint&lt;IRow, object&gt;<\/code>.\r\n\r\nIn order to provide constraints to a <code class=\"language-plaintext highlighter-rouge\">Session<\/code>, use <code class=\"language-plaintext highlighter-rouge\">.Constraints.Add()<\/code>. The constraints provided can accessed using <code class=\"language-plaintext highlighter-rouge\">.Constraints<\/code> and removed using <code class=\"language-plaintext highlighter-rouge\">.RemoveConstraints()<\/code> or <code class=\"language-plaintext highlighter-rouge\">.RemoveAllConstraints()<\/code>.\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\n\/\/ give an example session.Constraints.Add(new Example(new InputRow(\"Greta\", \"Hermansson\"), \"Hermansson, G.\"))[\/code]\r\n\r\n<h3>Synthesizing programs<\/h3>\r\nOnce a <code class=\"language-plaintext highlighter-rouge\">Session<\/code> has some inputs and constraints, a program can be synthesized. Programs are generated using the various <code class=\"language-plaintext highlighter-rouge\">.Learn*()<\/code> methods, which use the information in <code class=\"language-plaintext highlighter-rouge\">.Inputs<\/code> and <code class=\"language-plaintext highlighter-rouge\">.Constraints<\/code> to learn a program. They, like all <code class=\"language-plaintext highlighter-rouge\">Session<\/code> methods that do any significant amount of computation, have <code class=\"language-plaintext highlighter-rouge\">Async<\/code> variants which do the computation on a separate thread to make it easier to attach a <code class=\"language-plaintext highlighter-rouge\">Session<\/code> to a GUI.\r\n<ul>\r\n \t<li><code class=\"language-plaintext highlighter-rouge\">Learn()<\/code>\/[<code class=\"language-plaintext highlighter-rouge\">LearnAsync()<\/code> returns the single top-ranked program as a <code class=\"language-plaintext highlighter-rouge\">Program<\/code>.<\/li>\r\n \t<li><code class=\"language-plaintext highlighter-rouge\">LearnTopK()<\/code>\/[<code class=\"language-plaintext highlighter-rouge\">LearnTopKAsync()<\/code> takes an integer <code class=\"language-plaintext highlighter-rouge\">k<\/code> and returns the top-<code class=\"language-plaintext highlighter-rouge\">k<\/code> ranked programs as an <code class=\"language-plaintext highlighter-rouge\">IReadOnlyList&lt;Program&gt;<\/code>.<\/li>\r\n \t<li><code class=\"language-plaintext highlighter-rouge\">LearnAll()<\/code>\/[<code class=\"language-plaintext highlighter-rouge\">LearnAllAsync()<\/code> learns all programs consistent with the examples, giving the result compactly as a <code class=\"language-plaintext highlighter-rouge\">ProgramSet<\/code> (wrapped in an <code class=\"language-plaintext highlighter-rouge\">IProgramSetWrapper<\/code>).<\/li>\r\n<\/ul>\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nvar program = session.Learn();[\/code]\r\n\r\n<h3>Explanations<\/h3>\r\nIn order to decide if the synthesized program is satisfactory, the user has to comprehend what has been learned. As we assume that, in general, the user is not a programmer, simply showing the code to the user is a poor way to explain the what the program is doing. Even experienced programmers can have difficulty reading programs, especially ones in DSLs designed to be easy for a computer to synthesize programs in as opposed to being easy for a human to read.\r\n<h4>Running the program<\/h4>\r\nThe most straightforward way to explain the program is to run it. To run a <code class=\"language-plaintext highlighter-rouge\">Program<\/code>, use its <code class=\"language-plaintext highlighter-rouge\">Run()<\/code> method:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nforeach(var input in session.Inputs)\r\n{\r\n   Console.Out.WriteLine(program.Run(input));\r\n}[\/code]\r\n\r\n<table class=\" msr-table-schedule all-borders\" style=\"border-collapse: collapse;width: 80%;border-spacing: inherit\">\r\n<thead>\r\n<tr>\r\n<th style=\"text-align: left\">Input1<\/th>\r\n<th style=\"text-align: left\">Input2<\/th>\r\n<th style=\"text-align: left\">Program output<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td style=\"text-align: left\">Greta<\/td>\r\n<td style=\"text-align: left\">Hermansson<\/td>\r\n<td style=\"text-align: left\">Hermansson, G.<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: left\">Kettil<\/td>\r\n<td style=\"text-align: left\">Hansson<\/td>\r\n<td style=\"text-align: left\">Hansson, K.<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: left\">Etelka<\/td>\r\n<td style=\"text-align: left\">bala<\/td>\r\n<td style=\"text-align: left\">bala, E.<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<h4>DSL-specific<\/h4>\r\nOther explanations might be DSL-specific. For instance, <code class=\"language-plaintext highlighter-rouge\">Transformation.Text<\/code> offers a feature called \u201coutput provenance\u201d which pairs up substrings of the output with the substrings in the input they were selected from:\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nforeach(var input in session.Inputs)\r\n{ \r\n    Console.Out.WriteLine(program.ComputeOutputProvenance(input));\r\n}[\/code]\r\n\r\nShown with italic and bold substrings corresponding to each other between the input and the output:\r\n<table class=\" msr-table-schedule all-borders\" style=\"border-collapse: collapse;width: 80%;border-spacing: inherit\">\r\n<thead>\r\n<tr>\r\n<th style=\"text-align: left\">Input1<\/th>\r\n<th style=\"text-align: left\">Input2<\/th>\r\n<th style=\"text-align: left\">Program output<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td style=\"text-align: left\"><strong>G<\/strong>reta<\/td>\r\n<td style=\"text-align: left\"><em>Hermansson<\/em><\/td>\r\n<td style=\"text-align: left\"><em>Hermansson<\/em>, <strong>G<\/strong>.<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: left\"><strong>K<\/strong>ettil<\/td>\r\n<td style=\"text-align: left\"><em>Hansson<\/em><\/td>\r\n<td style=\"text-align: left\"><em>Hansson<\/em>, <strong>K<\/strong>.<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: left\"><strong>E<\/strong>telka<\/td>\r\n<td style=\"text-align: left\"><em>bala<\/em><\/td>\r\n<td style=\"text-align: left\"><em>bala<\/em>, <strong>E<\/strong>.<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nAn interactive variant of this could allow the user to select where in the input a specific part of the output should come from, although the current implementation of <code class=\"language-plaintext highlighter-rouge\">Transformation.Text<\/code> does not support such a constraint.\r\n<h4>Significant Inputs<\/h4>\r\nWhile explanations help the user understand how the program works on inputs they are looking at, if the input set is large, it is likely some problems occur only on inputs the user is not looking at. .GetSignificantInputClustersAsync() can suggest inputs that the user should take a look at. The default algorithm works for any DSL that supports learning multiple programs: it looks for inputs where the top program disagrees with other highly ranked programs. The return value is a set of clusters instead of single inputs because sets of inputs the algorithm cannot distinguish are returned together, so, for example, the UI could give preference to inputs that are currently on screen.\r\n\r\nWhen presenting a significant input to the user, possible alternative outputs can be suggested using\r\n.ComputeTopKOutputsAsync():\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nforeach(SignificantInput&lt;IRow&gt; sigInput in await session.GetSignificantInputsAsync())\r\n{\r\n    Console.Out.WriteLine(\"Input[Confidence=\" + sigInput.Confidence + \"]: \" + sigInput.Input);\r\n    foreach(object output in await session.ComputeTopKOutputsAsync(sigInput.Input, 5))\r\n    {\r\n        Console.Out.WriteLine(\"Possible output: \" + output);\r\n    }\r\n}[\/code]\r\n\r\nIf the significant inputs algorithm returns nothing at all, that indicates an assertion that the user has given sufficiently specific constraints to define the program to synthesize (modulo the inputs provided), which should give the user confidence that the synthesized program is correct.\r\n<h4>Conclusion<\/h4>\r\nPROSE's <code class=\"language-plaintext highlighter-rouge\">Session<\/code> API provides a mechanism for supporting an interactive synthesis task. After loading in the data to work with, the user can switch between providing constraints, generating programs interacting with their explanations, and requesting pointers to significant inputs. This rich vocabulary allows a user to interact with the program synthesis in a way such that they can have confidence that the program they generate will generalize as desired.\r\n<div><a href=\"#top\">Back to top &gt;<\/a><\/div>"},{"id":6,"name":"FAQs","content":"<h2>Frequently asked questions<\/h2>\r\n<h3>General questions<\/h3>\r\n<strong>How do I use the PROSE SDK?<\/strong>\r\n\r\nThe SDK is used through .NET APIs. See the tutorial for how to get started and the documentation links on the sidebar for more details.\r\n\r\n<strong>Do you have any datasets?<\/strong>\r\n\r\nYes, the datasets are available <a href=\"https:\/\/aka.ms\/prose-datasets\">here<\/a>.\r\n\r\n<strong>How do I install it?<\/strong>\r\n\r\nInstall the <code class=\"language-plaintext highlighter-rouge\">Microsoft.ProgramSynthesis<\/code> and (optionally) <code class=\"language-plaintext highlighter-rouge\">Microsoft.ProgramSynthesis.Compiler<\/code> NuGet packages in Visual Studio.\r\n\r\nYou can also generate a template DSL project by running\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nnpm install -g yo\r\nnpm install -g generator-prose\r\nyo prose[\/code]\r\n\r\n<strong>Is it cross-platform?<\/strong>\r\n\r\nYes, PROSE SDK is supported on Windows, Linux, and macOS. Currently, the package is guaranteed to work on .NET 4.5+ and .NET Core.\r\n\r\n<strong>Where can I use it?<\/strong>\r\n\r\nThe SDK is released under a <em>non-commercial license<\/em> for use in research, education, and non-commercial applications. See <a href=\"https:\/\/www.nuget.org\/packages\/Microsoft.ProgramSynthesis\/7.22.0\/License\" target=\"_blank\" rel=\"noopener\">the license<\/a> for details.\u00a0 Please note: The\u00a0<strong>samples<\/strong> for using the PROSE SDK are available under the MIT license (<a href=\"https:\/\/github.com\/microsoft\/prose\/blob\/master\/LICENSE\">as you can see in the github repo for the samples<\/a>), but the license for <strong>the SDK itself is restricted to non-commercial use only<\/strong>.\r\n\r\n<strong>Where can I find sample code?<\/strong>\r\n\r\nOur samples are located in the <a href=\"https:\/\/github.com\/microsoft\/prose\" target=\"_blank\" rel=\"noopener\">PROSE GitHub repository<\/a>.\r\n\r\n<strong>How can I contact you if I have any questions or feedback?<\/strong>\r\n\r\nIf you run into any bugs or issues, please <a href=\"https:\/\/github.com\/microsoft\/prose\/issues\" target=\"_blank\" rel=\"noopener\">open an issue<\/a> in our GitHub repository. Feel free also to <a href=\"mailto:prose-contact@microsoft.com\">email us<\/a>.\r\n<h3>Visual Studio Code on Linux<\/h3>\r\n<strong>How do I restore NuGet packages for a PROSE solution (a sample or a <code class=\"language-plaintext highlighter-rouge\">yo<\/code>-generated template)?<\/strong>\r\n\r\n[code autolinks=\"true\" highlight=\"\"]\r\nsudo apt install mono-complete nuget\r\nsudo nuget update -self\r\nnuget restore YourSolution.sln[\/code]\r\n\r\n<strong>How do I build a solution in VS Code?<\/strong>\r\n\r\nPress <kbd>Ctrl+Shift+P<\/kbd> and <a href=\"https:\/\/code.visualstudio.com\/docs\/editor\/tasks\" target=\"_blank\" rel=\"noopener\">configure a task runner<\/a>. Pick the <code class=\"language-plaintext highlighter-rouge\">msbuild<\/code> task. In the generated <code class=\"language-plaintext highlighter-rouge\">tasks.json<\/code>, replace <code class=\"language-plaintext highlighter-rouge\">msbuild<\/code> with <code class=\"language-plaintext highlighter-rouge\">xbuild<\/code>.\r\n\r\n<strong>When I try to run xbuild on my yo-generated solution, it fails with an error about <code class=\"language-plaintext highlighter-rouge\">@(AssemblyPaths -&gt; Replace('\/', '\/'))<\/code>.<\/strong>\r\n\r\nAs it turns out, <code class=\"language-plaintext highlighter-rouge\">xbuild<\/code> does not support the entirety of the <code class=\"language-plaintext highlighter-rouge\">msbuild<\/code> language. This means that you won\u2019t be able to recompile your grammar automatically on each build out of the box. We are working on fixing this. In the meantime, please regenerate your solution with <code class=\"language-plaintext highlighter-rouge\">yo prose<\/code> but <strong>answer \u201cNo\u201d to the last question<\/strong>.\r\n\r\n<strong>How do I launch a program or debug it in VS Code?<\/strong>\r\n\r\nInstall the <a href=\"https:\/\/marketplace.visualstudio.com\/items?itemName=ms-vscode.mono-debug\" target=\"_blank\" rel=\"noopener\">mono-debug<\/a> extension for VS Code and follow the instructions on its webpage."}],"related-researchers":[{"type":"user_nicename","display_name":"Sumit Gulwani","user_id":33755,"people_section":"PROSE SDK team","alias":"sumitg"},{"type":"user_nicename","display_name":"Vu Le","user_id":39174,"people_section":"PROSE SDK team","alias":"levu"},{"type":"user_nicename","display_name":"Danny Simmons","user_id":39165,"people_section":"PROSE SDK team","alias":"dsimmons"}],"msr_research_lab":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/654579","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":27,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/654579\/revisions"}],"predecessor-version":[{"id":864183,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/654579\/revisions\/864183"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/674232"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=654579"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=654579"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=654579"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=654579"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=654579"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}