{"id":1150291,"date":"2025-10-22T10:21:05","date_gmt":"2025-10-22T17:21:05","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=1150291"},"modified":"2025-11-07T05:25:17","modified_gmt":"2025-11-07T13:25:17","slug":"efficient-ai-applications-context-engineering-and-agents","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/efficient-ai-applications-context-engineering-and-agents\/","title":{"rendered":"Efficient AI applications: context engineering and agents"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background  has-background- card-background--full-bleed\">\n\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1920\" height=\"720\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/M365-Research-Page-Banner_1920x720.jpg\" class=\"attachment-full size-full\" alt=\"M365 Research banner: network of connected points\" style=\"\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/M365-Research-Page-Banner_1920x720.jpg 1920w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/M365-Research-Page-Banner_1920x720-300x113.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/M365-Research-Page-Banner_1920x720-1024x384.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/M365-Research-Page-Banner_1920x720-768x288.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/M365-Research-Page-Banner_1920x720-1536x576.jpg 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/M365-Research-Page-Banner_1920x720-1600x600.jpg 1600w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/06\/M365-Research-Page-Banner_1920x720-240x90.jpg 240w\" sizes=\"auto, (max-width: 1920px) 100vw, 1920px\" \/>\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 \">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 \">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/group\/efficient-ai\/\" class=\"icon-link icon-link--reverse mb-2\" data-bi-cN=\"Efficient AI team\">\n\t\t\t\t\t\t\t\t\t<span class=\"c-glyph glyph-chevron-left\" aria-hidden=\"true\"><\/span>\n\t\t\t\t\t\t\t\t\tEfficient AI team\t\t\t\t\t\t\t\t<\/a>\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 class=\"wp-block-heading\" id=\"efficient-ai-applications-context-engineering-and-agents\">Efficient AI applications: context engineering and agents<\/h1>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<p>Modern AI systems face a dual challenge: delivering high\u2011quality outputs while staying cost- and latency\u2011efficient. Every token processed and every millisecond of compute impacts scalability, user experience, and sustainability. Efficiency isn\u2019t just an optimisation, it\u2019s a design principle that makes AI applications feasible and scalable.<\/p>\n\n\n\n<p><strong>Efficient AI applications start with the right context<\/strong>. Identifying relevant information, reducing redundancy, and maintaining long-term memory are key to effective performance. Our research combines structured and unstructured context pruning, hybrid retrieval, and intelligent compression to minimize unnecessary tokens without losing utility. These techniques improve quality\u2011per\u2011dollar for complex workflows and enable more predictable latency over long sessions.<\/p>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"2027\" height=\"726\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-06-115905.png\" alt=\"Example: LLMLingua2 and TACO-RL prompt compression algorithms for efficient context engineering\" class=\"wp-image-1151273\" style=\"width:928px;height:auto\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-06-115905.png 2027w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-06-115905-300x107.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-06-115905-1024x367.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-06-115905-768x275.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-06-115905-1536x550.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/10\/Screenshot-2025-10-06-115905-240x86.png 240w\" sizes=\"auto, (max-width: 2027px) 100vw, 2027px\" \/><figcaption class=\"wp-element-caption\">Example: LLMLingua2 and TACO-RL prompt compression algorithms for efficient context engineering<\/figcaption><\/figure>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p><strong>Beyond context, we build efficient agents <\/strong>that make smart decisions about tools, compute, and memory. These agents plan, route, and execute tasks with minimal overhead, leveraging long\u2011horizon memory and selective model pathways to reduce redundant steps and optimise resource use.<\/p>\n\n\n\n<p>From context engineering to efficient agentic workflows, our goal is simple: AI applications that do more with fewer resources &#8212; fast, reliable, and ready for real\u2011world scale.<\/p>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n","protected":false},"excerpt":{"rendered":"<p>Modern AI systems face a dual challenge: delivering high\u2011quality outputs while staying cost- and latency\u2011efficient. Every token processed and every millisecond of compute impacts scalability, user experience, and sustainability. Efficiency isn\u2019t just an optimisation, it\u2019s a design principle that makes AI applications feasible and scalable. Efficient AI applications start with the right context. Identifying relevant [&hellip;]<\/p>\n","protected":false},"featured_media":1045266,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":true,"_classifai_error":"","footnotes":""},"research-area":[13556],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-1150291","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[1016619,1031946,1140724,1146313,1146318,1151270],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[{"type":"user_nicename","display_name":"Molly Xia","user_id":41943,"people_section":"Related people","alias":"mollyxia"},{"type":"user_nicename","display_name":"Camille Couturier","user_id":40111,"people_section":"Related people","alias":"cacoutur"},{"type":"user_nicename","display_name":"Dongge Han","user_id":43392,"people_section":"Related people","alias":"donggehan"},{"type":"user_nicename","display_name":"Victor Ruehle","user_id":41027,"people_section":"Related people","alias":"virueh"},{"type":"user_nicename","display_name":"Renee St. Amant","user_id":43080,"people_section":"Related people","alias":"reneestamant"},{"type":"user_nicename","display_name":"Daniel Eduardo Madrigal Diaz","user_id":40480,"people_section":"Related people","alias":"danielmad"},{"type":"user_nicename","display_name":"Samuel Kessler","user_id":43566,"people_section":"Related people","alias":"t-skessler"},{"type":"user_nicename","display_name":"Ankur Mallick","user_id":42441,"people_section":"Related people","alias":"ankurmallick"},{"type":"user_nicename","display_name":"Mirian Hipolito Garcia","user_id":40483,"people_section":"Related people","alias":"mirianh"},{"type":"user_nicename","display_name":"Spyridon (Spyros) Mastorakis","user_id":43994,"people_section":"Related people","alias":"smastorakis"},{"type":"user_nicename","display_name":"Helia Hashemi","user_id":44000,"people_section":"Related people","alias":"heliahashemi"},{"type":"user_nicename","display_name":"Jue Zhang","user_id":41212,"people_section":"Related people","alias":"juezhang"}],"msr_research_lab":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1150291","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":38,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1150291\/revisions"}],"predecessor-version":[{"id":1155060,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1150291\/revisions\/1155060"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1045266"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1150291"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1150291"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1150291"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1150291"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=1150291"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}