{"id":978456,"date":"2023-10-23T02:35:26","date_gmt":"2023-10-23T09:35:26","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&#038;p=978456"},"modified":"2024-11-03T19:36:37","modified_gmt":"2024-11-04T03:36:37","slug":"oppertune","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/oppertune\/","title":{"rendered":"OPPerTune: Post-Deployment Configuration Tuning of Services Made Easy"},"content":{"rendered":"<p>Real-world application deployments have hundreds of interdependent configuration parameters, many of which significantly influence performance and efficiency. With today\u2019s complex and dynamic services, operators need to continuously monitor and set the right configuration values (configuration tuning) well after a service is widely deployed. This is challenging since experimenting with different configurations post-deployment may reduce application performance or cause disruptions. While state-of-the-art ML approaches do help to automate configuration tuning, they do not fully address the multiple challenges in end-to-end configuration tuning of deployed applications.<\/p>\n<p>This paper presents OPPerTune, a service that enables configuration tuning of applications in deployment at Microsoft. OPPerTune reduces application interruptions while maximizing the performance of deployed applications as and when the workload or the underlying infrastructure changes. It automates three essential processes that facilitate post-deployment configuration tuning: (a) determining which configurations to tune, (b) automatically managing the scope at which to tune the configurations, and (c) using a novel reinforcement learning algorithm to simultaneously and quickly tune numerical and categorical configurations, thereby keeping the overhead of configuration tuning low. We deploy OPPerTune on two enterprise applications in Microsoft Azure\u2019s clusters. Our experiments show that OPPerTune reduces the end-to-end P95 latency of microservice applications by more than 50% over expert configuration choices made ahead of deployment. The code and datasets used are made available at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/aka.ms\/OPPerTune\">https:\/\/aka.ms\/OPPerTune<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<p>&nbsp;<\/p>\n<p><iframe loading=\"lazy\" title=\"NSDI '24 - OPPerTune: Post-Deployment Configuration Tuning of Services Made Easy\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube-nocookie.com\/embed\/c9snEu5Wv-o?feature=oembed&rel=0\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Real-world application deployments have hundreds of interdependent configuration parameters, many of which significantly influence performance and efficiency. With today\u2019s complex and dynamic services, operators need to continuously monitor and set the right configuration values (configuration tuning) well after a service is widely deployed. This is challenging since experimenting with different configurations post-deployment may reduce application [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"USENIX","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"USENIX","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"NSDI 2024","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2024-4-1","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"https:\/\/www.usenix.org\/conference\/nsdi24","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":null,"footnotes":""},"msr-research-highlight":[],"research-area":[13561,13547],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[263941],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-978456","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-algorithms","msr-research-area-systems-and-networking","msr-locale-en_us"],"msr_publishername":"USENIX","msr_edition":"","msr_affiliation":"","msr_published_date":"2024-4-1","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"USENIX","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/10\/OPPerTune.pdf","id":"982179","title":"oppertune-2","label_id":"243109","label":0},{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/www.usenix.org\/conference\/nsdi24\/presentation\/somashekar","label_id":"243109","label":0}],"msr_related_uploader":[{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/aka.ms\/OPPerTune","label_id":"243118","label":0},{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/github.com\/microsoft\/OPPerTune","label_id":"264520","label":0}],"msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[{"id":982179,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/OPPerTune.pdf"},{"id":982173,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/OPPerTune-Post-Deployment-Configuration-Tuning-of-Services-Made-Easy.pdf"},{"id":978471,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/10\/OPPerTune-Post-Deployment-Configuration-Tuning-of-Services-Made-Easy.pdf"}],"msr-author-ordering":[{"type":"user_nicename","value":"Gagan Somashekar","user_id":43416,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Gagan Somashekar"},{"type":"edited_text","value":"Karan Tandon","user_id":42375,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Karan Tandon"},{"type":"edited_text","value":"Anush Kini","user_id":42951,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Anush Kini"},{"type":"text","value":"Chieh-Chun Chang","user_id":0,"rest_url":false},{"type":"text","value":"Petr Husak","user_id":0,"rest_url":false},{"type":"text","value":"Ranjita Bhagwan","user_id":0,"rest_url":false},{"type":"edited_text","value":"Mayukh Das","user_id":41140,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Mayukh Das"},{"type":"text","value":"Anshul Gandhi","user_id":0,"rest_url":false},{"type":"edited_text","value":"Nagarajan Natarajan","user_id":37311,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Nagarajan Natarajan"}],"msr_impact_theme":[],"msr_research_lab":[199562],"msr_event":[],"msr_group":[793670,811276],"msr_project":[887322,971832],"publication":[],"video":[],"msr-tool":[1000452],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":887322,"post_title":"Network Brain","post_name":"netbrain","post_type":"msr-project","post_date":"2022-10-17 05:39:45","post_modified":"2023-11-02 01:11:16","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/netbrain\/","post_excerpt":"Holistic optimization of large-scale networked services The growth of large-scale networked services has brought to the fore myriad challenges: performance, reliability, efficiency, cost, and more. Traditionally, work on addressing and balancing these has been done in silos. For instance, an application could make choices to optimize its own performance or cost, while treating the rest of the workload and infrastructure as outside its purview. However, such an approach breaks down when an application or service&hellip;","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/887322"}]}},{"ID":971832,"post_title":"OPPerTune","post_name":"oppertune","post_type":"msr-project","post_date":"2023-10-02 08:22:51","post_modified":"2024-07-14 20:45:23","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/oppertune\/","post_excerpt":"Post-Deployment Configuration Tuning of Services Made Easy Real-world application deployments have hundreds of inter-dependent configuration parameters, many of which significantly influence performance and efficiency. With today's complex and dynamic services, operators need to continuously monitor and set the right configuration values (configuration tuning) well after a service is widely deployed. This is challenging since experimenting with different configurations post-deployment may reduce application performance or cause disruptions. While state-of-the-art ML approaches do help to automate configuration&hellip;","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/971832"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/978456","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":8,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/978456\/revisions"}],"predecessor-version":[{"id":1057251,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/978456\/revisions\/1057251"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=978456"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=978456"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=978456"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=978456"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=978456"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=978456"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=978456"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=978456"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=978456"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=978456"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=978456"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=978456"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=978456"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}