Loading...
On the left, a diagram with three layers, each of which contains a half-transparent image processed from the same image. The processed images are partitioned into several grids, and each grid contains 4 x 4 image patches. From bottom to top, the number of grids in each layer are 4 x 4, 2 x 2, and 1 x 1, respectively. The layers are labeled “4x,” “8x,” and “16x,” respectively, from bottom to top. An arrow joining the three layers points upward to the words “Segmentation” and “Detection” and an ellipsis. Another arrow points from the top layer to the word “classification.” On the right, a bar chart with a blue bar labeled “Swin V1” and an orange bar labeled “Swin V2.” The orange bar is much taller and labeled “3 billion (1,536 x 1,536 resolution)”; the blue bar is labeled “197 million.” An arrow labeled “15x” points upward from the blue bar, indicating the orange bar is 15 times higher than the blue one.
Microsoft Research Blog

Swin Transformer supports 3-billion-parameter vision models that can train with higher-resolution images for greater task applicability 

June 21, 2022 | Han Hu and Baining Guo

Early last year, our research team from the Visual Computing Group introduced Swin Transformer, a Transformer-based general-purpose computer vision architecture that for the first time beat convolutional neural networks on the important vision benchmark of COCO object detection (opens in…

In the news | Fortune

A.I. that guesses your emotions could be misused and shouldn’t be available to everyone, Microsoft decides 

June 21, 2022

In the news | New York Times

Microsoft Plans to Eliminate Face Analysis Tools in Push for ‘Responsible A.I.’ 

June 21, 2022

pic

In the news | Microsoft Innovation

Tech Minutes: Swin Transformer 

June 21, 2022

Learn how Swin Transformer surpasses previously dominant CNN (convolutional neural network) architectures in computer vision. Presented by Han Hu, Principal Researcher and Research Manager from Microsoft Research Asia.

logo, company name
Articles

ACL 2022 highlights: From unified-modal encoder-decoder to neural machine translation 

June 20, 2022

As a top international academic conference in the field of natural language processing, ACL attracts paper submissions and conference participation from a large number of scholars every year. This year's ACL conference was held from May 22nd to May 27th.…

Articles

夏炎:做科学研究与技术应用的“摆渡人” 

June 19, 2022

编者按:科学研究与技术创新的过程总是充满了不确定性,科研人员无法提前计算创新的周期,也无法预料每个灵感所带来的最终结果。若想将一项研究成果落地并通过产品化的方式让更多人感受到前沿技术所带来的便利,研究工程师的参与尤为重要,他们需要全面掌握终端用户的需求,深入了解技术应用的深度与广度,打通各个环节的流程,有效地将算法模型与产品应用连接起来。然而知易行难,这一过程中的艰辛与技术落地时的成就感也只有亲身...

Articles

Ada Workshop:从“她力量”到“她行动”,共促计算机领域多元发展 

June 17, 2022

作者:王婧雯,微软亚洲研究院学术合作经理 倘若在互联网上检索“计算机专业”,与“男孩子读计算机专业有哪些优越性”相对应的,则是“女生能读计算机专业吗”。这体现了一个普遍认知,“计算机”被认为与男性更为相配。 但是,女性真的不适合计算机吗?身边的女性研究员、工程师已经给了我答案,她们都技术扎实、勤勉努力,不断推动整个领域向前进步。与此同时,许多研究也表明,要推动计算机领域的创新与可持续发展,提升性别...

Figure 6. GLIPv2 can perform a wide range of tasks.
Articles

Object Detection in the Wild via Grounded Language Image Pre-training 

June 17, 2022

Visual recognition systems are typically trained to predict a fixed set of predetermined object categories in a specific domain, which limits their usability in real-world applications. How to build a model that generalizes to various concepts and domains with minimal…

Awards | PLDI'22

PLDI’22 Distinguished Paper Award 

June 16, 2022

Ryan Beckett (Microsoft), along with coauthors Michael Greenberg (Stevens Institute of Tech) and Eric Campbell (Cornell) received the Distinguished Paper Award at PLDI'22 for their publication Kleene Algebra Modulo Theories: A Framework for Concrete KATs.

  • Previous
  • 1
  • …
  • 159
  • 160
  • 161
  • 162
  • 163
  • …
  • 575
  • Next