By Judith Bishop, Director of Computer Science, Microsoft Research
A thriving research and development ecosystem relies on connecting innovators with the tools for innovation. When we launched the Microsoft Open Source Challenge earlier this year, that’s just what we had in mind: Put open source code and data into the hands of students. Give them free rein, just a few rules, a little incentive, and then watch them build with, and on top of Microsoft’s open source research software and tools.
We’re excited to announce that Akond Rahman, a second-year student in the computer science doctoral program at North Carolina State University, has won the Grand Prize in the first Microsoft Open Source Challenge. In his winning submission, Akond makes use of the Send2Vec, which are the predictors and trained model files of DSSM (deep structured semantic model or deep semantic similarity model) to quantify the semantic similarity of software projects.
The idea for the project, “Quantifying Semantic Similarity of Software Projects Using Deep Semantic Similarity Model,” had been percolating in the software developer’s mind. “Small teams of engineers working in large corporations [and institutions] are constantly having to start from scratch—they can’t get anything useful out of the software repositories. If I could use a deep learning neural network like DSSM to do the semantic search and arrange and score the tokens, teams would be able find and reuse code that other teams had already created.” When a colleague at North Carolina State University brought the Open Source Challenge to Akond’s attention, he quickly found the tools and got to work.
Jianfeng Gao, Principal Researcher at Microsoft, was asked by the Open Source Challenge committee to review submissions that made use of DSSM and related tools. When he read Akond’s submission, he knew immediately that it was exceptional: “This was such a surprising, innovative use of the [DSSM] tool. It had never occurred to me to apply it this way. This report was written by someone whose area of study isn’t related to DSSM or the theory it involves—natural language processing, AI—but who clearly understands how a tool like DSSM works. This student saw a way to make the tool serve his purposes, which were completely focused on the user. I think it’s really something special.”
“Open source is good for everyone—students, researchers, companies—because it allows us to build on the collective learning of the community.” Akond Rahman, Microsoft Open Source Challenge Grand Prize winner
A veteran of the CS research world, Jianfeng is excited by the results of Open Source for Academics, which sponsored the challenge. He feels events like the challenge “have brought attention to the availability of our open source tools and software. We’ve seen an important change in the culture. Open source helps create a community for the company, students, and researchers.”
On April 25, in addition to the Open Source Challenge grand prize-winner, we announced the winners of three second prizes:
- Varun Agrawal (Georgia Tech), for “OneGroup—Automated Photo Sharing via Facial Recognition,” which uses Microsoft Cognitive Services (formerly Project Oxford) to create an automated photo-sharing feature that integrates Microsoft OneDrive and the Outlook Contacts API. It optimizes sharing flows for customers by answering the question “how do I share more easily?”
- Saeid Tizpaz Niari (University of Colorado-Boulder), for “CONfidentiality CERTifier, a Modeling and Verification Framework for Program Confidentiality,” which extracts a nondeterministic transducer abstraction from programs and uses transducer techniques for analysis. A prototype tool was built around the Z3 theorem prover.
- Yida Wang (Beijing University of Posts and Telecommunications) for “CNTK on Mac: 2D Object Restoration and Recognition Based on 3D Model,” which synthesizes and renders 2D images with and without background, and uses the Computational Network Toolkit (CNTK) to train a segmentation and restoration model to restore the foreground image. CNTK’s open source was changed to support CNTK on Mac for object recognition based on 3D object or normal photos.
The Open Source Challenge did exactly what we’d hoped: the winning students — some of whom hadn’t known about the offerings available through the Open Source for Academics program at Microsoft—found the tools they needed to solve real problems. Their takeaway from the experience—in addition to the prizes and, we hope, the greatly-deserved awe of their colleagues—is that they have a new source for tools to help them with future projects, solving future problems. Others—many of whom were already using Microsoft Research open source—seized the opportunity to put their work in front of the people who appreciate it most.
At the same time, the Challenge has opened up to Microsoft’s researchers a new wave of developers who can engage and assist in taking their tools forward. Open source is far from just a mechanism for releasing code; it’s a means for building a community of users who are also developers and who care about the direction and quality of the product, whether large or small. With their fresh approach, students can play a key role in creating these communities and helping direct the products Microsoft cares about. An example is the project submitted by Yida Wang, who adapted the CNTK platform for the Mac. Yida submitted a change from a dependency on one open source library to another that was compatible with the Mac. Yida’s work involves looking at synthetic images and real photos, and he needs parallelization to achieve reasonable run times. His project filled his particular need, and this expanded capability is now available to all.
Software development advances and experimentation continue to be pushed by the no-strings-attached, ready availability of code and software that open source provides. We hope that opportunities like the Open Source Challenge will continue to stimulate innovation and creative research approaches by connecting students and other innovators with code and data that’s been grown in our own labs.