Office Lens Is a Snap
The moment mobile-phone manufacturers added cameras to their devices, they stopped being just mobile phones. Not only have lightweight phone cameras made casual photography easy and spontaneous, they also have changed the way we record our lives. Now, with help from Microsoft Research, the Office team is out to change how we document our lives in another way—with the Office Lens app for Windows Phone 8.
Office Lens, now available in the Windows Phone Store, is one of the first apps to use the new OneNote Service API. The app is simple to use: Snap a photo of a document or a whiteboard, and upload it to OneNote, which stores the image in the cloud. If there is text in the uploaded image, OneNote’s cloud-based optical character-recognition (OCR) software turns it into editable, searchable text. Office Lens is like having a scanner in your back pocket. You can take photos of recipes, business cards, or even a whiteboard, and Office Lens will enhance the image and put it into your OneNote Quick Notes for reference or collaboration. OneNote can be downloaded for free.
OCR accuracy depends on the quality of the image being scanned. Camera-phone users capture images under more diverse conditions than those putting documents through a desktop scanner. Office Lens users might be taking photos from an angle, and photos could be over- or under-exposed, blurred, or suffer from glare from a whiteboard’s reflective surface. This is where the product team’s collaboration with Microsoft Research improved results for Office Lens.
“We keep things simple for the user,” says Chris Yu, principal group program manager for Office. “Office Lens automatically performs image correction and cleanup to photos on your phone before the files are uploaded to OneNote for storage and text conversion. Office Lens detects the edges of the document or whiteboard, or you have the option to set the border manually. You take a photo, and Office Lens cleans up the image and saves it to OneNote, where the OCR software in the cloud does the text-recognition work that allows you to have a searchable digital file.”
From Whiteboard to OCR
The technological starting point for Office Lens came from a research project originally called Whiteboard It. Whiteboards provide effective, economical, free-form collaboration, but their contents are difficult to archive and share with colleagues not in the room. Whiteboard image capture and data conversion pose unique challenges. Just ask Zhengyou Zhang, principal researcher and research manager of the Multimedia, Interaction, and Communication group.
“Our ultimate goal was to reproduce whiteboard content as a faithful, yet enhanced, electronic document,” Zhang says. “The original project, Whiteboard It, was something I worked on with research engineer Li-wei He.”
Whiteboard It identified key challenges in working with whiteboards, such as compensating for perspective distortion when users take photos from an angle, edge detection to localize the board’s boundaries, white balancing to deliver a uniformly white background, and strong color saturation for pen strokes. All these issues and more must be addressed to deliver a crisp image that can be integrated with any Office document.
Yu learned about Whiteboard It through senior research program manager PD Singh, the liaison between Microsoft Research and Microsoft Office product groups.
“Chris’ team tested the code and very quickly committed to integrating it with Office Lens,” Zhang recalls. “His group is based in Japan, so we had quite a few online meetings, as well as face-to-face meetings in Redmond, to look into use cases and brainstorm solutions.
“Even though Whiteboard It is a core technology of Office Lens, it’s just one component of the whole product. For example, the Office Lens product team had to design the best user experience possible for the general public. That’s a lot of work: thinking, designing, and testing. They looked at ways to cope with a person being part of a snapshot of a whiteboard and how to remove that person from the equation. Our technology prototype didn’t include such considerations. I’ve been really impressed with them.”
Yu, too, is impressed.
“Zhengyou had already envisioned Whiteboard It many years ago,” Yu says. “He saw the need for an intuitive way to enhance networked meetings. That was back when webcams had just started becoming affordable. Today, device technologies and the cloud have evolved and provide a new context for his work. We are delighted to be delivering his vision.”
Begin with High-Quality Images
Office Lens also incorporates algorithms from Advanced Image Editor (AIE), a project that researcher Lu Yuan of Microsoft Research’s Visual Computing Group and his colleagues—principal researcher Jian Sun, research development engineer Jiangyu Liu, and researcher Kaiming He—built to showcase different technologies for improving the quality of consumer photos.
“The main goal of AIE,” Yuan says, “was to provide a platform and user interface that made our core technologies easy to present to product teams. In addition, we used AIE to get feedback from internal users and product groups. The feedback they gave us further improved our techniques and drove new ideas. In fact, it was AIE that introduced our advanced image-processing algorithms to the Office team.”
Within a week of trying out AIE, a product team from Office was back in touch with Yuan to give feedback and to discuss features that would be relevant to their use cases. Yuan and Liu worked closely with product developers to integrate algorithms and code. Then followed an intense second stage of collaboration to test and improve the code, which became part of the Camera Scan feature in OneNote for Windows 8.1, released in November 2013. Their work improved existing camera image-capturing functionality for documents and whiteboards via automatic rotation, straightening, cropping, sharpening, and shadow removal.
“Then we continued our relationship by working with the Office Lens team,” Yuan says. “We contributed our latest techniques and helped them implement the code.”
Office Lens automatically applies some of the key image-enhancement algorithms from AIE to correct image tone and exposure, improve contrast, and reduce “noise.” This might sound like a repeat of Camera Scan, but consider that all of this image-processing functionality had to run efficiently on a mobile device—one of the biggest challenges for Office Lens developers and researchers.
“Working with a product group is very encouraging,” Yuan says. “It gives us passion and motivation, and the open dialog helps us understand what’s required for a better user experience. We want to develop techniques in computational photography that help users achieve high-quality photos, so our research goals are in perfect alignment with Office Lens goals.”
“I am very impressed with the efficacy of Chris’ group,” he says. “Our collaboration with his team was very smooth. They’re open to suggestions and work really hard. I’m thrilled that the technology for Whiteboard It has been productized, and I want to thank the Office Lens team. I truly believe Office Lens will help improve productivity and collaboration.”
From the product side of the effort, Yu also is thrilled.
“The collaboration we’ve had with Microsoft Research for Office Lens has been a natural fit,” he says. “Microsoft Research has such deep expertise in so many aspects of image enhancement. For example, we were able to consult with Neel Joshi and Sebastian Nowozin about best practices for blur measurement and Piotr Dollar on edge detection.”
What have early users been saying about Office Lens?
“Feedback has been incredibly positive,” Yu says. “They rave about how much time it saves and how it’s changed the way they take notes, now that they know they can just take a photo of a document and find it later in the OneNote cloud. We couldn’t have done this without Microsoft Research.”