By Janie Chang, Writer, Microsoft Research
Emre Kıcıman was online browsing the business news when he noticed a box around the name of a startup company in an article about its acquisition by an industry giant. When he moved his cursor over the box, a pop-up appeared to inform him that one of his friends now worked for the startup. Kıcıman promptly sent his friend a congratulatory message.
For Kıcıman, a researcher with the Internet Services Research Center (ISRC) at Microsoft Research Redmond, this is a typical example of how connecting browsers with social-networking services can deliver an enhanced user experience. Together with Chun-Kai Wang, ISRC research software-development engineer, Kıcıman has launched the Social Web Experience to explore new models for connecting social networks with the Web-browsing experience.
Their first experiment makes use of the Social Web Experience Toolbar for Internet Explorer. The toolbar, a research prototype, analyzes Web pages and summons related content from the user’s social networks. The goal is to provide an integrated experience that presents messages, updates, and tweets in context while browsing.
Even the most dedicated user of social networks will admit that keeping up with favorite blogs, news sites, and all those tweets and updates can get time-consuming. Kıcıman and Wang wanted a way to stay current without constantly checking social-networking sites or receiving disruptive alerts.
“We realized that, with some of the technology we have at Microsoft Research,” Kıcıman says, “we could take social-networking messages and show them to you when they’re relevant to what you’re doing. We’re increasing the chance of serendipity—providing more opportunities to find out whether friends are saying or doing things related to a topic you’re reading about, right now.”
With help from Silviu-Petru Cucerzan, a researcher with the Text Mining, Search and Navigation Research group, the pair leveraged technology for entity extraction, the analysis of text for recognition of entities such as names, places, objects, or dates.
“By taking that technology and applying it to the Web pages you’re browsing,” Wang says, “and also applying it to your social-network data, we can figure out when there’s something that matches between the two.”
The Long and Short of Text Analysis
There are significant differences though, between entity extraction and the kind of analysis Kıcıman and Wang are trying to achieve.
For one thing, work to date on text analysis has focused on large documents, where more text is available to help establish context. With social messaging, however, the researchers are working with short bursts of text. When a tweet says, ‘i luv cars,’ is this a reference to automobiles or to the animated feature from Pixar?
“Entity extraction isn’t trivial, even when you’re analyzing a long document.” Wang says. “But you do stand a better chance of finding words such as ‘movie’ or ‘plot’ to help narrow down context. In short messages, we don’t have that much data, so figuring out context is even tougher.”
The other difference is that while entity extraction looks for specifics such as proper nouns and phrases, movie titles, or places, Kıcıman and Wang also are seeking categories of information: topic extraction.
“I think there is a broad spectrum of approaches,” Kıcıman says, “for making matches between text and topics and then understanding when it is appropriate to display that text to a user. We are sorting through existing entity extraction and other analysis techniques to see how they could have applicability to short messages.”
From a research point of view, improving text analysis is going to be where the team plans to focus its technical work. Kıcıman and Wang are examining various approaches to topic extraction for short social-networking messages, such as pulling in profile information and additional messages to fill contextual gaps.
“Our approach has been to build out the end-to-end tool with existing technology,” Wang says, “go back and see where they fall short, and then dive deep into those areas to improve them.”
A Smooth Delivery
Getting to a solution for reliable topic extraction is certainly critical to this project, but understanding its impact on user behavior is another major goal. The team needed the toolbar implementation to be as unobtrusive as possible, delivering social messages and data in a smooth, discreet fashion.
“Are you more likely to click and read an article,” Wang says, “if you knew that a friend had already read it and commented on it? If you were buying a plane ticket to Boston and saw a pop-up about a friend who lives there, are you more likely to e-mail about getting together for a beer? We want the user to come to the information smoothly and intuitively; otherwise they just won’t bother or, worse yet, abandon the experiment.”
One of the researchers’ first user-interface decisions was how to display matching social messages. Fortunately for Kıcıman and Wang, their experiment piqued the interest of Wissam Kazan, program manager with the Windows Live group, who contributed a number of ideas for the interface. They looked at different types of Web sites, particularly news pages with lots of headlines and topics.
“We found if we showed a list of matching messages off to the side,” Kıcıman says, “that it was hard to tell which items were related to the messages. That’s why we went with a mark-up approach, highlighting matched words with boxes and showing messages only when the cursor touches a box.”
Tweaking thresholds for messages is also part of the user-interface experience. Too many matches can be as intrusive as constant messaging alerts, so there is a slider bar that enables users to set a threshold for the number of messages to display, from only the most relevant to showing all of them. The Social Web Experience also has settings to configure the type of information it processes and which social-networking sites to include.
“We’ve already received interesting requests from users,” Wang grinned, “such as being able to ignore messages from certain friends.”
Defining Through Prototyping
“We built our end-to-end prototype toolbar to observe how it works and how it gets used,” Kıcıman explains. “And this is helping us set the scope and direction of our research in a way that would have been hard to define up front. So with end-to-end prototyping as a first step, we are able to get feedback that helps refine the research challenge.”
Releasing the prototype has highlighted interesting issues. While Kıcıman and Wang knew that improving text analysis would be the most difficult part of their research, the nature of communication via social networking adds to the text-analysis challenge.
“We are finding that language in a text page or article is very different from language in a social message,” Wang says. “Capitalization, for example, is used to convey emotion in social text, so that is something which throws off the analysis. Also, social text is very dynamic in that how people use words and phrases changes frequently, so in order to keep up with the language, we will need to re-optimize the engine from time to time.”
The researchers also have noticed that the way people express themselves on Facebook is different from the way people communicate using Twitter. Similarly, Kıcıman and Wang suspect that the way people talk when they have millions of followers or if they represent a commercial organization will be different from messaging just between friends. These nuances influence their research requirements and would not be possible to articulate without first seeing a prototype in action.
A New Browsing Experience
For those interested in trying a new browsing experience, the Social Web Experience toolbar is downloadable, and installation is simple.
“After you install it, you will see the changes in Internet Explorer right away,” Wang says. “You will see the new toolbar, and it takes about five seconds to download Twitter data. For Facebook, there is a button that logs you into Facebook to grant access to the toolbar, and you only need to do this once.”
Since the researchers are interested in user behavior, what kind of information do they collect?
“None—if you don’t want to provide any information.” Kıcıman says. “But usually, people who participate in an experimental prototype want to contribute feedback, so we are collecting good amounts of data. We receive purely quantitative statistics: how many matches were found, how many were from Twitter and how many from Facebook, the quality of the matches, and so on. These statistics are not unique or private. In addition, there are feedback buttons that allow you to send comments or suggestions to us.”
Wang stresses that all toolbar analysis happens on the client side, so there are no privacy issues. While the toolbar can grab information such as updates, tweets, and profiles, it does so respecting all your friends’ privacy settings.
More information about the toolbar is available on the Social Web Experience FAQ.
A Vision for Integrated Browsing
If the notion of browser integration with social networks takes off, what sort of scenarios do the researchers envision in the future?
“We have only just launched this experiment,” Kıcıman says, “so while we’ve already received some good feedback, we need to wait a bit longer to get a good idea of how this changes the browsing experience.
“But we have tossed around some scenarios, such as travel planning. If you’re browsing travel sites, the toolbar could bring in photos of vacations from friends who had been to those destinations, and you could e-mail or text them to ask for travel tips. The question is: How much more attention would you give to social data if presented this way? Would this help you understand a task better, facilitate activities and decisions, or would you carry on as usual?”
In the meantime, they are working to improve text analysis and developing requirements for topic extraction, the most challenging part of their research. They will be using collected data to train the analysis engine, and they already have minor updates planned for the Social Web Experience. Each new release undoubtedly will elicit more feedback about the prototype.
“That’s what we want,” Kıcıman says, “and that’s the fun of this kind of research. We’re exploring new ways to browse and exploring the new technical challenges that go with it.”