Chris Pratley is the partner director of engineering for the Sway team.
Since announcing Sway on October 1st, we’ve heard from thousands of you who signed up to join the Sway Preview. We’re thrilled that so many of you see the benefit of Sway in your own life. We also know that many of you want to understand more about why we developed Sway. In this post, I’ll take you through more of the background behind Sway and the philosophy behind the product.
Why did we develop Sway?
Sway comes from the Microsoft Office team. Many of the people on the team have worked on productivity software for years. We have deep experience in how people get stuff done, what they are trying to achieve, what matters to them and how work and content creation in particular has evolved over time. Over the years we have improved our core applications, making them easier to use, more powerful, more connected to the internet, available on mobile devices, etc. Our core applications like Microsoft Word, Excel and PowerPoint have over a billion users and with the recent moves to make them available on all popular devices, that’s going to continue.
As I’ll outline, we felt that a number of factors were pointing toward an opportunity to provide a new and different experience people could choose depending on their content creation, formatting, presentation and sharing needs. Just as important, the time was right and the technology within reach to deliver on this innovative new approach.
Let’s do some time travel
Rewinding a few decades, some readers may recall that era in the mid-80s when the desktop publishing revolution happened. All of a sudden any person, using a personal computer, graphical tools, fonts and a laser printer could create documents that looked as though they had been made by a publishing house. The sense of empowerment was palpable—something had changed. For the first time, a typical author could both write their content and prepare it for final distribution by themselves.
It was around this time that the paradigm for content creation tools we are all familiar with today was developed and standardized: WYSIWYG (What You See Is What You Get, pronounced “wizzy-wig”). Given how often the term WYSIWYG is used we don’t always consider the origin of this term. The revolutionary idea was that with the advances in operating systems, graphics and font technology what you see on the screen would be what you get (when you printed—since that was all anyone really did back then!). This seems obvious now, but before WYSIWYG, you couldn’t see what your output would look like on the screen while you worked on it. People had to use arcane embedded codes somewhat like HTML (which came later) to get the output they were looking for. It involved a lot of trial and error and mental gymnastics: work which did not help with productivity.
Among the earliest content creation tools were document processors (a family that includes Word and many others). These tools focused on helping people place flowing text onto a fixed size page and made it easy to interrupt that flow with images and tables, among other things. While word processors got easier to use, new tools appeared, such as presentation aids (PowerPoint and others). These initially had the aim of helping people create physical “slides” to use in a slide projector. Later with the advent of digital projectors people could skip that step and project digitally. Other more specialty authoring tools appeared as well for page layout, charting, diagrams, etc. They all adopted the popular WYSIWYG method, which continues to serve us well today. While WYSIWYG remains powerful, there is an opportunity for an alternate model that tackles some of the challenges of today’s world.
Times are changing
The need for adaptive layout
With the advent of devices of different screen sizes, from wall sized displays, through desktops, laptops, tablets, phablets, phones, watches and even goggles, the author no longer controls how content is consumed as they did when they printed on paper. The content creation tools we use today—even modern web-based tools—more or less assume the output target is a fixed size rectangle. Even tools that don’t target paper will target say, a certain sized screen (e.g. 1024 pixels wide). There are some newer tools that offer “responsive design,” where the content is flexible across screen sizes, provided you stick with the convention of vertically scrolling content. However, you generally have to compromise a lot with those tools. Either you settle for simplistic layouts, rigid templates that tend to make your content look the same as everyone else’s, or you write your own code. People increasingly need a way to get great, device-appropriate, output that helps them get their message across and makes their content look good and unique.
The design bar is higher
Who hasn’t noticed that these days, the “professional” content they see, such as some articles and web pages, has moved beyond static, uninspired styling to be well-designed (attractive and balanced), visual (lots of media), dynamic (meaning it updates and animates), and interactive (content changes or reveals in response to user input). Yet the great majority of us ordinary users have no skill or talent to build such content. Our own creations are often flat, static and dull in comparison.
Mobile is ascendant
There are about a billion PCs active in the world. There are many more phones, many of which are “smart” meaning they run apps and you can get stuff done on them beyond basic communication. If you’re like me, you live on your phone. If I could do all my work on my phone, I would. There are plenty of people like me, and even more who don’t have the option—they have a phone or tablet but no easy access to a PC. What sort of authoring experience is right for a phone? Is it a WYSIWYG experience, just like a desktop, where you manually adjust location and formatting of text and objects, only with bigger buttons and without a precision pointing device? That is one way, but mobile opens the door to a new approach.
People are busier than ever
In an age where people have more apps than they know what to do with, people have less time than ever to master new tools. Many people also have only basic design skills, and shy away from attempting anything fancy because they aren’t confident or aren’t willing to put in the time. Nevertheless, everyone wants their work to look good, but not look like they tried too hard. We need to work with others without hassle. We have to create, access, and modify our material from anywhere, from any device, more easily than ever before.
And now for something (completely) different
It was time to bring to market some ideas we had been working on quietly for some time now. I say quietly, but we did tip our hand in some widely watched videos about the Future of Productivity. Our dream was to enable customers to just tell us what they wanted, and get that as a result. If this approach succeeded, we felt we could offer a completely different, natural experience. We would be able to engage with customers on a different level (at the level of meaning and intent, rather than low-level settings and properties). And we would really change the game when it comes to content authoring and presentation. We might call this new approach What You Get Is What You Want (WYGIWYW, pronounced “wiggy-woo”). But mostly we call it “intent-driven.”
For our intent-driven approach to work, we laid out some core requirements:
- To solve the adaptive layout problem, we need an algorithm that will allow us to take an arbitrary set of content and make it look good on any target device.
- To solve the design problem, we need an algorithm that can take any arbitrary content and style it in a way that looks professional, cool, or fun or whatever the user’s intent is.
- For the authoring experience:
- To have the freedom to do #1 and 2, we have to invent an authoring experience that gives us enough guidance from the user but doesn’t encourage the user to constrain us too much.
- For mobile, we need an experience that lends itself to smaller devices, held in the hand, and lacking precision pointing mechanisms (fat fingers rather than mice).
Towards an intentional model
If we could invent these things, we would still need one more thing: a new “user contract.” We would need to change what the user expects from us and what we can expect from them. This part was the most controversial of all.
Many people have lived their whole lives with the WYSIWYG model and its handmaiden, direct formatting. In that model, when the user applies some kind of formatting, the tool faithfully reproduces it. The formatting is directly applied to text and objects, and it is unchanging. To make a direct formatting tool more powerful, you add more commands. You add macro buttons that set many properties at once to save time. Direct formatting uses commands such as Bold to get bold text and Italic to get italic text. It also means you can drag an object on the screen to place it in an exact location.
Our bet was that despite 30 years of habits around WYSIWYG and direct formatting, we could convince people that they didn’t actually want that some of the time. We felt that what they often really wanted was something more like a great shopping experience.
“What’s that now?!” I hear you asking…
Shopping and formatting
What I mean is that WYSIWYG expects users who want new clothes to be clothing designers where (with direct formatting) they specify colors for thread, stitch pattern, thread count, cut of cloth, etc. If instead we could show them great looking clothes and they could decide what worked and customize it, they wouldn’t need to have all that skill and spend all that time. But we couldn’t just show them a limited set of options “off the rack,” so to speak. Users want to look fabulous and unique, and to get something tailored to their needs and desires. For users to get the most out of what Sway can do, and be able to express themselves, we needed to invent an experience that lets them customize their output without needing to be an expert designer and tailor.
Our inspiration was a shopping experience such as what you might experience when you go to buy a suit in a high end men’s store or a dress at a custom dress shop. The salesperson might ask you what the occasion is for the garment (work, party, etc.). They will bring out a few items to get your reaction. Perhaps you have brought a photo of something you like. All of us find it much easier to say what we like or don’t like than to explain exactly what we want, or even why we like something. You can tell the salesperson which items you like, and maybe even which aspects of the design you like – but a good salesperson might not make you do that. They will then bring a few more designs from the back room that are inspired by your input. A few more rounds of this, and you will usually get a result that is “you” but also something you might never have been able to articulate without the salesperson’s expertise.
You see this model in software in a few places. Pandora asks you to tell it a few songs or artists that you like and it computes the rest—no need to explain about beats per minute, genre, etc. Netflix asks you to rate some movies and it helps you with suggestions. In those cases they are helping you navigate a fixed catalog of content, whereas Sway is (and will be) generating layout and style from your own input.
Breaking the rules
Bringing this back to documents, the biggest rule we were breaking was the basic direct formatting rule. After all, what sort of text editor doesn’t make it easy for you to specify bold, italic or underline? In our view, it would be a text editor that made the text look better than that for you, kept your document looking consistent as you explored styles, and adapted the look to match your other work or content and even the device your readers were using. These problems are ones a user might appreciate us solving, but that “downstream” value might be hard to appreciate when you just want to make something bold.
It would take some serious “upside” for users to move away from a familiar model they were comfortable with. We did some research asking people who had made various documents why they had chosen the formatting they ended up with—what were they trying to achieve? In the simple example of bold, italic and underline, at first people said, “I wanted this text to be bold.” But when we pushed them on why, they admitted they just wanted the text to stand out from the surrounding text. The same was true of italic, underline, color and so on. What we found was that with a few exceptions, people were trying to express an “intent” much more than they cared about the specific formatting they selected. The same is true of just about anything in a document: “I am going for an open airy look,” “This image matters more than the others,” “I am trying to look professional (or edgy, etc.),” “I would like more color or pictures to spruce this up.”
What you see in Sway today is the beginning of a new intent-driven content creation model, one where you tell Sway what really matters: sequence, importance, hierarchy, “keep together”, etc. You can even tell Sway to key off a favorite image and we’ll use it as the inspiration for the color palette. Sway uses your input and produces output that leverages the full power of our formatting and layout engine to produce highly designed, animated, interactive content that respects your intent. Because Sway ensures your output is coherently designed and presented, at any time (including immediately!) you can declare “I’m done,” and your content will look great. Or, continue to refine until you are satisfied.
The options we present for refining “intent” are currently limited in the Sway Preview, but you will see a lot more in this area in the months to come. Maybe we’ll eventually even let you do direct formatting when you absolutely need say, bold text in a certain RGB color or font—but it will be the exception. Our philosophy is that at every level of customization, Sway will offer some great, appropriate options for you to pick from. These will be unique to your context, content and usage history. It’s just easier with an assistant, and we’ll be refining and personalizing those options using machine learning over time.
Learning how our bets play out
Before we launched Sway Preview, we had a number of burning questions, which we turned into “bets”:
- Will people be able to let go of WYSIWYG and switch to intent-driven (WYGIWYW)?
- Will people be comfortable that their output is adapted to the device of the consumer/reader?
- In an age of “there’s an app for that,” can we develop a successful new general purpose Office app?
- Will people be comfortable with cloud-based documents and the lack of “files?”
- What will people make of the new type of creation that Sway makes? As an amorphous, interactive, cloud-based “document” composed of a blend of the user’s content (and some curated internet content) spanning a variety of multimedia, a Sway is authentically digital and has no analog with existing content types.
With Sway we took a new approach, not only with the app, but for Office and for Microsoft as well. We released a bare-bones Preview to get feedback from users, so that we could tune and react in an agile way as we continue to develop Sway. I am pleased to report that so far our bets are all going very well. We already have many users making multiple Sways, and some dedicated fans who love the experience even in its limited form. They are using Sway for all sorts of things—from work to school to family to running for elected office! We know that some people will take time to become comfortable with the cloud-based nature of Sway, and its overall approach—and that’s fine. But we’re thrilled that so many people are happy with it already.
In the weeks and months to come you’ll see Sway change and adapt as we release new capabilities and tune up experiences based on feedback. I hope you’ll join us in the UserVoice forum where we’re collecting requests—you can vote on your favorites.
—Chris Pratley, @ChrisPr