Finding What to Read: Visual Text Analytics Tools and Techniques to Guide Investigation


July 12, 2016


Christopher Collins


University of Ontario Institute of Technology


Text is one of the most prominent forms of open data available, from social media to legal cases. Text visualizations are often critiqued for not being useful, for being unstructured and presenting data out of context (think: word clouds). I argue that we should not expect them to be a replacement for reading. In this talk I will briefly discuss the close/distant reading debate then focus on where I think text visualization can be useful: hypothesis generation and guiding investigation. Text visualization can help someone form questions about a large text collection, then drill down to investigate through targeted reading of the underlying source texts. Over the past 10 years my research focus has been primarily on creating techniques and systems for text analytics using visualization, across domains as diverse as legal studies, poetics, social media, and automotive safety. I will review several of my past projects with particular attention to the capabilities and limitations of the technologies and tools we used, how we use semantics to structure visualizations, and the importance of providing interactive links to the source materials. In addition, I will discuss the design challenges which, while common across visualization, are particularly important with text (legibility, label fitting, finding appropriate levels of ‘zoom’).