Business Tips

5 Data Cleansing Mistakes Every Business Makes – Even Yours

Whether you’re mining your own data for ideas to help you grow your business, or you’re a full-time data analyst whose job is to reveal trends and information about others’ organizations, one thing’s certain: data quality is critical to your success.

However, even in the era of “big data” – with software that helps mine data for important discoveries and insights – cleaning up data by hand (aka “data wrangling” or “data munging”) is still a huge part of any project. In fact, it’s estimated that data scientists “spend from 50 percent to 80 percent of their time mired in this more mundane labor of collecting and preparing unruly digital data, before it can be explored for useful nuggets,” according to an article in the New York Times.

Photograph of a person seated at a conference table in a high-rise office building working on a Surface Book.

If you’re a business owner hoping to grow your company, your data informs what you do, how you grow, and where you head next. Therefore, your data must be as close to perfect as possible. But when you’re doing so much data cleansing by hand, making or overlooking errors – especially simple things that you’ve done hundreds of times – is easy to do. Which is why we’re taking a step back and looking at some of the basics that should be part of your best practices.

Best practices

  • Check spelling. Running a spell check is easy to forget, but it’s a simple step that can help make your data easier to understand. Not only can you use a spell checker to find misspelled words, but depending on the data cleansing tools you use, it might be able to help you find values that aren’t consistently used – like product or company names – which you can then add to a custom dictionary.
  • Remove spaces. Leading, trailing, or multiple embedded space characters, and nonprinting characters can cause results that are difficult to understand. To avoid them, be sure to remove spaces or unwanted characters before attempting to sort, search or filter your data.
  • Create uniform time and date formats. Because there are so many different time and date formats that can be confused with numbered part codes or other strings that contain slash marks, hyphens, periods, and colons, all times and dates should be reformatted.
  • Find and replace text. If your data has a common leading string that would be best to remove, like a label followed by a colon and space, or a suffix, like a parenthetic phrase at the end of the string that’s unnecessary, you can use a “find and replace” feature in your spreadsheet to automatically find instances of that text, then replace it with no text or another piece of text.
  • Remove duplicate rows. It’s not uncommon to find duplicate rows when you import data, but you have to be aware of the problem to remove them. To do this, filter your data for unique values, confirm that the results are what you want, then simply remove any duplicate values.
  • Merge and/or split columns. In order to make your data easier to digest, you might have to split some columns and merge others. For instance, you may want to split a column that contains a full name into two containing a first and last name. Or, you may want to split a column that contains an address field into several columns that contain the individual elements of the address, like street address, city, state, region, and/or postal code. Likewise, you may prefer one column with full names, or a single column for an address, and have to merge columns to organize your data correctly. Simply do what makes sense for your organization – and do it consistently.

Conclusion 

In an era where data seems to be everywhere, the importance of using clean data from reliable sources cannot be overstated. After all, if your data is messy, poorly designed, or inaccurate, it won’t give you information you can act on. So once your data is clean, keep it that way. Run periodic checks to make sure that your team is adhering to your standards and best practices. The quality of your data will improve, and you – and your entire team – can act on the data with confidence.

Jumpstart your business with a crash course in Microsoft 365

Empower your team to be productive every day, from virtually anywhere, with Microsoft 365. 

Learn more

Get started with Microsoft 365

It’s the Office you know, plus the tools to help you work better together, so you can get more done—anytime, anywhere.

Buy Now
Related content
Business Tips

Email analytics: Use data to write better emails

Read more
Business Tips

How Molly Moon’s uses data analytics to shape their menu

Read more
Business Tips

Use analytics to achieve a better work life balance

Read more

Business Insights and Ideas does not constitute professional tax or financial advice. You should contact your own tax or financial professional to discuss your situation..