Tables Part 6: Removing duplicates from tables (and ranges) of data

Being able to remove duplicate rows of information from a table of data is a request we hear fairly often from our customers (and one of the top questions in the comments in this blog).  Some users know that this capability exists in Excel today; unfortunately it is buried under advanced filter settings and it’s not terribly easy to use.  So we set out in Excel 12 to build a better interface specifically for this task so that any user could easily remove unnecessary data from their spreadsheet.

Remove duplicates can be found in two places in Excel 12, on the Data ribbon as well as the Table ribbon (just like sort and filters, it’s not necessary to have a table in order to use this feature).  To use the feature, a user simply has to select the data they want to examine for duplicates and press the “Remove Duplicates” button.  This will bring up a dialog that looks like this:

(Click to enlarge)

You’ll notice that all my column headers appear in the dialog.  To remove duplicates, just select the columns that Excel should use to evaluate duplicates.  For example, in my table above I want to remove all duplicate rows where the first name is the same and the last name is the same.  In other words, if there is more than one row where FirstName = David and LastName = Gainer then the extra duplicate rows will be removed.  So my table which looked like this:

(Click to enlarge)

Now looks like the following after I remove duplicates.

(Click to enlarge)

Note that remove duplicates physically removes data from your spreadsheet.  It does not hide rows.  You can, of course, back out (undo) of remove duplicates if you make a mistake.  If you wanted to first take a look at the duplicate values, you could use the new “Highlight Duplicate Values” feature that we have added to conditional formatting to do so.

In my next post I will finish up table-specific features by reviewing the work we did with table styles.