Spreadsheets are great at handling large quantities of data, which is both a blessing and a curse. It's almost always easier to handle huge data sets in a spreadsheet than in any other format. When in comes time to drawing conclusions about your data, however, you often need to shrink your data set down to a more manageable size. A default spreadsheet in Excel, for instance, has over one million rows and over [cue the Doctor Evil voice] two billion cells. That's a bit much for anyone to handle.
Enter Filters, a tool that allows you to shrink your visible data set based on your preferences. In this tutorial, I'm going to cover how to wield filters to master large data sets and take one step closer to becoming the Sorcerer Supreme of Spreadsheets. There are other, more advanced tools like PivotTables that can do even more to summarize your data, but we'll cover those in a future installment.
How to Enable Filters
There are two ways to enable filters, depending on how 'clean' your data is. A clean data set, in this context, means that your spreadsheet has column headers and continuous data entries in every cell, like this:
If your data looks all clean and tidy, you can enable filters the easy way:
1 Select any non-blank cell in your data set.
2. Open the Data menu and select 'Filter'.
3. Your data is now filtered! You can tell because each column header how has a filter icon next to it. The color of the column letters and row numbers will also change:
If you don't have a 'clean' data set, you'll need to highlight your data using the Shift key combined with either your mouse or your keyboard. After selecting your data, follow the steps above to turn filtering on.
Now that we have filters enabled, let's look at an example for how we can use this new tool. If you want to follow along, you can grab a copy of this spreadsheet and try for yourself: https://docs.google.com/spreadsheets/d/17T_2EBB9wW11OlEv4Myylol9bI3mDdQ4aFSA2KRDNSQ/Copy
Let's say we're looking at some stream quality data with over 100 data points. But really we just want to focus on one of our sites, Kelly Warm Springs. We'll use filters to hide all of the other cells.
1. Click on the Filter icon (the inverted triangle) next to 'Site Name" in cell B1:
2. Under the section where it says "Filter by values..." tap the link labeled 'Clear'. This will uncheck all of the boxes next to our site names:
3. Next, scroll down the list (or better yet, use the search box) until you find the site you're interested in displaying. Select this box:
4. Click the blue OK box. Your data is now filtered, check it out. All of the non-Kelly Warm Springs sites are hidden and you can see that the filter icon in the Site Name column has changed to a closed filter shape. Also, notice that the rows numbers on the left-hand side of the screen are no longer sequential.
That's basic filtering in a nutshell: open the filter dialog box, clear all the checkmarks, then re-check the values that you want to see. Of course, there's a bit more you can do with filtering...
OK, the time has come to do a little data analysis and learn something from this data example set. Let's say, for sake of argument, that we're only interested in warm-temperature stream data. You can use filters to look for just the streams that had temperatures higher than 15C. Go back to your filter menu and try filtering by condition:
Or what if we only want stream samples taken since Jan 1, 2012?
Or we want any site that contains the words "Flat Creek"?
You can probably think up even better applications, but those are some basic ideas to get you started. If you have a custom filter that you want to refer back to in a hurry (say, streams with a pH lower than 7 that also have Dissolved Oxygen levels over 9ppm). Once you have applied those two filters, go to Data -> Filter Views -> Save as filter view:
This will create a 'bookmark' of your custom filter settings, for you to access easily in the future. You can name your custom filter view in the upper-left hand corner of the display:
...which will make it easier to find later from the Filter views menu:
There are two common mistakes that folks often make with filters.
- Not selecting your full range. Remember, when you enable filters, it's important that you highlight the entire data range you want to use. This is essential if you have messy data with blank rows. Non-highlghted data won't be included when you turn Filtering on.
- Forgetting filters are still applied. It's easy to do a bunch of filtering to a sheet and then walk away, forgetting that your filters are still applied. If you're suspicious that you might be looking at filtered data, look for the 'closed filter' icons in each column header. Also, look to see if the row and column labels (the letters and numbers) have a shaded green background. Remember that you can always go to the Data menu and turn filtering off, just to be sure.