antipaucity

fighting the lack of good ideas

stats values vs stats list in splunk

Splunk’s | stats functions are incredibly useful and powerful.

There are two, list and values that look identical…at first blush.

But they are subtly different. Here’s how they’re not the same.

values is an aggregating, uniquifying function.

list is an aggregating, not uniquifying function.

“Whahhuh?!” I hear you ask.

Here’s a prime example – say you’re aggregating on the field IP_addr all user values.

Your search might contain the following chunk: | stats values(user) as user by IP_addr. So for each unique IP address, you will collate a uniquified list of users. Maybe you have the following two IP addresses: 10.10.10.10 & 10.10.20.10. And you have the following user-IP address pairings: kingpin11 10.10.10.10, fergus97 10.10.20.10, gerfluggle 10.10.10.10, kingping11 10.10.10.10, jbobgorry 10.10.10.10.

values will aggregate all of the following users associated with IP addresses: 10.10.10.10 & gerfluggle, jbobgorry, kingping11; 10.10.20.10 & fergus97.

That’s nice – it’s pretty.

But it exports in lousy form if you need to further process the data in another tool (eg Microsoft Excel).

When Splunk exports those results in a CSV, instead of getting a nice, processable file, you get tabs separating what would otherwise be individual items that have all been grouped into one field.

Enter list.

list doesn’t uniquify the values given to it, so while you still only get one line per IP address (since that was our by clause in the snippet above), you get as many IP addresses listed as there are users (in this example).

This makes for an exportable, more processable set of results that a tool like Excel can ingest to perform further analysis with relatively little reformatting needed.

Come back tomorrow for how to get the export to work “out of the box”.