fighting the lack of good ideas

finally starting to get some good docs amassed

I had a decent library of documentation, templates, hand-offs, slide decks, etc in my pre-Splunk consulting life (technically, I still have them).

It’s nice to be finally getting a decent collection to draw from for my customers in my post-automation consulting life.

you can’t disaggregate

Had a customer recently ask about to disaggregate a Splunk search that had aggregated fields because they export to CSV horribly.

Here’s the thing.

You can’t disaggregate aggregated fields.

And there’s a Good Reason™, too: aggregation, by definition, is a one-way street.

You can’t un-average something.

Average is an aggregation function.

So why would you think you could disaggregate any other Splunk aggregation operation (like values or list)?

You can’t.

And you shouldn’t be able to (as nice as the theoretical use case for it might be).

So what is a body to do when you have a use case for a clean-to-export report that looks as if it had been aggregated, but every field in each row cleanly plunks-out to a single comma-separated value?

Here’s what I did:

{parent search}
| join {some field that'll exist in the subsearch}
[ search {parent search}
 | stats {some stats functions here} ]
| fields - {whatever you don't want}
| sort - {fieldname}

What does that end up doing?

The subsearch is identical to the outer search, plus whatever filtering/where/|stats you might want/need to do.

Using the resultant, filtered set, join on a field you know will be unique [enough].

Then sort however you’d like, and remove whatever fields you don’t want in the final display.

Of course, be sure your subsearch will complete in under 60 seconds and/or return fewer than 10,000 lines (unless you’ve modified your Splunk limits.conf)

stats values vs stats list in splunk

Splunk’s | stats functions are incredibly useful and powerful.

There are two, list and values that look identical…at first blush.

But they are subtly different. Here’s how they’re not the same.

values is an aggregating, uniquifying function.

list is an aggregating, not uniquifying function.

“Whahhuh?!” I hear you ask.

Here’s a prime example – say you’re aggregating on the field IP_addr all user values.

Your search might contain the following chunk: | stats values(user) as user by IP_addr. So for each unique IP address, you will collate a uniquified list of users. Maybe you have the following two IP addresses: & And you have the following user-IP address pairings: kingpin11, fergus97, gerfluggle, kingping11, jbobgorry

values will aggregate all of the following users associated with IP addresses: & gerfluggle, jbobgorry, kingping11; & fergus97.

That’s nice – it’s pretty.

But it exports in lousy form if you need to further process the data in another tool (eg Microsoft Excel).

When Splunk exports those results in a CSV, instead of getting a nice, processable file, you get tabs separating what would otherwise be individual items that have all been grouped into one field.

Enter list.

list doesn’t uniquify the values given to it, so while you still only get one line per IP address (since that was our by clause in the snippet above), you get as many IP addresses listed as there are users (in this example).

This makes for an exportable, more processable set of results that a tool like Excel can ingest to perform further analysis with relatively little reformatting needed.

Come back tomorrow for how to get the export to work “out of the box”.

a fairly comprehensive squid configuration for proxying all the http things

After combing through the docs and several howtos on deploying the Squid proxy server – none of which really did everything I wanted, of course – I’ve finally gotten to the format below.

Installing Squid is easy-peasy – it’s in the standard package repos for the major platforms (CentOS/Fedora/RHEL, Ubuntu/Debian, etc) – so just run yum install squid or apt install squid on your platform of choice (my exact install command on Ubuntu 18.04 was apt -y install squid net-tools apache2-utils).

What I wanted was an “open” (password-protected) proxy server with disk-based caching enabled that would cover all of the ports I could reasonably expect to run into.

Why “open”? Because I want to be able to turn it on and off on various mobile devices which may (or may not) have stable-ish public IPs.

Here is the config as I have it deployed, minus sensitive/site-specific items (usernames, passwords, port, etc), of course:

A working /etc/squid/squid.conf

acl SSL_ports port 443
acl SSL_ports port 8443
acl Safe_ports port 80		# http
acl Safe_ports port 21		# ftp
acl Safe_ports port 443		# https
acl Safe_ports port 1025-65535	# unregistered ports
acl Safe_ports port 280		# http-mgmt
acl Safe_ports port 488		# gss-http
acl Safe_ports port 777		# multiling http
acl Safe_ports port 8080

auth_param basic program /usr/lib/squid/basic_ncsa_auth /etc/squid/.htpasswd
auth_param basic children 15
# after "realm", put some descriptive, clever, or otherwise-identifying string that will appear when you login
auth_param basic realm Insert Incredibly Witty Title Here
auth_param basic credentialsttl 5 hours
acl password proxy_auth REQUIRED
http_access allow password

# Deny requests to certain unsafe ports
http_access deny !Safe_ports

# Deny CONNECT to other than secure SSL ports
http_access deny CONNECT !SSL_ports

# Only allow cachemgr access from localhost
http_access allow localhost manager
http_access deny manager

#http_access allow localnet
http_access allow localhost

# And finally deny all other access to this proxy
# commented-out to allow "open" use (ie password authenticated)
#http_access deny all

# Squid normally listens to port 3128
# change this line if you want it to listen on something other port
# http_port 

# Uncomment and adjust the following to add a disk cache directory.
#cache_dir ufs /var/spool/squid 100 16 256
# format is      
cache_dir ufs /tmp/squid-cache 768 16 256

# Leave coredumps in the first cache dir
coredump_dir /var/spool/squid

# Add any of your own refresh_pattern entries above these.
refresh_pattern ^ftp:		1440	20%	10080
refresh_pattern ^gopher:	1440	0%	1440
refresh_pattern -i (/cgi-bin/|\?) 0	0%	0
refresh_pattern (Release|Packages(.gz)*)$      0       20%     2880
refresh_pattern .		0	20%	4320

via off
forwarded_for off

request_header_access Allow allow all 
request_header_access Authorization allow all 
request_header_access WWW-Authenticate allow all 
request_header_access Proxy-Authorization allow all 
request_header_access Proxy-Authenticate allow all 
request_header_access Cache-Control allow all 
request_header_access Content-Encoding allow all 
request_header_access Content-Length allow all 
request_header_access Content-Type allow all 
request_header_access Date allow all 
request_header_access Expires allow all 
request_header_access Host allow all 
request_header_access If-Modified-Since allow all 
request_header_access Last-Modified allow all 
request_header_access Location allow all 
request_header_access Pragma allow all 
request_header_access Accept allow all 
request_header_access Accept-Charset allow all 
request_header_access Accept-Encoding allow all 
request_header_access Accept-Language allow all 
request_header_access Content-Language allow all 
request_header_access Mime-Version allow all 
request_header_access Retry-After allow all 
request_header_access Title allow all 
request_header_access Connection allow all 
request_header_access Proxy-Connection allow all 
request_header_access User-Agent allow all 
request_header_access Cookie allow all 
request_header_access All deny allroot

Finalize your Squid server system settings

Things you need to do once you do the above (prepend sudo to each command below if youre not logged-in as root:

  1. Enable Squid to start at boot: systemctl enable squid
  2. Create the cache directories: squid -z
  3. Create a DNS entry for your proxy host (if you want it usable outside your home network, and don’t want to reference it by IP address only)
  4. Create the authentication file (located at /etc/squid/.htpasswd in this example): touch /etc/squid/.htpasswd
  5. Create a username and password: htpasswd -c /etc/squid/.htpasswd (don’t forget this username/password combination!)
  6. Start Squid: systemctl start squid

Configure your browser to use your new proxy

Here’s where you need to go and what you need to change in Firefox:

  1. Navigate to about:preferences
  2. Click on Settings… under Network Proxy
  3. Enter your proxy host details:

To verify your proxy settings are correct, visit with both the proxy off, and then again with it on.

If your reported IP address changes between visits (with the second check being your Squid server IP) – congratulations! You have successfully deployed a Squid proxy caching server.

ben thompson missed *a lot* in his microsoft-github article

Ben Thompson is generally spot-on in his analysis of industry goings-on. But he missed a lot in The Cost of Developers this week.

Here’s what he got right about this acquisition:

  • Developers can be quite expensive (though, $7.5B (in equity) is only ~$265 per user (which is pretty cheap))
  • Microsoft is betting that a future of open-source, cloud-based applications that exist independent of platforms will be a large-and-increasing share of the future
  • That there is room in that future for a company to win by offering a superior user experience for developers directly, not simply exerting leverage on them
  • Microsoft is the best possible acquirer for GitHub
  • GitHub, having raised $350 million in venture capital, was not going to make it as an independent entity
  • Purely enterprise-focused companies like IBM or Oracle would be tempted to wring every possible bit of profit out of the company
  • What Microsoft wants is much fuzzier: it wants to be developers’ friend
  • [Microsoft] will be ever more invested in a world with no gatekeepers, where developer tools and clouds win by being better on the merits, not by being able to leverage users

And here’s what he missed and/or got wrong:

  • [Microsoft] is in second place in the cloud. Moreover, that second place is largely predicated on shepherding existing corporate customers to cloud computing; it is not clear why any new company — or developer — would choose Microsoft
  • It is very hard to imagine GitHub ever generating the sort of revenue that justifies this purchase price

Some of the below I commented on Google+ yesterday. The rest is in response to more idiocy & paranoia I’ve seen on some technical community mailing lists (bet you didn’t know those still existed) in the last 24 hours, or in response to specific items in Ben’s essay that are shortsighted, misguided, or incredibly wrong.

  • If you cannot see why new users, developers, and companies would go to Microsoft Azure offerings, you don’t understand what they’re doing
    • AWS is huge – but Azure and Google Cloud Platform (GCP) have huge technical (and economic) advantages
    • Amazon likes to throw new cloud features at the wall like spaghetti to see what sticks; Google and Microsoft have clearly thought-through this whole cloud business, and make incredibly solid business & technical sense to use over AWS in most use cases (the only [occasional] real exception being “but we already use AWS”). Have you not seen the Azure IoT offerings?
  • GitHub has not yet been profitable, and would probably have IPO’d (poorly) in the next year to keep from running out of cash
    • Arguably, GitHub would never become profitable on their own
  • Microsoft has a long history of contributing to OSS projects (most-to-all of which are on GitHub)
    • If they were going to acquire anyone in this space, GitHub is the only one that makes any sense
  • (This was tangentially-mentioned in Ben’s essay by linking to his analysis of the Microsoft-LinkedIn acquisition in 2016.) Alongside the LinkedIn acquisition a couple years back (which has an obvious play for an eventual IDaaS (fully-and-forever integration with Office365 regardless of where you work, everything follows automagically)), offering better integrations with their existing tools (Visual Studio already had git integrations – they should only get better with this acquisition) is a Good Thing™ for devs and end user alike (because making those excellent developer tools even better means they’ll be better whether they’re using GitHub, Bitbucket, GitLab, etc)
  • The more-or-less instantaneous expansion of offered items in the Windows Store (some kind of cloud-based/distributed build-on-demand for software when you want it (and which fork you want)) to “everything” on GitHub is a brilliant possibility
    • In light of Apple’s announcement yesterday about enabling iOS apps to come to macOS over the next releases of iOS and macOS, this should have been at the forefront of most people’s thought processes (after the keynote was done, of course)
    • Through this acquisition, it’s [probably] likely more developers will use Microsoft APIs (.NET, etc) in their projects
  • Echoing Ballmer’s chant, “Developers! Developers! Developers!”, while Microsoft doesn’t really care about Windows anymore (just look at the recent reorg), it is still THE most widespread end-user platform in the world – and bringing millions more developers “into the fold” is genius
    • Even if some small percentage will opt to go elsewhere, most won’t change because, well, change is hard
    • All the developers Microsoft had that weren’t yet using GitHub will have a huge reason to start
  • Microsoft has typically been a buy-don’t-build shop (there are exceptions, but look at the original DOS, PowerPoint, SQL Server, Skype, their failed attempt at Yahoo!, etc): they could have spent 5-10x as much building something “as good as” GitHub, or they could buy it; they opted for the “buy” (via equity, note, and not cash (smart from several business viewpoints (not least of which is the “enforced” interest the GitHub subsidiary (with its new CEO, etc) will have in continuing to ensure it is The place for developers to put their projects (after all, if that drops considerably, the equity aspect GitHub got in the deal is going to drop))))

don’t use symlinks unless you *know* you can

I first ran into this on Solaris in the context of [then] Opsware SAS (then HP SA, now owned by Microfocus). Bind mounts might be OK … so unless the tarball has symlinks included, don’t use them – they get traversed differently than “real” directories.

In short, when directory traversals are done, sometimes it looks at the permissions bits and if the first character is not a d (for a symlink, it’s always an l), many processes can fail.

Symlinking files is [possibly] a different story: though permissions are usually wonky on symlinks (most often lrwxrwxrwx vs -rw-r--r--, for example), since you cannot traverse into a file (whereas you can into a directory), it’s generally ok

Also – sometimes when directory listings are pulled, the symlink is fully-dereferenced, and something that appears to be in, say, $SPLUNK_HOME/etc/deployment_apps but is really in, say, /some/other/place, there are some times when Splunk will decide not to deploy it, because it’s not where it “belongs”.

Also – checksums can be computed on the symlink and not the actual file, in some (perhaps all) instances: so if, for example, you have the same outputs.conf in several apps by way of symlink, and you change it in one, the checksum for all the others may (and typically do) not get updated … so you can be left in an inconsistent state for your configs (because not all locations that should’ve received the updated outputs.conf have received it, since they’re symlinks and not a real file, and the checksum may not update on those particular apps).

Moral of the story?

Unless you really know what you’re doing, never use symlinks with Splunk.

4 places to check your website’s ssl/tls security settings

Qualys –

High-Tech Bridge –

Comodo –

SSL Checker –