antipaucity

fighting the lack of good ideas

finally starting to get some good docs amassed

I had a decent library of documentation, templates, hand-offs, slide decks, etc in my pre-Splunk consulting life (technically, I still have them).

It’s nice to be finally getting a decent collection to draw from for my customers in my post-automation consulting life.

a lot of travel

Over the past month, and through the end of March, I’ve done, and will be doing, a lot of travel for work.

Nothing I haven’t done before, but it’s been a long time since I’ve had to be onsite for more than a couple weeks at a time – most customer leap at the chance to do remote work.

Sadly, that has not been possible with one customer, and the other is highly reticent to allow contractors to engage remotely until they’ve put in several weeks of face time.

Face-to-face interactions are certainly important (I even noted so 4.5 years ago), but conference calls, webexes, and the like can most assuredly replace much of that.

they asked the right question

Let me compare the experience I wrote about yesterday to another I had the same year with the first customer I was ever sent to – HSBC.

Just a couple weeks after starting with ProServe in 2008, I was sent to Chicago to do a final PoC for HSBC. Someone else had done a PoC the previous year, but with HP’s acquisition of Opsware, HSBC (along with many other customers and potential customers) held-off on signing a purchase contract so they could bundle “everything” they wanted from HP under one big honking purchase order.

And due to changes in the underlying product architecture, HSBC wanted a fresh demo to play with for a little while before writing-in that line item into their PO.

Enter me. A freshly-minted consultant who hadn’t yet developed a solid cheat sheet. So fresh, I thought staying 20 minutes away in a Comfort Inn to save $12 a night was smart (it’s not – always stay as close to your customer as you can (that is within budget) when you’re traveling). But I digress.

After a set of unexpected flight delays, instead of being able to start Monday before lunch, I didn’t even get to meet the customer team until almost end-of-business Monday. Tuesday morning, my main contact met me at the door, escorted me into their lab, and introduced me to the “spare” hardware I’d be working on – a ~5-year-old Sun server running Solaris 10 (thankfully – they’d only just upgraded from Sun OS 9 on that machine a couple weeks before).

Like my main contact in Nutley later that year, my main contact at HSBC was an old hat Solaris admin – he’d been using and administering Sun equipment for nearly 20 years. Smart guy (but, unlike the guy in NJ that summer, he wasn’t a Sun fanboi purist).

The reason we were using retired (and, possibly, resurrected) hardware was because they didn’t trust one of the sales reps (who had since been fired) who made some pretty sweeping promises to them early on in the sales cycle. And, whomever had been in several months prior to do the first PoC had apparently complained bitterly about “having to use Sun”.

So they partially set me up to fail – but I was too dumb to realize it at the time…a perfect instance of the old phrase, “you can’t fool me, I’m too ignorant”.

I did have to suffer through slow network access (the NIC onboard “supported” 100Mbps … but it was flaky, so it had been down-throttled to just 10Mbps. To put this is a little context, that was slower than my home internet access – even then – 10 years ago!

Wednesday about lunchtime, the HSBC project manager for “HP automation initiatives” introduced herself and through our conversation, casually asked, “if you had your druthers, what kind of hardware would you install SA on to support our environment?”

So I answered what I’d use: each server in each SA Core (they were going to have 3) should have 16+ x86-64 CPUs, at least 32 GB RAM, and ample storage (at least 100 GB just for the install, let alone extra space which might be needed for the software and OS libraries). Oh. And it should be running RHEL – don’t use Solaris as the host OS for HPSA.

She pressed me to find out why I suggested this, and I told her, “because SA is written on Linux, and the ported to Solaris; every major issue SA has run into in the last few years regarding OS conflicts has happened on Sun hardware & OSes.”

A little while later, she thanked me for our conversation, thanked me for getting SA up and running so quickly (even on half decade out of date hardware, I had it installed and ready to demo to them in only a little over 1.5 days), which gave me time to go through its functionality, show-off some new things in 7.0 that hadn’t been possible (or as easy) in 6.1 (or 6.5, or 6.6), and even be told I could head out to the airport a little early on Thursday! Win-win-win all around.

Fast forward a few months.

I get a phone call from the engagement manager I’d worked with on the HSBC PoC week, and he asked me if I had a current passport. I told him, “yes,” and asked him why he wanted to know.

He then informed me that HSBC was getting ready to finalize a $12+ million dollar hardware, software, and services sale … but would only be buying SA if I was available to install it.

That’s cool – getting asked back is always a Good Thing™ … but what does that have to do with having a current passport? Bob elaborated: HSBC has a policy of vendors doing installs on site (not weird). And two of those “on site” locations were not in the US: one would be in London England, and the other in Hong Kong. “Would I be able to do that?”, he wanted to know.

“Yes. Yes, I would.”

“OK,” he said, “I’ll send travel dates and details in a few days.”

I hung up, then wondered if I’d said “yes” maybe a little too quickly: who gets asked to be the installation engineer who’s holding-up the finalization of a multi-million-dollar sale? Especially when I knew there were folks at least as qualified, if not much more so, available?

This was my first experience with being asked-back as a consultant (I’d been asked-for when I worked in Support, but that was very different).

And, ultimately, it’s what led to the single best services engagement I had for quite a while. And giving me a [partially] company-paid vacation to the UK. And getting my first stamps in my passport. And establishing a friendship with a customer contact in London who’ve I’ve stayed in touch with ever since.

All from not knowing the “project manager” was actually high-enough up in the HSBC management chain that her recommendations/requests for external personnel would be honored even on big contracts – and being truly honest with her when she asked what I viewed as a casual, throwaway question in a loud computer lab on a cool Wednesday afternoon in April.

The upshot is to always treat everyone you meet as “just another person” – whether a CEO or a janitor, they put their pants on the same way you do: one leg at a time.

but, i got them on sale!

Back in August 2008, I had a one-week “quick start” professional services engagement in Nutley New Jersey. It was a supposed to be a super simple week: install HP Server Automation at BT Global.

Another ProServe engineer was onsite to setup HP Network Automation.

Life was gonna be easy-peasy – the only deliverable was to setup and verify a vanilla HPSA installation.

Except, like every Professional Services engagement in history, all was not as it seemed.

First monkey wrench: our primary technical contact / champion was an old-hat Sun Solaris fan (to the near-exclusion of any other OS for any purpose – he even wanted to run SunOS on his laptop).

Second monkey wrench: expanding on the first, out technical contact was super excited about the servers he’d gotten just the weekend before from Sun because they were “on sale”.

It’s time for a short background digression. Because technical intricacies matter.

HP Server Automation was written on Red Hat Linux. It worked great on RHEL. But, due to some [large] customer requests, it also supported running on Sun Solaris.

In 2007, Sun introduced a novel architecture dubbed, “Niagara”, or UltraSPARC T1, which they offered in their T1000 and T2000 series servers. Niagara did several clever things – it offered multiple threads running per core, with as many as 32 simultaneous processes running.

According to AnandTech, the UltraSPARC T1 was a “72 W, 1.2 GHz chip almost 3 times (in SpecWeb2005) as fast as four Xeon cores at 2.8 GHz”.

But there is always a tradeoff. The tradeoff Sun chose for the first CPU in the product line was to share a single FPU (floating point unit) between the integer cores and pipelines. For workloads that mostly involve static / simple data (ie, not much in the way of calculation), they were blazingly fast.

But sharing an FPU brings problems when you need to actually do floating-point math – as cryptographic algorithms and protocols all end up relying upon for gathering entropy for their random value generation processes. Why does this matter? Well, in the case of HPSA, not only is all interprocess, intraserver, and interserver communication secured with HTTPS certificates, but because large swaths are written in Java, each JVM needs to emulate its own FPU – so not only is the single FPU shared between all of the integer cores of the T1 CPU, it is further time-sliced and shared amongst every JRE instance.

At the time, the “standard” reboot time for a server running in an SA Core was generally benchmarked at ~15-20 minutes. That time encompassed all of the following:

  • stop all SA processes (in the proper order)
  • stop Oracle
  • restart the server
  • start Oracle
  • start all SA components (in the proper order)

As you’ll recall from my article on the Sun JRE 1.4.x from 6.5 years ago, there is a Java component (the Twist) that already takes a long time to start as it seeds its entropy pool.

So when it is sharing the single FPU not only between other JVMs, but between every other process which might end up needing it, the total start time is reduced dramatically.

How dramatically? Shutdown alone was taking upwards of 20 minutes. Startup was north of 35 minutes.

That’s right – instead of ~15-20 minutes for a full restart cycle, if you ran HPSA on a T1-powered server, you were looking at ~60+ minutes to restart.

Full restarts, while not incredibly common, are not all that unordinary, either.

At the time, it was not unusual to want to fully restart an HPSA Core 2-3 times per month. And during initial installation and configuration, restarts need to happen 4-5 times in addition to the number of times various components are restarted during installation as configuration files are updated, new processes and services are started, etc.

What should have been about a one-day setup, with 2-3 days of knowledge transfer – turned into nearly 3 days just to install and initially configure the software.

And why were we stuck on this “revolutionary” hardware? Because of what I noted earlier: our main technical contact was a die-hard Solaris fanboi who’d gotten these servers “on sale” (because their Sun rep “liked them”).

How big a “sale” did he get? Well, his sales rep told him they were getting these last-model-year boxes for 20% off list plus an additional 15% off! That sounds pretty good – depending on how you do the math, he was getting somewhere between 32% and 35% off the list price – for a little over $14,000 a piece (they’d bought two servers – one to run Oracle RDBMS (which Oracle themselves recommended not running on the T1 CPU family), and the other to run HPSA proper).

Except his sales rep lied. Flat-out lied. How do I know? Because I used Sun’s own server configurator site and was able to configure two identical servers for just a smidge over $15,000 each – with no discounts. That means they got 7% off list …
tops.

So not only were they running hardware barely discounted off list (and, interestingly, only slightly cheaper (less than $2000) than the next generation T2-powered servers which had a single FPU per core, not per CPU (which still had some performance issues, but at least weren’t dog-vomit slow), but they were running on Solaris – which had always been a second-class citizen when it came to HPSA performance: all things being roughly equal, x86 hardware running RHEL would always smack the pants off SPARC hardware running Solaris under Server Automation.

For kicks, I configured a pair of servers from Dell (because their online server configurator worked a lot better than any other I knew of, and because I wanted to demonstrate that just because SA was an HP product didn’t mean you had to run HP servers), and was able to massively out-spec two x86 servers for less than $14,000 a pop (more CPU cores, more RAM, more storage, etc) and present my findings as part of our write-up of the week.

Also for kicks, I demoed SA running in a 2-CPU, 4GB VM on my laptop rebooting faster than either T1000 server they had purchased could run.

Whats the moral of this story? There’s two (at least):

  1. Always always always find out from your vendor if they have a preferred or suggested architecture before namby-pamby buying hardware from your favorite sales rep, and
  2. Be ever ready and willing to kick your preconceived notions to the sidelines when presented with evidence that they are not merely ill thought out, but out and out, objectively wrong

These are fundamental tenets of automation:

“Too many people try to take new tools and make them fit their current processes, procedures, and policies – rather than seeing what policies, procedures, and processes are either made redundant by the new tools, or can be improved, shortened, or – wait for it – automated!”

You must always be reviewing and rethinking your preconceived notions, what policies you’re currently following, etc. As I heard recently, you need to reverse your benchmarks: don’t ask, “why are we doing X?”; ask, “what would happen if we didn’t do X?”

That was a question never asked by anyone prior to our arrival to implement what sales had sold them.

on entropy, password/passphrase complexity, and if you’ve been part of a data breach (spoiler alert: you have)

I wrote an article on passwords, passphrases, entropy, and data breaches for my employer’s blog: https://augustschell.com/passwords-passphrases-complexity-length-crackability-memorability-data-breaches

meetings

The author of a recent Medium post is so close to right, it’s scary. Gary says the best thing you can do is to cut your meeting length in half.

And that is a phenomenal step. One that needs to happen. But one that needs to happen in conjunction with an even more monumental shift.

Change the start time of meetings to something “weird”.

Don’t start on the hour or half hour. Don’t even start on the quarter hour.

Start at 10 past or 10 til, and go for 15, 30, or 45 minutes – with a hard cut off. Just like college classes. Oh – and just like class days when all you had was a test, as soon as your part of the meeting is over, leave. You may have to wait to leave until the end. But once your piece is done, just like when you finished your test, walk out and get on with your day.

i’m not technical

I am. But not really.

To paraphrase my prelicensing class instructor, “95% of consulting is not technical work – it’s psychological”. 5% of consulting is delivery. The remainder is listening, empathizing, training, selling, encouraging, improving, and a whole bunch more gerunds.

I’m an unlicensed psychiatrist dabbling in technology -just call me Frasier Malone – the single person every consultant has to be (even though on Cheers they were two people).

Part of the Art of Consulting™ is conveying ROI in the right terms to your current audience. My job as an automation consultant, project manager, and team lead is to convince customers (at all levels) that the tools I’m there to deliver, configure, and utilize are not “taking their jobs away” (in the wrong sense of the term). Ideally, my customers not only see me as their Trusted Advisor, but as someone who has “been there, done that” just like they have, and that I truly am there to help them: to help them save time (for engineers), to save headcount (for managers), and to save money (for executives).

Good consultants are, in many ways, like bartenders – they listen to the problems their customers have, and hand them things they hope will help. Like a good bartender, you need to deliver what has been agreed to. And like a good bartender, you need to know when to tell your customer “that’s not the best option – try this instead”. And like a good bartender, you need to know when to tell your customer “no”.