Category Archives: commentary

but, i got them on sale!

Back in August 2008, I had a one-week “quick start” professional services engagement in Nutley New Jersey. It was a supposed to be a super simple week: install HP Server Automation at BT Global.

Another ProServe engineer was onsite to setup HP Network Automation.

Life was gonna be easy-peasy – the only deliverable was to setup and verify a vanilla HPSA installation.

Except, like every Professional Services engagement in history, all was not as it seemed.

First monkey wrench: our primary technical contact / champion was an old-hat Sun Solaris fan (to the near-exclusion of any other OS for any purpose – he even wanted to run SunOS on his laptop).

Second monkey wrench: expanding on the first, out technical contact was super excited about the servers he’d gotten just the weekend before from Sun because they were “on sale”.

It’s time for a short background digression. Because technical intricacies matter.

HP Server Automation was written on Red Hat Linux. It worked great on RHEL. But, due to some [large] customer requests, it also supported running on Sun Solaris.

In 2007, Sun introduced a novel architecture dubbed, “Niagara”, or UltraSPARC T1, which they offered in their T1000 and T2000 series servers. Niagara did several clever things – it offered multiple threads running per core, with as many as 32 simultaneous processes running.

According to AnandTech, the UltraSPARC T1 was a “72 W, 1.2 GHz chip almost 3 times (in SpecWeb2005) as fast as four Xeon cores at 2.8 GHz”.

But there is always a tradeoff. The tradeoff Sun chose for the first CPU in the product line was to share a single FPU (floating point unit) between the integer cores and pipelines. For workloads that mostly involve static / simple data (ie, not much in the way of calculation), they were blazingly fast.

But sharing an FPU brings problems when you need to actually do floating-point math – as cryptographic algorithms and protocols all end up relying upon for gathering entropy for their random value generation processes. Why does this matter? Well, in the case of HPSA, not only is all interprocess, intraserver, and interserver communication secured with HTTPS certificates, but because large swaths are written in Java, each JVM needs to emulate its own FPU – so not only is the single FPU shared between all of the integer cores of the T1 CPU, it is further time-sliced and shared amongst every JRE instance.

At the time, the “standard” reboot time for a server running in an SA Core was generally benchmarked at ~15-20 minutes. That time encompassed all of the following:

  • stop all SA processes (in the proper order)
  • stop Oracle
  • restart the server
  • start Oracle
  • start all SA components (in the proper order)

As you’ll recall from my article on the Sun JRE 1.4.x from 6.5 years ago, there is a Java component (the Twist) that already takes a long time to start as it seeds its entropy pool.

So when it is sharing the single FPU not only between other JVMs, but between every other process which might end up needing it, the total start time is reduced dramatically.

How dramatically? Shutdown alone was taking upwards of 20 minutes. Startup was north of 35 minutes.

That’s right – instead of ~15-20 minutes for a full restart cycle, if you ran HPSA on a T1-powered server, you were looking at ~60+ minutes to restart.

Full restarts, while not incredibly common, are not all that unordinary, either.

At the time, it was not unusual to want to fully restart an HPSA Core 2-3 times per month. And during initial installation and configuration, restarts need to happen 4-5 times in addition to the number of times various components are restarted during installation as configuration files are updated, new processes and services are started, etc.

What should have been about a one-day setup, with 2-3 days of knowledge transfer – turned into nearly 3 days just to install and initially configure the software.

And why were we stuck on this “revolutionary” hardware? Because of what I noted earlier: our main technical contact was a die-hard Solaris fanboi who’d gotten these servers “on sale” (because their Sun rep “liked them”).

How big a “sale” did he get? Well, his sales rep told him they were getting these last-model-year boxes for 20% off list plus an additional 15% off! That sounds pretty good – depending on how you do the math, he was getting somewhere between 32% and 35% off the list price – for a little over $14,000 a piece (they’d bought two servers – one to run Oracle RDBMS (which Oracle themselves recommended not running on the T1 CPU family), and the other to run HPSA proper).

Except his sales rep lied. Flat-out lied. How do I know? Because I used Sun’s own server configurator site and was able to configure two identical servers for just a smidge over $15,000 each – with no discounts. That means they got 7% off list …

So not only were they running hardware barely discounted off list (and, interestingly, only slightly cheaper (less than $2000) than the next generation T2-powered servers which had a single FPU per core, not per CPU (which still had some performance issues, but at least weren’t dog-vomit slow), but they were running on Solaris – which had always been a second-class citizen when it came to HPSA performance: all things being roughly equal, x86 hardware running RHEL would always smack the pants off SPARC hardware running Solaris under Server Automation.

For kicks, I configured a pair of servers from Dell (because their online server configurator worked a lot better than any other I knew of, and because I wanted to demonstrate that just because SA was an HP product didn’t mean you had to run HP servers), and was able to massively out-spec two x86 servers for less than $14,000 a pop (more CPU cores, more RAM, more storage, etc) and present my findings as part of our write-up of the week.

Also for kicks, I demoed SA running in a 2-CPU, 4GB VM on my laptop rebooting faster than either T1000 server they had purchased could run.

Whats the moral of this story? There’s two (at least):

  1. Always always always find out from your vendor if they have a preferred or suggested architecture before namby-pamby buying hardware from your favorite sales rep, and
  2. Be ever ready and willing to kick your preconceived notions to the sidelines when presented with evidence that they are not merely ill thought out, but out and out, objectively wrong

These are fundamental tenets of automation:

“Too many people try to take new tools and make them fit their current processes, procedures, and policies – rather than seeing what policies, procedures, and processes are either made redundant by the new tools, or can be improved, shortened, or – wait for it – automated!”

You must always be reviewing and rethinking your preconceived notions, what policies you’re currently following, etc. As I heard recently, you need to reverse your benchmarks: don’t ask, “why are we doing X?”; ask, “what would happen if we didn’t do X?”

That was a question never asked by anyone prior to our arrival to implement what sales had sold them.

what is “plan b” for iot security?

Schneier has a recent article on security concerns for IoT (internet of things) devices – IoT Cybersecurity: What’s Plan B?

We can try to shop our ideals and demand more security, but companies don’t compete on IoT safety — and we security experts aren’t a large enough market force to make a difference.

We need a Plan B, although I’m not sure what that is. Comment if you have any ideas.

There are loads of great comments on the post.

Here’s the start of some of my thoughts:

There are a host of avenues which need to be gone down and addressed regarding device security in general, and IoT security in particular.

Any certification program could be good .. right up until the vendor goes out of business. Or ends the product line. Or ends formal support. Unless we go to a lease model for everything, you’re going to have unsupported/unsupportable devices out there.

We can’t have patches ad infinitum because it’s not practical: every vendor EOLs products (from OSes to firearms to DB servers to cars, etc).

A few things which would be good:

  • safe/secure by default from the vendor – you have to manually de-safe it to use it (like a rifle which only becomes usable/dangerous/operable when you load a cartridge and put the safety off)
  • well-known, highly-publicized support lifecycles (caveating the vendor going out of business)
  • related to the above, notifications from the device as it nears end of support
  • notifications from the device as well as the vendor that updates/patches are available
  • liability regulations – and an associated insurance structure – affecting businesses which choose to offer IoT devices across a few levels:
    1. here it is :: you deal with it || no support, no insurance, whatever risk is there is your problem
    2. patches / updates for 1 year || basic insurance / guarantee of operation through supported period, as long as you’re patched up to date
    3. patches / updates for 3 years ||
    4. patches / updates for 5 years || first-level business offering || insurance against hacks / flaws that have been disclosed for more than 90 days so long as you have patched
    5. patches / updates for 10 years || enterprise / long-term support || “big” insurance coverage (up to a year, so long as you’re yp-to-date) || proactive notifications from the vendor to customers regarding flaws, patches, etc

There are probably other things which need to be considered.

But there’s my start.

modularity is great – if you commoditize the right complements

Google bought Android and made great things with it.

They also had an interesting audacity to announce an “open, modular” phone that ‘anyone’ could design from, and make components that would play nicely together (like IBM did with their initial ISA architecture releases back in the 80s). (Microsoft then flipped the tables on IBM and non-exclusively licensed MS-DOS to them, which meant hardware manufacturers could build entire replacement “[IBM] PC compatible” machines … that ran Microsoft software. )

But this only works if you’re Google – an advertising company that wants more eyeballs on its ads.

If you’re a phone manufacturer, like Motorola, the absolute last thing you want is for “anyone” to be able to replace all of the modules in your phone – because you’re not selling the OS, you’re selling hardware. As Joel Spolsky wrote 15 years ago,

If you can run your software anywhere, that makes hardware more of a commodity. As hardware prices go down, the market expands, driving more demand for software (and leaving customers with extra money to spend on software which can now be more expensive.)

Sun’s enthusiasm for WORA is, um, strange, because Sun is a hardware company. Making hardware a commodity is the last thing they want to do.

Motorola is a hardware company. They may want add-ons to be available to their base phone, but the certainly don’t want you replacing everything – unless it’s from them.

Jean-Louis Gassée notes these issues in his latest article, “Lazy Thinking: Modularity Always Works”,

In order to succeed, “disruptive modularity” needs a stable architecture with well-defined and documented boundaries. Module innovators need to be able to slide their creations into place without playing havoc with the rest of the edifice. This is how it worked in the Wintel PC world…sort of. In PC reality, as many of us have experienced, the sliding in and out of modules wasn’t so neat and often landed us in Device Driver purgatory. In the mid-nineties, one Microsoft director told me that the Redmond company actually spent more engineering resources on drivers than on Windows’ core software. …
Most important, strongly-worded theories are less interesting than exploring their cracks, where they don’t seem to work. This is how physics keeps moving forward and this is also how our understanding of business should advance. In the case of Project Ara, the unexamined consensual acceptance of Disruption Theory led many to believe that Modularity Always Wins meant smartphones would (and should) follow the same path as PCs.

I hope JLG (and I, and Joel Spolsky, and basic economics) are wrong.

But I doubt it.

kvp is a lousy way to teach 

Recently on one of the podcasts I listen to, I heard an offhanded comment made about how history is taught not in patterns but as facts. For example, “On the 18th of April in ’75, hardly a man is now alive, who remembers that famous day and year”.

Rarely are the “whys” explained – understandably so at early ages, but not understandably as maturation happens.

“Teaching” in so many subjects has become memorization of what really amount to key-value pairs. Like, Columbus: 1492. Norman invasion: 1066. Etc.

Certainly, facts are important. And some things truly are best learned in a rote memorization form – for example, the multiplication table through 12, 15, or 25. But what about states and their capitals? Sure, they’re “pairs” – but are they more?

This is awesome if you’re a trivia nut. But if you’re not, or you truly want to learn the material – not merely pass a test or regurgitate facts – then you need to understand more than just the “facts”.

Outside history classes, it’s especially prevalent in math – very little (if any) time is taken to explain why the quadratic formula works (or even what it is), instead algebra students are expected to just learn and use it.

My late aunt, who did a lot of tutoring in her life, summed-up the problem with algebra (and other math subjects past elementary school) thusly: before algebra, we give a problem like “3 plus box is 9; what goes in the box?” but in algebra, we swap the box for a t or x or g, and we freak out. She would teach the facts, but [almost] never without the whys.

The whys are illustrated and analyzed very well in some books – like Why Nations Fail (review). But, sadly, they’re not given in more places.

We definitely need more good teachers who want their students to understand not merely enough to pass the class (or the test), but to cultivate the curiosity we’re all born with to become lifelong learners.

First step: stop “teaching” as key-value pairs.

apple tv – how apple can beat amazon and google

In e99 of Exponent, Ben Thompson makes a compelling case for his idea that Amazon Echo (Alexa) is an operating system – and that Amazon has beaten Apple (with Siri) and Google Home (with Assistant) at the very game they both try to play.

And I think he’s onto the start of something (he goes on to elaborate a bit in his note that Apple TV turned 10 this week (along with the little thing most people have never heard of, iPhone)).

But he’s only on the *start* of something. See, Apple TV is cheaper than Amazon Echo – by $30 for the entry model (it’s $20 more for the model with more storage). Echo Dot is cheaper, but also is less interesting (imo). And Alexa doesn’t have any local storage (that I know of).

And neither of them will stream video.

By Apple TV has something going for it – it *already* has Siri enabled. In other words, it has the home assistant features many people want, and does video and audio streaming to boot.

It handles live TV via apps like DIRECTV or Sling. And Netflix and other options for streaming (including, of course, iTunes).

Oh, and it handles AirPlay, so you can plop whatever’s on your iPhone, iMac, etc onto your TV (like a Chromecast).

But Apple doesn’t seem to focus on any of that. They have a device which, by all rights, ought to be at least equal (and probably superior to) with its competition – but they seem to think their competition is Roku or the Fire Stick. From a pricing perspective, those are the wrong folks to be considering your competition.

It’s Google and Amazon Apple should have in its sights – because Apple TV *ought* to beat the ever living pants of both Home and Echo.

If HomeKit exists on Apple TV, and you have Siri on Apple TV, why is it not the center of home automation?

vampires vs zombies

A few years ago I wrote about why I like good vampire and zombie stories.

I had an epiphany this week related to that, that I thought you’d all find interesting.

If vampires exist, zombies can not exist [long] in the same universe. Why? Because they’d be eliminating the only source of food for the vampires. And since vampires are, more or less, indestructible (at least to the wiles of marauding zombies), when they eliminated zombie outbreaks, they’d do it quickly and efficiently – and, most likely, quietly.

tesla’s solarcity bid isn’t about energy production

Ben Thompson* (temporary paywall) makes an excellent first-order analysis of Elon Musk's bid to acquimerge SolarCity with Tesla. But he, uncharacteristically, stops short of seeing the mid- and long-term reasons for the acquimerge.

It's about SpaceX.

It's about Mars.

It's about the Moon.

Musk knows that he needs an incredibly-solid pipeline of technology to get SpaceX past its initial "toy" phases of being a launch company to the ISS.

He wants to ensure that he's able to support the future on non-terrestrial bodies – lunar missions, Mars missions, long-term space exploration, high-altitude space stations, etc.

Sure, it happens to be good for Tesla (integrating solar tech at Tesla charging stations is a no-brainer). But that's not the end game.

The goal is space.

* Follow Ben on Twitter – @benthompson