Category Archives: technical

but, i got them on sale!

Back in August 2008, I had a one-week “quick start” professional services engagement in Nutley New Jersey. It was a supposed to be a super simple week: install HP Server Automation at BT Global.

Another ProServe engineer was onsite to setup HP Network Automation.

Life was gonna be easy-peasy – the only deliverable was to setup and verify a vanilla HPSA installation.

Except, like every Professional Services engagement in history, all was not as it seemed.

First monkey wrench: our primary technical contact / champion was an old-hat Sun Solaris fan (to the near-exclusion of any other OS for any purpose – he even wanted to run SunOS on his laptop).

Second monkey wrench: expanding on the first, out technical contact was super excited about the servers he’d gotten just the weekend before from Sun because they were “on sale”.

It’s time for a short background digression. Because technical intricacies matter.

HP Server Automation was written on Red Hat Linux. It worked great on RHEL. But, due to some [large] customer requests, it also supported running on Sun Solaris.

In 2007, Sun introduced a novel architecture dubbed, “Niagara”, or UltraSPARC T1, which they offered in their T1000 and T2000 series servers. Niagara did several clever things – it offered multiple threads running per core, with as many as 32 simultaneous processes running.

According to AnandTech, the UltraSPARC T1 was a “72 W, 1.2 GHz chip almost 3 times (in SpecWeb2005) as fast as four Xeon cores at 2.8 GHz”.

But there is always a tradeoff. The tradeoff Sun chose for the first CPU in the product line was to share a single FPU (floating point unit) between the integer cores and pipelines. For workloads that mostly involve static / simple data (ie, not much in the way of calculation), they were blazingly fast.

But sharing an FPU brings problems when you need to actually do floating-point math – as cryptographic algorithms and protocols all end up relying upon for gathering entropy for their random value generation processes. Why does this matter? Well, in the case of HPSA, not only is all interprocess, intraserver, and interserver communication secured with HTTPS certificates, but because large swaths are written in Java, each JVM needs to emulate its own FPU – so not only is the single FPU shared between all of the integer cores of the T1 CPU, it is further time-sliced and shared amongst every JRE instance.

At the time, the “standard” reboot time for a server running in an SA Core was generally benchmarked at ~15-20 minutes. That time encompassed all of the following:

  • stop all SA processes (in the proper order)
  • stop Oracle
  • restart the server
  • start Oracle
  • start all SA components (in the proper order)

As you’ll recall from my article on the Sun JRE 1.4.x from 6.5 years ago, there is a Java component (the Twist) that already takes a long time to start as it seeds its entropy pool.

So when it is sharing the single FPU not only between other JVMs, but between every other process which might end up needing it, the total start time is reduced dramatically.

How dramatically? Shutdown alone was taking upwards of 20 minutes. Startup was north of 35 minutes.

That’s right – instead of ~15-20 minutes for a full restart cycle, if you ran HPSA on a T1-powered server, you were looking at ~60+ minutes to restart.

Full restarts, while not incredibly common, are not all that unordinary, either.

At the time, it was not unusual to want to fully restart an HPSA Core 2-3 times per month. And during initial installation and configuration, restarts need to happen 4-5 times in addition to the number of times various components are restarted during installation as configuration files are updated, new processes and services are started, etc.

What should have been about a one-day setup, with 2-3 days of knowledge transfer – turned into nearly 3 days just to install and initially configure the software.

And why were we stuck on this “revolutionary” hardware? Because of what I noted earlier: our main technical contact was a die-hard Solaris fanboi who’d gotten these servers “on sale” (because their Sun rep “liked them”).

How big a “sale” did he get? Well, his sales rep told him they were getting these last-model-year boxes for 20% off list plus an additional 15% off! That sounds pretty good – depending on how you do the math, he was getting somewhere between 32% and 35% off the list price – for a little over $14,000 a piece (they’d bought two servers – one to run Oracle RDBMS (which Oracle themselves recommended not running on the T1 CPU family), and the other to run HPSA proper).

Except his sales rep lied. Flat-out lied. How do I know? Because I used Sun’s own server configurator site and was able to configure two identical servers for just a smidge over $15,000 each – with no discounts. That means they got 7% off list …
tops.

So not only were they running hardware barely discounted off list (and, interestingly, only slightly cheaper (less than $2000) than the next generation T2-powered servers which had a single FPU per core, not per CPU (which still had some performance issues, but at least weren’t dog-vomit slow), but they were running on Solaris – which had always been a second-class citizen when it came to HPSA performance: all things being roughly equal, x86 hardware running RHEL would always smack the pants off SPARC hardware running Solaris under Server Automation.

For kicks, I configured a pair of servers from Dell (because their online server configurator worked a lot better than any other I knew of, and because I wanted to demonstrate that just because SA was an HP product didn’t mean you had to run HP servers), and was able to massively out-spec two x86 servers for less than $14,000 a pop (more CPU cores, more RAM, more storage, etc) and present my findings as part of our write-up of the week.

Also for kicks, I demoed SA running in a 2-CPU, 4GB VM on my laptop rebooting faster than either T1000 server they had purchased could run.

Whats the moral of this story? There’s two (at least):

  1. Always always always find out from your vendor if they have a preferred or suggested architecture before namby-pamby buying hardware from your favorite sales rep, and
  2. Be ever ready and willing to kick your preconceived notions to the sidelines when presented with evidence that they are not merely ill thought out, but out and out, objectively wrong

These are fundamental tenets of automation:

“Too many people try to take new tools and make them fit their current processes, procedures, and policies – rather than seeing what policies, procedures, and processes are either made redundant by the new tools, or can be improved, shortened, or – wait for it – automated!”

You must always be reviewing and rethinking your preconceived notions, what policies you’re currently following, etc. As I heard recently, you need to reverse your benchmarks: don’t ask, “why are we doing X?”; ask, “what would happen if we didn’t do X?”

That was a question never asked by anyone prior to our arrival to implement what sales had sold them.

what is “plan b” for iot security?

Schneier has a recent article on security concerns for IoT (internet of things) devices – IoT Cybersecurity: What’s Plan B?

We can try to shop our ideals and demand more security, but companies don’t compete on IoT safety — and we security experts aren’t a large enough market force to make a difference.

We need a Plan B, although I’m not sure what that is. Comment if you have any ideas.

There are loads of great comments on the post.

Here’s the start of some of my thoughts:

There are a host of avenues which need to be gone down and addressed regarding device security in general, and IoT security in particular.

Any certification program could be good .. right up until the vendor goes out of business. Or ends the product line. Or ends formal support. Unless we go to a lease model for everything, you’re going to have unsupported/unsupportable devices out there.

We can’t have patches ad infinitum because it’s not practical: every vendor EOLs products (from OSes to firearms to DB servers to cars, etc).

A few things which would be good:

  • safe/secure by default from the vendor – you have to manually de-safe it to use it (like a rifle which only becomes usable/dangerous/operable when you load a cartridge and put the safety off)
  • well-known, highly-publicized support lifecycles (caveating the vendor going out of business)
  • related to the above, notifications from the device as it nears end of support
  • notifications from the device as well as the vendor that updates/patches are available
  • liability regulations – and an associated insurance structure – affecting businesses which choose to offer IoT devices across a few levels:
    1. here it is :: you deal with it || no support, no insurance, whatever risk is there is your problem
    2. patches / updates for 1 year || basic insurance / guarantee of operation through supported period, as long as you’re patched up to date
    3. patches / updates for 3 years ||
    4. patches / updates for 5 years || first-level business offering || insurance against hacks / flaws that have been disclosed for more than 90 days so long as you have patched
    5. patches / updates for 10 years || enterprise / long-term support || “big” insurance coverage (up to a year, so long as you’re yp-to-date) || proactive notifications from the vendor to customers regarding flaws, patches, etc

There are probably other things which need to be considered.

But there’s my start.

fallocate vs dd for swap file creation

I recently ran across this helpful Digital Ocean community answer about creating a swap file at droplet creation time.

So I decided to test how long using my old method (using dd) takes to run vs using fallocate.

Here’s how long it takes to run fallocate on a fresh 40GB droplet:

root@ubuntu:/# rm swapfile && time fallocate -l 1G /swapfile
real	0m0.003s
user	0m0.000s
sys	0m0.000s

root@ubuntu:/# rm swapfile && time fallocate -l 2G /swapfile
real	0m0.004s
user	0m0.000s
sys	0m0.000s

root@ubuntu:/# rm swapfile && time fallocate -l 4G /swapfile
real	0m0.006s
user	0m0.000s
sys	0m0.004s

root@ubuntu:/# rm swapfile && time fallocate -l 8G /swapfile
real	0m0.007s
user	0m0.000s
sys	0m0.004s

root@ubuntu:/# rm swapfile && time fallocate -l 16G /swapfile
real	0m0.012s
user	0m0.000s
sys	0m0.008s

root@ubuntu:/# rm swapfile && time fallocate -l 32G /swapfile
real	0m0.029s
user	0m0.000s
sys	0m0.020s

Interestingly, the relationship of size to time is non-linear when running fallocate.

Compare to building a 4GB swap file with dd (on the same server, it turned out using either a 16KB or 4KB bs gives the fastest run time):

time dd if=/dev/zero of=/swapfile bs=16384 count=262144 

262144+0 records in
262144+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 4.52602 s, 949 MB/s

real	0m4.528s
user	0m0.048s
sys	0m4.072s

Yes, you read that correctly – using dd with an “optimum” bs of 16KB (after much testing different bs settings) takes ~1000x as long as using fallocate to create the same “size” file!

How is fallocate so much faster? The details are in the man pages for it (emphasis added):

fallocate is used to manipulate the allocated disk space for a file, either to deallocate or preallocate it. For filesystems which support the fallocate system call, preallocation is done quickly by allocating blocks and marking them as uninitialized, requiring no IO to the data blocks. This is much faster than creating a file by filling it with zeroes.

dd will “always” work. fallocate will work almostall of the time … but if you happen to be using a filesystem which doesn’t support it, you need to know how to use dd.

But: if your filesystem supports fallocate (and it probably does), it is orders of magnitude more efficient to use it for file creation.

wonder how many zombie film/tv/game creators are/were computer science nerds

As you all know, I am a huge zombie fan.

And, as you probably know, I was a CIS/CS major/minor at Elon.

A concept I was introduced to at both Shodor and Elon was ant colony simulations.

And I realized today that many people have been introduced to the basics concepts of ant colony simulations through films like Night of the Living Dead or World War Z and shows like Z Nation or The Walking Dead.

In short, ant colony optimization simulations, a part of swarm intelligence, use the “basic rules” of ant intelligence to game-out problems of traffic patterns, crowd control, logistics planning, and all kinds of other things.

Those basic rules the ants follow more or less come down to the following:

  • pick a random direction to wander
  • continue walking straight until you hit something
    • if you hit a wall
      • turn a random number of degrees between 1 and 359
      • loop up one level
      • if you hit food
        • if you are carrying trash
          • turn a random number of degrees between 1 and 179 or 181 and 359
          • loop up two levels
        • if you are carrying food
          • drop it
          • turn 180 degrees, loop up two levels
        • if you are not carrying anything
          • pick it up
          • either turn 180 degrees and loop up two levels, or
          • loop up two levels (ie, continue walking straight)
      • if you hit trash (dead ants, etc)
        • if you are carrying trash
          • drop it
          • turn 180 degrees, loop up two levels
        • if you are carrying food
          • turn a random number of degrees between 1 and 179 or 181 and 359
          • loop up to levels
        • if you are carrying nothing
          • pick it up
          • either turn 180 degree and loop up two levels, or
          • loop up two levels (ie, continue walking straight)
      • if you hit an ant
        • a new ant spawns in a random cell next to the two existing ants (with a 1/grid-shape probability, in a square grid, this would be a ~10% chance of spawning a new ant; in a hex grid, it would be a ~12.5% chance of a spawn), IF there is an empty cell next to either ant
        • if you are both carrying the same thing,
          • start a new drop point
          • turn around 180 degrees
          • loop up two levels
        • if you are carrying different things (or not carrying anything)
          • turn a random number of degrees between 1 and 359
          • loop up two levels
    • if you have been alive “too long” (parameterizable), you die and become trash (dropping whatever you have “next” to you in a random grid point (for example, if the grid is square, you’re in position “5”, and your cargo could be in positions 1-4 or 6-9:

There are more rules you can add or modify (maybe weight your choice of direction to pick when you turn based on whether an ant has been there recently (ie simulated pheromone trails)), but those are the basics. With randomly-distributed “stuff” (food, walls, ants, trash, etc) on a board of size B, an ant population – P – of 10% * B, a generation frequency – F – of 9% * B, an iteration count of 5x board-size, a life span – L – of 10% * B, and let it run, you will see piles of trash, food, etc accumulate on the board.

They may accumulate differently on each run due to some of the random nature of the inputs, but they’ll accumulate. Showing how large numbers of relatively unintelligent things can do things that look intelligent to an outside observer.

And that’s how zombies appear in most pop culture depictions: they wander more-or-less aimlessly until attracted by something (sound, a food source (aka the living), fire, etc). And while they seem to exhibit mild group/swarm intelligence, it’s just that – an appearance to outside observers.

So if you like zombie stories, you might like computer science.

pi-hole revisited

Back in November, I was really up on Pi-hole.

But after several more months of running it … I am far less psyched than I had been. I’m sure part of that is having gotten better internet services at my house – so the impact of ads is less noticeable.

But a major part of it is that Pi-hole is just too aggressive. Far far too aggressive. Aggressive to the point that my whitelist was growing sometimes minute-by-minute just to get some websites to work.

Is that a problem with the site? No doubt somewhat. But it’s also a problem of blacklists. When domains and IPs are just blanket refused (and not in a helpful way), you get broken experience.

Pi-hole has also gone to a quasi-hijack approach: when a domain has been blocked, instead of it just silently not working, it now returns a message to contact your Pi-hole admin to update the block lists.

I hate intrusive ads as much as the next person .. but that shouldn’t mean that all ads are blocked. I have unobtrusive ads on a couple of my domains (this one included).

But even with Pi-hole, not all ads are blocked.

Part of that is due to the ever-changing landscape of ad servers. Part of it is due to the inherent problems with the blacklist/whitelist approach.

Content creators should be entitled to compensation for the efforts (even if they voluntarily choose to give that content away). Bombarding visitors with metric buttloads of advertising, however, makes you look either desperate, uncaring, or greedy.

The current flipside to that, though, is the pay-wall / subscription approach. Surely subscriptions are appropriate for some things – but I’m not going to pay $1/mo (or more) to every site that wants me to sign-up to see one thing: just today, that would’ve encumbered me with over $100/mo in new recurring bills.

Maybe there needs to be a per-hour, per-article, per-something option – a penny for an hour, for example (which, ftr, comes out to a monthly fee of about $7)- so that viewers can toss some scrilla towards the creators, but aren’t permanently encumbered by subscriptions they’ll soon forget about (though, of course, that recurring subscription revenue would surely look enticing to publishers).

As with the per-song/episode purchase model that iTunes first made big about 15 years ago, you could quickly find out what viewers were most interested in, and focus your efforts there. (Or, continue focusing your efforts elsewhere, understanding that less-popular content will not garner as much revenue as popular content will).

Imagine, using my example of $0.01/hr, how much more engagement you could end up garnering while visitors are actively on your site! A penny is “nothing” to most people – and probably just about all who’re online. Maybe you’ll have a handful of people “abusing” the system by opening a thousand pages in new tabs in their hour … but most folks’ll drop the virtual coin in the nickelodeon, watch the video / read the page / whathaveyounot, and move on about their day.

And not everyone will opt for the charge model. Sites that do utilize it can have some things marked “free” or “free for the next 24 hours” or “free in 7 days” or whatever.

Ad companies like Google could still work as the middleman on handling transactions, too – any time you visit per-X content, there could be a small pop-up that indicated you’d be withdrawing Y amount from your balance to view the site (I’m sure there’ll be competition in the space, so PayPal, Facebook, Stripe, Square, etc etc can get in on the “balance management” piece). And at the end of whatever period (day, week, month), Google can do a mass-settle of all the micropayments collected for each site from each visitor (with some percentage off the top, of course).

No ads. You’d actually Get What Your Pay For™, and issues like the recent Admiral thing would go in a corner and die.