antipaucity

fighting the lack of good ideas

improve your entropy pool in linux

A few years ago, I ran into a known issue with one of the products I use that manifests when the Red Hat Linux server it’s running on has a low entropy pool. And, as highlighted in that question, the steps I found 5 years ago didn’t work for me (turns out modifying the t parameter from ‘1’ to ‘.1’ did work (rngd -r /dev/urandom -o /dev/random -f -t .1), but I digress (and it’s no longer correct in CentOS 7 (the ‘t’ option, that is))).

In playing around with the Mozilla-provided SSL configurator, I noticed a line in the example SSL config that referenced “truerand”. After a little Googling, I found an opensource implementation called “twuewand“.

And a little more Googling about adding entropy, and I came across this interesting tutorial from Digital Ocean for “haveged” (which, interestingly-enough, allowed me to answer a 6-month-old question on Server Fault about CloudLinux).

Haveged “is an attempt to provide an easy-to-use, unpredictable random number generator based upon an adaptation of the HAVEGE algorithm. Haveged was created to remedy low-entropy conditions in the Linux random device that can occur under some workloads, especially on headless servers.”

And twuewand “is software that creates hardware-generated random data. It accomplishes this by exploiting the fact that the CPU clock and the RTC (real-time clock) are physically separate, and that time and work are not linked.”

For workloads that require lots of entropy (generating SSL keys, SSH keys, PGP keys, and pretty much anything else that wants lots of random (or strong pseudorandom) seeding), the very real problem of running out of entropy (especially on headless boxes or virtual machines) is something you can face quite easily / frequently.

Enter solutions like OpenRNG which are hardware entropy generators (that one is a USB dongle (see also this skh-tec post)). Those are awesome – unless you’re running in cloud space somewhere, or even just a “traditional” virtual machine.

One of the funny things about getting “random” data is that it’s actually very very hard to get. It’s easy to describe, but generating “truly” random data is incredibly difficult. (If you want to have an aneurysm (or you’re like me and think this stuff is unendingly fascinating), go read the Wikipedia entry on “Cryptographically Secure Pseudo Random Number Generator“.)

If you’re in a situation, though, like I was (and still am), where you need to maintain a relatively high quantity of fairly decent entropy (probably close to CSPRNG level), use haveged. And run twuewand occasionally – at the very least when starting Apache (at least if you’re running HTTPS – which you should be, since it’s so easy now).

what to automate

I have been in the world of automation for quite a while. Specifically in the realms of server, datacenter, and cloud automation – but I’ve been interested and/or involved in other tasks that tend towards automation (even for a short period of time) for far longer than just my post-college time in the world of HPSA and its related ilk.

One of the first questions customers ask us when we arrive onsite (heck – even way back in the technical presales cycle) is NOT what can be automated, but rather what should we automate and/or what can we automate first.

Analyzing the environment and finding some prime, low-hanging fruit to target in an initial automation push is vital.

To quote Donald Knuth, “We should forget about small efficiencies, say about 97% of the time; premature optimization is the root of all evil.”

In the realm of automating, that means picking on tasks that, while the tools at hand can make quick work of them, are done so infrequently as to not warrant an immediate focus, since the ROI on infrequently-done tasks is not going to be readily seen* should be skipped.

This is part of where being a good architect comes to bear.

No.

That’s not true.

This is where being a good listener and collator comes to bear. In a future post I’ll talk more fully about the art of architecting – but for today’s topic, let’s focus on the true key personality traits you must display to get a successful project started, implemented, and running.

You need to listen. You do not need to “hear” – you need to process what is being said, ask it back, take notes, ask for clarifications, etc. In the counseling world, this is called “active listening“. In the rest of life, it’s called being an attentive, thoughtful, caring, intelligent, adult human being.

When you hear a customer say they have a real problem with some task or other (beware – managementspeak coming!) – ie, they have “pain points” in various places, ask about what those individual tasks are actually comprised of. Investigate what can be touched today, what can be planned-for tomorrow, and what needs to be tabled for a future engagement (for you architects and sales folks reading, this translates into “what can we sell them later – after this project is successful?” – how can we build and strengthen this relationship?).

Take these notes and conversations you have to your colleagues and tease-out coherent lines of attack. Collate all the notes form everyone involved into commonalities – what has everyone heard a customer say? What did only one guy hear? What order did each person hear them in?

After you’ve listened, after you’ve taken your notes, after you’ve powwowed with your colleagues – then comes the fun part of any engagement: the actual automation!

Bring your cleaned-up and trimmed-down notes back to the customer in an easily-digestible form, and give a solid plan for what we will do now, what we want to do soon, and what really needs to wait to be done til later. Put an N, S, or L next to each item on your list. – it’s a first-cut priority draft. Then ask your customer for how they view those tasks, and listen to what they say are their priorities (including “real” dates, if any exist). You may need to reorganize your list, but keeping it involved in all project discussions will show you’re truly paying attention to them.

And at the end of the day, everyone’s favorite topic is themselves. Always – even shy people want to hear themselves bragged-up, talked-about, promoted, and given attention.

When you showcase your individual focus and attention on your customer, it will show in their willingness to accept you into their closer rings of trust – their readiness to receive you as a “trusted advisor”, which is what you want to be for them: you want to be who they can talk to about problems they’re seeing in their environment (current or potential) so you can bring your expertise to bear on their issues.

The role of any consultant who wants to be more than a mere grunt is not so much technical or business acumen, but that of their business therapist and/or best friend. You want to be able to say with Frasier Crane, “I’m listening”. And you want them to know that you really are.

Some of the early steps you can take today to bring yourself there are to:

  • avoid electronic distraction in meetings
  • document everything you do for work
  • be detailed
  • know industry trends, what competitors are doing, etc
  • treat everyone you come in contact with at a customer as if they were the most important person there
  • anticipate what you may be asked, and where you want to go
  • never speak authoritatively about that which you do not know
  • learn – be a “Lifelong Learner”: the day you stop learning is the day you stop growing, and the day you stop being reliable to others

*Unless, of course, those infrequent tasks are only infrequent because they’re “hard”, and therefore automating them will yield a solid ROI by allowing them to be done more often

a smart[ish] dhcpd

After running into some wacky networking issues at a recent customer engagement, I had a brainstorm about a smart[ish] DHCPd server that could work in conjunction with DNS and static IP assignment to more intelligently fill subnet space.

Here’s the scenario we had:

Lab network space is fairly-heavily populated with static assigned addresses – in a /23 network, ie ~500 available address on the subnet, about 420 addresses were in use.

Not all statically-assigned IPs were registered in DNS.

The in-use addresses were did not leave much contiguous, unused space (little groups of 2 or4 addresses open – not ~80, or even a couple small batches of 20-30 in a row).

DNS was running on a Windows 2012 host.

DHCPd (ISC’s) was setup on an RHEL 5×64 Linux machine.

The problem with using the ISC DHCPd server, as supplied by HPSA, is that while you can configure multiple subnets to hand-out addresses on, you cannot configure multiple ranges on a single subnet. So we were unable to effectively utilize all the little gaps in assigned addresses.

Maybe this is something DNS/DHCP can do from a Windows DC, but I have an idea for how DHCPd could work a little smarter:

  • give a very large range on a given subnet (perhaps all but the gateway and broadcast addresses)
  • before handing an address out, in addition to checking the leases file for if it is free, check against DNS to see if it is in use
  • if an address is in use because it is static, update the leases file with the statically-assigned information as if it were assigned dynamically – but give it an unusually-long lease time (eg 1 month instead of 4 hours)
  • on a periodic basis (perhaps once an hour, day, week – it should be configurable), scan the whole subnet for in-use addresses (via something like nmap and checking against DNS)
    • remove all lease file entries for unused/available IPs
    • update lease file entries for used/unavailable IPs, if not already recorded

This would have the advantage of intelligently filling address gaps on a given subnet, and require less interaction between teams that want/need to be able to use DHCP and those that need/want static addresses.

Or maybe what I’m describing has already been solved, and I just don’t know how to find it.

automating or automation?

I have been working in the realm of “automation” – specifically data center automation – for several years.

Merriam-Webster defines “automating” thusly:

  1. to operate by automation
  2. to convert to largely automatic operation <automate a process>

Notice the subtle difference with M-W’s definition of “automation“:

  1. the technique of making an apparatus, a process, or a system operate automatically
  2. the state of being operated automatically
  3. automatically controlled operation of an apparatus, process, or system by mechanical or electronic devices that take the place of human labor

These words tend to be used interchangeably – but they are different. Most of the customers I have worked with think they are “doing automation”, when in actuality they are only barely starting to “automate”.

What do I mean by this?

Most customers that bring automation tools in-house (whether simple cron jobs or complex tools like HPSA) take their current, manual processes and merely write wrappers around them to make them “automatic”. That is the first step of automation – but it’s only the first step.

Too many people try to take new tools and make them fit their current processes, procedures, and policies – rather than seeing what policies, procedures, and processes are either made redundant by the new tools, or can be improved, shortened, or – wait for it – automated!

Think of a physical example – you’re a Shaker Cabinetmaker in the late 1800s. You’re making end tables. Cutting dovetails with a dovetail saw. Sanding with a block and sandpaper. Cutting pieces to size by hand. Drilling mortises and cutting tenons with a manual auger and small saw. Lathing on a treadle-powered lathe.

Jump forward 100 years. You’re WGBH in Boston wanting to come up with a how-to program to air weekly. You find a fellow named Norm Abram, and pay him to do the show. You put New Yankee Workshop on TV for years. Norm visits places like Hancock Shaker Village for inspiration. But Norm doesn’t use loads of hand tools – he uses radial arm saws, drill presses, table saws, routers, joiners, etc. Why? Because he wants to make a prototype or three in a few days. And then wants to film a show in two (or occasionally longer for big projects) and be able to give you the confidence, along with a set of measured drawings, that you could, more-or-less, replicate what he did in your own home.

Which approach is better? Neither – they both yield end tables. Which approach is more repeatable – especially by someone with little experience? That’d be Norm’s method: the automated method. But the New Yankee Workshop is only slightly down the road of automating the entire process.

Automation (in the furniture world, at least) is found at Ikea. Thousands of identical Ingos and Karls roll out the door every year. Produced by automated machinery.

The job of the Information Technologist has a great deal of art to it – but it’s also a science: at the core of everything that a computer does is logic (perhaps poor logic, but logic nonetheless). Everyone in information technology (especially, though it’s applicable to myriad other industries) should be striving to make themselves replaceable – because no one is irreplaceable. I’ve seen it come true in scores of settings: the person who makes themselves “irreplaceable” never gets promoted, and is eventually replaced by someone else: either management removes him or finds a way around him, or he leaves the organization.

Therefore, preemptively make yourself replaceable. This was Ken Moellman’s campaign when he ran for State Treasurer in KY: to eliminate the very job he was running for.

There’s a secret to making yourself replaceable – and that is that when you can show that you’re replaceable, especially in the world I work in, you tend to be promoted. You also tend to grow because you’re learning more. Because you learn more and grow, you become more valuable. Because you become more valuable, you can move up the chain as you like.

Do artisan works still have a market? Of course – go look at any “artsy” type store that showcases “local craftsmen” in almost any part of the country: they’re offering their hard work for your consideration … at a hard price.

Is the artisan chair fundamentally any better than the Ikea chair? Maybe, maybe not. It sure looks better. But it’s not as repeatable.

And repeatable tasks get automated because if you have to do it twice, you need to write it down. And if you have to do it more than twice, you need a process anyone can follow. But processes are always open for refinement and replacement. The process to get a piece of lumber from a tree is not completely dissimilar to day from how we did it before the advent of the sawmill – but with laser-guided blades, the sawmill of 2013 can optimize lumber out of a log in a way very very few people ever could .. and can cut the log into its constituent boards faster than any person.

The process for manually provisioning a RHEL server is pretty simple – but shortly after introducing Anaconda, Red Hat introduced the kickstart (modeled after jumpstart). Microsoft, likewise, has unattend.txt and unattend.xml (for either the DOS or WinPE methods of installing). And SuSE, HPUX, and AIX have their systems (AutoYaST, Ignite, and NIM).

Why do these tools exist? It’s so administrators can rapidly deploy machines without having to do a lot of extra setup work by hand. The same can be asked for why do chainsaws, table saws, and circular saws exist? It’s so you can cut wood more rapidly and more repeatably than by hand. You can fell a tree with an axe. You can fell a tree with a saw. But for a single person, felling a tree with a chainsaw is best. Or you can use a Tiger Cat.

The first step of automating needs to be building wrappers to reliably repeat manual processes.

The second, and far more important, step in the paradigm shift from manual methods to automation, is to systematically go through all of your processes, procedures, and policies and see what can be refined, what can be replaced, and, most importantly, what can be removed.

What legacy activities are you doing that should be eliminated, updated, or cleaned-up?

delivering solutions – “shipping is a feature!”

Back in 2009, Joel Spolsky wrote an article called The Duct Tape Programmer. Of everything he has written, I think this is the very pinnacle, and it is summed in one simple sentence in the middle: “Shipping is a feature.”

I’ve referenced this article twice before (in Feb and Sep of ’11).

Why is this so important in my mind?

I went back to school in 2003 to complete my bachelor’s degree in CIS. I had graduated in 2001 with an AAS in CIS from HVCC, and after finding nothing in 2 years of searching better than the job I had at Hertz, I decided a 4 year degree might help. I graduated from Elon University in Dec 2006 with my newly-minted BA in CIS. During my tenure at school I discovered that I didn’t really like the development end of Computer Science, and I instead preferred the analytical and integrational aspects of systems work – tying disparate tools together, improving internal workflow, etc – to help make individuals’ lives better and easier. In other words, I enjoyed finding ways to automate time-consuming and repetitive tasks to allow myself (and others) to focus on more interesting work – like figuring out how to automate more tasks to move up the chain.

I worked for a few places while I was at school (two different departments at the school itself, a pair of non-profits, and some freelance side work doing web site development). When I graduated, therefore, it was only natural that I ended-up with a pair of offers to work with automation tools – one from a company called Opsware, and one from a place called Network General. For a variety of reasons, I chose Opsware.

It wasn’t long after I started in Support for Opsware’s Server Automation [System] product that I became more and more sold on the product, and grew bored doing support – troubleshooting is fun, but with the paucity of good support tickets*, large similarity of cases coming from customers, etc .. it just wasn’t “me”. Shortly after HP purchased Opsware I put in to move from Support to Professional Services – to, hopefully, get a chance to work with harder integrational problems than I would ever see helping people over the phone and via email.

Beginning March of 2008^ I moved from Support to ProServe, and did start to get a taste for the bigger systemic problems that could be solved with the Opsware HP BTO suite. While with HP, I had the opportunity to do the global delivery of HPSA 7.5 for HSBC – performing both installation and onsite mentoring/training in Chicago, NYC, London, and Hong Kong. I also did the replacement install of HPSA 7.0 (a non-upgrade-to release) for Home Depot in Atlanta to manage their 2200 stores. There were some other customers I worked with, too – but those were the two biggest.

One of the issues that has arisen with [nearly] every customer I have ever worked with it that they want what they’ve agreed to pay for in the Statement of Work (SOW) signed, sealed, and delivered by the end of the project – and if it’s not, they want good reasons to sign a CO (change order) to modify the SOW.

And it’s no surprise. When you cost someone nearly 7 cents per second to work for them, they want to see results!

One of the constraints, therefore, that needs to be constantly watched is scope creep – the insidious tendency for all projects to go beyond their intended purpose (violating law 47 of the 48 laws), exceed budget, and never deliver what is really needed.

My primary goal when I work with a customer is not, perhaps paradoxically, to “make them happy”. One thing I learned when working in support is that the customer is never right!. You may have to pretend that they’re kinda right – but they’re always wrong. They do not know what they want. They do not know what they need. And they certainly do not know what is wrong if you ask them.

My primary goal when I work with a customer is to deliver what they have paid for. When possible, I will change course slightly (following proper CO processes) – but I want them to get what they have agreed to pay for. Ideally, especially now that I am in the architecture end of the world much more than ‘just’ delivery, I can work with them in the pre-sales process to get the SOW to something that approximates what they need. But I always aim to give them what they have paid for. Everything else is window dressing.

At the end of a project – whether as outside consultants, students, internal employees, at home, for work, etc – what needs to be seen is what was paid for.

Ship. Deliver.

Without those two, nothing gets done.


* I have grown so frustrated with support processes that I spent time a couple years ago writing a small eBook that includes a section on how to make good tickets. I’ve also written on ways to improve your support organization before.
^ Just realized that means I’ve been doing ProServe or PS-like work for 5 years running, and have been with the automation suite for more than 6 now.
! Before you become too concerned – I do realize there are a few good customers out there. But they are just that – few, and VERY far between.

call

I learned about the call command in Windows recently.

Some context – was trying to run a command via HPSA at a customer, but kept getting an error that the program was not a recognized internal or external command.

Very frustrating.

Then one of the guys I worked with suggested adding a “call” to the front of my script. That worked like a champ. Here’s why.

When the HPSA Agent on a Managed Server receives a script to run from the Core, it runs it in a headless terminal session. This means that while environment variables (eg %ProgramFiles%) expand properly, if the first part of the command is NOT a built-in from cmd.exe, it won’t execute. Unlike *nix which is designed to run most things headless, Windows never was (and isn’t still as of Win2k8R2).

The built-in command ‘call‘ forks the next command to a full session (albeit still headless), and enables cmd.exe to run it properly.

Now you know.

storage strategies – part 4

Last time I talked about storage robustifiers.

In the context of a couple applications with which I am familiar, I want to discuss ways to approach balancing storage types and allocations.

Storage Overview

Core requirements of any modern server, from a storage standpoint, are the following:

  • RAM
  • swap
  • Base OS storage
  • OS/application log storage
  • Application storage

Of course, many more elements could be added – but these are the ones I am going to consider today. From a cost perspective, hardware is almost always the least expensive part of an application deployment – licensing, development, maintenance, and other non-“physical” costs will typically far outweigh the expenses of hardware.

RAM

RAM is cheap, comparatively. Any modern OS will lap-up as much memory as is given to it, so I always try to err on the side of generous.

swap

After having found an instance where Swap Really Mattered™, I always follow the Red Hat guidelines which state that swap space should be equal to installed RAM plus 2 gigabytes. For example, if 16GB of RAM is installed, swap should at least equal 18GB. Whether swap should be on a physical disk or in a logical volume is up for debate, but do not chintz on swap! It is vital to the healthy operation of almost all modern systems!

Base OS

This will vary on a per-platform basis, but a common rule-of-thumb is that Linux needs about 10GB for itself. Windows 2008 R2 requests a minimum of 20GB, but 40GB is substantially better.

OS/application logs

Here is another wild variable – though most applications have pretty predictable log storage requirements. For example, HP’s Server Automation (HPSA) tool will rarely exceed 10GB in total log file usage. Some things, like Apache, may have varying log files depending on how busy a website is.

Application

Lastly, and definitely most importantly, is the discussion surrounding the actual space needed for an application to be installed and run. On two ends of the spectrum, I will use Apache and HPSA for my examples.

The Apache application only requires a few dozen megabytes to install and run. The content served-up by Apache can, of course, be a big variable – a simple, static website might only use a few dozen megabytes. Whereas a big, complex website (like StackOverflow) might be using a few hundred gigabytes (in total with any dynamically-generated content which might be in a database, or similar)*.

The best way to address varying storage needs, in my opinion, is to use a robustifying tool like LVM – take storage presented to the server, and conglomerate it into a single mount point (perhaps /var/www/html) so that as content needs change, it can be grown transparently to the website.

Likewise, with HPSA, there are several base storage needs – for Oracle, initial software content, the application itself, etc. Picking-up on a previous post on bind mounts, I think it a Very Good Thing™ to present a mass of storage to a single initial mount point, like /apps, and then put several subdirectories in place to hold the “actual” application. Storage usage for the “variables” of HPSA – Software Repository, OS Media, Model Repository (database) – is very hard to predict, but a base guideline is that you need 100GB in total to start.

Choosing Storage Types

My recommendations

This is how I like to approach storage allocation on a physical server (virtual machines are a whole other beast, which I’l address in a future post) for HP Server Automation:

Base OS

I firmly believe this should be put on local storage – ideally a pair of mirror RAIDed 73GB drives (these could be SSDs, to accelerate boot time, but otherwise the OS, per se, should not be being “used” much. You could easily get away with 36GB drives, but since drives are cheap, using slightly more than you “need” is fine.

swap

Again, following the Red Hat guidelines (plus my patented fudge growth factor), ideally I want swap to be on either a pair of 36GB or a pair of 73GB drives – not RAIDed (neither striping, nor mirroring swap makes a great deal of sense). Yes, this means you should create a pair of swap partitions and present the whole shebang to the OS.

OS/application logs

Maybe this is a little paranoid, but I like to have at least 30GB for log space (/var/log). I view logs to be absolutely vital in the monitoring and troubleshooting arenas, so don’t chintz here!

Application

HPSA has four main space hogs, so I’ll talk about them as subheadings.

Oracle

It is important that the database has plenty of space – start at 200GB (if possible), and present it as a logically-managed volume group, preferably made up of one or more [growable] LUNs from a SAN.

Note: Thin-provisioning is a perfectly-acceptable approach to use, by the way (thin provisioning present space as “available” but not yet “allocated” from the storage device to the server).

Core application

The application really doesn’t grow that much over time (patches and upgrades do cause growth, but they are pretty well-defined).

Since this is the case, carve 50-60GB and present it as a [growable] LUN via LVM to the OS.

OS Media

Depending on data retention policies, number of distinct OS flavors you need to deploy, and a few other factors, it is a Good Idea™ to allocate a minimum of 40GB for holding OS media (raw-copied content from vendor-supplied installation ISOs). RHEL takes about 3.5GB per copy, and Windows 2008 R2 takes about the same. Whether this space is presented as an NFS share from a NAS, or as a [growable] LUN under an LVM group from a SAN isn’t vitally-important, but having enough space most certainly is.

Software Library

This is truly the biggest wildcard of them all – how many distinct packages do you plan to deploy? How big are they? How many versions needs to be kept? How many target OSes will you be managing?

I prefer to start with 50GB available to the Library. But I also expect that usage to grow rapidly once the system is in live use – Software Libraries exceeding 300GB are not uncommon in my field. As with the OS Media discussion, it isn’t vitally-important whether this space is allocated from a NAS or a SAN, but it definitely needs to be growable!

Closing comments (on HPSA and storage)

If separate storage options are not available for the big hogs of SA, allocating one, big LVM volume (made up of LUNs and/or DAS volumes), and then relying on bind mounts is a great solution (and avoids the issue of needing to worry about any given chunk of the tool exceeding its bounds too badly – especially if other parts aren’t being as heavily-used as might have been anticipated).


*Yes, yes, I know – once you hit a certain size, the presentation and content layers should be split into separate systems. For purposes of this example, I’m leaving it all together.