fighting the lack of good ideas

http is a stateless protocol

The ubiquitous protocol that enables the internet as we know it, http, is stateless.

Stateless merely means that any given request has nothing to do with the previous, or the next request. This enables the world wide web, as web servers do not need to keep track of who is receiving data, nor ow much they have: they get a request, and ship data to the requestor.

It is up to the requestor (often a web browser) to handle the incoming data.

If not every part of a web page, for example, is sent, the browser will display what it can.

This is analogous to a creditor sending you a bill (request), and you sending a check back to them – once the bill has been sent, the creditor knows nothing about the state of the bill until he receives a payment. Likewise, once the check is dropped in the mail, the payor knows nothing about his bill until the check clears his bank.

Why is this important? Because of an oft-repeated “request for enhancement” to the product I use on a daily basis. When the implementors of Opsware SAS were picking how a user should communicate with the system, they picked to run everything over http(s). They chose to utilize http because it’s commonplace, well-understood, and easy to work with.

One of the things about statelessness is that you cannot know how many people are using a given web page at the same time. Google cannot tell anyone how many people are actually looking at at this moment. They can tell you how many loaded it,and how many just presses “Search”, but they can’t know what percentage of the loaders promptly went elsewhere – either to a different page, or a different room in their home.

One way around the statelessness of http is to utilize cookies or session data – but that merely adds a check layer to the interaction, it does not provide true “statefulness”.

Several times during my time in Support at Opsware (and after HP’s acquistion), I would have a customer who was looking for the ability to determine who was logged-in at any given time (in similar fashion to running `w` or `who` or `finger` on a Linux/Unix system). This could be important to know whether a user is “doing something” before doing an application restart.

However, since communication is all done via http, there can be no state known in the tool. Once you load a web page, it is being viewed/rendered on your local machine in your web browser – the server could be shut off, your network connection removed, or any of a host of other simulations of restarting the application. And your browser would be none-the-wiser, nor should it be: it has the data it requested/received, and you’re doing something with it.

This carries over to the product I work with. Jobs might be scheduled by a user to run every day at 0200 – but he doesn’t need to be actively logged-in to have them run. Likewise, someone may have logged-in, but is not “doing” anything currently (maybe they’re at lunch).

Another case of why technical intricacies matter 🙂

technical career development

Career development. Career path. Development opportunities. Taking your career to the next level.

Terms and phrases we all hear and pretty much pass over in our day-to-day lives. Right up until we want to move to a new/better job or performance reviews roll around.

But what do they mean, and how can you advance your career (presuming, of course, that you want to)?

This is by no means an exhaustive list – indeed, I’d appreciate any other ideas / feedback / improvements y’all may suggest 🙂

For a software developer:

  • be the documentation KING of your code – if it’s not right, make it right
  • own every bug in your code – even when it’s not “yours”
  • be The Guy™ who learns a new component of the code/product (at least conversationally) every few weeks
  • write at least one tutorial a month on the internal wiki/kb about something you found or did with the code
  • write at least one tutorial or similar a month externally (maybe a personal blog) in a general fashion about something you learned or did

For a systems consultant:

  • be the documentation KING of every project you work on – make ABSOLUTELY sure the next guy can do more after you leave
  • own every issue you find, even when it’s really somebody else’s problem (no throwing it over the fence)
  • the The Guy™ who learns something new about the environment or product every couple weeks
  • write at least one tutorial a month and/or give an overview talk of something you learned/did
  • write about what you’ve done (changing names to protect the innocent) on a blog or elsewhere
  • teach as many people as are willing to learn what you know (in your company / on your team / etc)

Focus – decide where you want to be, and plot a course to get there.

Finally, NEVER make yourself “irreplaceable” – the instant you make yourself irreplaceable, you also make yourself unpromotable: after all, if you’re the Only Guy™ who can do your job, why would your boss/manager/supervisor even think of moving you into a new role?

As a side note – if you’re ever working at a customer site, don’t take calls from anyone other than the customer while you’re at your desk/cube/workspace: even if it’s project related, take it in a different room 🙂

the ticket smash, raw metrics, and communication – how to have a successful support organization

When I worked at Opsware, and for a while after HP bought us, we used to try to have once- or twice-a-week meetings for each support group wherein we would bring our most difficult cases (with the difficulty being determined by the case owner), and have an opportunity for everyone on the team to ask questions, contribute, and maybe even solve the problem our customer was having.

Novel idea, isn’t it? The typical Support team is driven by stats – the number of tickets in their queue, age of the ticket, number solved/closed, number escalated, etc. Support is driven by these numbers because managers don’t think of any better way to do it.

All things being equal, if you can close 40 cases in a week, that’s a lot better than your podmate who “only” finished-out 12. But what about the complexity of each of those cases? And how much effort did each engineer put into them? Did the customer come back and ask for it to be closed because it’s either no longer an issue, or they solved it themselves? Is it a question that can be answered with a reference to a specific page/section of a manual? Or was it a problem that took multiple webex engagements, and dozens of contacts back and forth to find a solution because it was a deep bug?

Theoretically, the goal of “support” is to, well, support – get the problem reporter a solution of some kind they can use. That solution may be a bug fix, an RFE, a reference to a tutorial, reconfiguring, or a work around / alternative approach to their problem. A big problem with this setup is that the reporter rarely asks the right question. They ask what they have pre-determined to be what they think is a question – but by biasing their initial report, they can often end-up dragging-out the solution process far longer than it should take. I recently wrote a guide on creating effective support tickets, based on my experience working in support, and interacting with various support organizations both before and since.

Reporter bias is the hardest issue to overcome, in my opinion; engineer bias is easier to get past because (hopefully) there are folks you can bounce the problem off of in the team who can help narrow-down the problem and find a solution … or at least figure out where to try looking next.

Communication is the key to solving problems – when I was at Opsware we utilized internal IRC channels and (gasp!) talking with each other to try to find solutions to customer issues. We also spent a lot of time wording inquiries to the reporter to try to gain as much information as possible on each iteration of the communication process.

Another key to solving problems was to make records of cases with the following:

  • initial reported behavior (or lack thereof)
  • actual problem
  • solution

Those records were sometimes on wiki pages, sometimes in our Plone internal KB, and sometimes got “promoted” out to the customer-facing KB. All of these approaches helped us get problems solved faster – either by offloading the “work” to the customer (via a KB reference), or by being able to apply previous answers more quickly when new-but-similar/identical problems were reported.

The end goal of a support team is not to outdo one another on how many cases one engineer has in his queue, or how many another has closed – the end goal is to solve customer problems. “Works well in a team setting” is a qualification typically associated with support engineering employment listings – but all too often that gets reduced to a cliche that practically means “tries to outdo his cubemates by closing more cases than the next guy”.

I’m as much a fan of personal responsibility and action as the next red-blooded capitalist, so don’t take this next section to imply I’m promoting communalism.

The way a support team should work is the way [good] sports teams work, or the way a Nascar team operates: yeah, it’s the driver of the car who gets the “glory”, but without his pit and maintenance crew, he’d be no better than you or I going to the grocery store. Any given support engineer gets to have his name tagged to the case for posterity – both with the good things he did, and the not so good ones. But since the goal is really to get the customer’s problem addressed, the ego of the engineer needs to be removed from the equation.

Bob Smith might be “the guy” who informed his customer of a solution, but generating the solution involved the other 7 people in his office. He gets the “fame” from Universal Widgets LLC, but he was just one of the [important] cogs in the process of resolving the issue.

The number of cases Bob has in his queue should have [almost] ZERO correlation to his skill as an technical engineer: it’s the 7 people behind him whom he can ask and brainstorm with that get the job done.

Maybe Bob gets to handle most of the “customer” action, but the other 7 are writing bug reports, solutions articles, etc. When evaluating that team, management needs to do just that: evaluate the team first, and the individuals second.

debugging authorized_keys and ssh

I saw an interesting question this morning on ServerFault, entitled “SSH Prompts for password even though private keys are available, presented to server and known to it”.

  • when my user is not already connected to the server (first ssh connexion), it prompts for password even though privates keys are availiable (PuTTY + Pagent). After that first connection, if I open a secondary or a third connection it gets connected with the keys.
  • If I close all connections and open a new one it prompts for the password.
  • If I have let say 4 open connections and I close the first one (the one that prompted for the password), the fifth connection will be opened with the keys

Now that is an interesting problem. The answer supplied, with follow-on comments was also interesting, but the process behind solving this is even more fascinating, I think.

The issue is that password-less logins should work. sshd_config has been set properly, and there is a set of matching keys in authorized_keys.

But it doesn’t work, obviously – or there’d be no question raised.

A list of items to look into, both from the supplied answer, and from my own thoughts (somebody else beat me to an answer):

  • permissions on .ssh/authorized_keys (must be 600)
  • verify sshd has been started/restarted post changes to sshd_config
  • check to see if home directory is remotely mounted / mounted on demand
  • check to see if key has a passphrase in use
  • look at /var/log/auth.log for errors
  • check to see if the home directory is encrypted (actual answer)

Debugging is something I have written about recently – it seems to come up over and over in my line of work.

It’s a skill that’s vital to have in the IT world, and yet an awful lot of folks do not.

The answer, for those interested:

It sounds like, for whatever reason, the user’s home directory is not available if the user is not logged in, so that sshd can’t find the authorized_keys file

The user’s home directory must be using ecrypt or something like that

that’d be the cause, then, since sshd can’t decrypt the contents of the home directory

Ubuntu Desktop asks if you want to encrypt the home directory (why not?) without mentioning what it may do to ssh… a simple “note: this will effect SSH…” would be helpful

new connexions collection available

I have been working on my Connexions submissions again recently, and have a collection ready for use (it will be growing as time goes on): “Debugging and Supporting Software Systems

I realize there are some small typos in the current text, but I will be addressing that in a upcoming revision 🙂

I’d love to get feedback from anyone on how it could be improved/expanded.


A few years ago I was working for Sigma Xi as an intern, and was introduced to the then-young Connexions project from Rice University.

This week I was reminded of the service, and have started looking into ways I can contribute to their open repository of educational materials.

I’d written two articles published there when I was at Sigma XI, and while one of them now looks somewhat quaint and dated, I think there are some other areas that I could contribute to that would be helpful.

CNX is free and open to anyone to use, add-to, modify, and reference – so have fun 🙂