antipaucity

fighting the lack of good ideas

gaming expense reports? really?

At various stages in my career, I have traveled extensively – yet never even thought of “gaming” the expense reproting system the way it has been recently reported by CNN.

Being terminated over charging a movie to your room? Seems harsh (getting the $9.95 back from the employee would seem to be easier) – but breaking the rule is breaking the rule.

Being terminated over buying gum? Ok, so I WOULD terminate somebody over that … but I hate the stuff 😉

But it’s repulsive, revolting, and wrong
chewing and chewing all day long
The way that a cow does*

There are a host of ways listed in the article – that I find truly shocking – to cheat on expense reports: blank receipts? buying gifts and then selling them on eBay? double-billing? Wow. The sheer effort taken by some people to cheat is astonishing!

Where I work now has a corporate credit card issued to every traveling employee. The only time we submit non-AmEx charges is if a place doesn’t accept AmEx: it’s just way easier to use the corporate card than it is to try to give all the supporting documentation of a personal card. Plus, there’s the benefit that it’s not my personal limit that is being affected if a customer delays in paying a bill.

Everyone that works where I do now also follows the expense guidelines we have – don’t exceed the IRS per diem rate for your region (on average). If you want to eat someplace nice for dinner – that’s fine. Just eat someplace less expensive the next day. Sticking within the rules isn’t that hard … so why would you want to try to evade them and end up with employment history issues like termination on your record?

defaulting pxe boots with hpsa

I recently found a very helpful nugget with regards to OS Provisioning with HP’s Server Automation product.

OS Prov is most typically done using PXE (or the similar bootp process). SA provides a PXE server that gives a boot menu to network-booted systems. That menu contains a variety of choices: linux, windows, winpe, etc.

In most environments, one particular OS will be dominant, and typically one particular version of that OS – whether it be RHEL 5 x64 or Windows 2008 R2, it’s usually just one that makes up the lion’s* share of systems on the network.

If you know that you will be provisioning, say, Windows 2008 R2 90% of the time, it would be nice to not have to always have to pick it manually from the PXE boot menu.

Here’s what you need to do to make that happen (presuming you want to use Build Plans):

Edit the following file:

/opt/opsware/boot/tftpboot/pxelinux.cfg/default

For example, if you have this at the beginning of the file:

"prompt 1
default local
timeout 100
display pxelinux.msg
implicit 0"

Change it to this for the OGFS version of winpe64:

"prompt 1
default winpe64-ogfs
timeout 100
display winpe64-ogfs.msg
implicit 0"

You can use the above process – modified, of course – for any of the available boot images.


*And if you’re provisioning Mac OS X 10.7, “Lion” makes up all the share 🙂

doing technical phone screens

Related to a previous post on career development, I thought it could be interesting to look at one approach to the technical screen that I have used over the past few years when interviewing candidates.

  1. for folks with no “real” experience yet, I ask them to rank themselves on a few key technologies on the “Google scale”
    • the range is 0..10 where a 0 is no knowledge, 1 is some, 10 is “you wrote the book”, 9 is you could’ve written the book, or you edited/contributed
    • on a few occasions, I have had folks ask to change their ranking from their initial [overconfident] statement to one that is much closer to inline with their true experience/comfort/knowledge level – and that’s OK in my book – honesty is always the best policy here
  2. a couple quick “about us” questions – open-ended inquiries that let the candidate tell me what they’ve done for work
    • this verifies their resume
    • gets them warmed-up for the rest of the call
    • allows the candidate to brag on something
  3. perhaps a couple quick probes to find out more about a specific experience
  4. a few basic / intermediate questions to assess candidate’s technical chops (ie, verify that their resume is accurate)
    • this goes along with my personal rule of “never put anything on a resume you don’t want to be asked about”
  5. open-ended, intentionally-vague questions to gauge problem solving ability, and methodologies
    • see how they go about refining the problem statement (if at all)
    • gauge estimation skills
    • gauge teamwork and delegation aptitude
  6. a few intermediate/advanced questions about an area they *don’t* know anything about – to gauge their response to unfamiliar/stressful situations
    • in my field in particular, it is impossible to know every new technology or even (probably) to be truly 100% aware of those that you do use every single day
  7. a few intermediate/advanced questions in their now-articulated fields of expertise (presuming I have any)
    • this verifies more of their stated (and unstated) job experience, and helps determine at what title/work level they should start
  8. lifestyle/workstyle questions
    • how much they enjoy travel
    • how they handle last-minute demands and “requests” by customers and management
  9. a few questions to gauge flexibility of response to changing requirements
    • for example, switching a project from being Solaris-based to Windows-based part way into implementation because a new CIO has come in, or new licensing is available, etc
  10. open time for them to ask me whatever they may wish to know that I can tell them
    • this usually ends-up being very short because the candidate was stressed-out over the interview, and can’t think of anything about the company they want to know on the spot

What I try to NEVER ask:

  • “trivia” questions – I bet there are C questions even K&R couldn’t answer 🙂
    • I guarantee I can ask you a question about your area of expertise you cannot answer…just like I guarantee you could do the same to me
    • since that is the case, trivia questions are pretty pointless, and more of an ego stroke to the asker than anything else
  • pointless “MindTrap“, lateral-thinking questions
    • riddles are fun – but only add to the stress of the interview (like “why are manhole covers round”)
  • pointless problem-solving and estimation problems
    • for example, “how would you move Mt Fuji”, or “how many gallons of water flow into New York Harbor from the Hudson River per hour”
    • estimation problems are wonderful tools and games to play, but not in an interview
  • illegal questions
    • sometimes they slip out, but it’s never intentional 🙂

I adjust my questioning to fit the situation, timing, and candidate responses – so it’s [somewhat] different every time.

When the interview is done, I write-up my evaluation of the candidate and send it on to the hiring manager. In line with Joel Spolsky‘s “Guerilla Guide to Interviewing“, I make sure to put my firm conclusion of Hire/No-Hire near the top, and again at the bottom – with my reasoning in between.

One thing I have noticed about almost every interview I have ever taken or given is that I end up learning something in the process – and not just about the candidate (or company). It’s important to listen to both how and the candidates responds to questions, and what they say.

So, if you ever get the chance to interview with me, you have an idea of how I’m going to run the show 🙂

http is a stateless protocol

The ubiquitous protocol that enables the internet as we know it, http, is stateless.

Stateless merely means that any given request has nothing to do with the previous, or the next request. This enables the world wide web, as web servers do not need to keep track of who is receiving data, nor ow much they have: they get a request, and ship data to the requestor.

It is up to the requestor (often a web browser) to handle the incoming data.

If not every part of a web page, for example, is sent, the browser will display what it can.

This is analogous to a creditor sending you a bill (request), and you sending a check back to them – once the bill has been sent, the creditor knows nothing about the state of the bill until he receives a payment. Likewise, once the check is dropped in the mail, the payor knows nothing about his bill until the check clears his bank.

Why is this important? Because of an oft-repeated “request for enhancement” to the product I use on a daily basis. When the implementors of Opsware SAS were picking how a user should communicate with the system, they picked to run everything over http(s). They chose to utilize http because it’s commonplace, well-understood, and easy to work with.

One of the things about statelessness is that you cannot know how many people are using a given web page at the same time. Google cannot tell anyone how many people are actually looking at www.google.com at this moment. They can tell you how many loaded it,and how many just presses “Search”, but they can’t know what percentage of the loaders promptly went elsewhere – either to a different page, or a different room in their home.

One way around the statelessness of http is to utilize cookies or session data – but that merely adds a check layer to the interaction, it does not provide true “statefulness”.

Several times during my time in Support at Opsware (and after HP’s acquistion), I would have a customer who was looking for the ability to determine who was logged-in at any given time (in similar fashion to running `w` or `who` or `finger` on a Linux/Unix system). This could be important to know whether a user is “doing something” before doing an application restart.

However, since communication is all done via http, there can be no state known in the tool. Once you load a web page, it is being viewed/rendered on your local machine in your web browser – the server could be shut off, your network connection removed, or any of a host of other simulations of restarting the application. And your browser would be none-the-wiser, nor should it be: it has the data it requested/received, and you’re doing something with it.

This carries over to the product I work with. Jobs might be scheduled by a user to run every day at 0200 – but he doesn’t need to be actively logged-in to have them run. Likewise, someone may have logged-in, but is not “doing” anything currently (maybe they’re at lunch).

Another case of why technical intricacies matter 🙂

technical career development

Career development. Career path. Development opportunities. Taking your career to the next level.

Terms and phrases we all hear and pretty much pass over in our day-to-day lives. Right up until we want to move to a new/better job or performance reviews roll around.

But what do they mean, and how can you advance your career (presuming, of course, that you want to)?

This is by no means an exhaustive list – indeed, I’d appreciate any other ideas / feedback / improvements y’all may suggest 🙂

For a software developer:

  • be the documentation KING of your code – if it’s not right, make it right
  • own every bug in your code – even when it’s not “yours”
  • be The Guy™ who learns a new component of the code/product (at least conversationally) every few weeks
  • write at least one tutorial a month on the internal wiki/kb about something you found or did with the code
  • write at least one tutorial or similar a month externally (maybe a personal blog) in a general fashion about something you learned or did

For a systems consultant:

  • be the documentation KING of every project you work on – make ABSOLUTELY sure the next guy can do more after you leave
  • own every issue you find, even when it’s really somebody else’s problem (no throwing it over the fence)
  • the The Guy™ who learns something new about the environment or product every couple weeks
  • write at least one tutorial a month and/or give an overview talk of something you learned/did
  • write about what you’ve done (changing names to protect the innocent) on a blog or elsewhere
  • teach as many people as are willing to learn what you know (in your company / on your team / etc)

Focus – decide where you want to be, and plot a course to get there.

Finally, NEVER make yourself “irreplaceable” – the instant you make yourself irreplaceable, you also make yourself unpromotable: after all, if you’re the Only Guy™ who can do your job, why would your boss/manager/supervisor even think of moving you into a new role?


As a side note – if you’re ever working at a customer site, don’t take calls from anyone other than the customer while you’re at your desk/cube/workspace: even if it’s project related, take it in a different room 🙂

ogsh/ogfs for fun and profit

The absolute coolest feature of HP’s Server Automation suite is the OGSH (or OGFS) – the Opsware Global SHell (or FileSystem).

I worked for Opsware before HP acquired them, and the OGSH was a new feature to the product (then called Opsware SAS (Server Automation System)). It’s a fuse module that gives a [limited] bash interface to the managed environment by presenting a live query/view into the database, and, ultimately, allowing manipulation of managed servers in the environment.

For example, to access a list of all managed servers, you login to global shell, then

cd /opsw/Server/@

The ‘@’ sign is used to indicate you are “there” – at the limit of that particular filter (in this case, “Server”).

Since it’s bash, you can run most common *nix utilities and commands. But the one that’s most handy, in my opinion, is rosh – the Remote Opsware SHell.

Remote shell opens an authenticated, logged session to a remote machine (*nix or Windows – doesn’t matter), based on your user’s/group’s permissions. For testing purposes, I always configure one group (and add myself) that can connect using root for *nix machines (and Administrator on Windows).

The basic command to connect to a machine is:

rosh -l [username] -n [machine]

You can also pass commands to rosh like it was an ssh session:

rosh -l [username] -n [machine] '[command]'

For the fullest power of rosh, though, use it in a script or loop. For example:

for sn in *; do rosh -l root -n $sn 'uptime ; uname -a'; done

That will remote shell into every server in the current view, using standard shell expansion of the splat (*), and run uptime and uname -a, printing the results to screen. That particular command is handy for quick-and-dirty reports on the managed environment to see

  • which servers are up, and which aren’t
  • how long they’ve been up

In addition to rosh, global shell provides a near-complete exposing of the SA API (which is also accessible via Java, web services, and Python (using the “PyTwist” bindings written to access the Java interfaces).

the ticket smash, raw metrics, and communication – how to have a successful support organization

When I worked at Opsware, and for a while after HP bought us, we used to try to have once- or twice-a-week meetings for each support group wherein we would bring our most difficult cases (with the difficulty being determined by the case owner), and have an opportunity for everyone on the team to ask questions, contribute, and maybe even solve the problem our customer was having.

Novel idea, isn’t it? The typical Support team is driven by stats – the number of tickets in their queue, age of the ticket, number solved/closed, number escalated, etc. Support is driven by these numbers because managers don’t think of any better way to do it.

All things being equal, if you can close 40 cases in a week, that’s a lot better than your podmate who “only” finished-out 12. But what about the complexity of each of those cases? And how much effort did each engineer put into them? Did the customer come back and ask for it to be closed because it’s either no longer an issue, or they solved it themselves? Is it a question that can be answered with a reference to a specific page/section of a manual? Or was it a problem that took multiple webex engagements, and dozens of contacts back and forth to find a solution because it was a deep bug?

Theoretically, the goal of “support” is to, well, support – get the problem reporter a solution of some kind they can use. That solution may be a bug fix, an RFE, a reference to a tutorial, reconfiguring, or a work around / alternative approach to their problem. A big problem with this setup is that the reporter rarely asks the right question. They ask what they have pre-determined to be what they think is a question – but by biasing their initial report, they can often end-up dragging-out the solution process far longer than it should take. I recently wrote a guide on creating effective support tickets, based on my experience working in support, and interacting with various support organizations both before and since.

Reporter bias is the hardest issue to overcome, in my opinion; engineer bias is easier to get past because (hopefully) there are folks you can bounce the problem off of in the team who can help narrow-down the problem and find a solution … or at least figure out where to try looking next.

Communication is the key to solving problems – when I was at Opsware we utilized internal IRC channels and (gasp!) talking with each other to try to find solutions to customer issues. We also spent a lot of time wording inquiries to the reporter to try to gain as much information as possible on each iteration of the communication process.

Another key to solving problems was to make records of cases with the following:

  • initial reported behavior (or lack thereof)
  • actual problem
  • solution

Those records were sometimes on wiki pages, sometimes in our Plone internal KB, and sometimes got “promoted” out to the customer-facing KB. All of these approaches helped us get problems solved faster – either by offloading the “work” to the customer (via a KB reference), or by being able to apply previous answers more quickly when new-but-similar/identical problems were reported.

The end goal of a support team is not to outdo one another on how many cases one engineer has in his queue, or how many another has closed – the end goal is to solve customer problems. “Works well in a team setting” is a qualification typically associated with support engineering employment listings – but all too often that gets reduced to a cliche that practically means “tries to outdo his cubemates by closing more cases than the next guy”.

I’m as much a fan of personal responsibility and action as the next red-blooded capitalist, so don’t take this next section to imply I’m promoting communalism.

The way a support team should work is the way [good] sports teams work, or the way a Nascar team operates: yeah, it’s the driver of the car who gets the “glory”, but without his pit and maintenance crew, he’d be no better than you or I going to the grocery store. Any given support engineer gets to have his name tagged to the case for posterity – both with the good things he did, and the not so good ones. But since the goal is really to get the customer’s problem addressed, the ego of the engineer needs to be removed from the equation.

Bob Smith might be “the guy” who informed his customer of a solution, but generating the solution involved the other 7 people in his office. He gets the “fame” from Universal Widgets LLC, but he was just one of the [important] cogs in the process of resolving the issue.

The number of cases Bob has in his queue should have [almost] ZERO correlation to his skill as an technical engineer: it’s the 7 people behind him whom he can ask and brainstorm with that get the job done.

Maybe Bob gets to handle most of the “customer” action, but the other 7 are writing bug reports, solutions articles, etc. When evaluating that team, management needs to do just that: evaluate the team first, and the individuals second.