Category Archives: technical

that’s right – we’re not falling behind

There was an article recently on Business Week (here) on how the US is not falling behind in math, science, and technology. In fact, we seem to be turning-out more technologists and engineers than we can use. I disagree.

The problem seems to be that those technically-minded people that US schools are churning-out won’t work for “starting” wages. There’s a glut of tech jobs. And, ironically, there’s a glut of tech people. But there are few tech jobs that are willing to pay what qualified people want.

Imported tech workers, such as from China and India, are willing to work for less money than their American counterparts. They aren’t necessarily any better or worse at the job, but they’ll work for less money.

Guess what? That means that American graduates end up not getting jobs in-field, or get underpaid.

a perfect hash function?

As I was walking to get my turkey pot pie today that was cooking in the microwave in our break room, I looked at the parking lot below and realized that parking lots are approximately perfect hash functions.

Think about it: cars come in in some semi-random order; spaces are available in semi-random fashion; cars park; and the owner comes back to the same spot to retrieve the item later. Admittedly, it isn’t necessarily replicable every day – but it’s an approximation.

Perhaps a better example would be a professor who tells his students on the first day of class to remember where they are sitting, because that’s their seat for the rest of the semester. The spaces were filled in random fashion once, then always in the same way in the future: if Sarah isn’t in class, her slot is empty – it doesn’t get filled by anyone else because they’re in their slots.

The real trick will be to figure out how to replicate this behavior functionally.

i know why search is broken

Search is broken. Google, Yahoo, Ask, Alta-Vista, and on, and on the list goes.

Hundreds of companies, thousands of individuals. I know why search is broken, and I know what needs to be fixed. Now to figure out the how of fixing.

When you’re looking for information, you search on keywords. Google’s been nice enough to rank results by ‘popularity’ (yeah, it’s called PageRank, and it’s proprietary, but it’s a popularity/relevance ranking). The problem is that you have to know what keywords were used. Some places are nice enough to suggest spelling fixes (it’s not ‘brittany spears’, it’s ‘britney spears’).

But that’s not the issue. The issue is that you don’t know what word, term, or phrase to look for. You have the concept you need to find, like ‘module’. Except you don’t think of that word, you think of ‘chunk’. Bam! You’re out of luck: no author would use the word ‘chunk’ when they mean ‘module’, right?

To fix search, we need to search on not just the keyword, but the concept. In English, you’d use a thesaurus.

So, you’re thinking: “This is easy! I’ll just build a comparator that looks at the keyword and then goes through an index of a thesaurus and finds stuff. And we’ll all be rich!”

Hold it, buster. You missed something. This is a perfectly valid English sentence, and you can figure out what I’m saying, too: “Bring me the cooler cooler cooler from the cooler’s cooler.” Cooler is used five times, with the following meanings (at least): hip, less warm, box to keep things cool, jail cell, big refrigerator.

That’s the problem with trying to fix search. Words can mean far too many things in English. But here’s your big chance to figure out a solution: I’ve told you the problem, and I’ve given you the target.

Now go make it work.

oh vista, vista, whyfore art thou vista?

I’ve been playing with Windows Vista Beta 2 recently on my home computer, and my overall impression is pretty blah. I must agree with many other reviews I’ve read that it’s really XP SP3. The eye candy is nice (taken from Apple and the OSS world), but nothing worth upgrading over. The new Start menu is better laid out, but again – not worth upgrading for. User management is a bit better, and the side bar is a spiffy feature – but you can already get that for free with either Google Desktop or Konfabulator.

I kinda feel sorry for the engineers at Microsoft who’ve poured millions of man hours and years of effort individually into this new edition of Windows – there’s no compelling reason for anyone I know to buy it.

When you factor in the minimum system requirements (and you lose a lot of eye candy if you go with the minimums) – 1.5G CPU, 512M RAM, 64M video card, 16G free drive space – the system is hogging all the basic resources of any new computer. Budget-minded consumers who snag Dell’s latest weekend special won’t have enough oomph to run Vista. XP Pro runs fine on a system with 256M RAM and a 1G CPU (I should know – one of my home boxes is such a beast). I do not see any reason why this “upgrade” has to be such a resource hog.

Sure, power users, gamers, and businesses will buy machines that can run Vista well – but Vista is going to be sucking the life out of those systems so those self-same buyers will end up needing even beefier hardware to get the “most” from their computing experience.

It’s sad when I can install any other desktop OS (distros of Linux with heavy or light window managers, XP Pro, OS X, Zeta, etc) on a system with 256M or 512M of RAM and expect it to run acceptably – along with all the apps I need to use – but Microsoft has to push its customers into machines formerly relegated to true heavy users (gamers, developers, etc).

Maybe some miracle will happen in the next several months and Vista won’t demand so many resources – but I’m not holding out for one.

virtually speaking

I’ve gotten very interested in virtualization technology recently. There’s a high probability I will be working with VMware this summer, and several of my websites (including this one) run on a virtual private server provided by Tektonic, running CentOS 3 through Virtuozzo.

Virtualization is a fascinating concept. Instead of needing gobs of physical servers, by running operating systems through a virtualization layer, several servers can be run off one physical piece of hardware. With several options available – including Xen, VMware, Virtuozzo, User-Mode Linux, Virtual Server – deciding on a particular route is difficult at best. Depending on your budget, actual server OS requirements, and available physical hardware, all of the above may end up being viable options.

Because several guest operating systems will be running inside or on top of the host virtualizer, underlying hardware generally has to be pretty hefty. However, some of the available virtualization options will allow as many as 100 guest operating environments – so installing just a few high-end servers can replace potentially dozens or hundreds of pieces of hardware.

Solutions such as the new edition of VMware ESX Server are actually smart enough to automatically shift virtual instances from one piece of physical hardware to another based on server load, or in the event of hardware problems.

User-Mode Linux, aka UML, is actually Linux ported to run on an abstract hardware standard implemented in Linux – so it’s Linux ported to run on itself. Now that hurts to think about.

As I get more personal experience with virtualization technology, I’m sure I’ll be writing more about it.

containerized datacenters

Expanding on Cringely’s posts late last year (first, second), I was wondering why companies don’t offer turn-key datacenters for businesses.

Imagine, for a moment, that you were in need of several servers – email, web, hr, inventory, file storage, applications – and support architecture – routers, switches, firewalls, etc. Locating suppliers for all of these can be a very time-consuming process, and if everything is not purchased at the same time, you can run into compatibility issues. So, why not have a business whose sole purpose in life is to integrate datacenter needs for customers, and then deliver those datacenters ready to roll?

For example, let’s say you need to provide email for 5000 users, handle user authentication for workstations, serve a medium-use website (>10000 hits/day), document management, and handle human resources -related stuff (employee contracts, sick/vacation time use, benefits, time tracking). From my understanding, a typical organization who needs to do this will solicit proposals from several vendors, fight their internal bureaucracies over how much should be spent, what OS to use, etc, and then finally start purchasing equipment after several months. In a perfect world, the vendor supplies support and training to administrators so they can run the hardware for their organization, but otherwise leave the ‘real’ work up to the customer.

I think a very profitable business could be run in which a vendor receives such a request from a customer, but instead of worrying about which hardware goes in which closet, is there enough rack space already, or do they need more, etc, they could provide the entire package in a container that could be delivered via truck (or train). Said container could include its own HVAC unit, and only need a couple connectors to the outside world to become a ‘usable’ server room when it’s delivered.

My vision for this is to install lots of rack space into a default arrangement in the container, preroute cooling and ventilation ducts, wire the whole container for power, phone, and network, and install insulation inside the container, so that the HVAC unit won’t be working overtime to keep the box cold.

Containers have lots of space inside of them, and could easily be used to hold dozens of servers, storage units, and networking infrastructure hardware. Once a customer settled on what they need, in terms of current and future capacity, minimum networking requirements, OS, etc, the vendor would just install all of the necessary hardware into the racks inside the container, install non-proprietary software into the hardware – basically everything the systems administrators would have to do when the hardware arrived at their location – but would then just close the doors on the container, hire a trucking outfit to deliver the container, and have it dropped-off at the customer’s location.

All that would be left for the customer would be decide where they wanted their datacenter, connect power and network, and turn it on.

What do you think?