In the last few months, I recurringly stumbled upon mentions and articles of the infamous “partition alignment problem”, that can cause serious performance hits on i/o, particularly in virtualized environments and with RAID based disk subsystems.

I won’t start from scratch trying to explain you what the problem is, since many gurus already did an amazing job of describing the issue. My favorite post in this case is Duncan’s : http://www.yellow-bricks.com/2010/04/08/aligning-your-vms-virtual-harddisks/ , where he clearly explains what’s the issue behind partition alignment, and (particularly for vmware environments) what you should and could do to prevent/fix the issue.

The easy part in this scenario is how to align vmfs partitions: as long as you use vcenter server to create the partition, it takes care of proper alignment, and you can get rid of that side of the problem. Now on the “guest side” of the issue, where each and every OS take a different approach to partition alignment. Basically, partitioning has always been approached using the CHS (Cylinder,Head,Sector) technique, which doesn’t consider the blocks and tracks where the actual data is written to. It’s good for its simplicity, but definitely not good performance-wyse.

For example, common Linux distros and even MS Windows Server 2003 misalign partitions, since they rely on cylinder boundaries. Windows 2008 server, by contrast, aligns partitions to 1MB, which is safe and cool generally, regardless of the storage array that sits behind your virtualized environment (different storage arrays have different chunk sizes, thus your “alignment needs” may vary – refer to your storage vendor’s docs to know what your chunk size is).

Ok, so we’re talking about Linux here, and in my case, RHEL environments. What should I do to:

a) install properly-aligned systems

b) fix already installed systems w/ misaligned partitions.

Before I go on, let me  underline the fact that while this may seem a “tweak”, and maybe not an easy/cheap one, this can make a huge difference, particularly if you consider many guests that waste i/o resources by unnecessarily stressing your consolidated storage array. Using partition alignment as a best practice can improve your performance and save you bucks! 

 In this post I’m gonna show you how we solved the first scenario, by preparing a customized kickstart file that takes care of the partition aligment process. At least this allows us to avoid creating new vms that will need to be fixed sooner or later 🙂 .

In a future post I will describe the procedure we’re taking in order to fix already installed and misaligned systems (we’re still evaluating the different possibilities).

The kickstart file

Our approach to linux installs is to use a pxe server to boot the newly created vm, and to install the linux guest via network. Setting up a pxe boot server is easy and covered with great detail throughout the web, so I won’ t bother you with this part of the process. The good thing here is that RHEL systems provide you with a way of customizing beforehand the installation process, so that it can be automated in nearly every aspect of it (partition, grub installation, package selection, post installation activities, etc.). In one way this speeds up A LOT your installations, and on the other it provides the chance to configure an automated/repeatable setup (and maybe align partitions properly 😉 ).

Regarding the “partitioning issue”, We discovered that the default partitioning methods available with the RHEL installer simply can’t be used, since they rely on the CHS mechanism, and thus align partitions to cylinder boundaries, which is not what we need. We need to align to sector boundaries, and to choose the exact sector that we want our partitions to start.

The “%pre” section in the kickstart file allows to execute operations that are executed before the installation process (I’ll let you guess what the %post section does 😉 ). So we decided to dig more into this, and to use GNU parted to define the partitions the way we wanted.

So, here’s a sample %pre section that does all the job:

%pre

#section 1

TOTAL_RAM=`expr $(free | grep ^Mem: | awk ‘{print $2}’) / 1024`

if [ $TOTAL_RAM -lt 2048 ];

then

        SWAP_SIZE=2048

else

       SWAP_SIZE=4096

fi

# section 2

dd if=/dev/zero of=/dev/sda bs=512 count=1

parted -s /dev/sda mklabel msdos

# section 3

TOTAL=`parted -s /dev/sda unit s print free | grep Free | awk ‘{print $3}’ | cut -d “s” -f1`

SWAP_START=`expr $TOTAL – $SWAP_SIZE \* 1024 \* 2`

SWAP_START=`expr $SWAP_START / 64 \* 64`

ROOT_STOP=`expr $SWAP_START – 1`

parted -s /dev/sda unit s mkpart primary ext3 128 $ROOT_STOP

parted -s /dev/sda unit s mkpart primary linux-swap $SWAP_START 100%

What are we doing here?

In this simple scenario, we want to guess the amount of RAM available on the system, in order to size the swap partition. Then, we want to split our disk into two partitions, one holding / , and the other holding the swap.

The first section calculates the amount of available RAM and defines two possible sizes of the swap partition: 2GB for system with <2GB of RAM, 4GB for larger systems.

Then we go on the interesting part:

The second section starts by erasing the partition table (by using dd) and creating an empty one.

The third section calculates the amount of sectors available in the disk, then calculates the start sector for the swap partition, which is one sector ahead of the end of the root partition.

if you look at the math being done here (okok, I know… it's a quickie 😉 ), we're substracting N GB (in sectors) from the end of the disk, divide by 64 to get an int number, and multiply back to 64, to get a sector number that is aligned. This gives us also an aligned swap partition! Substracting one sector to this figure gives us the end sector of the root partition. The remaining stuff is easy: We define the two partitions: the first starts from sector 128 and goes on to SWAP_START – 1, the second is the swap and starts from SWAP_START up to the whole disk

Advertisements

Mike Laverick’s ‘stupid IT’ series ( http://bit.ly/bY5vxS ) which I liked a lot, inspired me to write this post. But this one isn’t directly related to IT stupidity. Instead, it’s more about the great flexibility that comes with modern IT technologies, and the chaos, panic, and disasters that ‘old-style-IT-guys’ create by not grasping the potential and understanding the inner dynamics of these new multi-dimensional environments.

‘Multiple dimensions’ is one key factor here, the other one being flexibility in those dimensions. Many IT environments have multiple dimensions, and those dimensions are there for a reason: they address specific requirements, and have unique features that involve planning, managing, monitoring, etc in a very specific way. Each dimension of course is part of a whole, and ‘control’ is what you get when you understand and manage ‘the whole thing’ ( this is ‘governance’, in the parlance of our times 😉 ).

A quick example: storage systems are multi-dimensional in their very nature, even if some single-neuron/single-threaded/mono-dimension IT guys think that storage=bunch of rotating rust that it’s supposed to hold data, eventually. Dimensions here are: protection level (raid), throughput (iops), bandwidth (gbps), scalability, tiering capabilities (automated or not), just to name a few. Add to that storage-related functions, such as SANs or replication, and a whole lot of new dimensions come into play, with varying degrees of impact in the overall architecture.

Some may argue that this multi-dimensional nature leads inevitably to complexity. I do believe that it mostly leads to flexibility (and coolness, but that’s the geek in me speaking). Flexibility is imho what makes an architecture really succesful, as it allows for change and growth when your business requirement vary: that’s why enterprise architectures exist, after all.

Flexibility comes at a cost, indeed: enterprise architectures force you to think in many ways at the same time, and require that you understand the need of careful planning, operating and monitoring the whole thing, to keep it current and aligned to business goals/requirements. That’s where the ‘rigid flexibility’ comes from: the multi-dimensional nature is there, whether you want it or not. It’s up to you to get the whole picture and transform flexibility into business advantage (or instead take everything for granted and prepare to suffer massive pain 😉 ).

I had the luck of living the growth of such an architecture, from the very basic needs (few standalone servers) up to a mature environment (consolidated storage, replication, D/R-B/C, virtualization, and so on). That allowed me to deal with one (well, sometimes more than one) dimension at a time, and to fit the new scenarios in the whole design.

I like to think that one of my tasks is to keep up to date with technology and with business requirements, so that both keep converging. If I look back on what we’ve done with our architecture, it has really been an evolving journey towards flexibility. Accordingly, the ‘rigid flexibility’ I mentioned is a good thing imho, since it forced us to broaden our thinking instead of letting us choose some shorter (and maybe simpler) path.

Obviously, pushing towards the cutting/bleeding edge isn’t always well accepted by coworkers and users (I don’t really understand the need to sit and watch as most do, but I take it as a component of the whole environment), but by taking the risk of braking some eggs, I saw a good number of pretty made omelettes.
Actually, the way of knowing which eggs you should break, is knowledge, again. The more you know your architecture, the products and solutions the world has to offer (and where they’re going, too), the more succesful you can be in planning and designing your own evolution. That’s why plan & design is soooo crucial nowadays (and that’s why a VCDX is a killer figure 🙂 ). Sure, flexibility allows for remediation, but that’s something you’d like to avoid if possible, right?

Ironically, what I’m seeing lately is that while there are some great vendors that show clearly their faith and endorsement in innovation, some others are actually pulling the handbrake and slowing things down, unacceptably. This is done by applying ridicoulus/anachronistic licensing or support policies (yes, ORCL, I’m talking to you), or throwing tons of FUD against mainstream/leading technologies such as virtualization.

While this pisses me off (A LOT), I use to take a deep breath and try to remind myself that this is evolution. Evolution will allow those that embrace flexibility and innovation to succeed, and at the same time will leave in the dust those sitting on their golden support contracts and once-shining technology.

So, dear vendor, since I have no intention of standing still, you better do the same… Or I’ll look elsewhere
>:-D

It’s 2010 (according to reliable sources 😉 ), and I hardly believe that there are still sooo many software vendors that freak out as soon as I propose virtual servers for hosting their apps (virtual has been our default for the last 2+ years).

Whether they’ve been hit by evil FUD or simply (that’s the vast majority of the cases) unaware of what virtualization is nowadays, I simply don’t accept it, unless there are SERIOUS reasons (e.g. insanely huge performance/capacity requirements, which account for .00001 of our workloads – to be fair -, or a dumb support/licensing policy (ORCL, can you heaaar meeee?)).

So, virtual is the way to go. But as a *seasoned* 😉 IT admin, I know my chickens, and know well that sooner or later they’re going to show up and blame the virtual environment for their app’s poor performance, and waste my precioussss time to prove they’re wrong (I’m the server guy and they’re the software monkeys… They should already know that it’s their fault… But they’re monkeys, so… 😉 ).

vBastard mode to the rescue. Joking apart, since my assumptions and proposals about the virtual environment are not based on black magic or voodoo rituals, but rather on performance metrics, years of statistics for a wild variety of application workloads, and plain ‘experience’ on the field, and since I take responsibility anyway for the platform hosting those applications, I blended a bit the ‘politics of server consolidation’, introducing the vBastard mode: simply put, if the vScared app vendor insists in using FUD against the virtual environment, I’ll cease the virtual proposal and embrace a fully physical environment, adding some makeup to the thing: ‘oh no!!! Your oh-so-cute-and-critical-and-precious Crap, ahem… App should be given its own iron!!! That V thing is for toys and dhcp servers… Don’t worry, I’ll bring you the gozillahertz and ultrabytes you need. Oh, and if you’ll see something like ‘vmtools’ running in the box, or vmx* sort of device drivers/modules, don’t worry… We install them just… for… standard… governance… guidelines’ (most of the time, the monkey will stare at you wondering what a device driver is 😉 ).

Needless to say, it ends being virtual (most of the time, at least). And you already know it… The V thing simply does its job (and more).

I know this sounds so BOFH, but since some people won’t EVER learn, I prefer to let them live under their pRock, while I can go on with the vParty!
And most important of all, I can deliver a better service to my employer, which is what matters at the end of the day.

While walking out of the office yesterday evening, I stumbled upon this one, and couldn’t resist to take a picture:

where information lives

I think this deserves a post, and some “comments”… like, for example:

– Hey boss, when I told you to get rid of all the paper and put all documents in a EMC box.. I didn’t mean it LITERALLY 😉

– Boss!, ok… I know storage ain’t cheap… but hey! 😉

– How the heck I’m gonna find an hba to tie this one to my SAN? 😉

– Man, these EMC guys are dead serious about storage tiering! FC, SATA, EFD, and now this! 😉

– What if I sit down on this? how do you call it? compression or dedupe? 😉

 Eheh,

 Drakpz

Slaves to contingency

November 21, 2009

This week I attended many meetings, which started as ‘planning/strategy meetings’, but sadly ended in being in something like ‘let’s just fix that’, ‘let’s extinguish that fire’, ‘let’s just do that and then we’ll see’.
Apparently, there’s a growing adoption of the lifestyle that I decided to call ‘slave to contingency’. Whether it’s due to high pressure (see previous posts), lack of enthusiasm (see previous posts), or lack of common sense/practical wisdom (see previous posts), it seems that as things go on, more and more people lose their ability to focus and innovate, and sadly accept the fact that as long as you stick to the rules, and don’t let fires burn everything to dust… You’re good to go.

Again, this leads you into surviving vs living, into watching your life passing by vs grasping it. I’ve always been able and willing of estinguishing fires as needed, but still, I need to be sure that there’s a wyser, longer plan that will bring to further innovation, quality, success.

So, bring on the contingency, the issues and the catastrophes… I’m going to face ’em, and still… Seeing the ‘big picture’. 🙂

Curb your enthusiasm. That’s a name of a tv series, that’s a sentence that generates a sound that I like when it’s pronounced. What I don’t like is the meaning of it: enthusiasm, in my opinion, is something that should never be curbed… But nonetheless, it generally is.
I had the luck, overtime, to get in touch with people with different degrees of enthusiasm in what they do and what they are. You just feel it… It’s in the words that those people use, in the gestures they do, in the tireless effort and courage they put into things. Enthusiastic people are those who make things happen, that don’t sit and wait for the time passing by. Enthusiasm should by the way generate enthusiasm, which is totally normal (and highly welcome) for me… But I found that to be an exception.
Fact is, most of the time I found that those who were enthusiastic at first, got ‘curbed’ over time, and most of the time it wasn’t their fault: they just gave up on a overwhelming pressure on avoiding enthusiasm, on sticking to the ‘rules’.
This is very much true especially in work areas, where bureaucracy and endless hierarchies seem to have a vital need of getting rid of enthusiasm. Eventually I understood that need: it starts as a way to make certain that basic rules (indeed needed) are followed, but it blatantly ends as a way of controlling what you don’t know or can’t understand. Too many rules, by definition, curb enthusiasm and creativity.
In the long run, you’ll probably end up with something that’s as much controlled as useless.
Luckily, a few bright minds keep their enthusiasm, no matter what… I don’t know if I became picky over time, but I found myself growing an exclusive preference for this kind of people. I’m not interested in shallow, controlled, ruled minds: their predictability is both their virtue and the cause of their ultimate failure. I believe, in the long run, this difference (enthusiasm vs lack of enthusiasm) will define something bigger: the difference between living and surviving. Count me in for the former.

Just a quick one, to let you know that I seriously jumped in the “virtualize everything” bandwagon. Yeah, I know, this is a EMC initiative, but apart from borrowing the expression from them, my commitment is to fight against EVERYONE that doesn’t support virtualization, and more specifically that turn 180 from that kind of tech. I’m totally pissed off of companies literally building up tons of excuses or commercial mazes just to  lock you in the physical world forever, or at least (yeah, ORCL, I’m talking to you) taking the time to figure out how to get some more $$ from your wallet. I am the one that should choose where my apps are going (p or v), and any unjustified lock in the pworld, simply won’t be accepted anymore. I’ll bite.