Skip to content
Feb 22 10

Backing Up Two Ways from Sunday

by karlkatzke

One method of backup or recovery isn’t enough. Period. No matter what anyone tells you, what the book says, what your boss says, or what you think you need, you need to be backing things up in many ways.

Here’s a few examples.

MySQL

Theoretically, you could recover anything you needed from the binary log, as long as you’ve got a good starting point and a good ending point. (This, by the way, is a good reason to flush the binary logs and take a backup on a regular basis.) What if your binary log’s corrupted, though? You need to fall back to a full SQL backup … which you’re doing regularly, right?

If your binary log is corrupted, any mirrors you are using that are based on that binary log are corrupted as well.

Case in point: I had a client with a very active, very large database… north of 15GB in InnoDB. The binary log hit a bug and corrupted itself. The backups were being done from that mirror so that they didn’t interrupt the main machine’s processing, but they only kept a few days worth, so we couldn’t use those backups to restore. The most recent un-corrupted dump from the main machine had been taken three months before. Luckily, the client had done some application-level backups to an XML format, and we were able to (laboriously) restore from that. It cost about $3,000 because they didn’t want to degrade their forum’s performance for a half hour every night and pay for an extra TB or storage or so to keep more than a few days worth of upgrades.

Servers

Scenario: Hard drive gets corrupted or dies. You need to get the machine back up quickly. You have a snapshot of the machine … but your snapshot is on the same storage as that machine unless you back it up somewhere else.

On top of that, storage requirements have been growing rapidly for servers. Where a linux server take less than 1GB, Windows 2008R2 can take up 20GB with system files alone. (In fact, if you plan to have any data on that server, or keep any logs, we’d recommend going with 40GB minimum for your C: drive.) It’s important to back that up to something that’s not on the same system disks.

Better yet, take a hint from the application-level backups — and back up your registry, configuration files, and data separately from the snapshot. We tend to use RSync for this role and put it in a rolling-backup mode with the –link-dest option to ease recovery.

VMWare

Same principle as above. Snapshots are usually stored in the same datastore. Datastore goes bye-bye, so do your snapshots.

There’s some great products out there that can really help with this issue. The one we use is VEEAM Replication and Backup. It can be used to replicate a snapshot to another VMWare cluster, or back up the datastore files at a consistent snapshot point and then copy them elsewhere all in one step. We use a two-step process — we keep them locally on the backup server and also transmit them to another datacenter across campus.

When using VEEAM with Windows, make sure that VMWare Tools is installed and that you enable the VSS integration. (You’ll also need to make sure that the administrative share option on the system drives are enabled, and that the appropriate firewall ports are opened.) This ensures that you’ve got a transactionally consistent backup snapshot.

Practice, practice, practice

The only way to make sure that you can recover from a disaster is to test recovering from a disaster. At least once a year, we practice recovering from a worst-case scenario. That means bringing up a new machine from scratch, re-implementing all of the options and configurations, and then restoring the data. Despite that kind of restoration being something that should never happen, it does — and practice gives you insights into how to improve the processes and turns a recovery operation from an expensive nightmare that sets back all of your other processes into something that you can execute quickly and professionally.

Feb 13 10

Buzz not worth the Buzz

by karlkatzke

Besides the obvious, the thing that pisses me off most about Google Buzz is having to mark things read twice — once in google reader, once in Buzz. Still experimenting to see if I can hide/unfollow people in Google Reader and not have them unfollowed in Buzz, or vice versa.

Feb 10 10

Sun/Oracle Merger

by karlkatzke

I’m happy to see it; I’m happy to be involved in it.

Sun has some of the best ideas in the world. From a creativity point of view, they’re pretty amazing. From an implementation point of view, with some notable exceptions (ex: Fishworks), they’re pitiful. Sun couldn’t get laid in a whorehouse wearing a suit made of hundred dollar bills.

Half of Sun’s ideas were half-baked. (Either go fully baked, a’la Steve Jobs, or lay off whatever writes you make bad haiku, mmkay?) The x45xx line of servers is a wonderful idea and a wonderful form factor, and Sun overcame significant engineering challenges to develop it. Unfortunately, the first gen fell down hard under load and were practically unusable. The second gen is still suffering from some high replacement part and add-on costs that don’t justify the price in many cases. The integration of ZFS and SSDs as ZIL/L2ARC is wonderful, but there are a ton of technical problems that customers keep running into and Sun keeps refusing to acknowledge. It took three months to solve the problem I was having with SSDs and ZIL. I place the blame for the former on poor management controls, and the latter on excessive outsourcing of core competencies. Both are failures of management to execute the brilliant ideas that engineers come with.

It’s nice to see a company with a reputation for being able to execute and capitalize on new ideas come in. Oracle’s already started to cut, and all of the cuts I know of so far in my various interactions with the company have been well-justified. I’m really excited, from the point of view of someone with several relationships with the company, to see what comes of this merger.

Jan 21 10

Why Redhat’s Losing Market Share

by karlkatzke

Check this out:

RHN Fail

RHN Fail

Yeah, that’s what you see when you visit rhn.redhat.com — which you need to use to administer redhat subscriptions. I can’t get my servers to subscribe while the site’s down, and I can’t manage my entitlements or buy new ones.

One of my consulting projects has been on hold for days while RHN sorts itself out. Worse, you can’t even log in to report the problem. If you click on the “contacting us” link, you get taken to a page with a couple of mailing lists. Well, why join a mailing list? I know the site’s down. I want to file an engineering report. I click the last option, which is supposed to allow me to file such a report. It says I need to log in to file a report. FAIL.

It does seem that there’s some awareness of the problem. Poking around in the rest of the redhat.com domain, I got messages like this:

Screen shot 2010-01-21 at 3.33.32 AM

Why’s Redhat losing market share? They can’t even run a website well. Who’s going to trust their server distro when they can’t get a website right?

Nov 23 09

LSI Logic SAS Driver for VMWare vSphere 4

by karlkatzke

If you’re using raw access to storage LUNs with VMWare, and you’re using Windows, you can use the LSI Logic SAS virtual SCSI adapter option and create virtual drives. This is better than using the Microsoft iSCSI initiator because you can edit the drive mappings with the machine powered off and you can clone the machine and easily redirect all of your storage before powering the machine on.

The correct driver to be using is the LSI SAS 1068 driver. You’ll need to make a floppy image using an image tool — if you’ve access to a linux box, just use DD to create the image and then mount it and write files to it. If you’re on Windows, the venerable WinImage and other utilities exist. Either way, you’ll need to rename the file with a .flp extension and mount it on boot in your Windows VM in order to load the driver to see the drives.

I’ve been getting some pretty darned good performance with that and iSCSI LUNs on my Solaris server. I haven’t (yet) put together a decent test and some metrics to back it, but the machines on raw device LUNs feel a *lot* snappier than the machines that are on a 400GB VMFS. A good basic tutorial with iSCSI and ZFS is here: Running ZFS Over iSCSI as a vmware VMFS store — but note that I’m using raw LUNs after not being happy with the VMFS performance with a half-dozen hosts doing heavy I/O.

Nov 19 09

VMWare explorations…

by karlkatzke

GhettoVCB – VCB for free. Doesn’t get better than that.

Understanding VMWare Snapshots – Also, it’s probably a good idea to learn this stuff.

Nov 10 09

Biggest Problem in Airbus A380: Software

by karlkatzke

Software problems are the #1 thing that will keep an Airbus A380 on the ground. Yes, airplanes are complicated things … but at the same time, not much is required to keep most of them in the air.

The thing that speaks volumes to me about these problems are a few key quotes.

Clark says that the problem with the nuisance warnings has been their diverse nature, but “the common thread” is the software. He says Airbus executive vice-president programmes Tom Williams and his team “have sat in my office many times and said they can’t identify trends, which is the worst possible thing”.

Clark blames the software’s design. “There was a philosophy of utopia – I suspect that Airbus was blessed with some boffins who said ‘we’ve got to make this absolutely perfect – no flexibility’. The slightest surge causes one [sensor] to trip and then six more as they’re all linked,” he says.

Anyone willing to take guesses about the type of architecture and software developers at Airbus?

Oct 14 09

Detecting and Resolving LAMP Stack Problems – Scheduled Downtime

by karlkatzke

In the last issue of my current consulting saga, Detecting and Resolving LAMP Stack Performance Problems, we talked about a Drupal site that was being brought offline every few hours due to poor tuning of the LAMP stack. With the default settings, a site isn’t going to take much before it just falls flat on it’s face.

After triaging and addressing the main issues based on the logs, we were left with two more issues. The first was the inability of Drupal to perform well in an environment where it had to rebuild every page from source for every page view. This is well documented in the drupal community; there are many pages inn the documentation area of Drupal that deal with caching and performance optimization. The second issue was MySQL performance and the long table lock/scan times we were seeing on some queries that could not be further optimized.

We scheduled a 2 hour downtime with the customer to install some tools. Our checklist was installing memcached and PHP-APC. I also wanted to take the time to back up the MySQL database and run a good check_table on each of the MyISAM tables. (Yes, I know. MyISAM. More on that later.)

Side note: I would typically prefer xcache, which in my mind is superior to APC because I have an easier time working with it and prefer it’s management interface and tuning parameters. However, APC was available as a binary package for the platform we were on, and xcache was not. To make things faster and easier, we chose APC. Despite the endless debate about which is superior, both are usable and work. I have not run into problems using APC on an 8-core system, despite oft-reported-but-never-proven flock() issues.

APC was fast to install and required minimal tuning. It produced a noticeable performance improvement. However, the number of deadlocked apache threads (and total number of apache threads) went up, and the other Apache errors that dealt with clients timing out did not cease.

We installed the Drupal Memcache implementation along with the appropriate PECL module. We configured two pools, both using up to 1 GB of RAM (which we had to spare on the web server.) The ‘hot’ pool would mostly handle cached pages for non-logged-in users, and the other one would handle some higher volume caching for users that are logged in, as well as some internal/custom functionality to go along with specialized RSS feed parsing. (Side note: We found that the Cache and Cacherouter plugins did not work as expected. Rather than waste downtime troubleshooting them, we used what worked.)

Again, we saw a huge performance boost. We needed to do some tuning (changing certain cache settings and analyzing performance, but that was essentially everything that we could find to do from a single-server web server side of things.

While we’re on the topic of drupal: Don’t forget that Drupal has a ‘cron’ program that should be getting called remotely. It’s sort of a poor man’s cron solution, but it works. It was causing our load to spike every 20 minutes. We occasionally disabled it during testing to be sure we understood it’s effects.

The next beast to tackle was the database. As previously mentioned, it was on MyISAM tables. Obviously, this isn’t ideal. We found that node lookups, statistics lookups, and searches were taking up a disproportionate amount of server time because they were both The weirdest part was that we were seeing some full table scans in the slow query log (i.e. 3 million rows scanned) but a later ‘explain’ statement couldn’t replicate the performance recorded in the slow query log.

We batted around adding indexes. The issue was that Drupal’s search and nodes tables are frequently altered, which means the indexes become scrambled quickly. And really, what was taking time was the size of the table we were dealing with — the table wouldn’t fit in memory, so it was copying it to a disk temporary table and then doing a filesort.

Running check_table did the trick to re-sort the indexes and ‘defrag’ the files, but the benefits only lasted so long.

What we ended up doing was taking the database down, dumping everything out to a SQL file, and re-importing everything to InnoDB. Make sure that innodb_files_per_table is enabled, or you might end up with some unexpectedly big files — this depends on your architecture and filesystem. Remember that InnoDB files can not currently shrink. (Also: You can do the table changes online, but it’s really not recommended. It takes a long time, especially when some of your tables are larger than 1gb.) Don’t forget to switch to set innodb_buffer_pool_size appropriately.

The change to InnoDB, the implementation of both PHP engine-level opcode and actual built pages, and the careful tuning of Apache and MySQL parameters led to stability for this client.

There were some further problems, but they were with an unrelated product that causes a nightly load spike on the database machine. Tomorrow night I’ll covering the cleanup work: NFS iops vs. local disk, binary logging and the lack of backups in the original configuration, and building some redundancy into the system so that it can tolerate faults more smoothly.

Oct 13 09

Detecting and Resolving LAMP Stack Performance Problems

by karlkatzke

As a sysadmin, we sometimes run into performance problems with multiple angles and portions. It’s sometimes not particularly obvious where the actual performance problem is, and resolving one problem that you can see might bring another couple of problems to the surface.

The below comes from a consulting gig that I’ve been working on recently. The parties will remain nameless. I’m going to break this into several parts, since it took over three weeks to resolve all of the immediate problems with the site, and we’re still not all the way done with the task list.

Going in, I knew that we were dealing with a heavily loaded Drupal site that shared a mysql database with a wiki and a forum. The site would go down at random times — sometimes multiple times per hour. Upon logging into the server the first time, it seemed slow — so I immediately called ‘uptime’ and the answer came back with all three time period load averages over 90 on an 8-core server. There were 125 Apache processes running, but most of them were in Deadlocked state. The very second command I ran on the server was killall -9 httpd, which is never the way you want to start out a consulting gig…

While that was busy killing off processes, I checked the Apache configuration. Sure enough, it was still at the stock settings. I immediately cranked up the requests per process to 20,000 and upped the server limit to 300. (Remember, we’re dealing with prefork here.) I restarted Apache and watched it churn. It handled the load far more gracefully with some room to move around, and I quickly saw the number of Apache processes spike, and then sink down to about 80 and stay there.

The next step was looking through the logs. A quick aside about logs: I like my logs to be clean. I don’t like debug messages, I don’t like status messages, and I don’t want to see either of them. If I have a lot of a certain type of status message that I *do* want to trap, I make sure that syslog puts it into it’s own file or I handle the problem that’s causing it. In this case, /var/log/messages had a bunch of SNMP messages logging each get, and some messages about martian packets. The martian packets issue could be (and was) resolved with a quick firewall tweak to reject packets from an illegal source. The snmp issue was resolved by editing snmpd’s startup configuration to log to local1 instead of the default (check your man file for snmpd to make sure you get the right flags, it’s changed…), and then editing syslog’s configuration to log everything on local1 to /var/log/snmpd — and don’t forget to add it to logrotate!

Now we were down to two classes of errors. The first was obvious and sort of easy to troubleshoot: “MySQL server has gone away.” Log into the MySQL server. See if there’s slow-running queries. Nope? Well, double check the timeout that’s set in /etc/my.cnf — on this server, slow-query-time was set to twenty seconds, but timeout was set to ten seconds. Well, that’s not very useful. Also, check your caches and table types. In this case, everything was MyISAM. More on that later — for now, just make sure we’re using the right kind of caching strategy for your table type and system specs, which in this case is MyISAM key cache (and lots of it!). Try to fit all of your most-used tables in memory.

On this gig, we got the site back on it’s feet with these things. Downtime went from multiple events an hour down to one or two events per six hour period. Unfortunately, we were also out of easy things to change. Next time I post, we’ll start to get into fixes that will cause downtime.

Oct 13 09

Sun/Oracle OpenWorld & Flash Storage

by karlkatzke

At the Sun OpenWorld conferene keynote today, there were a few new products listed in the Flash storage arena — most notably the F5100 that everyone’s jibber-jabbering about.

As a smaller customer, I’m far more interested in the SunFlash F20 PCIe card — which I don’t see many people blogging about. Looks like I could add that to not only my existing systems, but non-Sun systems that can make use of that sort of storage. That, ladies and germs, is something worth the name “OpenWorld” — as in, a world of open wallets.