<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Karl Katzke &#187; sysadmin</title>
	<atom:link href="http://www.karlkatzke.com/categories/linux/sysadmin/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.karlkatzke.com</link>
	<description>PHP, Puppies, and other Geekery</description>
	<lastBuildDate>Wed, 10 Mar 2010 22:07:37 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Backing Up Two Ways from Sunday</title>
		<link>http://www.karlkatzke.com/backing-up-two-ways-from-sunday/</link>
		<comments>http://www.karlkatzke.com/backing-up-two-ways-from-sunday/#comments</comments>
		<pubDate>Mon, 22 Feb 2010 18:25:07 +0000</pubDate>
		<dc:creator>karlkatzke</dc:creator>
				<category><![CDATA[howto]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[vmware]]></category>

		<guid isPermaLink="false">http://www.karlkatzke.com/?p=555</guid>
		<description><![CDATA[One method of backup or recovery isn&#8217;t enough. Period. No matter what anyone tells you, what the book says, what your boss says, or what you think you need, you need to be backing things up in many ways. 
Here&#8217;s a few examples. 
MySQL
Theoretically, you could recover anything you needed from the binary log, as [...]]]></description>
			<content:encoded><![CDATA[<p>One method of backup or recovery isn&#8217;t enough. Period. No matter what anyone tells you, what the book says, what your boss says, or what you think you need, you need to be backing things up in many ways. </p>
<p>Here&#8217;s a few examples. </p>
<h3>MySQL</h3>
<p>Theoretically, you could recover anything you needed from the binary log, as long as you&#8217;ve got a good starting point and a good ending point.  (This, by the way, is a good reason to flush the binary logs and take a backup on a regular basis.) What if your binary log&#8217;s corrupted, though? You need to fall back to a full SQL backup &#8230; which you&#8217;re doing regularly, right? </p>
<p>If your binary log is corrupted, any mirrors you are using that are based on that binary log are corrupted as well. </p>
<p>Case in point: I had a client with a very active, very large database&#8230; north of 15GB in InnoDB. The binary log hit a bug and corrupted itself. The backups were being done from that mirror so that they didn&#8217;t interrupt the main machine&#8217;s processing, but they only kept a few days worth, so we couldn&#8217;t use those backups to restore. The most recent un-corrupted dump from the main machine had been taken three months before. Luckily, the client had done some application-level backups to an XML format, and we were able to (laboriously) restore from that. It cost about $3,000 because they didn&#8217;t want to degrade their forum&#8217;s performance for a half hour every night and pay for an extra TB or storage or so to keep more than a few days worth of upgrades. </p>
<h3>Servers</h3>
<p>Scenario: Hard drive gets corrupted or dies. You need to get the machine back up quickly. You have a snapshot of the machine &#8230; but your snapshot is on the same storage as that machine unless you back it up somewhere else. </p>
<p>On top of that, storage requirements have been growing rapidly for servers. Where a linux server take less than 1GB, Windows 2008R2 can take up 20GB with system files alone. (In fact, if you plan to have any data on that server, or keep any logs, we&#8217;d recommend going with 40GB minimum for your C: drive.) It&#8217;s important to back that up to something that&#8217;s not on the same system disks. </p>
<p>Better yet, take a hint from the application-level backups &#8212; and back up your registry, configuration files, and data separately from the snapshot. We tend to use RSync for this role and put it in a rolling-backup mode with the &#8211;link-dest option to ease recovery. </p>
<h3>VMWare</h3>
<p>Same principle as above. Snapshots are usually stored in the same datastore. Datastore goes bye-bye, so do your snapshots. </p>
<p>There&#8217;s some great products out there that can really help with this issue. The one we use is <a href="http://www.veeam.com/vmware-esx-backup.html">VEEAM Replication and Backup</a>. It can be used to replicate a snapshot to another VMWare cluster, or back up the datastore files at a consistent snapshot point and then copy them elsewhere all in one step. We use a two-step process &#8212; we keep them locally on the backup server and also transmit them to another datacenter across campus. </p>
<p>When using VEEAM with Windows, make sure that VMWare Tools is installed and that you enable the VSS integration. (You&#8217;ll also need to make sure that the administrative share option on the system drives are enabled, and that the appropriate firewall ports are opened.) This ensures that you&#8217;ve got a transactionally consistent backup snapshot. </p>
<h3>Practice, practice, practice</h3>
<p>The only way to make sure that you can recover from a disaster is to test recovering from a disaster. At least once a year, we practice recovering from a worst-case scenario. That means bringing up a new machine from scratch, re-implementing all of the options and configurations, and then restoring the data. Despite that kind of restoration being something that should never happen, it does &#8212; and practice gives you insights into how to improve the processes and turns a recovery operation from an expensive nightmare that sets back all of your other processes into something that you can execute quickly and professionally. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.karlkatzke.com/backing-up-two-ways-from-sunday/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Detecting and Resolving LAMP Stack Problems &#8211; Scheduled Downtime</title>
		<link>http://www.karlkatzke.com/detecting-and-resolving-lamp-stack-problems-scheduled-downtime/</link>
		<comments>http://www.karlkatzke.com/detecting-and-resolving-lamp-stack-problems-scheduled-downtime/#comments</comments>
		<pubDate>Thu, 15 Oct 2009 04:51:15 +0000</pubDate>
		<dc:creator>karlkatzke</dc:creator>
				<category><![CDATA[linux]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[apc]]></category>
		<category><![CDATA[cache]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[Drupal]]></category>
		<category><![CDATA[innodb]]></category>
		<category><![CDATA[myisam]]></category>
		<category><![CDATA[xcache]]></category>

		<guid isPermaLink="false">http://www.karlkatzke.com/?p=536</guid>
		<description><![CDATA[In the last issue of my current consulting saga, Detecting and Resolving LAMP Stack Performance Problems, we talked about a Drupal site that was being brought offline every few hours due to poor tuning of the LAMP stack. With the default settings, a site isn&#8217;t going to take much before it just falls flat on [...]]]></description>
			<content:encoded><![CDATA[<p>In the last issue of my current consulting saga, <a href="http://www.karlkatzke.com/detecting-and-resolving-lamp-stack-performance-problems/">Detecting and Resolving LAMP Stack Performance Problems</a>, we talked about a Drupal site that was being brought offline every few hours due to poor tuning of the LAMP stack. With the default settings, a site isn&#8217;t going to take much before it just falls flat on it&#8217;s face. </p>
<p>After triaging and addressing the main issues based on the logs, we were left with two more issues. The first was the inability of Drupal to perform well in an environment where it had to rebuild every page from source for every page view. This is well documented in the drupal community; there are many pages inn the documentation area of Drupal that deal with caching and performance optimization. The second issue was MySQL performance and the long table lock/scan times we were seeing on some queries that could not be further optimized. </p>
<p>We scheduled a 2 hour downtime with the customer to install some tools. Our checklist was installing <a href="http://www.danga.com/memcached/">memcached</a> and <a href="http://pecl.php.net/package/APC">PHP-APC</a>. I also wanted to take the time to back up the MySQL database and run a good check_table on each of the MyISAM tables. (Yes, I know. MyISAM. More on that later.) </p>
<p><small>Side note: I would typically prefer <a href="http://xcache.lighttpd.net/">xcache</a>, which in my mind is superior to APC because I have an easier time working with it and prefer it&#8217;s management interface and tuning parameters. However, APC was available as a binary package for the platform we were on, and xcache was not. To make things faster and easier, we chose APC. Despite the endless debate about which is superior, both are usable and work. I have not run into problems using APC on an 8-core system, despite oft-reported-but-never-proven flock() issues.</small> </p>
<p>APC was fast to install and required minimal tuning. It produced a noticeable performance improvement. However, the number of deadlocked apache threads (and total number of apache threads) went up, and the other Apache errors that dealt with clients timing out did not cease. </p>
<p>We installed <a href="http://drupal.org/project/memcache">the Drupal Memcache implementation</a> along with the appropriate PECL module. We configured two pools, both using up to 1 GB of RAM (which we had to spare on the web server.) The &#8216;hot&#8217; pool would mostly handle cached pages for non-logged-in users, and the other one would handle some higher volume caching for users that are logged in, as well as some internal/custom functionality to go along with specialized RSS feed parsing. (Side note: We found that the <a href="http://drupal.org/project/cache">Cache</a> and <a href="http://drupal.org/project/cacherouter">Cacherouter</a> plugins did not work as expected. Rather than waste downtime troubleshooting them, we used what worked.)</p>
<p>Again, we saw a huge performance boost. We needed to do some tuning (changing certain cache settings and analyzing performance, but that was essentially everything that we could find to do from a single-server web server side of things. </p>
<p>While we&#8217;re on the topic of drupal: Don&#8217;t forget that Drupal has a &#8216;cron&#8217; program that should be getting called remotely. It&#8217;s sort of a <a href="http://drupal.org/cron">poor man&#8217;s cron solution</a>, but it works. It was causing our load to spike every 20 minutes. We occasionally disabled it during testing to be sure we understood it&#8217;s effects. </p>
<p>The next beast to tackle was the database. As previously mentioned, it was on MyISAM tables. Obviously, this isn&#8217;t ideal. We found that node lookups, statistics lookups, and searches were taking up a disproportionate amount of server time because they were both  The weirdest part was that we were seeing some full table scans in the slow query log (i.e. 3 million rows scanned) but a later &#8216;explain&#8217; statement couldn&#8217;t replicate the performance recorded in the slow query log. </p>
<p>We batted around adding indexes. The issue was that Drupal&#8217;s search and nodes tables are frequently altered, which means the indexes become scrambled quickly. And really, what was taking time was the size of the table we were dealing with &#8212; the table wouldn&#8217;t fit in memory, so it was copying it to a disk temporary table and then doing a filesort. </p>
<p>Running check_table did the trick to re-sort the indexes and &#8216;defrag&#8217; the files, but the benefits only lasted so long. </p>
<p>What we ended up doing was taking the database down, dumping everything out to a SQL file, and re-importing everything to InnoDB. Make sure that innodb_files_per_table is enabled, or you might end up with some unexpectedly big files &#8212; this depends on your architecture and filesystem. Remember that InnoDB files can not currently shrink. (Also: You <b>can</b> do the table changes online, but it&#8217;s really not recommended. It takes a long time, especially when some of your tables are larger than 1gb.) Don&#8217;t forget to switch to <a href="http://www.mysqlperformanceblog.com/2007/11/03/choosing-innodb_buffer_pool_size/">set innodb_buffer_pool_size</a> appropriately.</p>
<p>The change to InnoDB, the implementation of both PHP engine-level opcode and actual built pages, and the careful tuning of Apache and MySQL parameters led to stability for this client. </p>
<p>There were some further problems, but they were with an unrelated product that causes a nightly load spike on the database machine. Tomorrow night I&#8217;ll covering the cleanup work: NFS iops vs. local disk, binary logging and the lack of backups in the original configuration, and building some redundancy into the system so that it can tolerate faults more smoothly.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.karlkatzke.com/detecting-and-resolving-lamp-stack-problems-scheduled-downtime/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Detecting and Resolving LAMP Stack Performance Problems</title>
		<link>http://www.karlkatzke.com/detecting-and-resolving-lamp-stack-performance-problems/</link>
		<comments>http://www.karlkatzke.com/detecting-and-resolving-lamp-stack-performance-problems/#comments</comments>
		<pubDate>Wed, 14 Oct 2009 03:45:34 +0000</pubDate>
		<dc:creator>karlkatzke</dc:creator>
				<category><![CDATA[linux]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[deadlock]]></category>
		<category><![CDATA[lamp]]></category>
		<category><![CDATA[logging]]></category>
		<category><![CDATA[problems]]></category>
		<category><![CDATA[syslog]]></category>
		<category><![CDATA[troubleshooting]]></category>

		<guid isPermaLink="false">http://www.karlkatzke.com/?p=534</guid>
		<description><![CDATA[As a sysadmin, we sometimes run into performance problems with multiple angles and portions. It&#8217;s sometimes not particularly obvious where the actual performance problem is, and resolving one problem that you can see might bring another couple of problems to the surface. 
The below comes from a consulting gig that I&#8217;ve been working on recently. [...]]]></description>
			<content:encoded><![CDATA[<p>As a sysadmin, we sometimes run into performance problems with multiple angles and portions. It&#8217;s sometimes not particularly obvious where the actual performance problem is, and resolving one problem that you <i>can</i> see might bring another couple of problems to the surface. </p>
<p>The below comes from a consulting gig that I&#8217;ve been working on recently. The parties will remain nameless. I&#8217;m going to break this into several parts, since it took over three weeks to resolve all of the immediate <i>problems</i> with the site, and we&#8217;re still not all the way done with the task list. </p>
<p>Going in, I knew that we were dealing with a heavily loaded Drupal site that shared a mysql database with a wiki and a forum. The site would go down at random times &#8212; sometimes multiple times per hour. Upon logging into the server the first time, it seemed slow &#8212; so I immediately called &#8216;uptime&#8217; and the answer came back with all three time period load averages over 90 on an 8-core server. There were 125 Apache processes running, but most of them were in Deadlocked state. The very second command I ran on the server was <code>killall -9 httpd</code>, which is never the way you want to start out a consulting gig&#8230; </p>
<p>While that was busy killing off processes, I checked the Apache configuration. Sure enough, it was still at the stock settings. I immediately cranked up the requests per process to 20,000 and upped the server limit to 300. (Remember, we&#8217;re dealing with prefork here.) I restarted Apache and watched it churn. It handled the load far more gracefully with some room to move around, and I quickly saw the number of Apache processes spike, and then sink down to about 80 and stay there. </p>
<p>The next step was looking through the logs. A quick aside about logs: I like my logs to be clean. I don&#8217;t like debug messages, I don&#8217;t like status messages, and I don&#8217;t want to see either of them. If I have a lot of a certain type of status message that I *do* want to trap, I make sure that syslog puts it into it&#8217;s own file or I handle the problem that&#8217;s causing it. In this case, <code>/var/log/messages</code> had a bunch of SNMP messages logging each get, and some messages about martian packets. The martian packets issue could be (and was) resolved with a quick firewall tweak to reject packets from an illegal source. The snmp issue was resolved by editing snmpd&#8217;s startup configuration to log to local1 instead of the default (check your man file for snmpd to make sure you get the right flags, it&#8217;s changed&#8230;), and then editing syslog&#8217;s configuration to log everything on local1 to /var/log/snmpd &#8212; and don&#8217;t forget to add it to logrotate! </p>
<p>Now we were down to two classes of errors. The first was obvious and sort of easy to troubleshoot: &#8220;MySQL server has gone away.&#8221; Log into the MySQL server. See if there&#8217;s slow-running queries. Nope? Well, double check the timeout that&#8217;s set in <code>/etc/my.cnf</code> &#8212; on this server, slow-query-time was set to twenty seconds, but timeout was set to ten seconds. Well, that&#8217;s not very useful. Also, check your caches and table types. In this case, everything was MyISAM. More on that later &#8212; for now, just make sure we&#8217;re using the right kind of caching strategy for your table type and system specs, which in this case is MyISAM key cache (and lots of it!). Try to fit all of your most-used tables in memory. </p>
<p>On this gig, we got the site back on it&#8217;s feet with these things. Downtime went from multiple events an hour down to one or two events per six hour period. Unfortunately, we were also out of easy things to change. Next time I post, we&#8217;ll start to get into fixes that will <i>cause</i> downtime. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.karlkatzke.com/detecting-and-resolving-lamp-stack-performance-problems/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Sun/Oracle OpenWorld &amp; Flash Storage</title>
		<link>http://www.karlkatzke.com/sunoracle-openworld-flash-storage/</link>
		<comments>http://www.karlkatzke.com/sunoracle-openworld-flash-storage/#comments</comments>
		<pubDate>Wed, 14 Oct 2009 03:09:14 +0000</pubDate>
		<dc:creator>karlkatzke</dc:creator>
				<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[flash]]></category>
		<category><![CDATA[Sun]]></category>
		<category><![CDATA[zfs]]></category>

		<guid isPermaLink="false">http://www.karlkatzke.com/?p=532</guid>
		<description><![CDATA[At the Sun OpenWorld conferene keynote today, there were a few new products listed in the Flash storage arena &#8212; most notably the F5100 that everyone&#8217;s jibber-jabbering about. 
As a smaller customer, I&#8217;m far more interested in the SunFlash F20 PCIe card &#8212; which I don&#8217;t see many people blogging about. Looks like I could [...]]]></description>
			<content:encoded><![CDATA[<p>At the <a href="http://www.cuddletech.com/blog/pivot/entry.php?id=1079">Sun OpenWorld conferene keynote today</a>, there were a few new products listed in the Flash storage arena &#8212; most notably the <a href="http://www.c0t0d0s0.org/archives/6003-Sun-Storage-F5100-officially-announced.html">F5100 that everyone&#8217;s jibber-jabbering about</a>. </p>
<p>As a smaller customer, I&#8217;m far more interested in the <a href="http://www.sun.com/storage/disk_systems/sss/f20/">SunFlash F20 PCIe card</a> &#8212; which I don&#8217;t see many people blogging about. Looks like I could add that to not only my existing systems, but <b>non-Sun</b> systems that can make use of that sort of storage. <i>That</i>, ladies and germs, is something worth the name &#8220;OpenWorld&#8221; &#8212; as in, a world of open wallets.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.karlkatzke.com/sunoracle-openworld-flash-storage/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Reading/Googling List</title>
		<link>http://www.karlkatzke.com/readinggoogling-list/</link>
		<comments>http://www.karlkatzke.com/readinggoogling-list/#comments</comments>
		<pubDate>Tue, 22 Sep 2009 04:42:54 +0000</pubDate>
		<dc:creator>karlkatzke</dc:creator>
				<category><![CDATA[reading list]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[economics]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[iSCSI]]></category>
		<category><![CDATA[recession]]></category>
		<category><![CDATA[san]]></category>
		<category><![CDATA[vmware]]></category>
		<category><![CDATA[vSphere]]></category>
		<category><![CDATA[webcam]]></category>

		<guid isPermaLink="false">http://www.karlkatzke.com/?p=523</guid>
		<description><![CDATA[vSphere, SAN or iSCSI-related:

Using iSCSI with vSphere &#8211; Pretty much the bible, they covered it all.
2TB drives are here, but Stephen Foskett identifies the issues with bringing them to the enterprise. In the same vein, he covers the death of RAID as a storage technology, and what lies beyond.
I need to research if our iSCSI [...]]]></description>
			<content:encoded><![CDATA[<ul>
<li>vSphere, SAN or iSCSI-related:
<ul>
<li><a href="http://virtualgeek.typepad.com/virtual_geek/2009/09/a-multivendor-post-on-using-iscsi-with-vmware-vsphere.html">Using iSCSI with vSphere</a> &#8211; Pretty much the bible, they covered it all.</li>
<li><a href="http://blog.fosketts.net/2009/08/14/2-tb-enterprise-drives/">2TB drives are here, but Stephen Foskett identifies the issues with bringing them to the enterprise.</a> In the same vein, he covers the <a href="http://blog.fosketts.net/2008/09/14/turning-page-raid/">death of RAID as a storage technology, and what lies beyond</a>.</li>
<li>I need to research if our <a href="http://www.vmwareinfo.com/2009/01/iscsi-hardware-or-software-how-many.html">iSCSI TOE cards are supported</a> by vSphere&#8230; </li>
<li><a href="http://blog.laspina.ca/">Ubiquitous Talk</a> might be my new favorite high-quality techie blog.</li>
</ul>
</li>
<li>Other:
<ul>
<li><a href="http://www.hightechdad.com/2009/08/12/webcam-monitoring-streaming-via-a-5-iphone-application-icam/">Streaming live webcams to your iPhone</a></li>
<li>This has been linked all over, but <a href="http://www.dailymail.co.uk/home/moslive/article-1212013/Revealed-The-ghost-fleet-recession-anchored-just-east-Singapore.html">the Ghost Fleet of the Recession</a> is anchored just off of Singapore, and it doesn&#8217;t look like it&#8217;s going anywhere soon. Sis wondered why she hadn&#8217;t seen the Florida in port recently; she ships a lot of containers with Maersk.</li>
</ul>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.karlkatzke.com/readinggoogling-list/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Dude, you&#8217;re not getting a Dell.</title>
		<link>http://www.karlkatzke.com/dude-youre-not-getting-a-dell/</link>
		<comments>http://www.karlkatzke.com/dude-youre-not-getting-a-dell/#comments</comments>
		<pubDate>Fri, 18 Sep 2009 03:39:28 +0000</pubDate>
		<dc:creator>karlkatzke</dc:creator>
				<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[solaris]]></category>
		<category><![CDATA[Sun]]></category>
		<category><![CDATA[work]]></category>
		<category><![CDATA[zfs]]></category>

		<guid isPermaLink="false">http://www.karlkatzke.com/?p=521</guid>
		<description><![CDATA[Despite the recent pot-banging around the Sun/Oracle merger and the allegations that Sun&#8217;s getting it&#8217;s customer base stolen out from under it, I just pushed the button on a fairly large cluster with Sun as the hardware vendor. 
Simply put, I couldn&#8217;t find machines with better stats for the money. Even with the academic matching [...]]]></description>
			<content:encoded><![CDATA[<p>Despite the recent pot-banging around the <a href="http://www.informationweek.com/news/global-cio/interviews/showArticle.jhtml?articleID=220000577&#038;pgno=1&#038;queryText=&#038;isPrev=">Sun/Oracle merger</a> and the allegations that Sun&#8217;s getting it&#8217;s customer base stolen out from under it, I just pushed the button on a fairly large cluster with Sun as the hardware vendor. </p>
<p>Simply put, I couldn&#8217;t find machines with better stats for the money. Even with the academic matching grant program tabled for now, we STILL got amazing promotional pricing on the x4150. I can&#8217;t even find anything that can compare to an x4250 for on-board storage &#8212; 16 on-board drives. Dell&#8217;s MD1000 chassis supports only .. 15 drives. There&#8217;s no better hardware to run Solaris on. The Sun ILOM support is leagues better than Dell&#8217;s DRAC or even HP&#8217;s ILO. All the machines come with <i>at least</i> four on-board ethernet ports. The storage array options are also superior. No one else sells a 24 slot SATA chassis with hot-swap drives backed with three controllers.</p>
<p>Simply put, the Sun option was the fastest, most scalable option. The hardware is put together well, with the same sort of build quality you&#8217;ve come to expect from HP&#8230; far superior to Dell or IBM. And the management and tuning options are awesome. I&#8217;m really, really excited to see the hardware racked in a few weeks. They also maintain and stock a parts &#8220;locker&#8221;/cache on our campus so that a technician has access to all the parts they might need for our systems without having to courier them or drive for them. </p>
<p>Am I concerned about Sun going away? Not now that they&#8217;ve been bought by Oracle. They&#8217;ve got so many compelling offerings, and I hope that IW and other tech rags stop trashing Sun &#8212; I&#8217;m a fanboy from here on out. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.karlkatzke.com/dude-youre-not-getting-a-dell/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Struggling with Budget SAN Speed</title>
		<link>http://www.karlkatzke.com/struggling-with-budget-san-speed/</link>
		<comments>http://www.karlkatzke.com/struggling-with-budget-san-speed/#comments</comments>
		<pubDate>Mon, 14 Sep 2009 05:22:10 +0000</pubDate>
		<dc:creator>karlkatzke</dc:creator>
				<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[opensolaris]]></category>
		<category><![CDATA[raid]]></category>
		<category><![CDATA[san]]></category>
		<category><![CDATA[solaris]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[zfs]]></category>

		<guid isPermaLink="false">http://www.karlkatzke.com/?p=517</guid>
		<description><![CDATA[It&#8217;s Monday morning. Your boss strolls into your office. You just finished with the trouble tickets from the weekend, and this is his favorite time to ruin your entire week. He says, &#8220;I have a project for you. I need a cluster with a primary and backup SAN that is going to store about 8TB [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s Monday morning. Your boss strolls into your office. You just finished with the trouble tickets from the weekend, and this is his favorite time to ruin your entire week. He says, &#8220;I have a project for you. I need a cluster with a primary and backup SAN that is going to store about 8TB of infrequently accessed images and it will also need to host virtual machines and an Oracle database. You&#8217;ll have to fit a budget for two sites in there, but the second site is a cold, hourly-synch backup. And it has to scale. And we&#8217;d prefer if you used a vendor solution and didn&#8217;t homebrew things.&#8221; </p>
<p>Talk about a list of contradictory feature requests! You&#8217;ve got a limited budget, it&#8217;s hard to squeak 8 usable TB out of your average entry level 12-disk arrays (i.e. HP MSA60 or Sun J4200 disk array, with Dell&#8217;s MD1000 15 disks and AC&#038;NC&#8217;s 516-series with 16 disks being notable exceptions) when you factor in a double parity stripe and a couple of hot spares. </p>
<p>In most cases, you&#8217;ll do just fine. What happens when the load on the infrequently accessed (slow) portion of the array is &#8216;peaky&#8217; though? During one of those peaks, you&#8217;ll max out a gig per second line in &#8211; depending on what you&#8217;ve got driving the array, that might be your entire bandwidth budget. What&#8217;s Oracle, which is also running in one of the VMs, going to do then? It doesn&#8217;t like having slow access to it&#8217;s log files, which means it&#8217;ll be consuming RAM and swapping heavily on it&#8217;s VM, which means the VM image will also be trying to write to disk. Triple-whammy until something gives &#8212; either load decreases or something fails. </p>
<p>The obvious choice is ZFS and Solaris. And the obvious choice for hardware is also Sun; you get four NICs by default on Sun hardware, with management ports and ILO ports out of band on their own interfaces. (Side note: When you have 7 Cat5e cables, a KVM dongle, and 2 power cords running to a 1u chassis, yes, you really do want the cable management arm.) ZFS support with Sun is excellent. Their storage products are also excellent. </p>
<p>By the time you get done buying storage, you&#8217;re through most of your budget &#8212; those 1TB disks aren&#8217;t cheap, and neither are the arrays themselves. Your maximum speed across the SAS backplane for the J4200 or J4400 series is going to be 3 or 6 GB/S, and your input is only going to be 1 gb/s actual even with bonded ports, but you&#8217;d probably rather not skip all over the place on the array as you try to write 3,000 10GB (compressed) images and then try to write to the Oracle logs. The question still remains: how do you squeeze in a budget for some faster storage for the VM images and database storage while still paying for the bulk storage you need and some room to grow? </p>
<p>Answer: What are you using to drive that array? Buy a bigger chassis, and put it inboard. The 2.5&#8243; 10k SAS drives aren&#8217;t hideously expensive, and the additional grand for a larger chassis beats the hell out of buying an entire extra J4200. Note that you can&#8217;t mix the 10k SAS disks in an array with the 7200 SATA disks&#8230; on any vendor that I know of, at least. But inboard on the system&#8217;s backplane, you can run SAS and then run SATA on the outside. </p>
<p>Bonus points: This may not last, and it might just be the academic pricing that we get at work, but right now I can buy a half-full J4400 (24 disks) for less than I can buy a fully-loaded J4200. Guess which we&#8217;re getting? It&#8217;ll be half full of blanks, but those are free. As our 8TB grows over the next year, we&#8217;re going to just slot additional disks in. ZFS&#8217;s ability to add disks to pools relatively painlessly has made this a realistic goal. ZFS also has a built-in management server (which we&#8217;ll restrict to our private network and people will have to VPN in, but that&#8217;s trivial&#8230;) which makes management&#8217;s acceptance of the technology dead simple. </p>
<p>Also, don&#8217;t forget that if you can acquire some SSDs, you&#8217;ll be able to drive your storage even faster by offloading the ZFS log writes to the much-faster SSD. They have a limited lifespan, though, so consider if it&#8217;s really worth it to you and make sure that you plan for their obsolescence and replacement considering that a log buffer is a r/w-intensive application. </p>
<p>Our total server budget for this project (a compute-/storage-intensive academic project where data loss is not acceptable) was only $70k total. We managed to squeeze an insanely fast cluster out of a paltry budget. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.karlkatzke.com/struggling-with-budget-san-speed/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>nfs4, ldap authentication, and idmapd</title>
		<link>http://www.karlkatzke.com/nfs4-ldap-authentication-and-idmapd/</link>
		<comments>http://www.karlkatzke.com/nfs4-ldap-authentication-and-idmapd/#comments</comments>
		<pubDate>Mon, 27 Jul 2009 17:15:51 +0000</pubDate>
		<dc:creator>karlkatzke</dc:creator>
				<category><![CDATA[linux]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[idmap]]></category>
		<category><![CDATA[idmapd]]></category>
		<category><![CDATA[nfs]]></category>
		<category><![CDATA[nfsv3]]></category>
		<category><![CDATA[nfsv4]]></category>
		<category><![CDATA[sles]]></category>
		<category><![CDATA[sles10]]></category>
		<category><![CDATA[SLES11]]></category>

		<guid isPermaLink="false">http://www.karlkatzke.com/?p=477</guid>
		<description><![CDATA[We&#8217;re trying not to use &#8216;old stuff&#8217; as we&#8217;re building out our new cluster, but we have a big need for nfs or some other ad-hoc shared filesystem designed for high i/o on content servers. We&#8217;d been using ocfs2, but it&#8217;s slower than molasses and doesn&#8217;t scale n-ward as you increase the number of systems [...]]]></description>
			<content:encoded><![CDATA[<p>We&#8217;re trying not to use &#8216;old stuff&#8217; as we&#8217;re building out our new cluster, but we have a big need for nfs or some other ad-hoc shared filesystem designed for high i/o on content servers. We&#8217;d been using ocfs2, but it&#8217;s slower than molasses and doesn&#8217;t scale n-ward as you increase the number of systems attached to a filesystem (due to the need for a journal for each node), whereas we can mount as many as we can support if our nfs server&#8217;s hardware will tolerate it. </p>
<p>Anyway, so the preference against &#8216;old stuff&#8217; means that we&#8217;d shy away from the nfs v2 and v3 that are well-documented and stable on linux, and towards the hairy, thorny wilds of nfs 4. There&#8217;s a multitude of websites about nfs4, but they all seem to be incomplete or to apply to Solaris&#8217;s implementation, which is thorough and well-documented. </p>
<p>And don&#8217;t mistake me, nfs4 does run. It runs with TCP, it runs quickly, and we haven&#8217;t run into any issues save one &#8212; mapping users between servers. With nfsv3, between two servers it <i>just works</i>. With nfs4, you have to have a shared user authentication system and idmapd has to be running and configured correctly. </p>
<p>Idmapd is essentially undocumented on linux. Or, if there is documentation, I have not been able to find it. There is a man page giving basic options for the daemon. It *seems* that a configuration file syntax guide is living in <code>/usr/share/doc/packages/nfsidmap/README</code>, but I can&#8217;t verify that. The configuration file man page states that only Nobody-User and Nobody-Group are permitted in the [Mapping] area. </p>
<p>For what it&#8217;s worth, the following configuration is working for me on SLES11. </p>
<ol>
<li>Get some sort of shared authentication working between the servers. Since we&#8217;re a Novell shop, we&#8217;re doing it on eDirectory with the &#8216;linux user&#8217; option enabled on the accounts, which assigns a uid and gid to the user.</li>
<li>Set the idmapd.conf to have the same domain on each server. (ours, predictably, is &#8216;tamu.edu&#8217;).</li>
<li>Add the mount point to /etc/exports on the server. Don&#8217;t forget that if you&#8217;re using nfs4 you need to bind the mount point on the server inside of the pseudofilesystem, and then set the pseudofs in the /etc/exports file as fsid=0. Start the server, and make sure that idmapd runs.</li>
<li>On the client, set the domain in the idmapd to your domain, add the mount point to the fstab and then start the nfs service. Double check to make sure that idmapd is running.</li>
</ul>
<p>Notes: </p>
<ul>
<li>I have root squashing enabled, but don&#8217;t set the nobody-user or nobody-group on the server. I don&#8217;t know what effect it will have if I did&#8230; haven&#8217;t tried. Need to move on. </li>
<li>SLES10sp2 doesn&#8217;t start idmapd when you start the nfs service; you need to set it to start manually. </li>
<li>You could probably manually create users and manually set their uid/gid specifically. Again, I did not try this since we already had a solution in place to manage it. We just run our web servers and other clients as specific users that are defined in our ldap tree but disallowed logins. As a side bonus, our Novell logging infrastructure logs attempted logins/accesses for those user IDs.</li>
</ul>
<p>Write a file as a user to the nfs mount on the client, and then check it on the server (and vice versa). It should show up as the same uid and gid &#8212; if you see an exceptionally long one, or get &#8216;nobody&#8217;, it&#8217;s not working for you. I&#8217;m sorry, but I don&#8217;t have time right now to hack around and try different ways to get it to work!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.karlkatzke.com/nfs4-ldap-authentication-and-idmapd/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Wolfram Jealousy</title>
		<link>http://www.karlkatzke.com/wolfram-jealousy/</link>
		<comments>http://www.karlkatzke.com/wolfram-jealousy/#comments</comments>
		<pubDate>Sat, 16 May 2009 01:07:42 +0000</pubDate>
		<dc:creator>karlkatzke</dc:creator>
				<category><![CDATA[sysadmin]]></category>

		<guid isPermaLink="false">http://www.karlkatzke.com/?p=468</guid>
		<description><![CDATA[As a sysadmin who&#8217;s been getting into clustered virtualized hardware stuff, I&#8217;m unbelievably jealous of Wolfram Alpha&#8217;s custom Dell hardware (note: Youtube video with horrible music, I suggest hitting mute) &#8212; it&#8217;s a 2u, quad-board, dual-socket, quad-core system. You fit four servers into 2U of space. It&#8217;s essentially one of our 1950 or sc1435 virtualization [...]]]></description>
			<content:encoded><![CDATA[<p>As a sysadmin who&#8217;s been getting into clustered virtualized hardware stuff, I&#8217;m unbelievably jealous of <a href="http://blog.wolframalpha.com/2009/04/30/rack-n-roll/">Wolfram Alpha&#8217;s custom Dell hardware</a> (note: Youtube video with horrible music, I suggest hitting mute) &#8212; it&#8217;s a 2u, quad-board, dual-socket, quad-core system. You fit four servers into 2U of space. It&#8217;s essentially one of our 1950 or sc1435 virtualization hosts, but in 2u of space with an infiniband backplane. </p>
<p>I wish Dell would make toys like this publicly available. They&#8217;ve already done the design and engineering work, and I think there&#8217;s a huge market for this type of unit given Dell&#8217;s complete and utter failure in the blade market. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.karlkatzke.com/wolfram-jealousy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>LSB -&gt; OCF:HB compatibility</title>
		<link>http://www.karlkatzke.com/lsb-ocfhb-compatibility/</link>
		<comments>http://www.karlkatzke.com/lsb-ocfhb-compatibility/#comments</comments>
		<pubDate>Mon, 11 May 2009 15:39:54 +0000</pubDate>
		<dc:creator>karlkatzke</dc:creator>
				<category><![CDATA[linux]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[lsb]]></category>
		<category><![CDATA[ocf]]></category>
		<category><![CDATA[pacemaker]]></category>

		<guid isPermaLink="false">http://www.karlkatzke.com/?p=462</guid>
		<description><![CDATA[You can always use LSB scripts (/etc/init.d) with pacemaker, but it&#8217;s better to make them osb-compliant&#8230; 
]]></description>
			<content:encoded><![CDATA[<p>You can always use LSB scripts (/etc/init.d) with pacemaker, but it&#8217;s better to make them <a href="http://www.linux-ha.org/LSBResourceAgent">osb-compliant</a>&#8230; </p>
]]></content:encoded>
			<wfw:commentRss>http://www.karlkatzke.com/lsb-ocfhb-compatibility/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
