• 28Jul
    Author: trent Categories: Infrastructure Comments: 4
    tcp lego header

    TCP header, Lego (tm) style

    The older I get, the more lessons I seem to learn (or, not learn) over and over.  Have you ever seen TCP offload work correctly?  Of course not!  I’ve been bitten by a TCP offload (aka TCP Offload Engine or TOE) problem in just about every environment I’ve touched in the last 20 years, and sadly this week was no exception.

    To make a long story short, we have a production vmware ESXi 4.1 host with both Linux (CentOS) and Windows Server 2008 guests.  No problems were reported (or measured) with the Linux guests, but the Win 2008 guests suffered from extremely choppy network connections, for common services like Remote Desktop and backups (including lost connections).  As you probably know, I’m big into actually investigating the underlying cause of a problem rather than randomly throwing darts at it, and as such I grabbed some packet traces with wireshark.  Check this out:

    Read more »

  • 20Jul
    Author: ben Categories: IT Management, Ramblings Comments: 0

    ULSAH/4E CoverI know you’ve all been waiting with bated breath for this day:  UNIX and Linux System Administration Handbook, 4th edition, is finally out! More than two years in the making, this edition covers six major operating systems in 1300 pages of fresh deliciousness. Plenty of new topics, including virtualization, green IT, scripting, and modern storage and security. Copies available at Amazon, Barnes & Noble, or from Pearson Education.

    Writing a book of this magnitude is an intense process that I learned all about. The steps to produce the book, from inception to dead trees, include:

    • Brainstorm and agree on full topic list
    • Brainstorm and agree on contributing authors
    • Assign chapters to authors and contributors
    • Write chapter, distribute for review
    • Integrate reviewed comments from all other authors, distribute to external reviewers
    • Integrate external review comments
    • Repeat for all 32 chapters
    • Edit chapters
    • Index chapters individually
    • Engage artist (Lisa Haney) for new chapter cartoons, dividers and cover art
    • Engage outside organizations (IBM, Sun, HP) for test equipment
    • Regular (semi-weekly) meetings with authors, occasional meetings with publisher
    • Read and revise page proofs, searching for any obvious errors or inconsistencies
    • Deliver final manuscript to publisher and wait patiently

    One of the biggest challenges in producing this edition was the distributed collaboration effort. We Skyped regularly to stay in sync. Evi was around for much of the development, but we also corresponded with her while she was sailing in the Caribbean and the Pacific. We used a subversion repository for the Adobe FrameMaker source files to avoid stomping on each other’s work. I’d say this was met with mixed success; Frame’s binary files are hard to merge, despite Garth’s valiant efforts at a scripted solution.

    Special thanks to our named and unnamed contributors whose efforts are highly appreciated and certainly worthy of recognition. This is the best edition yet!

    [Slashdot] [Digg] [Reddit] [del.icio.us] [Technorati] [StumbleUpon]
    Tags: , ,
  • 09Jul

    The latest version of The Barking Seal is here , and it is filled with a variety of applicable and accessible treats.  Want some? Keep reading for a taste…

    Goodie #1: Learn why version control is important for all businesses across the board.

    Goodie #2: Get some assistance in deciding “Git or Subversion? Git or Subversion? Git…?”

    Goodie #3 (otherwise known as the cherry on top): Meet Jim Turpin, one of our fabulous network engineers, who embodies the concept of multi-discipline to a T both inside and outside of the office.

    Click here to read Q3 2010, and, as always, enjoy the treat!

    We’d love to hear from you, so please post your comments and questions here.

    [Slashdot] [Digg] [Reddit] [del.icio.us] [Technorati] [StumbleUpon]
  • 06Jul

    The fine folks at Twitter Engineering recently posted about the performance issues they have had over the holiday weekend. Since Saturday, the site has been slow for users and API calls. While AppliedTrust hasn’t (yet) made the leap to Twitter, we recognize how important it is for delivering World Cup news. I give Twitter Engineering tons of credit for being so transparent about the details of the problem – they say:

    In brief, we made three mistakes:
    * We put two critical, fast-growing, high-bandwith components on the same segment of our internal network.
    * Our internal network wasn’t appropriately being monitored.
    * Our internal network was temporarily misconfigured.

    Twitter is well known for great application-layer monitoring and instrumentation, so this gap in monitoring is a surprise. It exposes a common misconception among social software companies – that their server and network infrastructure is “covered” by their hosting provider.  As web applications scale to even 1/1000 the size of Twitter, software becomes critically interdependent on the underlying network. Infrastructure should be instrumented and monitored at least as closely as the software that depends on it.

    For more The Barking Seal articles on monitoring and troubleshooting, see:

    [Slashdot] [Digg] [Reddit] [del.icio.us] [Technorati] [StumbleUpon]