• 20Aug
    Author: chris Categories: Infrastructure, Ramblings Comments: 0

    If you’ve been reading the Barking Seal for a while, you probably already know that we use Nagios to monitor a variety of things here at AppliedTrust.  It’s been a great platform for us, and we’ve put a lot of time and energy into writing custom plugins, integrating performance graphing tools, and generally making it work for us.  I’m a huge fan of Nagios because of its stability, openness (the documentation is truly excellent), and flexibility.  Historically though, Nagios has had two weak spots.  The first is auto-discovery, and the second is usability.  Unfortunately, the first issue is likely to be with us for quite some time — it seems like the process of going out over the network and automatically discovering and configuring servers and services is just too complex for the moment.  Or perhaps it just hasn’t been worthwhile for anyone to solve that problem yet… either way, I haven’t found a good solution as of now.  But amazingly, the last couple of years have seen a lot of development in the second area.  There are now several different open source projects that provide easy-to-use GUI interfaces for configuring Nagios!

    Centreon Configuration Screen

    Read more »

  • 16Aug

    We’ve blogged in the past about Nagios, the open source monitoring software. Another great open source alternative is Cacti. Both Nagios and Cacti do a great job of graphing system resources.

    Like Nagios, Cacti is capable of monitoring your servers, as well as processor, memory, network, and disk utilization on your networking devices. After initial installation, adding hosts to be monitored can be completed using nothing but the web interface.
    Read more »

  • 13Aug

    Network World is reporting that 2010 will be the year of Open Source.  According to the article, half of the organizations surveyed are using Open Source Software in some capacity already, and the vast majority (71% in the US) said they are planning to greatly increase their use of OSS in the coming year.

    More interestingly, perhaps, many of the participants in the survey reported that they are switching to OSS for non-financial reasons such as:

    Read more »

  • 28Jul
    Author: trent Categories: Infrastructure Comments: 4
    tcp lego header

    TCP header, Lego (tm) style

    The older I get, the more lessons I seem to learn (or, not learn) over and over.  Have you ever seen TCP offload work correctly?  Of course not!  I’ve been bitten by a TCP offload (aka TCP Offload Engine or TOE) problem in just about every environment I’ve touched in the last 20 years, and sadly this week was no exception.

    To make a long story short, we have a production vmware ESXi 4.1 host with both Linux (CentOS) and Windows Server 2008 guests.  No problems were reported (or measured) with the Linux guests, but the Win 2008 guests suffered from extremely choppy network connections, for common services like Remote Desktop and backups (including lost connections).  As you probably know, I’m big into actually investigating the underlying cause of a problem rather than randomly throwing darts at it, and as such I grabbed some packet traces with wireshark.  Check this out:

    Read more »

  • 20Jul
    Author: ben Categories: IT Management, Ramblings Comments: 0

    ULSAH/4E CoverI know you’ve all been waiting with bated breath for this day:  UNIX and Linux System Administration Handbook, 4th edition, is finally out! More than two years in the making, this edition covers six major operating systems in 1300 pages of fresh deliciousness. Plenty of new topics, including virtualization, green IT, scripting, and modern storage and security. Copies available at Amazon, Barnes & Noble, or from Pearson Education.

    Writing a book of this magnitude is an intense process that I learned all about. The steps to produce the book, from inception to dead trees, include:

    • Brainstorm and agree on full topic list
    • Brainstorm and agree on contributing authors
    • Assign chapters to authors and contributors
    • Write chapter, distribute for review
    • Integrate reviewed comments from all other authors, distribute to external reviewers
    • Integrate external review comments
    • Repeat for all 32 chapters
    • Edit chapters
    • Index chapters individually
    • Engage artist (Lisa Haney) for new chapter cartoons, dividers and cover art
    • Engage outside organizations (IBM, Sun, HP) for test equipment
    • Regular (semi-weekly) meetings with authors, occasional meetings with publisher
    • Read and revise page proofs, searching for any obvious errors or inconsistencies
    • Deliver final manuscript to publisher and wait patiently

    One of the biggest challenges in producing this edition was the distributed collaboration effort. We Skyped regularly to stay in sync. Evi was around for much of the development, but we also corresponded with her while she was sailing in the Caribbean and the Pacific. We used a subversion repository for the Adobe FrameMaker source files to avoid stomping on each other’s work. I’d say this was met with mixed success; Frame’s binary files are hard to merge, despite Garth’s valiant efforts at a scripted solution.

    Special thanks to our named and unnamed contributors whose efforts are highly appreciated and certainly worthy of recognition. This is the best edition yet!

    [Slashdot] [Digg] [Reddit] [del.icio.us] [Technorati] [StumbleUpon]
    Tags: , ,
  • 09Jul

    The latest version of The Barking Seal is here , and it is filled with a variety of applicable and accessible treats.  Want some? Keep reading for a taste…

    Goodie #1: Learn why version control is important for all businesses across the board.

    Goodie #2: Get some assistance in deciding “Git or Subversion? Git or Subversion? Git…?”

    Goodie #3 (otherwise known as the cherry on top): Meet Jim Turpin, one of our fabulous network engineers, who embodies the concept of multi-discipline to a T both inside and outside of the office.

    Click here to read Q3 2010, and, as always, enjoy the treat!

    We’d love to hear from you, so please post your comments and questions here.

    [Slashdot] [Digg] [Reddit] [del.icio.us] [Technorati] [StumbleUpon]
  • 06Jul

    The fine folks at Twitter Engineering recently posted about the performance issues they have had over the holiday weekend. Since Saturday, the site has been slow for users and API calls. While AppliedTrust hasn’t (yet) made the leap to Twitter, we recognize how important it is for delivering World Cup news. I give Twitter Engineering tons of credit for being so transparent about the details of the problem – they say:

    In brief, we made three mistakes:
    * We put two critical, fast-growing, high-bandwith components on the same segment of our internal network.
    * Our internal network wasn’t appropriately being monitored.
    * Our internal network was temporarily misconfigured.

    Twitter is well known for great application-layer monitoring and instrumentation, so this gap in monitoring is a surprise. It exposes a common misconception among social software companies – that their server and network infrastructure is “covered” by their hosting provider.  As web applications scale to even 1/1000 the size of Twitter, software becomes critically interdependent on the underlying network. Infrastructure should be instrumented and monitored at least as closely as the software that depends on it.

    For more The Barking Seal articles on monitoring and troubleshooting, see:

    [Slashdot] [Digg] [Reddit] [del.icio.us] [Technorati] [StumbleUpon]
  • 30Jun
    Author: ned Categories: Infrastructure Comments: 0

    This month, AppliedTrust re-launched our web site on the CMS called Drupal. Although the “look and feel” of the site hasn’t changed much, this upgrade has been a breakthrough in terms of both performance and manageability. I would give our previous CMS, Joomla, a grade of a B- in comparison to Drupal’s solid A. Here are six reasons why Drupal is a great fit for www.appliedtrust.com:

    Read more »

  • 24Jun

    Saturday morning I was up and out the door early for a long run before the heat set in too much. As I was running I was thinking to myself, “Gosh, having a good exercise routine is kind of like having a good information security program.” I had lots of time to ponder this particular issue, as my iPod was unfortunately not charged and I had no one to talk to. Here are a few things I thought of that make exercise and security so alike.

    1) Set goals: Both in exercise and in information security, it is good to set goals. For example, before I can write up a training plan for myself, I need to know what race I’m training for, and what my target pace is. Similarly, before I can write up my information security plan, I need to know what information I need to protect and how much protection I need (is this credit card data, or is it records of what color paint my store sold last year?)

    Read more »

  • 04Jun
    Author: ned Categories: IT Management, Ramblings Comments: 1

    Sadly disasters happen, and when they do there are often valuable lessons to be learned. Unfortunately, poor IT infrastructure will limit the lessons the oil industry can learn from this incident.

    The Deepwater Horizon rig was equipped with a vessel management system (VMS), which records dozens of different metrics about the conditions on the rig and in the well. These VMS logs would contain valuable details about the blowout, much like an airplane “black box” is essential in understanding a plane crash.

    Read more »