• 06Jul

    The fine folks at Twitter Engineering recently posted about the performance issues they have had over the holiday weekend. Since Saturday, the site has been slow for users and API calls. While AppliedTrust hasn’t (yet) made the leap to Twitter, we recognize how important it is for delivering World Cup news. I give Twitter Engineering tons of credit for being so transparent about the details of the problem – they say:

    In brief, we made three mistakes:
    * We put two critical, fast-growing, high-bandwith components on the same segment of our internal network.
    * Our internal network wasn’t appropriately being monitored.
    * Our internal network was temporarily misconfigured.

    Twitter is well known for great application-layer monitoring and instrumentation, so this gap in monitoring is a surprise. It exposes a common misconception among social software companies – that their server and network infrastructure is “covered” by their hosting provider.  As web applications scale to even 1/1000 the size of Twitter, software becomes critically interdependent on the underlying network. Infrastructure should be instrumented and monitored at least as closely as the software that depends on it.

    For more The Barking Seal articles on monitoring and troubleshooting, see:

    [Slashdot] [Digg] [Reddit] [del.icio.us] [Technorati] [StumbleUpon]
  • 02Aug
    Author: ned Categories: Infrastructure, Security Comments: 13

    161072974_f50ecb1823Virtual Private Networks (VPNs) offer a way to securely connect different locations that are both connected to the Internet. Internet VPNs are way cheaper than private lines leased from a telco company, but unfortunately they are often much less reliable. Many times, when an Internet VPN “drops”, distant offices are no longer able to communicate — as network administrators, we want to know so we can fix it before our users notice anything!

    This post shows one way to monitor site-to-site VPNs configured on a Cisco ASA firewall using SNMP and Nagios.

    Read more »