Customer

ANZ Bank

Description

This client wanted to proactively and intelligently monitor and resolve failing web servers in a complete automated way, with no team involvement.

  • Proactively monitor 10 different URLs (via synthetic http transaction)

  • Alert when 3 of these URLs fail

  • Automate creation of a ticket in ServiceNow

  • Auto-restart the 3 failing web servers

Solution

  • Proactively monitor 10 different URLs (via synthetic http transaction)

    • Add 10 web servers in Data Sources (germainAPM workspace > left menu > Data Sources)

    • Add 10 “http scenarios” (germainAPM workspace > left menu > wizard > Http Scenario Component Deployment)

    • Setup a single KPI to match the synthetic transactions generated by the HTTP monitors. (germainAPM workspace > Analytics > KPIs; Add New Configuration)

    • Create a fact-based SLA to color individual synthetic transactions based on success / failure state:

  • Alert when 3 of these URLs fail

    • Create an aggregate SLA to trigger when 3 or more servers are not reachable. The schedule of the aggregate SLA should match the one selected for the HTTP monitors (for example, if your monitors connect every 5 minutes, also evaluate the SLA every 5 minutes):

  • Automate creation of a ticket in ServiceNow

    • Select your SNOW HTTP Action as part of the final step to be executed if there are 3 or more failures:

  • Auto-restart the 3 failing web servers

    • Once the SLA has been created, you can set up an action to restart the web servers in case of failure. (germainAPM workspace > Automation > SSH; Add New Configuration)

The specific command will vary based on your specific OS and Webserver.

Select the fact-based SLA created earlier as the trigger for your new action.