APM as a Service: 4 steps to monitor real user experience in production

From Compuware APM Blog from 15 May 2013

With our new service platform and the convergence of dynaTrace PurePath Technology with the Gomez Performance Network, we are proud to offer an APMaaS solution that sets a higher bar for complete user experience management, with end-to-end monitoring technologies that include real-user, synthetic, third-party service monitoring, and business impact analysis.

To showcase the capabilities we used the free trial on our own about:performance blog as a demonstration platform. It is based on the popular WordPress technology which uses PHP and MySQL as its implementation stack. With only 4 steps we get full availability monitoring as well as visibility into every one of our visitors and can pinpoint any problem on our blog to problems in the browser (JavaScript, slow 3rd party, …), the network (slow network connectivity, bloated website, ..) or the application itself (slow PHP code, inefficient MySQL access, …).

Before we get started, let’s have a look at the Compuware APMaaS architecture. In order to collect real user performance data all you need is to install a so called Agent on the Web and/or Application Server. The data gets sent in an optimized and secure way to the APMaaS Platform. Performance data is then analyzed through the APMaaS Web Portal with drilldown capabilities into the dynaTrace Client.

Compuware APMaaS is a secure service to monitor every single end user on your application end-to-end (browser to database)

Compuware APMaaS is a secure service to monitor every single end user on your application end-to-end (browser to database)

4 Steps to setup APMaaS for our Blog powered by WordPress on PHP

From a high-level perspective, joining Compuware APMaaS and setting up your environment consists of four basic steps:

  1. Sign up with Compuware for the Free Trial
  2. Install the Compuware Agent on your Server
  3. Restart your application
  4. Analyze Data through the APMaaS Dashboards

In this article, we assume that you’ve successfully signed up, and will walk you through the actual setup steps to show how easy it is to get started.

After signing up with Compuware, the first sign of your new Compuware APMaaS environment will be an email notifying you that a new environment instance has been created:

Following the steps as explained in the Welcome Email to get started

Following the steps as explained in the Welcome Email to get started

While you can immediately take a peek into your brand new APMaaS account at this point, there’s not much to see: Before we can collect any data for you, you will have to finish the setup in your application by downloading and installing the agents.

After installation is complete and the Web Server is restarted the agents will start sending data to the APMaaS Platform – and with dynaTrace 5.5, this also includes the PHP agent which gives insight into what’s really going on in the PHP application!

Agent Overview shows us that we have both the Web Server and PHP Agent successfully loaded

Agent Overview shows us that we have both the Web Server and PHP Agent successfully loaded

Now we are ready to go!

For Ops & Business: Availability, Conversions, User Satisfaction

Through the APMaaS Web Portal, we start with some high level web dashboards that are also very useful for our Operations and Business colleagues. These show Availability, Conversion Rates as well as User Satisfaction and Error Rates. To show the integrated capabilities of the complete Compuware APM platform, Availability is measured using Synthetic Monitors that constantly check our blog while all of the other values are taken from real end user monitoring.

Operations View: Automatic Availability and Response Time Monitoring of our Blog

Operations View: Automatic Availability and Response Time Monitoring of our Blog

Business View: Real Time Visits, Conversions, User Satisfaction and Errors

Business View: Real Time Visits, Conversions, User Satisfaction and Errors

For App Owners: Application and End User Performance Analysis

Through the dynaTrace client we get a richer view to the real end user data. The PHP agent we installed is a full equivalent to the dynaTrace Java and .NET agents, and features like the application overview together with our self-learning automatic baselining will just work the same way regardless of the server-side technology:

Application level details show us that we had a response time problem and that we currently have several unhappy end users

Application level details show us that we had a response time problem and that we currently have several unhappy end users

Before drilling down into the performance analytics, let’s have a quick look at the key user experience metrics such as where our blog users actually come from, the browsers they use, and whether their geographical location impacts user experience:

The UEM Key Metrics dashboards give us the key metrics of web analytics tools as well as tying it together with performance data. Visitors from remote locations are obviously impacted in their user experience

The UEM Key Metrics dashboards give us the key metrics of web analytics tools as well as tying it together with performance data. Visitors from remote locations are obviously impacted in their user experience

If you are responsible for User Experience and interested in some of our best practices I recommend checking our other UEM-related blog posts – for instance: What to do if A/B testing fails to improve conversions?

Going a bit deeper – What impacts End User Experience?

dynaTrace automatically detects important URLs as so-called “Business Transactions.” In our case we have different blog categories that visitors can click on. The following screenshot shows us that we automatically get dynamic baselines calculated for these identified business transaction:

Dynamic Baselining detect a significant violation of the baseline during a 4.5 hour period last night

Dynamic Baselining detect a significant violation of the baseline during a 4.5 hour period last night

Here we see that our overall response time for requests by category slowed down on May 12. Let’s investigate what happened here, and move to the transaction flow which visualizes PHP transactions from the browser to the database and maps infrastructure health data onto every tier that participated in these transactions:

The Transaction Flow shows us a lot of interesting points such as Errors that happen both in the browser and the WordPress instance. It also shows that we are heavy on 3rd party but good on server health

The Transaction Flow shows us a lot of interesting points such as Errors that happen both in the browser and the WordPress instance. It also shows that we are heavy on 3rd party but good on server health

Since we are always striving to improve our users’ experience, the first troubling thing on this screen is that we see errors happening in browsers – maybe someone forgot to upload an image when posting a new blog entry? Let’s drill down to the Errors dashlet to see what’s happening here:

3rd Party Widgets throw JavaScript errors and with that impact end user experience.

3rd Party Widgets throw JavaScript errors and with that impact end user experience.

Apparently, some of the third party widgets we have on the blog caused JavaScript errors for some users. Using the error message, we can investigate which widget causes the issue, and where it’s happening. We can also see which browsers, versions and devices this happens on to focus our optimization efforts. If you happen to rely on 3rd party plugins you want to check the blog post You only control 1/3 of your Page Load Performance.

PHP Performance Deep Dive

We will analyze the performance problems on the PHP Server Side in a follow up blog. We will show you what the steps are to identify problematic PHP code. In our case it actually turned out to be a problematic plugin that helps us identify bad requests (requests from bots, …)

Conclusion and Next Steps

Stay tuned for more posts on this topic, or try Compuware APMaaS out yourself by signing up here for the free trial!


Amazon explains how to measure streaming video performance

Learn who the customers are and understand what’s important to them. An Amazon exec offers 12 best practices.

Every second matters when it comes to online video performance, a statement that’s literally true. In his address at the recent Streaming Media West conference in Los Angeles, Nathan Dye, software development manager for Amazon Web Services, revealed that studies have shown that a one-second delay in an e-commerce web site’s loading time can reduce revenues by seven percent.

Loading times are just as crucial for online video. Shoppers often don’t come back if videos are slow to load.

“Poor performance and video interruptions lead to less return traffic and less video viewed overall,” Dye said. “IMDB, of course, knows this very well. Their operations team is constantly using their performance measurement, their metrics and dashboards, to find issues with their infrastructures or find problems their customers are experiencing, pinpointing those issues and finally fixing them. Ultimately, that’s what performance measurement is all about: it’s about improving the streaming performance of your customers by first finding those issues and then fixing them.”

In his presentation, Dye offered 12 best practices for measuring streaming video performance.

“You have to start with your customers. If you don’t know what your customers care about, you won’t be able to measure it,” Dye explained. “You need to know what they’re watching, where they’re watching it from, how frequently they’re watching it. Depending on who your customers are, you may care about different performance criteria. For example, if you’re vending feature-length films, you may care a lot more about insuring that customers get a high-quality stream that’s uninterrupted.”

Feature film vendors might decide to sacrifice some start-up latency to insure that viewers get an uninterrupted stream, Dye explained.

For the other 11 best practices for measuring streaming video, watch the full presentation below.

An Amazon exec offers 12 best practices.

HOW-TO: Best Practices for Measuring Performance of Streaming Video

In this presentation, you’ll learn about best practices for measuring and monitoring the quality of your videos streamed to end-users. We will provide practical guidance using external agent-based measurements and real user monitoring techniques, and discuss CDN architectures and how they relate to performance measurement. Finally we’ll walk through real-world CDN performance monitoring implementations used by Amazon CloudFront customers for video delivery.

Speaker: Nathan Dye, Software Development Manager, Amazon Web Services


WebTuna SharePoint monitoring – 24×7 real user monitoring for SharePoint

By WebTuna

WebTuna for SharePoint screengrab

When it comes to SharePoint end-user performance, there are several issues facing SharePoint teams:

  1. They have no visibility of what performance is being delivered to all users.
  2. They don’t know what percentage of users actually use SharePoint or what content they are accessing.
  3. They know performance is worse for overseas offices, but cannot quantify it.
  4. They are not able to establish what the impact on performance will be of code, configuration, upgrades and hardware changes before going live.
  5. They don’t know there is a performance issue unless users complain.

What WebTuna the SharePoint end-user monitoring tool provides…

  1. Each and every user’s actual performance at all times from all locations.
  2. Which content is being accessed, by who and when.
  3. Real page load times of users hitting the SharePoint site in real-time and historically. Every page view from every user is captured.
  4. Geographical map showing usage by country, office and individual users, and highlights in real-time regions that experience poor load times.
  5. User performance broken down by country, browser type, operating system, page title, URL and many more entities.

New blog launched by OPNET: Application Performance Matters.

A new blog has been launched by OPNET called “Application Performance Matters”.  OPNET aims to develop a  new forum for discussion on APM concepts, techniques, challenges, and directions.
APM and more generally, IT Service Assurance, is an area of utmost importance to business because virtually every enterprise today is driven by processes and information. Software applications, which started out as a means to enhance productivity and enforce policies, have now evolved into the very embodiment of these processes and the reference model of an organization’s approach to conducting business.
Today, the question of whether a change to organisational practices can be implemented is virtually inseparable from the question “can we do that in our systems?”
Given the fundamental role of applications, and the increasing complexity and sophistication of application architectures, managing performance has become a hotbed of activity. There is much information to share in this area. The technologies of APM continue to evolve rapidly, as does the entire IT environment.
Many enterprises at varying stages of adopting APM wish to learn about approaches that would enable them to reap the most benefit. The blog aims to cover a full spectrum of topics, ranging from detailed technical problem solving all the way to organisational best practices.
Their first post is intended to define an initial set of terms to serve as a basis for future discussion. Download the “APM: An Evolving Lexicon” whitepaper. We hope you enjoy participating in “Application Performance Matters” and find it useful to your initiatives and daily activities in APM.