Performance Testing

OpenLMIS focuses on performance metrics that are typical in web-applications:

  • Calls to the server - how many milliseconds does this single operation take, and is the memory usage reasonable.
  • Network load - how large are the resources returned from the server. Typically OpenLMIS is designed to work in network-constrained locations, so the size, in bytes, of each resource is important.
  • The number of calls the Reference UI makes - again networks being what they, we want to minimize the number of connections that are made to accomplish a user workflow as each connection adds overhead.
  • Size of the “working” data set. Here working data is defined as the data that’s needed for a user to accomplish a task. Examples are typically Reference Data: # of Products, # of Facilities, # of Users, etc. Though also the # of Requisitions or # of Stock Cards might factor into a user’s working data. Since OpenLMIS typically manages countries, it’s important that we’re efficient in managing country-level data sets.

There are some areas of Performance however that OpenLMIS typically doesn’t focus as much on:

  • Scaling - typically we’re not concerned with tens of thousands of people needing to use the system concurrently. Likewise we don’t typically worry yet about surges or dips in user activity requiring more or less resources to serve those users.

Getting Started

OpenLMIS uses Apache JMeter to test RESTful endpoints. We use Taurus, and it’s YAML format, to write our test scenarios and generate reports which our CI server can present as an artifact of every successful deployment to our CD test server.

Keeping to our conventions, Taurus is used through a Docker image, with a simple script located at ./performance/test.sh with tests in the directory ./performance/tests/ of a Service. Any *.yml file in that test directory will be fed to Taurus to be used against https://test.openlmis.org.

Running test.sh will place JMeter output as well as Taurus output under ./build/performance-artifacts/. The file stats.xml has the final summary performance metrics. Files of note when developing test scenarios:

  • error-N.jtl - Contains errors and requests that led to those errors from the HTTP server.
  • JMeter-N.err - Contains JMeter errors where JMeter didn’t understand the test scenario.
  • modified_requests-N.jmx - Contains the generated JMeter requests (after Taurus generation).
  • kpi-N.jtl - Individual metrics of a test scenario.

Running in CI

Tests run in a Jenkin’s Job that ends in -performance. This job is run as part of each Service’s build pipeline that results in a deployment to the test server.

The reports are presented using Performance Plugin. When looking at this report you’ll see:

  • A graph that shows all of the endpoints (requests) over time.
  • A report for a build which includes an average over time, as well as a table showing KPIs of each request.

A simple Scenario (with authentication)

Nearly all of our RESTful resources require authentication, in this example we’ll show a basic test scenario that includes authentication. The syntax and features used here are documented at Taurus’ page on the JMeter executer.

execution:
  - concurrency: 1
    hold-for: 1m
    scenario: users-get-one
scenarios:
  get-user-token:
    requests:
      - url: ${__P(base-uri)}/api/oauth/token
        method: POST
        label: GetUserToken
        headers:
          Authorization: Basic ${__base64Encode(${__P(basic-auth)})}
        body:
          grant_type: password
          username: ${__P(username)}
          password: ${__P(password)}
        extract-jsonpath:
          access_token:
            jsonpath: $.access_token
  users-get-one:
    requests:
      - include-scenario: get-user-token
      - url: ${__P(base-uri)}/api/users/a337ec45-31a0-4f2b-9b2e-a105c4b669bb
        method: GET
        label: GetAdministratorUser
        headers:
          Authorization: Bearer ${access_token}

The execution block defines for our test scenario users-get-one that runs 1 concurrent user, for one minute. Notice that this definition is for the simplest of test executions - 1 user, run it enough times to get a useful sampling. We use this sort of test execution to first get a sense of what our endpoint’s single-user characteristics are.

Next notice that we have two scenarios defined:

  1. get-user-token - this is a reusable scenario, which gets a basic user authentication token, and through the extract-jsonpath saves it to a variable named access_token.
  2. users-get-one - this is the test scenario we’re primarily interested in: exercise the /api/users/{a specific users uuid}. We pass the previously obtained access_token through the HTTP request’s headers.

Summary

  • First test the most basic of environments: 1 user, enough times to get useful sample size.
  • Re-use the scenario to obtain an access_token using include-scenario.
  • It’s generally OK to use demo-data identifiers (the user’s UUID) - though it couples the test to the demo-data, it will provide consistent results.
  • Give each request a clear, semantic label. This will be used later in pass-fail criteria.

Testing collections

To the simple Scenario we’re going to now test the performance of returning a collection of a resource:

users-search-one-page:
  requests:
    - include-scenario: get-user-token
    - url: ${__P(base-uri)}/api/users/search?page=1&size=10
      method: POST
      label: GetAUserPageOfTen
      body: '{}'
      headers:
        Authorization: Bearer ${access_token}
        Content-Type: application/json

Here we’re testing the Users resource by asking for 1 page of 10 users.

Summary

  • When testing the performance of collections, the result will be influenced by the number of results returned. Due to this prefer to test a paginated resource, and always ask for a number that exists (i.e. don’t ask for 50 when demo-data only has 40).
  • Searching often requires a POST, in this case the query parameters must be in the URL.

Testing complex workflows

A complex workflow might be:

  1. GET a list of periods for which requisitions may be initiated.
  2. Create a new Requisition resource by POSTing with the previously returned periods available.
  3. DELETE the previously created Requisition resource, so that we may test again.
initiate-requisition:
  requests:
    - url: ${__P(base-uri)}/api/oauth/token
      method: POST
      label: GetUserToken
      headers:
        Authorization: Basic ${__base64Encode(${__P(user-auth)})}
      body:
        grant_type: password
        username: ${__P(username)}
        password: ${__P(password)}
      extract-jsonpath:
        access_token:
          jsonpath: $.access_token
    # program = family planning, facility = comfort health clinic
    - url: ${__P(base-uri)}/api/requisitions/periodsForInitiate?programId=10845cb9-d365-4aaa-badd-b4fa39c6a26a&facilityId=e6799d64-d10d-4011-b8c2-0e4d4a3f65ce&emergency=false
      method: GET
      label: GetPeriodsForInitiate
      headers:
        Authorization: Bearer ${access_token}
      extract-jsonpath:
        periodUuid:
          jsonpath: $.[:1]id
      jsr223:
        script-text: |
          String uuid = vars.get("periodUuid");
          uuid = uuid.replaceAll(/"|\[|\]/, "");
          vars.put("periodUuid", uuid);
    - url: ${__P(base-uri)}/api/requisitions/initiate?program=10845cb9-d365-4aaa-badd-b4fa39c6a26a&facility=e6799d64-d10d-4011-b8c2-0e4d4a3f65ce&suggestedPeriod=${periodUuid}&emergency=false
      method: POST
      label: InitiateNewRequisition
      headers:
        Authorization: Bearer ${access_token}
        Content-Type: application/json
      extract-jsonpath:
        reqUuid:
          jsonpath: $.id
      jsr223:
        script-text: |
          String uuid = vars.get("reqUuid");
          uuid = uuid.replaceAll(/"|\[|\]/, ""); # remove quotes and []
          vars.put("reqUuid", uuid);
    - url: ${__P(base-uri)}/api/requisitions/${reqUuid}
      method: DELETE
      label: DeleteRequisition
      headers:
        Authorization: Bearer ${access_token}

Summary

  • When creating a new RESTful resource (e.g. PUT or POST), we may need to clean-up after ourselves in order to run more than one test.
  • JSR223 blocks allow us to execute basic Groovy (default). This can be especially useful when you need to clean-up a JSON result from a previous response, such as a UUID, to use in the next request.

Simple stress testing

As mentioned, OpenLMIS performance tests tend to focus first on basic execution environments where we’re only testing 1 user interaction at a time. However there is a need to do basic stress testing, especially for endpoints which are used frequently. For example we’ve seen the authentication resource used repeatedly in all our previous examples. Lets stress test it.

modules:
  local:
    sequential: true

execution:
  - concurrency: 10
    hold-for: 2m
    scenario: get-user-token
  - concurrency: 50
    hold-for: 2m
    scenario: get-service-token

scenarios:
  get-user-token:
    requests:
      - url: ${__P(base-uri)}/api/oauth/token
        method: POST
        label: GetUserToken
        headers:
          Authorization: Basic ${__base64Encode(${__P(user-auth)})}
        body:
          grant_type: password
          username: ${__P(username)}
          password: ${__P(password)}
  get-service-token:
    requests:
      - url: ${__P(base-uri)}/api/oauth/token
        method: POST
        label: GetServiceToken
        headers:
          Authorization: Basic ${__base64Encode(${__P(service-auth)})}
        body:
          grant_type: client_credentials

Here we’ve defined 2 tests:

  1. Authenticate as if you’re a person.
  2. Authenticate as if you’re another Service (a Service token).

The stress testing here introduces important changes in our execution block:

- concurrency: 10
  hold-for: 2m
  scenario: get-user-token

Instead of defining 1 user, here we’ll have 10 concurrent ones. Instead of running the test for 1 minute, we’re going to run the test as many times as we can for 2 minutes. For further options see the Taurus’ Execution doc.

When stress testing, it’s important to remember that too much simply isn’t useful, and only slows down the test. Nor do we presently have a test infrastructure in place that allows for tests to originate from multiple hosts.

Summary

  • You can define multiple execution definitions for the same scenario, so the first might give us the basic performance characteristics, the second might be a stress test.
  • By default the tests defined in the execution block are run in parallel. This can be changed to by ran sequential with sequential: true.
  • Choose a reasonable number of concurrent users. Typically less than a dozen is enough.
  • Choose a reasonable time to hold the test for. Typically 1-2 minutes is enough, and no more than 5 minutes unless justifiable.
  • Remember that we don’t have a performance testing infrastructure in place that can concurrently send requests to our application from multiple hosts. OpenLMIS performance testing typically only requires the most basic stress testing.

Testing file uploads

In this short example we’re going to send a request to the catalog items endpoint and upload some items as a CSV file.

upload-catalog-items:
  requests:
    - include-scenario: get-user-token
    - url: ${__P(base-uri)}/api/catalogItems?format=csv
      method: POST
      label: UploadCatalogItems
      headers:
        Authorization: Bearer ${access_token}
      upload-files:
        - param: file
          path: /tmp/artifacts/catalog_items.csv

Summary

  • When uploading a file we don’t have to worry about setting correct content header as Taurus take care of it on its own when using upload-files block. This behavior is described in the HTTP Requests of the Taurus user manual

Pass-fail criteria

With the above tests defined, we can now write pass-fail criteria. This is especially useful if we want our test to fail when the performance is less than what we’ve defined.

reporting:
    - module: passfail
      criteria:
        - avg-rt of GetUserToken>300ms, continue as failed
        - avg-rt of GetServiceToken>300ms, continue as failed

This allows us to fail the test if the average response time for either of the two tests was greater than 300ms. See the Taurus Passfail doc for more.

Summary

  • Write the pass-fail criteria within the test definition.

Performance Acceptance Criteria

With Taurus we can now add basic acceptance criteria when working on new issues. For example the acceptance criteria might say:

  • the endpoint to retrieve 10 users should complete in 500ms for 90% of users

This would lead us to write a performance test for this new GET operation to retrieve 10 users, and we’d add a pass-fail criteria such as:

reporting:
    - module: passfail
      criteria:
        Get 10 Users is too slow: p90 of Get10Users>500ms, continue as failed

Read the Taurus Passfail doc for more.

Next Steps (WIP)

We’ve covered basic performance testing, stress testing, and pass-fail criteria. Next we’ll be adding:

  • Loading performance-oriented data sets (e.g. what happens to these requests when there are 10,000 products).
  • Using Selenium to mimic browser interactions, to give us:
    • How many http requests does a page incur.
    • Network payload size.
  • Failing deployments based on performance results.