mex-docs

Sizing Your Deployment

It is important to size the backend for your application correctly. If you overestimate your potential load, you will be consuming underutilized resources which will limit the available resources on the cloudlet(s). If you underestimate your load, you may cause a degradation in the end user experience. Fortunately, MobiledgeX provides two policies, Autoscale and Autoprovision, that can help reduce the potential issues from undersized applications.

Autoscale is available for K8 and Helm deployed workloads, and will create additional application instances when resource usage of the existing instance(s) reach a defined threshold. These instances can then be removed when the resource usage subsides [NEED: some fact checking here - I don’t know if this is currently in the product or still WIP]. This helps manage unexpected increases in load, say, for example if an application is featured on the Apple App Store driving a surge in traffic to the backends.

Autprovision is available for all deployment types that are using a loadbalancer; this logic will cause additional applications to be provisioned to a defined list of cloudlets when certain connection thresholds are reached. Although this is similar to the Autoscale feature, the key differences are that this feature takes action based on connection information and can operate across cloudlets. This feature will also de-provision instances once the connection volume drops below the defined thresholds. [NEED: more fact checking on the status of this feature, especially the deprovision]

Sample Sizing Procedure

Requirements

The following procedure assumes the following:

You are able to deploy and run your backend locally.
- Use a local virtual machine that you can configure to mimic the resources available in the MobiledgeX platform. This also allows you to view the resource usage independently from the host system.
You have written a test harness that allows you to generate a load against the backend.
- This should connect to the backend in the same manner the application connects.
- This should allow you to simulate discrete connections (simulating users) against the backend.
- This should be tunable to, at a minimum, adjust the number of simulated users, time to ramp up/down, timeout values, and any other key features the backend provides.
- This should provide latency figures or other QoS metrics that can be used by the developers to determine what the likely QoS for the end user would be.

Note: In this procedure we refer to the virtual hardware configuration of a local vm as a “resource profile”; this is roughly analogous to “flavor” in the MobiledgeX platform.

Standard Flavors

The following are the standard flavors that are provided with OpenStack. This is provided for information only, and the developer is advised to view the current list of available flavors in the region(s) they are choosing to deploy in.

Name	Memory	Disk	vCPU
m1.tiny	512G	1	1
m1.small	2048G	10	1
m1.medium	4096G	10	2
m1.large	8192G	10	4
m1.xlarge	16384G	10	8

Note: For testing purposes, the two key dimensions to benchmark for are vCPU and Memory.

Procedure

The following steps will help you come up with an ideally sized backend for your application. Note that if you are expecting a varied load across cloudlets or regions you may need to repeat this test for the intended concurrently connected users (CCU) in order to size all of your backends appropriately.

Load Estimation

These questions will help provide the data necessary to start the benchmarking:

What is your expected average number of concurrently connected users?
What is your expected max concurrently connected users count?
How long does it take a copy of your application to become available for traffic?

For example, the developers of ApplicationX expect an average of 200 CCU at any given time, a potential max of 800, and they have an application that takes roughly 2 minutes to initialize and begin to serve traffic.

Local Testing

The following steps need to be completed for each resource profile, and the results tabulated.

Configure your local test environment with a resource profile.
Run the test harness:

With average CCU count.
With max CCU count.

Record the resource usage

Backend disk, cpu, memory.
Test harness latency and errors.

Analysis

When reviewing the data you are looking for the following:

Were any of the resource utilization figures over 70%?
What was the error rate from the simulated users?
What was the average latency of the simulated users?

You need to find the resource profile that keeps client errors and latency in what have been defined by the developers as acceptable rates, while at the same time keeping resource usage under 70%. This will be your starting point when you deploy to the Edge.

Edge Deployment

Now that you have benchmarked locally and determined a resource profile that works for your application, the next step is to deploy to the MobiledgeX platform and conduct further testing to validate the choice.

Flavor Validation

Deploy your application to the MobiledgeX platform.

Choose the flavor that most closely matches the resource profile that you chose above.

Run your test harness.

With average CCU count.
With max CCU count.

Record the resource usage.

Backend disk, cpu, memory; this can be observed in the MobiledgeX console or can be pulled from the API.
Test harness latency and errors.

Analysis

Just as above, when reviewing the data you are looking for the following:

Were any of the resource utilization figures over 70%?
What was the error rate from the simulated users?
What was the average latency of the simulated users?

You need to find the flavor that keeps client errors and latency in what have been defined by the developers as acceptable rates, while at the same time keeping resource usage under 70%. If you are planning on using an autoprovision policy you will also want to record the number of connections that corresponds with the highest number of CCUs that still maintain the desired quality of service.

Provision Time

An additional dimension that needs to be benchmarked and validated is how long it takes to spin up an additional application instance. This factors in to how you configure the autoscale and autoprovision policies. You will want to note the amount of time that it takes from starting the application instance deployment process (either from the CLI or from the console) until it begins to serve traffic.

The result from this test will help you define the thresholds for the application policies. The ultimate goal is to ensure that a new instance is created (either via autoscale or autoprovision) when the application instance begins to reach resource exhaustion but before the application begins to shed load, throttle traffic, or begin to affect user experience.

Notes

It is up to the developer to define an acceptable user experience. In many cases this can be quantified based on latency values, but there may be other variables that the developer needs to take into account.
For mobile applications, there will be a base value for latency between the handset and the cloudlet. You will not be able to optimize backend latency beyond this value.

Autoscale and Autoprovision

Once you have defined the flavors for your initial deployment it is time to use the test harness to test your scaling and provisioning policies. In this process you want to be able to do the following:

Autoscaling

Create resource exhaustion on the application in order to induce scaling; this will most likely be measured in terms of the CCU value for the test harness.
Validate the user experience to see if there is any degradation during the process.
Validate that the application is now able to handle 2x the CCU over the non-scaled version of the application.
Reduce resource usage on the application in order to cause it to scale back down.
Validate the user experience to see if there is any degradation during the process.
Validate that the application is still able to handle the max CCU defined for one instance.
If there are issues at any stage of this process, this is an indication that the thresholds are most likely not set to proper values. Consider retesting and retuning these values.

Autoprovision

Using the test harness, generate a load sufficient to exceed the thresholds defined for the application. These will correspond to the values you noted during your testing on the MobiledgeX platform.
The auto provision policy should begin the provision of a new version of the application.
Validate the user experience to see if there is any degradation during the process.
Validate that the application is able to handle the current load without degradation of user experience.
Reduce resource usage on the application in order to cause it to deprovision an instance (CHECK - not sure if this is true currently).
Validate the user experience to see if there is any degradation during the process.
Validate that the application is still able to handle the load below the threshold.
If there are issues at any stage of this process, this is an indication that the thresholds are most likely not set to proper values. Consider retesting and retuning these values.