It is important to size the backend for your application correctly. If you overestimate your potential load, you will be consuming underutilized resources which will limit the available resources on the cloudlet(s). If you underestimate your load, you may cause a degradation in the end user experience. Fortunately, MobiledgeX provides two policies, Autoscale and Autoprovision, that can help reduce the potential issues from undersized applications.
Autoscale is available for K8 and Helm deployed workloads, and will create additional application instances when resource usage of the existing instance(s) reach a defined threshold. These instances can then be removed when the resource usage subsides [NEED: some fact checking here - I don’t know if this is currently in the product or still WIP]. This helps manage unexpected increases in load, say, for example if an application is featured on the Apple App Store driving a surge in traffic to the backends.
Autprovision is available for all deployment types that are using a loadbalancer; this logic will cause additional applications to be provisioned to a defined list of cloudlets when certain connection thresholds are reached. Although this is similar to the Autoscale feature, the key differences are that this feature takes action based on connection information and can operate across cloudlets. This feature will also de-provision instances once the connection volume drops below the defined thresholds. [NEED: more fact checking on the status of this feature, especially the deprovision]
The following procedure assumes the following:
You are able to deploy and run your backend locally.
You have written a test harness that allows you to generate a load against the backend.
Note: In this procedure we refer to the virtual hardware configuration of a local vm as a “resource profile”; this is roughly analogous to “flavor” in the MobiledgeX platform.
The following are the standard flavors that are provided with OpenStack. This is provided for information only, and the developer is advised to view the current list of available flavors in the region(s) they are choosing to deploy in.
Name | Memory | Disk | vCPU |
---|---|---|---|
m1.tiny | 512G | 1 | 1 |
m1.small | 2048G | 10 | 1 |
m1.medium | 4096G | 10 | 2 |
m1.large | 8192G | 10 | 4 |
m1.xlarge | 16384G | 10 | 8 |
Note: For testing purposes, the two key dimensions to benchmark for are vCPU and Memory.
The following steps will help you come up with an ideally sized backend for your application. Note that if you are expecting a varied load across cloudlets or regions you may need to repeat this test for the intended concurrently connected users (CCU) in order to size all of your backends appropriately.
These questions will help provide the data necessary to start the benchmarking:
For example, the developers of ApplicationX expect an average of 200 CCU at any given time, a potential max of 800, and they have an application that takes roughly 2 minutes to initialize and begin to serve traffic.
The following steps need to be completed for each resource profile, and the results tabulated.
When reviewing the data you are looking for the following:
You need to find the resource profile that keeps client errors and latency in what have been defined by the developers as acceptable rates, while at the same time keeping resource usage under 70%. This will be your starting point when you deploy to the Edge.
Now that you have benchmarked locally and determined a resource profile that works for your application, the next step is to deploy to the MobiledgeX platform and conduct further testing to validate the choice.
Just as above, when reviewing the data you are looking for the following:
You need to find the flavor that keeps client errors and latency in what have been defined by the developers as acceptable rates, while at the same time keeping resource usage under 70%. If you are planning on using an autoprovision policy you will also want to record the number of connections that corresponds with the highest number of CCUs that still maintain the desired quality of service.
An additional dimension that needs to be benchmarked and validated is how long it takes to spin up an additional application instance. This factors in to how you configure the autoscale and autoprovision policies. You will want to note the amount of time that it takes from starting the application instance deployment process (either from the CLI or from the console) until it begins to serve traffic.
The result from this test will help you define the thresholds for the application policies. The ultimate goal is to ensure that a new instance is created (either via autoscale or autoprovision) when the application instance begins to reach resource exhaustion but before the application begins to shed load, throttle traffic, or begin to affect user experience.
Once you have defined the flavors for your initial deployment it is time to use the test harness to test your scaling and provisioning policies. In this process you want to be able to do the following: