banner



How To Optimize Android For Gaming

Players savour a game most when information technology has brusk loading times, a consistent framerate, and dependable input responsiveness.

If you accept feel developing games for a computer or a panel, you might be surprised past how much these devices differ relative to mobile devices when it comes to GPU size and flash memory performance. This contrast in system structure can make it difficult to predict your game's performance on Android.

This guide serves to assist yous optimize your game so that its performance is as dependable every bit possible on the variety of Android devices that run your game. In item, this guide explains how to configure organization tracing of an Android game. The guide and then describes how you can utilise the output study from a organization trace to help you consider specific aspects of your game'south performance.

Set upwardly a game-based system trace

The Systrace tool, bachelor both as a control-line programme and an on-device service, captures a CPU and thread profile of your app over a brusque menstruation of fourth dimension. You can use the results shown in a Systrace study to better understand how your game is performing on Android, and to identify how best to improve your game'due south efficiency and responsiveness.

Systrace is a very low-level tool, which offers the following benefits:

  • Provides footing truth. Systrace captures output directly from the kernel, so the metrics that information technology captures are nearly identical to those that a serial of organisation calls would study.
  • Consumes few resource. Systrace introduces very low overhead on the device, usually less than i%, because it streams data into an in-retentiveness buffer.

Optimal settings

Regardless of how you capture the system trace, it's important to give the tool a reasonable set of arguments:

  • Categories: The all-time set of categories to enable for a game-based system trace are: {sched, freq, idle, am, wm, gfx, view, sync, binder_driver, hal, dalvik}.
  • Buffer size: A general rule is that a buffer size of 10 MB per CPU core allows for a trace that's about twenty seconds long. For example, if a device has a 2 quad-cadre CPUs (8 cores total), an appropriate value to pass into the systrace program is lxxx,000 KB (80 MB).

    If your game performs a great deal of context-switching, however, yous should increase the buffer to 15 MB per CPU core.

  • Custom events: If yous define custom events to capture in your game, enable the -a flag, which allows Systrace to include these custom events in the output report.

If yous're using the systrace command-line programme, utilise the following command to capture a system trace that applies best practices for category fix, buffer size, and custom events:

python systrace.py -a          com.instance.myapp          -b 80000 -o my_systrace_report.html \   sched freq idle am wm gfx view sync binder_driver hal dalvik        

If you're using the Systrace arrangement app on a device, consummate the following steps to capture a arrangement trace that applies best practices for category set, buffer size, and custom events:

  1. Enable the Trace debuggable applications option.
  2. Under Buffer size, select 65536 (about 64 MB). Note that, in order to use this setting, the device must have 256 MB or 512 MB available (depending on whether the CPU has 4 or 8 cores), and each 64-MB piece of memory must be available equally a contiguous chunk.
  3. Choose Categories, then enable the categories in the following list:

    • am: Activity Managing director
    • binder_driver: Folder Kernel driver
    • dalvik: Dalvik VM
    • freq: CPU Frequency
    • gfx: Graphics
    • hal: Hardware Modules
    • idle: CPU Idle
    • sched: CPU Scheduling
    • sync: Synchronization
    • view: View System
    • wm: Window Manager
  4. Enable Record tracing.

  5. Load your game.

  6. Perform the interactions in your game corresponding to the gameplay whose device performance you want to measure.

  7. Shortly later on you see undesirable behavior in your game, plough organization tracing off. You've now captured the performance statistics needed to further analyze the issue.

To salvage deejay infinite, on-device organization traces save files in a compressed trace format (*.ctrace). To uncompress this file when generating a report, use the command-line program and include the --from-file option:

python systrace.py          --from-file=/information/local/traces/my_game_trace.ctrace          \   -o my_systrace_report.html        

Meliorate specific performance areas

This section highlights several common performance concerns in mobile games and describes how to place and amend these aspects of your game.

Loading speed

Players want to get into your game's action as quickly equally possible, so information technology'southward important to improve your game'due south load times as much equally possible. The post-obit measures usually help load times:

  • Perform lazy loading. If yous use the aforementioned avails across sequent scenes or levels in your game, load these assets only once.
  • Reduce the size of your avails. That way, you can bundle uncompressed versions of these assets with your game's APK.
  • Apply a disk-efficient compression method. An example of such a method is zlib.
  • Use IL2CPP instead of mono. (Applies just if you're using Unity.) IL2CPP provides better execution performance for your C# scripts.
  • Brand your game multithreaded. For more than details, encounter the framerate consistency section.

Framerate consistency

1 of the most important elements of gameplay experience is achieving a consistent framerate. To brand this goal easier to achieve, follow the optimization techniques discussed in this section.

Multithreading

When developing for multiple platforms, information technology'due south natural to place all activity inside your game in a single thread. Although this method of execution is uncomplicated to implement in many game engines, it'southward far from optimal when running on Android devices. As a result, single-threaded games often load slowly and lack a consequent framerate.

The Systrace shown in Figure 1 displays beliefs that'southward typical of a game running on only ane CPU at a fourth dimension:

Diagram of threads  within a system trace

Figure one. Systrace report for a unmarried-threaded game

To improve your game's functioning, brand your game multithreaded. Typically, the best model is to have 2 threads:

  • A game thread, which contains your game's main modules and sends render commands.
  • A render thread, which receives render commands and translates them into graphics commands that a device's GPU can use to display a scene.

The Vulkan API expands upon this model, given its capability to push button two common buffers in parallel. Using this feature, you tin distribute multiple return threads across multiple CPUs, further improving a scene's rendering time.

You can also brand some engine-specific changes to enhance your game's multithreading performance:

  • If you're developing your game using the Unity game engine, enable the Multithreaded Rendering and GPU Skinning options.
  • If y'all're using a custom rendering engine, brand sure that the return control pipeline and graphics control pipeline are aligned correctly; otherwise, you could introduce delays in displaying your game's scenes.

After applying these changes, you should see your game occupying at least 2 CPUs simultaneously, as shown in Figure 2:

Diagram of threads  within a system trace

Figure two. Systrace report for a multi-threaded game

UI element loading

Diagram of a frame    stack within a system trace

Figure three. Systrace written report for a game that's rendering dozens of UI elements at the same fourth dimension

When creating a feature-rich game, it's tempting to show many unlike options and deportment to the player at the same time. To maintain a consistent framerate, however, it'southward important to consider the relatively small size of mobile displays and keep your UI equally simple equally possible.

The Systrace report shown in Effigy 3 is an instance of a UI frame that's attempting to render too many elements relative to a mobile device's capabilities.

A good goal is to reduce the UI update time to 2-3 milliseconds. Y'all can achieve such quick updates by performing optimizations similar to the following:

  • Update only the elements on screen that take moved.
  • Limit the number of UI textures and layers. Consider combining graphics calls, such equally shaders and textures, that use the same material.
  • Defer chemical element animation operations to the GPU.
  • Perform more than ambitious frustum and apoplexy culling.
  • If possible, perform depict operations using the Vulkan API. The draw telephone call overhead is lower on Vulkan.

Power consumption

Even after making the optimizations discussed in the previous section, you might detect that your game'southward framerate deteriorates within the first 45-fifty minutes of gameplay. Furthermore, the device might begin to rut up and consume more than battery over time.

In many cases, this undesirable ready of thermals and power consumption is related to how your game's workload is distributed beyond a device's CPUs. To increment your game's power consumption efficiency, apply the best practices shown in the following sections.

Proceed memory-heavy threads on one CPU

On many mobile devices, the L1 caches reside on specific CPUs, and L2 caches reside on the fix of CPUs that share a clock. To maximize L1 cache hits, it'south mostly best to go along your game's principal thread, along with any other retention-heavy threads, running on a unmarried CPU.

Defer short-elapsing work to lower-powered CPUs

Most game engines, including Unity, know to defer worker thread operations onto a unlike CPU relative to your game's main thread. Nonetheless, the engine isn't aware of a device'southward specific architecture and cannot anticipate your game's workload every bit well as yous can.

Most organization-on-a-chip devices have at least 2 shared clocks, i for the device's fast CPUs and 1 for the device's slow CPUs. A consequence of this architecture is that, if one fast CPU needs to operate at maximum speed, all the other fast CPUs too operate at maximum speed.

The example written report shown in Figure 4 shows a game that takes advantage of fast CPUs. Nonetheless, this high action level generates a slap-up deal of power and heat speedily.

Diagram of threads  within a system trace

Figure 4. Systrace report showing a suboptimal assignment of threads to the device's CPUs

To reduce overall power usage, it'due south best to propose to the scheduler that shorter-duration work—such as loading sound, running worker threads, and executing the choreographer—exist deferred to the set of slow CPUs on a device. Transfer every bit much of this piece of work onto the slow CPUs equally y'all can while maintaining a desired framerate.

Almost devices listing the tiresome CPUs before the fast CPUs, only you cannot assume that your device's SOC uses this order. To check, run commands similar to the ones shown in this CPU topology discovery code on GitHub.

After you know which CPUs are the slow CPUs on your device, yous tin can declare affinities for your short-elapsing threads, which the device's scheduler follows. To do and then, add together the post-obit code within each thread:

#include <sched.h> #include <sys/types.h> #include <unistd.h>  pid_t my_pid; // PID of the procedure containing your thread.  // Assumes that cpu0, cpu1, cpu2, and cpu3 are the "slow CPUs". cpu_set_t my_cpu_set; CPU_ZERO(&my_cpu_set); CPU_SET(0, &my_cpu_set); CPU_SET(ane, &my_cpu_set); CPU_SET(2, &my_cpu_set); CPU_SET(3, &my_cpu_set); sched_setaffinity(my_pid, sizeof(cpu_set_t), &my_cpu_set);        

Thermal stress

When devices get also warm, they may throttle the CPU and/or GPU, and this tin can affect games in unexpected ways. Games that contain complex graphics, heavy computation, or sustained network activity are more likely to encounter issues.

Use the thermal API to monitor temperature changes on the device and take activeness to maintain lower ability usage and libation device temperature. When the device reports thermal stress, back off ongoing activities to reduce ability usage. For example, reduce the frame rate or polygon tessellation.

Starting time, declare the PowerManager object and initialize it in the onCreate() method. Add a thermal status listener to the object.

Kotlin

class MainActivity : AppCompatActivity() {     lateinit var powerManager: PowerManager      override fun onCreate(savedInstanceState: Bundle?) {         powerManager = getSystemService(Context.POWER_SERVICE) equally PowerManager         powerManager.addThermalStatusListener(thermalListener)     } }            

Coffee

public course MainActivity extends AppCompatActivity {     PowerManager powerManager;      @Override     protected void onCreate(Bundle savedInstanceState) {         ...         powerManager = (PowerManager) getSystemService(Context.POWER_SERVICE);         powerManager.addThermalStatusListener(thermalListener);     } }            

Ascertain the actions to take when the listener detects a status change. If your game uses C/C++, add together code to the thermal status levels in onThermalStatusChanged() to call into your native game lawmaking using JNI or use the native Thermal API.

Kotlin

val thermalListener = object : PowerManager.OnThermalStatusChangedListener() {     override fun onThermalStatusChanged(status: Int) {         when (status) {             PowerManager.THERMAL_STATUS_NONE -> {                 // No thermal status, so no action necessary             }              PowerManager.THERMAL_STATUS_LIGHT -> {                 // Add code to handle calorie-free thermal increment             }              PowerManager.THERMAL_STATUS_MODERATE -> {                 // Add code to handle moderate thermal increase             }              PowerManager.THERMAL_STATUS_SEVERE -> {                 // Add lawmaking to handle severe thermal increase             }              PowerManager.THERMAL_STATUS_CRITICAL -> {                 // Add together code to handle critical thermal increase             }              PowerManager.THERMAL_STATUS_EMERGENCY -> {                 // Add code to handle emergency thermal increase             }              PowerManager.THERMAL_STATUS_SHUTDOWN -> {                 // Add code to handle immediate shutdown             }         }     } }            

Java

PowerManager.OnThermalStatusChangedListener thermalListener =     new PowerManager.OnThermalStatusChangedListener () {      @Override     public void onThermalStatusChanged(int status) {          switch (status)         {             case PowerManager.THERMAL_STATUS_NONE:                 // No thermal condition, so no action necessary                 break;              example PowerManager.THERMAL_STATUS_LIGHT:                 // Add code to handle lite thermal increase                 intermission;              case PowerManager.THERMAL_STATUS_MODERATE:                 // Add code to handle moderate thermal increment                 break;              instance PowerManager.THERMAL_STATUS_SEVERE:                 // Add together code to handle astringent thermal increase                 pause;              case PowerManager.THERMAL_STATUS_CRITICAL:                 // Add lawmaking to handle critical thermal increase                 interruption;              example PowerManager.THERMAL_STATUS_EMERGENCY:                 // Add together lawmaking to handle emergency thermal increase                 break;              example PowerManager.THERMAL_STATUS_SHUTDOWN:                 // Add lawmaking to handle immediate shutdown                 pause;         }     } };            

Touch on-to-display latency

Games that render frames as quickly as possible create a GPU-bound scenario, where the frame buffer becomes overstuffed. The CPU needs to look for the GPU, which causes a noticeable filibuster between a thespian's input and the input taking upshot on screen.

To determine whether you could better your game'due south frame pacing, consummate the following steps:

  1. Generate a Systrace study that includes the gfx and input categories. These categories contain particularly useful measurements for determining touch-to-display latency.
  2. Bank check the SurfaceView section of a Systrace report. An overstuffed buffer causes the number of pending buffer draws to oscillate between 1 and two, equally shown in Figure 5:

    Diagram of  buffer queue within a system trace

    Effigy 5. Systrace report showing an overstuffed buffer that is periodically too full to take drawing commands

To mitigate this inconsistency in frame pacing, complete the actions described in the post-obit sections:

Integrate the Android Frame Pacing API into your game

The Android Frame Pacing API helps y'all perform frame swaps and define a bandy interval such that your game maintains a more than consistent framerate.

Reduce the resolution of your game'due south non-UI assets

The displays on mod mobile devices contain many more pixels than a actor tin can process, so information technology'due south OK to downsample such that a run of 5 or even 10 pixels all contains one color. Given the construction of most display caches, it's all-time to reduce the resolution along one dimension simply.

Nevertheless, don't reduce the resolution of your game's UI elements. Information technology's important to preserve the line thickness on these elements to maintain a large plenty affect target size for all of your players.

Rendering smoothness

When SurfaceFlinger latches onto a display buffer to testify a scene in your game, the CPU activity momentarily increases. If these spikes in CPU activity occur unevenly, it'south possible to run across stuttering in your game. The diagram in Figure half dozen depicts the reason why this occurs:

Diagram of frames  missing a Vsync window because they started drawing too late

Figure half dozen. Systrace report showing how a frame tin miss a Vsync

If a frame starts drawing too late, even by a few milliseconds, it might miss the next brandish window. The frame must then wait until the next Vsync to exist displayed (33 milliseconds when running a game at 30 FPS), which causes a noticeable delay from the role player's perspective.

To address this state of affairs, use the Android Frame Pacing API, which ever presents a new frame on a VSync wavefront.

Memory country

When running your game for an extended menstruum of fourth dimension, it'southward possible for the device to experience out-of-memory errors.

In this situation, check the CPU action in a Systrace report and see how often the system is making calls to the kswapd daemon. If in that location are many calls during your game's execution, it's all-time to have a closer look at how your game is managing and cleaning up retentivity.

For more than data, see Manage retention effectively in games.

Thread land

When navigating through the typical elements of a Systrace report, yous tin view the amount of fourth dimension that a given thread spent in each possible thread country by selecting the thread within the report, every bit shown in Figure vii:

Diagram of a  Systrace report

Figure vii. Systrace report showing how selecting a thread causes the report to brandish a state summary for that thread

As Figure seven shows, you lot might find that your game's threads aren't in the "running" or "runnable" land every bit often as they should exist. The following list shows several common reasons why a given thread might be periodically transitioning to an unusual state:

  • If a thread is sleeping for an extended period of time, it might be suffering from either lock contention or waiting for GPU activity.
  • If a thread is constantly blocked on I/O, y'all're either reading too much information from deejay at a time, or your game is thrashing.

Additional resources

To larn more virtually improving your game's functioning, meet the post-obit additional resources:

Videos

  • Systrace for Games presentation from Android Game Developer Summit 2018

How To Optimize Android For Gaming,

Source: https://developer.android.com/games/optimize

Posted by: kimfortind.blogspot.com

0 Response to "How To Optimize Android For Gaming"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel