Regardless of the engine or graphics API you are using, there are a number of tweaks that can be applied to game assets to optimize them for Galaxy devices.
In this guide, we will highlight a number of optimizations that will help you reduce download size, reduce installation size, reduce load time and increase run-time performance.
When your players decide to buy your game, one of the first things they will look at is package size. For example, do they have enough space to install the game? Can they afford to download the game over a cellular data connection, or will they need Wi-Fi?
The smaller you can make your game, the better the player’s first impression of the game will be.
Consider splitting assets into APK or OBB bundles that are specific to texture format. Use the Google Play Developer Console to configure which bundle should be downloaded, depending on the target device’s supported texture compression formats.
Consider file compression to reduce package sizes.
Avoid redundancy in assets. For example, avoid sparsely-occupied texture atlases.
If mesh data is using significant space, consider 16bit vertex attribute formats.
If assets can’t be downloaded at install time, avoid long download and set-up times on first run. Consider downloading a sub-set of assets initially.
Keeping the install size of your game to a minimum also improves the player’s initial impression. If the install size is small, there’s a greater chance they will purchase the game. They may want to have many large games installed simultaneously. If yours is the smallest, there’s less chance of it being uninstalled when they run out of space!
If the game needs to download assets on first launch, warn the user if they do not have sufficient storage space before you start downloading files.
Consider file compression to reduce package size, but beware of the run-time decompression cost!
Each Galaxy device your game targets will support one or more compressed texture formats that can be read natively by the device’s GPU. These formats are great for run-time storage space and keeping memory bandwidth consumption low, as the compressed textures can consume significantly fewer bytes of memory than the original assets they were generated from.
GPU compressed texture formats commonly found on Galaxy devices include:
GPU compressed texture formats tend to be lossy. Developers need to find a balance between the compression size used and an acceptable loss of fidelity.
Prefer GPU compressed texture formats. Only consider uncompressed formats if the compressed quality is unacceptable.
Consider compressing to more than one format to improve device/GPU coverage. Split game binaries into asset-specific bundles to reduce install size (see Optimize for download size).
GPU compressed formats are lossy, but most compressors can be configured. Tweak compression settings to suit your needs, such as block size (for ASTC) and whether the alpha channel is needed.
Consider pre-processing images to make the data texture compressor friendly!
Level of detail scaling is a technique commonly used in 3D games to render high-fidelity assets in the foreground and simplified assets farther away from the camera. Storing assets at multiple levels of detail increases memory consumption, but enables the engine to reduce the cost of rendering objects farther from the camera.
For texturing, the standard solution to LOD is mipmapping. Mipmapped texture sampling is performed efficiently by GPU hardware. It can significantly enhance aesthetics by avoiding moiré patterns. Additionally, it can improve performance by reducing the chance of costly texture cache misses. As objects are rendered farther from the camera, the rate of change between the texture coordinates of each pixel rendered for a given object increases. Sampling from a mip level that matches this rate of change reduces the chance of texture cache misses, as each texel of a smaller mip level represents a larger range of texture coordinates than the previous level.
For mesh rendering, developers are responsible for providing multiple levels of detail and implementing logic to scale between the LODs based on object-camera distance. Creating meshes at multiple levels of detail offline can be done automatically by tools; however, this can introduce artefacts as it is difficult for the most important features to be preserved through a procedural process. Because of this problem, artists may generate some or all of their mesh LODs by hand.
An additional benefit of LOD scaling is that it makes it much easier for games to adjust visual fidelity depending on the capabilities of the target device. For example, when running on a mid-tier mobile device a game may choose to only upload the lower-fidelity LOD levels to the GPU. Doing so reduces run-time memory consumption and reduces the total number of cycles required to render the assets.
Texture mipmapping is easy to do offline and online. Always use in 3D games.
Use mesh LODs as much as possible. Use automated LOD generation where possible. Create LODs by hand if automated generation gives poor results for key assets, e.g. game characters.
To avoid vertex attributes being duplicated in memory, index buffers are commonly used to define triangles by referencing each vertex in a mesh more than once. To ensure mesh data is accessed by drivers and GPUs efficiently, attribute and index data should be sorted to improve spatial locality. There are numerous publicly-documented and open-source triangle sorting libraries available, including:
Optimize attribute and index data for efficient cache access.
Ensure the draw references every vertex attribute between the lowest and highest index values.
To conserve memory and bandwidth requirements, games should consider using the minimum acceptable precision for each attribute in a mesh.
Interleaving attributes into a single buffer results in very efficient memory transfers as the driver and GPU can read from and write to contiguous blocks of memory. In modern mobile GPU architectures, vertex shader execution may be split into two steps; vertex transformation (accesses position attributes) and varying execution (all other attributes). To make the most of these architectures, we recommend using a dedicated vertex buffer for positional data and another to interleave the remaining attributes.
Store positional attribute data in one buffer.
Interleave all remaining attributes in a second buffer.
There are some situations in 3D rendering, such as cloth simulation, where vertex attributes need to be updated dynamically. To perform these updates efficiently, you should split vertex attributes into interleaved buffers that match the frequency that the attributes are being updated (e.g. static vs. per-frame updates) and consider double-buffering content that changes.
To ensure caches are accessed efficiently, vertex attributes should not cross 4-byte boundaries. For example if two 3-component vector attributes are needed, 1 byte of padding should be added at the end of each. Alternatively, a single component value should be added after each. When 2-component vectors are used, they should be packed to 4 bytes, followed by two single component attributes or padded to the 4-byte boundary.