Lineage 2 Mobile is one of the top revenue games in Google Play. It is an MMORPG, made by NCSoft in Korea. We worked for almost 4 months with almost of GameDev colleagues online and offline. We integrated optimizations from previous projects and also made new optimizations and also analyzed performance differences based on content changes to allow us to make well-informed choices to optimize that content.
This article will introduce 2 changes, one is related to Vulkan, and the other is related to general rendering.
First, is an optimization related to binding. As you probably know, GLES is a state machine, so once bound, the vertex and index buffer state remain in place. Vulkan, in contrast, is not actually a state machine, instead the state is saved in a command buffer. So, in a command buffer if the same vertex or index buffer is used as during the command buffer recording, then it is okay to call bind just once.
The engine will try not to bind the same vertex/index buffer from previous draw calls under proper conditions (i.e. same pipeline, same CommandBuffer, same frame number) since the Vertex and Index Buffers are already recorded in the Command Buffer for the coming draw.
We made a performance comparison, to check the same condition we fixed CPU and GPU frequency. It has a 1fps benefit.
MaterialFloat Stencil = Texture2DSample(MobileSceneTextures.MobileCustomStencilTexture, MobileSceneTextures.MobileCustomStencilTextureSampler, UV).r * 255.0; Stencil = floor(Stencil + 0.5);
After implementation, we can see the framebuffer color channel is changed as well. Even the color changes from gray to red, but as only 1 color channel is used, the final result is the same.
Sometimes changing to a format which has a low bitrate can help performance even if that change might eliminate some color data. There is a ‘PostProcessMaterial’ pass, which is for drawing the characters’ outlines. We found it used 4xFP16, but it could be reduced to a packed 32 bit format because it is just used for drawing the outline.
If we set maximum fps to 30, we can see the actual difference for GPU usage. It has a 2% benefit.
The scene result is not exactly the same, but the difference is hard to recognize. If we change the format for base rendering, it would be easy to recognize but as this is for drawing characters’ outline so it is usually hard to see.
When changing an asset format, it is wise to check that the format supports GPU Driver compression such as ARM AFBC or Qualcomm UWBC. If the format does not support compression, it could decrease performance.
We integrate all our optimizations, we get 2~4 fps to benefit compared with GLES.
On lower-end devices such as S8 it has more benefit. It is checked with max fps 30.
Both of these optimizations are fairly well known and it is easy to think each will have only minor benefits... We all like to find those huge optimizations which make a game run twice as fast - but those are rare, and, instead, diligently working through changes like these while not so glamorous is often the main opportunity to improve the user experience. Additionally, the ease of implementing these changes meant that it was a fully justified choice.
Thanks to the GameDev Engineers : Alon Or-bach, Aton Sinyavskiy, Dohyun Kim, Fedir Nekrasov, Igor Nazarov, Inae Kim, Joonyong Park, Junsik Kong, Lewis Gordon, Kostiantyn Drabeniuk, Munseong Kang, Nataliia Kozoriz, Oleksii Vasylenko, Serhii Pavliv, Seunghwan Lee