Performance recommendations for Unity

This article builds on the performance recommendations for mixed reality, but focuses on Unity-specific improvements.

Nosotros recently released an application chosen Quality Fundamentals that covers common operation, design, and surround bug and solutions for HoloLens 2 apps. This app is a bang-up visual demo for the content that follows.

The near important first step when optimizing performance of mixed reality apps in Unity is to be sure y'all're using the recommended surround settings for Unity. That article contains content with some of the most important scene configurations for building performant Mixed Reality apps. Some of these recommended settings are highlighted below, as well.

How to profile with Unity

Unity provides the Unity Profiler built-in, which is a dandy resource to gather valuable performance insights for your detail app. Although you can run the profiler in-editor, these metrics don't represent the truthful runtime environment so results should be used cautiously. We recommended remotely profiling your awarding while running on device for most accurate and actionable insights. Further, Unity's Frame Debugger is also a powerful and insight tool to employ.

Note

Unity provides the ability to hands modify the render target resolution of your application at runtime through the XRSettings.renderViewportScale belongings. The final paradigm presented on device has a fixed resolution. The platform will sample the lower resolution output to build a college resolution prototype for rendering on displays.

                UnityEngine.XR.XRSettings.renderViewportScale = 0.7f;                              

Unity provides slap-up documentation for:

  1. How to connect the Unity profiler to UWP applications remotely
  2. How to effectively diagnose functioning issues with the Unity Profiler

Note

With the Unity Profiler connected and after adding the GPU profiler (run across Add together Profiler in elevation right corner), ane can see how much time is being spent on the CPU & GPU respectively in the middle of the profiler. This allows the developer to get a quick approximation if their application is CPU or GPU bounded.

Unity CPU vs GPU

CPU performance recommendations

The content below covers more in-depth performance practices, peculiarly targeted for Unity & C# development.

Cache references

We recommend caching references to all relevant components and GameObjects at initialization because repeating part calls such as GetComponent<T>() and Camera.main are more expensive relative to the memory price to store a pointer. . Camera.main only uses FindGameObjectsWithTag() underneath, which expensively searches your scene graph for a photographic camera object with the "MainCamera" tag.

              using UnityEngine; using Organization.Collections;  public class ExampleClass : MonoBehaviour {     private Camera cam;     private CustomComponent comp;      void Start()      {         cam = Camera.main;         comp = GetComponent<CustomComponent>();     }      void Update()     {         // Good         this.transform.position = cam.transform.position + cam.transform.forrard * ten.0f;          // Bad         this.transform.position = Camera.main.transform.position + Camera.master.transform.forward * 10.0f;          // Good         comp.DoSomethingAwesome();          // Bad         GetComponent<CustomComponent>().DoSomethingAwesome();     } }                          

Notation

Avoid GetComponent(string)
When using GetComponent(), there are a handful of dissimilar overloads. It is of import to always use the Type-based implementations and never the string-based searching overload. Searching by string in your scene is significantly more plush than searching by Blazon.
(Adept) Component GetComponent(Type type)
(Good) T GetComponent<T>()
(Bad) Component GetComponent(string)>

Avoid expensive operations

  1. Avoid use of LINQ

    Although LINQ can be clean and easy to read and write, it generally requires more computation and memory than if y'all wrote the algorithm manually.

                      // Case Code using System.Linq;  Listing<int> information = new List<int>(); information.Any(x => ten > 10);  var result = from ten in data              where 10 > ten              select ten;                                  
  2. Mutual Unity APIs

    Sure Unity APIs, although useful, can exist expensive to execute. Most of these involve searching your unabridged scene graph for some matching listing of GameObjects. These operations tin by and large exist avoided by caching references or implementing a managing director component for the GameObjects to rails the references at runtime.

                                          GameObject.SendMessage()     GameObject.BroadcastMessage()     UnityEngine.Object.Find()     UnityEngine.Object.FindWithTag()     UnityEngine.Object.FindObjectOfType()     UnityEngine.Object.FindObjectsOfType()     UnityEngine.Object.FindGameObjectsWithTag()     UnityEngine.Object.FindGameObjectsWithTag()                                  

Note

SendMessage() and BroadcastMessage() should be eliminated at all costs. These functions can be on the order of 1000x slower than direct role calls.

  1. Beware of boxing

    Boxing is a core concept of the C# language and runtime. It's the procedure of wrapping value-typed variables such as char, int, bool, etc. into reference-typed variables. When a value-typed variable is "boxed", it's wrapped in a System.Object, which is stored on the managed heap. Retention is allocated and eventually when disposed must be candy by the garbage collector. These allocations and deallocations incur a operation cost and in many scenarios are unnecessary or can be easily replaced past a less expensive alternative.

    To avoid boxing, be certain that the variables, fields, and properties in which you store numeric types and structs (including Nullable<T>) are strongly typed equally specific types such as int, float? or MyStruct, instead of using object. If putting these objects into a list, be certain to use a strongly typed list such as List<int> rather than List<object> or ArrayList.

    Example of boxing in C#

                      // boolean value type is boxed into object boxedMyVar on the heap bool myVar = true; object boxedMyVar = myVar;                                  

Repeating code paths

Any repeating Unity callback functions (i.eastward Update) that are executed many times per 2nd and/or frame should be written carefully. Any expensive operations hither will have huge and consistent impact on performance.

  1. Empty callback functions

    Although the code beneath may seem innocent to get out in your application, peculiarly since every Unity script auto-initializes with an Update method, these empty callbacks tin become expensive. Unity operates dorsum and along between an unmanaged and managed code boundary, between UnityEngine lawmaking and your awarding lawmaking. Context switching over this span is adequately expensive, even if there's nil to execute. This becomes especially problematic if your app has 100s of GameObjects with components that have empty repeating Unity callbacks.

                      void Update() { }                                  

Note

Update() is the about mutual manifestation of this operation issue but other repeating Unity callbacks, such as the following tin can be equally every bit bad, if not worse: FixedUpdate(), LateUpdate(), OnPostRender", OnPreRender(), OnRenderImage(), etc.

  1. Operations to favor running in one case per frame

    The following Unity APIs are common operations for many Holographic Apps. Although not always possible, the results from these functions can commonly exist computed once and the results reutilized across the awarding for a given frame.

    a) It's adept practise to have a dedicated Singleton course or service to handle your gaze Raycast into the scene and and so reuse this result in all other scene components, instead of making repeated and identical Raycast operations past each component. Some applications may require raycasts from unlike origins or against different LayerMasks.

                                          UnityEngine.Physics.Raycast()     UnityEngine.Physics.RaycastAll()                                  

    b) Avert GetComponent() operations in repeated Unity callbacks like Update() by caching references in Start() or Awake()

                                          UnityEngine.Object.GetComponent()                                  

    c) Information technology's good practice to instantiate all objects, if possible, at initialization and use object pooling to recycle and reuse GameObjects throughout runtime of your application

                                          UnityEngine.Object.Instantiate()                                  
  2. Avoid interfaces and virtual constructs

    Invoking function calls through interfaces vs direct objects or calling virtual functions can often be much more expensive than using direct constructs or directly function calls. If the virtual function or interface is unnecessary, and so information technology should be removed. Nonetheless, the performance hit for these approaches is worth the trade-off if using them simplifies development collaboration, code readability, and code maintainability.

    Generally, the recommendation is to not mark fields and functions equally virtual unless there'south a articulate expectation that this member needs to exist overwritten. Ane should be especially careful around loftier-frequency code paths that are chosen many times per frame or fifty-fifty once per frame such as an UpdateUI() method.

  3. Avoid passing structs by value

    Unlike classes, structs are value-types and when passed direct to a function, their contents are copied into a newly created instance. This copy adds CPU cost, equally well as additional memory on the stack. For minor structs, the effect is minimal and thus acceptable. However, for functions repeatedly invoked every frame besides as functions taking large structs, if possible modify the role definition to pass by reference. Learn more hither

Miscellaneous

  1. Physics

    a) Generally, the easiest way to improve physics is to limit the amount of time spent on Physics or the number of iterations per second. This volition reduce simulation accuracy. Run into TimeManager in Unity

    b) The types of colliders in Unity accept widely unlike performance characteristics. The gild beneath lists the most performant colliders to least performant colliders from left to correct. It's important to avoid Mesh Colliders, which are essentially more expensive than the primitive colliders.

    Sphere < Capsule < Box <<< Mesh (Convex) < Mesh (non-Convex)

    See Unity Physics Best Practices for more than info

  2. Animations

    Disable idle animations by disabling the Animator component (disabling the game object won't take the same consequence). Avert design patterns where an animator sits in a loop setting a value to the same affair. There'south considerable overhead for this technique, with no effect on the application. Larn more here.

  3. Complex algorithms

    If your application is using complex algorithms such as inverse kinematics, path finding, etc, look to find a simpler approach or accommodate relevant settings for their performance

CPU-to-GPU functioning recommendations

Generally, CPU-to-GPU functioning comes down to the draw calls submitted to the graphics card. To improve performance, draw calls need to be strategically a) reduced or b) restructured for optimal results. Since draw calls themselves are resource-intensive, reducing them will reduce overall work required. Further, land changes between draw calls require costly validation and translation steps in the graphics driver and thus, restructuring of your application's draw calls to limit land changes (i.e different materials, etc) can boost performance.

Unity has a nifty article that gives an overview and dives into batching describe calls for their platform.

  • Unity Draw Call Batching

Unmarried pass instanced rendering

Single Pass Instanced Rendering in Unity allows for depict calls for each eye to be reduced down to one instanced describe call. Because of cache coherency betwixt two describe calls, there'south also some functioning comeback on the GPU too.

To enable this characteristic in your Unity Project

  1. Open Thespian XR Settings (go to Edit > Project Settings > Actor > XR Settings)
  2. Select Single Pass Instanced from the Stereo Rendering Method drop-downward carte du jour (Virtual Reality Supported checkbox must exist checked)

Read the following articles from Unity for details with this rendering approach.

  • How to maximize AR and VR operation with advanced stereo rendering
  • Single Pass Instancing

Note

One common effect with Single Pass Instanced Rendering occurs if developers already have existing custom shaders not written for instancing. After enabling this feature, developers may observe some GameObjects simply render in one eye. This is because the associated custom shaders do not have the appropriate properties for instancing.

See Single Pass Stereo Rendering for HoloLens from Unity for how to address this problem

Static batching

Unity is able to batch many static objects to reduce draw calls to the GPU. Static Batching works for most Renderer objects in Unity that 1) share the same fabric and 2) are all marked as Static (Select an object in Unity and select the checkbox in the superlative right of the inspector). GameObjects marked as Static cannot exist moved throughout your awarding'due south runtime. Thus, static batching tin can be difficult to leverage on HoloLens where about every object needs to be placed, moved, scaled, etc. For immersive headsets, static batching tin can dramatically reduce depict calls and thus improve performance.

Read Static Batching under Draw Call Batching in Unity for more details.

Dynamic batching

Since it's problematic to marking objects as Static for HoloLens development, dynamic batching tin be a great tool to recoup for this lacking feature. It can too be useful on immersive headsets, as well. Withal, dynamic batching in Unity can be difficult to enable because GameObjects must a) share the same Material and b) encounter a long listing of other criteria.

Read Dynamic Batching under Draw Telephone call Batching in Unity for the total list. Nearly commonly, GameObjects become invalid to exist batched dynamically, because the associated mesh data tin be no more than 300 vertices.

Other techniques

Batching can simply occur if multiple GameObjects are able to share the same material. Typically, this volition be blocked by the need for GameObjects to accept a unique texture for their corresponding Fabric. It's common to combine Textures into i big Texture, a method known as Texture Atlasing.

Furthermore, information technology'southward preferable to combine meshes into 1 GameObject where possible and reasonable. Each Renderer in Unity will have its associated draw call(s) versus submitting a combined mesh under one Renderer.

Note

Modifying properties of Renderer.material at runtime will create a copy of the Textile and thus potentially break batching. Use Renderer.sharedMaterial to change shared fabric properties across GameObjects.

GPU performance recommendations

Larn more than virtually optimizing graphics rendering in Unity

Bandwidth and make full rates

When rendering a frame on the GPU, an application is either bound past memory bandwidth or fill charge per unit.

  • Memory bandwidth is the rate of reads and writes the GPU can exercise from memory
    • In Unity, change Texture Quality in Edit > Project Settings > Quality Settings.
  • Fill rate refers to the pixels that tin exist fatigued per second by the GPU.
    • In Unity, employ the XRSettings.renderViewportScale property.

Optimize depth buffer sharing

It'southward recommended to enable Depth buffer sharing under Player XR Settings to optimize for hologram stability. When enabling depth-based late-stage reprojection with this setting nonetheless, it'southward recommended to select 16-flake depth format instead of 24-bit depth format. The 16-chip depth buffers will drastically reduce the bandwidth (and thus power) associated with depth buffer traffic. This can be a large win both in power reduction and performance comeback. However, in that location are two possible negative outcomes by using 16-bit depth format.

Z-Fighting

The reduced depth range fidelity makes z-fighting more than likely to occur with xvi chip than 24-bit. To avoid these artifacts, modify the nigh/far clip planes of the Unity camera to account for the lower precision. For HoloLens-based applications, a far prune plane of 50 m instead of the Unity default 1000 grand can by and large eliminate any z-fighting.

Disabled Stencil Buffer

When Unity creates a Return Texture with 16-scrap depth, there'due south no stencil buffer created. Selecting 24-bit depth format, per Unity documentation, will create a 24-chip z-buffer, too as an [8-bit stencil buffer] (https://docs.unity3d.com/Manual/SL-Stencil.html) (if 32-flake is applicable on a device, which is by and large the case such equally HoloLens).

Avoid total-screen effects

Techniques that operate on the full screen tin can be expensive since their social club of magnitude is millions of operations every frame. It's recommended to avoid post-processing effects such as anti-aliasing, bloom, and more than.

Optimal lighting settings

Existent-time Global Illumination in Unity tin can provide outstanding visual results but involves expensive lighting calculations. Nosotros recommended disabling real-time Global Illumination for every Unity scene file via Window > Rendering > Lighting Settings > Uncheck Existent-fourth dimension Global Illumination.

Furthermore, it'south recommended to disable all shadow casting as these also add together expensive GPU passes onto a Unity scene. Shadows can be disable per light but can also exist controlled holistically via Quality settings.

Edit > Projection Settings, then select the Quality category > Select Low Quality for the UWP Platform. Ane tin likewise simply set the Shadows property to Disable Shadows.

We recommended that you use broiled lighting with your models in Unity.

Reduce poly count

Polygon count is reduced by either

  1. Removing objects from a scene
  2. Nugget decimation, which reduces the number of polygons for a given mesh
  3. Implementing a Level of Detail (LOD) System into your awarding, which renders far abroad objects with lower-polygon version of the same geometry

Understanding shaders in Unity

An easy approximation to compare shaders in performance is to identify the average number of operations each executes at runtime. This can be washed easily in Unity.

  1. Select your shader nugget or select a fabric, then in the tiptop-right corner of the inspector window, select the gear icon followed by "Select Shader"

    Select shader in Unity

  2. With the shader nugget selected, select the "Compile and show lawmaking" push button under the inspector window

    Compile Shader Code in Unity

  3. Later on compiling, look for the statistics section in the results with the number of unlike operations for both the vertex and pixel shader (Note: pixel shaders are often also called fragment shaders)

    Unity Standard Shader Operations

Optimize pixel shaders

Looking at the compiled statistic results using the method above, the fragment shader will generally execute more operations than the vertex shader, on boilerplate. The fragment shader, besides known as the pixel shader, is executed per pixel on the screen output while the vertex shader is but executed per-vertex of all meshes existence drawn to the screen.

Thus, not just practice fragment shaders have more instructions than vertex shaders because of all the lighting calculations, fragment shaders are almost ever executed on a larger dataset. For instance, if the screen output is a 2k by 2k prototype, and then the fragment shader can get executed 2,000*ii,000 = 4,000,000 times. If rendering two optics, this number doubles since there are two screens. If a mixed reality awarding has multiple passes, full-screen post-processing effects, or rendering multiple meshes to the same pixel, this number will increase dramatically.

Therefore, reducing the number of operations in the fragment shader tin generally requite far greater performance gains over optimizations in the vertex shader.

Unity Standard shader alternatives

Instead of using a physically based rendering (PBR) or another high-quality shader, look at utilizing a more performant and cheaper shader. The Mixed Reality Toolkit provides the MRTK standard shader that has been optimized for mixed reality projects.

Unity also provides an unlit, vertex lit, diffuse, and other simplified shader options that are faster compared to the Unity Standard shader. See Usage and Performance of Built-in Shaders for more detailed information.

Shader preloading

Utilize Shader preloading and other tricks to optimize shader load fourth dimension. In particular, shader preloading means you won't meet whatever hitches due to runtime shader compilation.

Limit overdraw

In Unity, one can display overdraw for their scene, by toggling the describe fashion menu in the top-left corner of the Scene view and selecting Overdraw.

Generally, overdraw can be mitigated by culling objects ahead of time before they're sent to the GPU. Unity provides details on implementing Occlusion Alternative for their engine.

Memory recommendations

Excessive memory allocation & deallocation operations can take agin effects on your holographic application, resulting in inconsistent performance, frozen frames, and other detrimental beliefs. It's specially important to empathise memory considerations when developing in Unity since memory management is controlled past the garbage collector.

Garbage collection

Holographic apps will lose processing compute fourth dimension to the garbage collector (GC) when the GC is activated to analyze objects that are no longer in scope during execution and their retention needs to be released, so it can be made available for reuse. Abiding allocations and de-allocations will generally crave the garbage collector to run more oftentimes, thus hurting performance and user feel.

Unity has provided an excellent page that explains in detail how the garbage collector works and tips to write more efficient lawmaking in regards to memory management.

  • Optimizing garbage collection in Unity games

One of the nigh mutual practices that leads to excessive garbage collection isn't caching references to components and classes in Unity development. Any references should be captured during Start() or Awake() and reused in later functions such as Update() or LateUpdate().

Other quick tips:

  • Employ the StringBuilder C# class to dynamically build complex strings at runtime
  • Remove calls to Debug.Log() when no longer needed, as they still execute in all build versions of an app
  • If your holographic app more often than not requires lots of retentivity, consider calling System.GC.Collect() during loading phases such as when presenting a loading or transition screen

Object pooling

Object pooling is a popular technique for reducing the toll of continuous object allocation and deallocations. This is washed by allocating a large pool of identical objects and reusing inactive, available instances from this pool instead of constantly spawning and destroying objects over fourth dimension. Object pools are great for reuseable components that have variable lifetime during an app.

  • Object Pooling Tutorial in Unity

Startup performance

Consider starting your app with a smaller scene, then using SceneManager.LoadSceneAsync to load the residual of the scene. This allows your app to get to an interactive state as fast as possible. There may be a large CPU fasten while the new scene is being activated and that any rendered content might stutter or hitch. One way to work around this is to gear up the AsyncOperation.allowSceneActivation property to "false" on the scene being loaded, wait for the scene to load, clear the screen to black, so gear up it back to "true" to complete the scene activation.

Recollect that while the startup scene is loading, the holographic splash screen will be displayed to the user.

Meet also

  • Optimizing graphics rendering in Unity games
  • Optimizing garbage collection in Unity games
  • Physics All-time Practices [Unity]
  • Optimizing Scripts [Unity]