By properly leveraging GPU rendering, you can effectively increase the performance of your Flash games and AIR applications, achieving higher frame rates than you would using CPU render mode. This article will teach you two methods to convert your graphics into GPU optimized bitmaps and explain key differences you will experience when publishing your application in GPU mode that are not explicitly covered in Adobe’s optimization guide.
Before Getting Started…
- Download the tutorial files: gpuRenderFiles.zip
- Ensure your publish mode is set to Adobe AIR for Android and that the Render Mode is set to GPU. Adobe AIR currently offers three different rendering modes for publishing applications to mobile devices: Auto, CPU, and GPU. Auto currently means CPU mode for all devices. CPU mode means that all rendering operations are handled by the device’s central processor (this setting is always used for Flash Player within the browser). GPU mode means that the device’s graphics processor creates the off screen pixel buffers and final scene compositing used to output display. GPU mode is currently available only for mobile devices.
Note: Prior to AIR 2.6, iOS and AIR for Android used their own unique GPU renderers. iOS formally used a method called GPU Blend, a combination of CPU/GPU rendering where the CPU handles creation of pixel buffers and the GPU handles scene compositing. Android uses a method called GPU Vector where the GPU handles both of these tasks. With AIR 2.6, iOS has transitioned to GPU Vector, although Adobe engineers are, at the time of this article’s publication, still working to bring iOS to parity with Android such that GPU modes perform the same for both devices. Throughout this document, GPU Mode will refer to GPU Vector rendering.
Making Bitmaps out of Vectors
The greatest gain you receive from rendering in GPU mode is faster visuals. The GPU is optimized for quickly moving bitmaps around on the display list, so the first step to maximizing your application’s performance is understanding how to convert CPU process heavy vectors into fast rendering bitmaps. Throughout this document, I will be referring to vectors, bitmaps, cached vectors, and drawn vectors and how they relate to the GPU. In this section, you learn the differences of each type of graphics object and how to code it using ActionScript 3.0.
A vector graphic is defined by points, lines and curves that are plotted in space. Unlike bitmaps, vectors can be rescaled to any dimension without visual distortion. Vectors typically require less memory than bitmaps, but require heavier CPU operations to render on the display list. Vector graphics most commonly exist as the shapes that are drawn with the pencil and paint tools. A bitmap is a type of graphic defined by a grid of pixels. Each pixel represents the color information for one square on the grid. Bitmaps have a fixed resolution. This means if you resize or rotate a bitmap image, it will typically suffer visual distortions and jagged edges. Bitmaps are typically the images you import to your library, such as BMPs, JPEGs, and PNGs, but they can also be created dynamically using code.
In Flash, there are two ways you can convert existing vector graphics into bitmaps: caching and drawing.
Caching Vectors as Bitmaps
To bitmap cache a vector means to internally render the graphic as a bitmap such that the computations involved in plotting the graphic do not need to be recalculated when the object is moved on the display list. The code to cache a vector is very simple:
vectorClip.cacheAsBitmap = true;
Bitmap caching greatly improves rendering performance with the following caveats: If you attempt to cache a movieclip or display object with children that animate, Flash will need to re-cache the bitmap for each frame the movieclip plays or changes its contents. As the process of creating the cache is system intensive, caching in this way will actually create a negative effect on your application. Also, using just the cacheAsBitmap property, you should only move the cached object along the x or y axis. Scaling, rotating, adjusting alpha, or any other kind of transformation will require Flash to re-cache the object, again, hurting performance.
With GPU rendering enabled, you can add an additional line of code that tells the GPU to handle your graphics transformations:
vectorClip.cacheAsBitmap = true;
vectorClip.cacheAsBitmapMatrix = new Matrix();
With cacheAsBitmapMatrix enabled, you can rotate, scale, and change alpha values without regenerating the cached bitmap. Note that cacheAsBitmapMatrix will only work on GPU render mode for mobile devices.
In this article, cached vectors refer to vector graphics that have both the cacheAsBitmap and cacheAsBitmapMatrix properties activated.
Drawing Vectors as Bitmaps
A bitmap drawn vector is an image created by populating pixel data of a vector graphic at a fixed resolution onto a BitmapData object. In other words, you draw a bitmap copy of your original vector graphic. Here is what the code looks like:
var bd:BitmapData = new BitmapData(vectorClip.width, vectorClip.height, true, 0x00000000);
var bm:Bitmap = new Bitmap(bd);
First you create a new BitmapData class with dimensions to match your vector graphic with the transparency property set to true and the background color set to transparent. Then, you call the draw method to make a bitmap out of the vector clip. It is important to note that the draw process starts from coordinates (0, 0) of the vector clip and that scaling/transformations are not applied. Finally, you must create a new bitmap object, assigning the newly created BitmapData to it and then add your bitmap to the display list.
Drawn vectors differ from cached vectors in a few significant ways. They exist independent of the original vector graphic. This means that if you use drawn vectors, you will typically want to remove the original vector from the display list, or, better yet, draw the vector off the display list to begin with. The drawn vector exists as a BitmapData object and will leverage the strengths and weaknesses of this class. It will scale and transform like a bitmap which means potential for distortion and jagged edges (this can be mitigated by smoothing which you will learn about later in this article). Yet, because it is a bitmap, it will not need to be re-cached when transformed and will scale, rotate, and move around the stage faster than a vector. In Flash Player 10.1 or later, the BitmapData class uses single reference. This means that you can have just one copy of the BitmapData in memory and send that data to any number of bitmaps on the display list without having to store additional memory for each image. Finally, drawn vectors do not require GPU rendering (though they benefit from using it) so they can improve animation performance whether you are publishing your application for AIR or Flash Player.
The downside to using drawn vectors is that they require more coding and resource management than caching. Flash Player has measures to automatically release cached vectors from memory when they are not on the display list; the BitmapData from drawn vectors must be explicitly released using the dispose() method. Drawn vectors also have some performance and appearance advantages over cached vectors as you will discover when running the included test files. Using drawn vectors also ensures that you don’t break any guidelines for caching that could affect performance: namely, caching a movieclip that contains moving parts. Since the drawn vector is literally a bitmap it won’t contain any parts that move or change that would cause it to re-cache.
How GPU Rendering Affects Your Application
Rendering in GPU mode affects both the appearance and performance of your application. In this section, you will learn how about how the GPU renderer impacts the performance of moving bitmaps, blitting, and memory usage, as well as how it alters the appearance of vectors, bitmaps, text, and filters.
The included test_files.zip contains several FLAs that allow you to test and experiment with the topics discussed below. By publishing these FLAs as mobile applications, you can measure your own results for how GPU rendering will affect applications on your specific mobile device.
In GPU mode, BitmapData objects are rendered with the software renderer but composited by the GPU. For this reason, the GPU excels at moving bitmaps around the screen. If you’re moving lots of bitmaps or bitmap cached vectors across stage, performance will be much faster on most devices in GPU mode. This makes GPU mode an ideal choice for games rendering their animations via Partial Blittingor Bitmap Armaturetechniques.
The copyPixels method of the BitmapData class is a CPU operation and will perform faster in CPU render mode than GPU mode. However, noticeable difference in performance is only apparent when running many copyPixel operations at once, so if you’re a title=”Rendering animated models in mobile games” href=”http://www.adobe.com/devnet/games/articles/rendering-animated-models.html”>Partial Blitting, you will likely experience higher performance in GPU mode due to its ability to quickly move bitmaps on the display list. If you are Stage Blitting you will get better performance rendering in CPU mode.
The sample file, blittingPerformanceText.fla demonstrates CPU mode blitting to the stage approximately 1.4x faster than GPU render mode.
Bitmaps that are scaled or rotated in Flash will often appear to have jagged edges and/or pixelated appearance due to mathematical estimations involved in rendering the transformed image. This distortion can typically be eliminated by enabling bitmap smoothing, an anti-aliasing effect that will smooth the jagged edges of bitmaps so that they appear to have cleaner and more vector-like curves.
Bitmap smoothing behaves differently depending on the device you’re targeting. Take note of the following:
- In the browser version of Flash Player, bitmap smoothing is enabled by setting the “smoothing” property of a particular bitmap to true or by setting “Enable Bitmap Smoothing” on an image in the library by right-clicking on it and checking the box.
- In CPU mode for Android devices, bitmap smoothing does not work at all, even if it is set explicitly.
- In GPU mode for Android devices, bitmap smoothing is applied automatically by the renderer to dynamically created bitmaps or images in the FLA library that have either the “Export for ActionScript” or “Bitmap Smoothing” option enabled. Note: Smoothing is only applied by the renderer, so if you attempt to bitmap draw a smoothed bitmap, it will not appear smoothed.
- In the browser version of Flash Player, if you set “Export for ActionScript” on an image in the library to true, it will not appear smoothed on stage unless you explicitly set the smoothing property of the particular image instance to true. This is because images exported for ActionScript are treated as BitmapData rather than bitmap instances. When you add an exported bitmap to the stage, it creates a new bitmap instance for which you must explicitly set the smoothing property. As bitmap images cannot be assigned instance names, the only way to do this is by referencing the index of bitmap on the display list. For example: Bitmap(getChildAt(0)).smoothing = true.
- On iOS devices, bitmap smoothing will work in CPU mode as it does in the browser version of Flash Player.
Smoothed bitmaps will have a much better appearance than non-smoothed bitmaps that are rotated are scaled. If you intend to use these transformations in your application and are targeting Android devices, you should consider GPU rendering to achieve the ideal visual appearance.
Vector Graphic Appearance
In GPU mode, vector shapes are tessellated, a process where they are converted to triangles. As a result of this process, the edges of curved vector graphics will appear to have jagged edges vs. the smooth edges you get in CPU mode. The jagged edges are especially apparent in smaller vectors with fine details and will appear even if you cache the vector using cacheAsBitmap = true and cacheAsBitmapMatrix = new Matrix(). In order to eliminate the jagged edges, you must draw the vector as actual bitmap data using the BitmapData.draw() method described in the first section.
Avoid static text anti-aliased for animation in GPU mode if the text font size is small, for, like vector graphics, the character glyphs will have jagged edges. Text that is anti-aliased for animation will deliver smooth looking text in any other setting (Dynamic, TFL). In CPU mode, text that is anti-aliased for animation will look blurrier than readability in all text settings, but will not have jagged edges. Text that is anti-aliased for readability will appear clearly for all text settings in both CPU and GPU modes.
You can experiment with text appearance using the included sample file: textTest.fla.
Very simple vector graphics, for example, circles with a gradient that fades out to alpha 0 like the ones commonly used for particle effects such as fire and smoke, can be moved around the stage equally as fast as and sometimes even faster than bitmap renders of the same image when in GPU render mode. The precise performance result you will get is dependent on the amount of data being sent from the CPU to the GPU and how the data needs to be converted. Simple shapes contain a small number of vertices and will be smaller than bitmap equivalents. However, complex vectors often contain more data than bitmap equivalents. Bitmaps need to be swizzled, or converted into a format that can be used by the GPU rasterizer, while vector data does not. Finally, in order to draw a vector with the GPU, it must first be tessellated. Simple shapes will convert faster, but complex vectors will require more triangles and hence more data to be processed.
As a general rule, it is usually better to keep simple vectors as vectors rather than caching them or using bitmap equivalents since they will perform comparably and use less memory. When possible, test both alternatives in your application and measure the results.
The mere presence of complex vector art (detailed shapes with many curves and edges that cannot be readily represented by mathematical functions) on the display list when in GPU mode can cause a noticeable drop in FPS, even when the vectors are static (non-animated) and the images being animated are elsewhere on the stage (not overlapping the vectors). While the issue does not necessarily occur in all situations, it is definitely something you should look out for when you’re experiencing unexplained performance drops in your application.
My results on the T-Mobile G2 show that the application rendered faster with non-cached vectors than cached-vectors, but rendered the fastest when vectors where drawn as bitmaps (equally as fast as removing the vectors from the display). View the breakdown below:
- Leave the vectors as vector (41FPS)
- Cache the vectors as bitmap matrix (34FPS)
- Draw the vectors as BitmapData (47FPS)
- Remove vectors from stage (47FPS)
Filters (drop shadows, glows, bevels, etc.) are currently not supported in GPU mode. This means that if you apply a filter to a display object on the stage you will not see the filter effect. If you apply a filter to a vector, however, you will see an even greater jagged edge effect on the shape than if you had not filtered. It is also possible that filtering a vector on the display list can cause a performance decrease in your application due to the fact that filters automatically cache vectors as bitmaps (see above notes on Complex Vectors and how caching vectors affects performance).
While GPU mode will not render filter effects on the stage, it is possible to achieve filter effects by drawing filtered objects using BitmapData.draw(). The sample file bitmapDrawExample.fla demonstrates how to do this.
Applications rendered in GPU mode will consume more RAM than the same application rendered in CPU mode. In my tests I saw a range of 25-40% more memory usage, with the actual memory differences ranging from approximately 4MB for an empty app (9 MB CPU vs. 13MB GPU) to 18MB for a game application (55MB CPU vs. 73MB GPU). Below is a list of free apps available on the Android marketplace that you can use to test the actual memory usage of your AIR application on your device:
- Super Task Killer 2011 by NetQin Mobile Inc.
- Gemini App Manager by Grace.Liu
- Memory Usage by TwistByte LLC
Available memory plays a critical role on the performance of mobile apps, so it is even more critical to reduce RAM usage when using GPU rendering.
Where to go from here
GPU mode is an alternative way to render mobile applications with the potential to deliver superior performance to CPU rendering in games and applications with heavy visual components. At the same time, GPU mode can be more difficult to work with. Many features readily available in CPU render mode, such as filters and crisp vector graphic appearance, do not readily work the same in GPU render mode and demand alternative, work-around solutions. To make matters more complicated, GPU performance is not consistent for all devices. This means that even if you optimize your code for GPU mode, your application may still perform better on some devices in CPU mode.
Rest-assured, adapting your application to use bitmaps instead of vectors will also help improve application performance in CPU mode. Also, the technology behind mobile GPUs and hardware rendering is still very new and under heavy development. We can hope that Adobe will eventually implement the Auto render mode to detect the best settings based on device or, even better, include a way for developers to switch rendering modes at runtime or explicitly delegate operations to the CPU or GPU. In the meantime, always test your application on as many device platforms as possible to ensure you’re getting your desired results.
Here are some good resources for further reading:
- Optimizing Performance for the Adobe Flash Platform – Adobe’s Flash optimization handbook. This must-read document contains chapters for improving every aspect of your application’s performance: conserving memory, minimizing CPU usage, ActionScript 3.0 optimization, rendering performance, and more.
- Adobe AIR for Android Developer’s Notes – An excellent follow up to Adobe’s optimization handbook, this document covers some additional optimization points specific to Android development.
- Rendering animated models in mobile games – Learn how to create optimized game animations using stage blitting, partial blitting, and bitmap armature models.
- Bitmap Caching – Adobe’s help page that explains how to properly use bitmap caching to improve rendering performance
- Cached bitmap transform matrixes in AIR – Adobe’s help page that explains how to implement GPU cached graphics on mobile processor in AIR