Video and Memory usage on iOS devices

June 11, 2013


Today's post is all about video and memory usage on iOS devices. In previous posts, colors and the way a color is represented as a pixel was covered. This post will focus on how a video is represented in memory and how much memory is required to hold all the data contained in a video. This is an important detail that a developer must understand when considering possible implementations. Video takes up a LOT of memory, so much in fact that it can be a little hard to believe at first. The following example will make memory usage a little more clear by providing actual file sizes in bytes.


The animated GIF above is a web friendly version of the original video with dimensions 480 x 320 at 24 bits per pixel. These dimensions match the screen size of the original iPhone display in landscape orientation. This video is a small clip made up of 48 frames or images shown in a loop. The way video works is that one image (aka frame) after another is displayed on the screen. As long as the video is displayed quickly, it looks like the video is a smooth movement instead of a series of images. If the video above was viewed as a series of images on a filmstrip, it might look like this:


A film projector displays a filmstrip by shining light through each frame at a certain framerate. Digital video is not too conceptually different, except that each frame is contained in a file and the frames of video are displayed one after another on the screen. In digital video terms, to blit a video frame is to display it on screen at exactly the right moment so that the viewer sees smooth motion instead of a series of frames.

Conceptually, this all sounds pretty easy. It is only when a developer sits down to write code to implement this type of video playback that all the problems start to become clear. The first problem is the shocking amount of memory that uncompressed video takes up.

In the example above, the video clip has dimensions 480 x 320 and each pixel is stored as 24 bits per pixel. The video is displayed at 15 FPS (frames per second) so the whole series of 41 frames is displayed for about 3 seconds. A quick calculation shows how many bytes that is when each pixel is represented by a 32 bit word.

32 bits -> 4 bytes per pixel
480 * 320 -> 153600 pixels
153600 * 4 -> 614400 bytes per frame
25190400 bytes for 41 frames

So, this quick calculation shows that each frame of video takes up about 600 kB, or a bit more than 1/2 a megabyte. When all the frames are considered together, a file that contains all the video frames as raw pixels would be roughly 25 megabytes. That is a really really big file and this is a very simple animation that is only 3 seconds long. If this clip that is 10 seconds long would require something like 75 megabytes of memory. Yikes!

Let's do one more quick calculation just to put things in perspective. Assume for a moment that we want to process a 30 FPS video at 960 x 640, the screen size of a retina iPhone4. If the video were 30 seconds long, it would require about this many bytes:

32 bits -> 4 bytes per pixel
960 * 640 -> 614400 pixels
614400 * 4 -> 2457600 bytes per frame
2457600 * 30 -> 73728000 bytes for 1 second
73728000 * 30 -> 2211840000 bytes for 30 seconds

Yes, that is more than 2.2 gigabytes of memory needed to hold the data for a 30 second video! There is no way an iOS device could ever hold that much data in memory. There might not even be enough room on the flash drive to store that much data to disk. Video just takes up a lot of room.

The astute reader will no doubt wonder how video compression plays into all this. After all video can be compressed down to a much smaller size by doing frame to frame deltas and other types of data compression. Yes, it is true that the size of the file written to disk can be reduced by various compression methods, but the focus of this post is memory usage of the uncompressed video. Once compressed video has been uncompressed into app memory, it takes up memory. So, to keep things simple the complexity of video compression can be ignored when looking at the memory usage required when dealing with the uncompressed video data.

To understand how a developer ends up running into this type of problem in iOS, have a look at these two stackoverflow posts:

The basic problem illustrated with the above posts is that the developer is lacking an understanding of exactly how much memory the uncompressed video memory is going to take up at runtime on the actual device. As a result, code might seem to work for very small examples or in the Simulator. But, once poor code is put into practice the result can be app crashes when run on the device. Under iOS, the system will notice when an app that is taking up too much memory and automatically kill it in certain cases.

So, the first thing to do is make sure that any code dealing with video does not load all the video frames into memory at the same time. For this reason, developers should simply avoid ever using the UIImageView animationImages API under iOS. Any code that allocates a UIImage or CGImageRef for each frame of a video will end up crashing on the device in certain cases. For example, a video with large dimensions or a longer duration will crash while a smaller video might not.

While it may seem obvious that a developer should not use up all app memory on an embedded device like iOS, many developers simply do not understand this basic memory usage issue when dealing with images and video. For example, here are a few projects that can be found online that contain this most basic design flaw (all frames being loaded into memory at once):

It is tempting to look around online for existing code and copy and paste what seem to be easy solutions. But this type of Cargo cult programming has serious implications. When dealing with complex topics like audio and video, it is better to use an existing solution that already deals with the complexity.

Fully solving this memory usage problem was the initial reason I created AVAnimator. The AVAnimator library makes use of memory mapped files to implement a highly efficient means of loading into memory just those video frames that are actually being used at any one time. AVAnimator also includes an exceptionally fast blit implemented in ARM asm that makes it possible to efficiently load delta frames and apply frame deltas in the most efficient way possible under iOS. A developer might also be interested in this simplified PNG animation example code if AVAnimator seems too complex at first.