Making Media from Scratch, Part 2by Chris Adamson
This is the second of a two-part series on creating QuickTime movies "from scratch" in Java. By that, I mean we're creating our own media data, piece by piece, to assemble the movie. Doing things at this low level is tricky, but I hope you'll agree after this installment that it's remarkably powerful.
Part 1 began with the structure of a QuickTime movie as a collection of tracks, each of which has exactly one
Media object that in turn
references media data that can be in the movie file, in another file, or out on the
Media has tables that indicate how to find specific
"samples," individual pieces of audio, video, text, or other content
to be rendered at a specific time in the movie. Part 1 used easy-to-create
text tracks to show how to build up a
Media structure, first by
creating a simple all-text movie and then by adding textual
"time-code" samples as a new text track in an existing movie.
In this part, we'll move on to creating video tracks from scratch, building up a video media object by adding graphic samples.
The goal of this article's sample code is to take a graphic file and make a movie out of it by "moving" around the image — you may have seen this concept in iMovie, where Apple calls it the "Ken Burns Effect," after the director who used it extensively in PBS' The Civil War and other documentaries. There is also a shareware application called Photo to Movie that does much the same thing.
Download the source code for the examples.
We can make this work because of the concept of persistence of vision, which says that the human eye perceives a series of images, alternated sufficiently quickly, as motion. To do an image-to-movie effect, we show slightly different parts of the picture in each distinct image or "frame," creating the illusion of moving from one part of the picture to another.
In creating text tracks, the approach was to:
- Create a movie on disk.
- Create a track.
- Add a
Mediaobject to it.
- Get a
MediaHandlerand use that to add samples to the
The same approach generally works for video, except that the
VisualMediaHandler doesn't do anything for us. Instead, we need
to create a compression sequence, or
CSequence, to prepare
samples, encoded and compressed with a codec supported by QuickTime. We'll
then add these samples directly to the
CSequence class has a method called
compressFrame, which is what we need to generate samples. Its
public CompressedFrameInfo compressFrame(QDGraphics src, QDRect srcRect, int flags, RawEncodedImage data) throws StdQTException
That doesn't look too bad. We just need a
QDGraphics as the
source of our image, a rectangle describing what part of the image to use, some
behavior flags, and a
RawEncodedImage buffer into which to put the compressed
All Around the
"So what's a
QDGraphics?", you might be wondering. The name is
presumably meant to evoke thoughts of the AWT's
the two are remarkably similar: each represents a drawing surface, either
on-screen or off-, containing methods for drawing lines, circles, arcs,
polygons, and text strings.
One clever thing that
QDGraphics does under the covers is to
offer an isolation layer to hide whether the drawing surface is on-screen or
off-screen unless you specifically ask for it, and what native structures
GWorld) are involved. One odd
side effect of this arrangement is that while there are many
getGWorld() methods throughout the QTJ API, there's no
GWorld class to return, so you get
In fact, the
GraphicsImporter offers a
getGWorld(), and if you guessed that this class offers a way to get an
image into QuickTime, you're right. So now we have some idea of how we're
going to connect the dots to make a movie from an image:
GraphicsImportercan read an image file.
- It has a
getGWorld()that returns a
QDGraphicscan go to
compressFramecan be added to our
One strategy for getting the frames is to:
Get starting and ending rectangles, where a rectangle is a
QDRectrepresenting an upper-left corner point and width by height dimensions.
Calculate a series of intermediate rectangles that take us from the
For each of these intermediate
compressFrameto make a frame from that portion of the original image. Add each frame as a sample.
If you have QuickTime 5 or better, you can see the result here.
This strategy works, but it is limited by the size of the original image. This is pretty much a fatal flaw. If the image is only slightly larger than the movie size (i.e., the size of the rectangles), there isn't much room to move around. If it's smaller than our movie, then it won't work at all. On the other hand, if the image is much larger than our desired movie dimensions, then we might not be able to get the parts of the picture we want — it's not very useful if we can't get someone's entire face in the movie, and instead settle for a shot that moves from their nose to their chin.
Scaling the image would be a nice improvement, but we can actually do better
than that. If we could scale each
fromRect, then we could
"zoom" in or out of the picture by using progressively larger or
smaller source regions. But how do we do this?