Professional Video Editing on Linux with Cinelerraby Howard Wen
Final Cut Pro on the Mac, and Premiere, for Windows, both provide professional quality video editing. Cinelerra is the closest and best Linux equivalent. First released in 1996 (under its original name, Broadcast 2000), this freely distributed non-linear editor (NLE) was developed natively and solely for Linux. The program continues to be updated and improved to this day.
Cinelerra includes many of the features of the pricey professional editors and some extras: real-time visual effects, FireWire input/output, render-farm capability, and even support for HDTV formats and Ogg Vorbis. The downside is that its hardware demands are quite unforgiving; the recommended configuration has a dual 2GHz Athlon system, with 1GB RAM and a 200GB hard drive.
Who's behind this impressive program? We don't know. Cinelerra — along with other very useful multimedia utilities for Linux released from the same author (or authors) — is shrouded in mystery. Though this program's code is available for all to see and contribute to, its creator(s) prefers to remain anonymous for reasons best left to "Jack Crossfire" (a pseudonym, of course). As he explained in an email interview:
"In a shrinking industry like we're in now, managers aren't ready to see staff engineers building killer apps outside their day jobs, and they aren't afraid to get rid of anyone who ignores the system. You can't release software under an individual name when that happens, so 'Heroine Virtual Ltd.' became the entity under which all our content creation tools would appear. We leave it to your imagination how many people are behind it."
The Need to Edit
The inspiration to create Cinelerra was based on "some very basic, practical needs," as Crossfire describes. "Humans need to edit video and audio. Like the typewriter, the multimedia editor makes everything possible: video email, audio email, streaming media, watching TV, virtually everything we do when we're not eating and sleeping," he elaborates philosophically. "In the late 90s, there was no multimedia content creation system on any UNIX platform for less than $100,000. That got Broadcast 2000 off the ground. Then, as a natural course, Cinelerra elaborated on that functionality."
The nebulous Heroine developers have relied on a combination of C, C++, NASM assembly, and GAS assembly throughout the project's life. They found C to be the most useful for the coding of general-purpose libraries, using C++ for application-specific code. Between the general-purpose libraries and the application code, there's a "middle layer" written in C++ as well.
"Unfortunately, platform-specific assembly language is becoming more and more important as newer CPUs rely more on vectored assembly language to gain performance," says Crossfire. "We're looking into alternative languages to C and assembly, which can be easily converted to either scalar assembly or any of the vectored assembly languages out there."
However, many of these existing "vectored C" languages lock development into IA-32 assembly, do not lead to the best optimization of vectored op-codes, can be difficult to read, and typically cost a lot of money for the compiler. So the Cinelerra developers are considering a derivative of Forth as the best means to produce platform-independent, vectored-object code.
Necessity as the Mother of Multimedia Invention
Heroine incorporated a few outside libraries into the Cinelerra codebase, including libdv and FreeType. The project has also spun off significant multimedia libraries. During Cinelerra's incarnation as Broadcast 2000, no general-purpose MPEG-2 decoders for Linux supported video editing. The Heroine group thus wrote Libmpeg3, a set of almost entirely refurbished MPEG reference implementations. Today Libmpeg3 is developed outside of Cinelerra.
Likewise, Heroine developed QuickTime support for Linux in 1999 out of necessity. "Today, there are many QuickTime libraries, but there are so many application-specific things in QuickTime for Linux — like libdv [and] FireWire wrappers — that it's not clear if it's going to be replaced in the near future," says Crossfire.
Cinelerra's user interface also had to be developed internally. When principal coding on Broadcast 2000 began, GTK+ at the time was not useful enough for Heroine's needs, and Qt had yet to be open source. (Using open source materials is a project requirement.) So Heroine built their own GUI library, with the intention to eventually wrap it around GTK+ in the future.
"Six years later, however, GTK+ and Qt still involve a lot more work than necessary, just to keep up with the API changes and the growing dependencies," says Crossfire. So Cinelerra continues to use the Heroine-built GUI library, because it has reasonably fast graphics rendering, decent object orientation, and easy compilation.
Future Innovation with HDTV
This innovation-by-necessity continues to this day — the program's developers anticipate that future technical challenges for them will involve improving (or creating) more multimedia libraries and capabilities for Linux.
Cinelerra's background render-farm feature is one such example. Its design involves transparent load balancing and restart detection. Every time the user performs an edit, the network jumps to work: every node stops, re-syncs with the editor's timeline, and balances itself with the overall load. In most cases the user can simply drop visual effects onto the timeline and see them rendered immediately at full-frame rates — something many modern-day commercial NLEs still choke at trying to pull off, without the addition of a specialized graphics card or other hardware.
Theoretically, Cinelerra is able to do this feat even for video footage under HDTV format. "Anyone with a 100-node rack should try this, since we never could actually afford a render farm to see what was supposed to happen," says Crossfire.
The Cinelerra developers try to make a new release every three months. New versions are mainly improvements to the stability of the code. "The next best thing is probably going to be selective use of vectored assembly language to speed things up," says Crossfire, but he and his mysterious cohorts see Cinelerra's evolution in improving its wrangling of high-definition video. "The future is HDTV. Right now, you can edit HDTV broadcasts with a render farm and a certain amount of patience, but it could be a lot faster," he says.
The Developer Speaks
"Jack Crossfire" is a pseudonym for one of the developers (or is he just the only one?) behind Cinelerra. He recently agreed to an interview with the O'Reilly Network.
O'Reilly Network: So how would you say Cinelerra compares to Final Cut Pro or Premiere?
Jack Crossfire: Cinelerra will probably never have the relevance in content creation that Final Cut Pro, Premiere, and, more importantly, Avid Express have. There isn't the marketing horsepower or the volunteer programmer army to create a bottomless pit of features. Cinelerra is more likely to emphasize basic features like color correction, non-destructive editing, render-farm support, and features that rely more on software than hardware.
In commercial software, however, you have to show big hardware boxes with lots of circuit boards and exciting user interfaces to get the attention of the trade show jocks. The commercial [video] editors are more likely to emphasize eye-popping features, and features that depend more on hardware than software. They have a lot of 3D-animated effect icons, smoothly scrolling time lines, and talking paperclips.
ORN: Does Cinelerra have any features or technologies that we won't find in a commercial video editor?
JC: It's been such a long time since I've actually used another system for editing content that it's not clear where the advantage is. Cinelerra probably has a shorter learning curve than commercial packages because they're piling on shortcut after shortcut to build up their trade show demos.
Secondly, the commercial packages once had so many user interface bugs that it justified a new editor. They took a long time to navigate an audio waveform. They required many steps to accomplish the simplest importing and exporting.
Finally, you don't have to pay for Cinelerra. The user has complete ownership of the source code when they download it. That goes a long way toward peace of mind. If Avid goes out of business or Apple decides to mothball their content creation business and go pure servers, you're a lot better off if you have the source code.
No matter what the future of binary formats and operating systems, the only requirement for running Cinelerra is going to be a compiler.
ORN: Describe the biggest technical challenges you faced in putting together Cinelerra.
JC: The biggest challenges are mainly software, with somewhat smaller challenges involving hardware. The capability to route one [video or audio] track through any other track, and layer any number of [video] effects under the tracks, was a significant problem. The ability to render the timeline in the background over a cluster was another problem.
Nowadays, there's a renewed frenzy in ColorModel support. In 1997, everyone wanted 8-bit Pseudocolor so they could make animated GIFs. Now everyone wants their own crazy color model, either 16-bit floating point, 32-bit floating point, 16-bit fixed point, 10-bit YUV. YUV was a big one in [the year] 2000 because everyone was converting VHS to DVD, which is a pure YUV process. Cinelerra has a choice of what seem to be the eight most useful ColorModels, RGB and YUV, 16- and 8-bits per channel, with and without alpha.
Now, they're not supported in every operation. This is partly intentional and partly unintentional in a debugging kind of way. You have to experiment to make sure the effect you want behaves expectedly in the desired ColorModel. Supporting YUV and RGB at runtime has proven difficult because the math operations for each are completely different. The result is, of course, most of the time you can do internal processing in YUV when source footage is in YUV, and RGB when source footage is in RGB, thus eliminating several steps.
ORN: When it comes to developing a video editor, what technical hurdles are there in dealing with video on the Linux platform?
JC: Linux has virtually non-existent import and export capability for [video] footage. To get to this point, Linux would need three things:
- A way to transfer the raw video data itself between an external device and the Linux box.
- A way to seek the external device to an exact position in its storage medium from the Linux box.
- A way to convert to and from the external device's video encoding format on the Linux box.
Now, before we get into another war about evil murdering dictators oppressing free software developers by withholding important technical documents, the lack of import and export capability is probably going to solve itself in the long term.
More and more of the compatibility between external devices and Linux boxes is moving from hardware implementation to software implementation. For some new high definition camcorders coming out, the interface change is purely in software. Instead of creating new I/O boards with new registers sets and logic waveforms, they're using the existing I/O boards while changing the software to decode MPEG-2 instead of DV. It's virtually impossible to support new I/O boards, while it's relatively easy to support new software protocols.
That isn't to say the evil murdering dictators should continue refusing interviews with their driver developers, but free software developers can get their biggest gains by doing more in software instead of hardware.
ORN: What kind of help could Cinelerra use from those willing to volunteer their skills?
JC: The biggest contributions would be detailed explanations of how to crash it. There are a lot of crash situations, but they're very hard to reproduce.
We're always interested in new developments in image processing, new directions the content creation industry is headed in, and platform-independent assembly languages.
Finally, it takes a long time to incorporate outside changes. New code has to be verified against the current stability, and it has to be maintainable. So that limits the amount of new code that can be integrated to bug fixes or big, major features people are going to use all the time.
Supposedly, this is why a lot of programs have macro languages and language bindings. The problem is, the guys who use the macro languages to add new features are normally more interested in programming the language than using the program to create content.
A lot of people like the Autoconf system and want to rebuild the Cinelerra tree to use Autoconf. That system has proven to be real hard to cross compile with and it fills the screen with huge amounts of linker wrappers, compiler flags, compiler wrappers, concealing the important messages. Furthermore, these systems of layered build scripts and package configurators have grown so huge that it's become just as hard to hunt down the right script flags as it is to configure a makefile.
ORN: Any advice for those interested in modifying the Cinelerra code to help improve/stabilize it? Or advice in developing Linux multimedia applications that deal with video?
JC: There are very few features that are going to justify the amount of work required to implement them. Unless you've got millions of dollars and a large staff of slaves, features that are big, major, and lasting in impact are the things you'll be most rewarded for in your private software adventures.
Finally, nowadays everyone wants to use Linux as a front-end to Win32 binaries, writing a few lines of open source code, and calling into Win32 binaries to do the real work. While this may get Linux recognized by one or two marketing guys and Windows bigots, remember there's nothing like knowing the code you've written is always going to work regardless of what agenda one company or another takes.
ORN: Throughout your working on Cinelerra, what have you personally learned as a programmer?
JC: Moore's law may have applied to CPU clock speeds, but it doesn't apply to computer systems as a whole. In 1997, we thought general purpose computers would be fast enough by 2002 for a C program to decode compressed 2048x1024 video to an abstracted color model, perform any operation you wanted, and display it in real-time in the display's color model.
Six years later, the instructions-per-clock-cycle, memory bandwidth, and memory latency are largely unchanged since the Pentium II. The affordable hard drive still maxes out at 20MB/sec most of the time. Affordable memory still takes 200ns per request.
A lot of the things that were supposed to be absorbed by Moore's law got done by moving C code to either assembly or hardware-specific implementation. YUV-RGB display moved from C to XVideo. Libdv is almost entirely IA-32 assembly language. A massive permutation of color conversions is used, instead of an abstracted color space. This is all done to get around slow performance, and it's not very maintainable.
A lot is learned about software planning from doing free software projects like Cinelerra that you couldn't learn any other way. Producing 150,000 lines of useful application, finishing something you start, keeping the end in mind through a long process, are things you can't do any other way.
Return to the LinuxDevCenter.com.