My wife has a research project in which she needs to analyze brief (8-second) segments of hundreds of much longer videos. My goal was to take the videos (~30 minutes each) and cut out only the relevant sections and splice them together, including a static marker between each segment. This should allow her and her colleagues to analyze the videos quickly and using precise time-points (instead of using a slider in a video player to locate and estimate time-points). I’ve posted my notes from this process below for my own reference, and in case it should prove useful to anyone else.
To my knowledge, the best tool for the job is FFmpeg, an open source video tool. FFmpeg provides much of the underlying processing functionality for other popular video tools, such as Handbrake, Miro, and MPlayer. Since you compile FFmpeg from source code, it should run on any system with a C compiler, including my OS X box. Unfortunately, it’s not the most user-friendly software package in the world.
The developers recommend using the latest version from Git (which finally forced me to install Git, something I’d been meaning to do anyway). One how-to for FFmpeg on OS X docs suggest that I’d also need LAME in order to process audio. (The audio is irrelevant for my use case, so I didn’t bother with LAME.) But I couldn’t compile FFmpeg because, apparently, OS X doesn’t include GCC. To get GCC on OS X, the official Apple way, I needed XCODE from the Apple Developer Tools. To get those you have to sign up for an Apple Developer account. Welcome to Dependency Hell, now featuring bureaucracy!
Hours later, I’ve compiled FFmpeg. However, the videos I’m dealing with are raw H.264 video files from a Night Owl K-44500-C surveillance system. Also, I want to save them as H.264-encoded MP4 files. That means I needed additional H.264 support–or, at least, I thought I did–from the x264 project. And you need YASM to compile x264.
I was actually unable to compile YASM from the Git repository, although I was able to compile it following the instructions in this Handbrake for Mac OS X guide.
I recompiled FFmpeg with the –enable-libx264 and –enable-gpl switches:
./configure --enable-libx264 --enable-gpl
make
sudo make install
To select just the relevant portions of the video, I used the -ss (start/seek position) and -t (time/duration) flags, e.g.:
ffmpeg -f h264 -i input-video-file.264 -ss 180 -t 8 output-video-file.mp4
The above example takes 8 seconds of the input video starting at the 3-minute (180-second) mark.
However, when I played back the output, it played much too fast. The source videos included a timestamp, and about 4 seconds ticked by every for every second of video! It turned out that the source videos were recorded at 7 fps (frames per second). I added a flag to specify the framerate:
ffmpeg -f h264 -r:v 7 -i input-video-file.264 -ss 180 -t 8 output-video-file.mp4
After running this a few times for different 8 second segments, I needed to put the segments back together again. This is such a relatively common use-case that FFmpeg has instructions in their FAQ, How can I join video files?. The first method–concatenating MPEG-2 files–seemed like the easiest option. However, MPEG-2 doesn’t support the framerate (7 fps).
I tried the other suggested method of concatenating videos, using named pipes. This worked, although the BASH script was convoluted and very particular.
Another thing I wanted to add to the video was a separator–some static frames to divide each 8-second clip–and a title card. At first I created a JPEG and turned it into a 1-frame video, concatenated it with itself to create a 2-frame video, and then 4, 8, 16, etc. However, I discovered a much easier method of creating a video from a single image using the loop flag:
ffmpeg -f image2 -loop 1 -r:v 7 -i image.jpeg -pix_fmt yuv420p -an -t 2 image-movie.mpeg
(The pix_fmt flag was to set the correct color space, and the an flag ignores the audio channel.)
Now I had title cards, clip separators, and 8-second video clips that I could combine into a single video. But I needed to do this hundreds of times! I wrote a Python script to generate the appropriate BASH script based on the input filename. The BASH script would create the title cards and clip separators using ImageMagick, and then call the appropriate FFmpeg commands to create and concatenate the video.
The ImageMagick commands look like this:
convert -size 704x480 -background SteelBlue1 -fill black -font Helvetica -pointsize 72 -gravity center label:[Video Title] titlecard.jpg
Then I used the find command to run the Python script on all the video files:
find *.264 -maxdepth 1 -exec ./process.sh '{}' \; -print
Eureka! It worked. Almost perfectly.
Almost.
Some (but not all) of the output videos would produce solid gray frames after a certain time point. Reviewing the FFmpeg output for those files, there was an error:
[h264 @ 0x9460340] FMO not supported
FMO stands for Flexible Macroblock Ordering, and based on the response to a libavcodec issue from 2009, the FFmpeg developers don’t plan to support it (although one of the developers suggested that, if someone would like to create a software patch to enable FMO support, the community would welcome it).
I wrote to technical support for the camera system and asked if they had any suggestions. They replied that, although conversion to MPEG formats was not supported, they do provide a conversion utility to convert to AVI. I was able to successfully convert the AVIs to MP4s. An annoying extra step, but one that is only necessary in a subset of cases.
This solution took me several weeks to figure out, although it should save quite a bit of time in the long run.
Thanks for this great post, i cant believe anybody left a thank u post here, since this post saves a lot of research time. Kudos!
Wow.. Thumps Up!! 😀
Thanks for posting these – if the video chunks have an audio track, and you want to add in static titles generated from a png, I found you needed to generate a ‘silent’ audio track for the titles, otherwise the audio gets out of sync when you concatenate the video chunks.
e.g. for a 5- second slice of audio:
ffmpeg -ar 4100 -t 5 -f s16le -acodec pcm_s16le -i /dev/zero -ab 128K -f mp2 -acodec mp2 -y silence.mp2
then combine with the 5-second title
ffmpeg -i title_without_audio.mpg -i silence.mp2 -y title_with_audio.mpg
(and probably something similar for mp4)
Great job! Thanks a lot for this post. Please help me to solve this problem sir. Thanks a lot in advance. http://stackoverflow.com/questions/17311708/how-to-join-two-video-files-using-python?answertab=active#tab-top
Rash, try making the intermediate files MPEGs, as in the example on http://ffmpeg.org/faq.html#Concatenating-using-the-concat-protocol-_0028file-level_0029
You can concatenate the MPEGs and then use another call to FFmpeg to convert the combined file back to WebM. That may not be the most efficient way to do it, but it should work.
Thank you so much! This is the best solution I have found to pulling video selections using FFMPEG. You just saved me a ton of time.
Thank you!
Sir, please help me out. In my project, I need to convert an array of images into a video. Please provide me with a solution. Its an android project(Java).
@Manmohan, I’ve never done that with Java, but check out the Java Media Framework (JMF). The documentation includes a possible solution for your project: Generating a Movie File from a List of (JPEG) Images