In this post, we’ll use OpenCV-Python to process video from a camera (or from a video file) and Python’s matplotlib package to plot a histogram of the video’s pixel intensities in real time. This is the final product:
What is a histogram and what is it good for? Histograms represent the frequency with which something occurs. In the context of images (or video), a histogram shows us the distribution of the intensity of all the pixels in the image—in other words, how much of the image is light, how much of the image is dark, and everything in between. In OpenCV, pixel intensity is represented by an unsigned 8-bit integer, i.e., by a number from 0 to 255, where 0 is black and 255 is white. In an image with a lot of bright shades, more of the pixels will be closer to 255. In contrast, in an image with a lot of dark shades, a relatively large number of pixels will be closer to 0. Usually, the pixel intensity range of 0 to 255 is sub-divided into groups of equal size, called “bins,” to reduce computation time. For example, if we chose to divide the range into 16 bins, the first bin—let’s call it bin 0—would contain pixel intensities from 0 to 15. Any pixel with a value from 0 to 15 would fall into this first bin. The next bin, bin 1, would contain pixel intensities from 16 to 31, and so on.
Histograms often serve important purposes in computer vision and image processing. They can be used to determine how similar two images are, or they can be used to differentiate objects in the foreground from the background, since one will often be lighter than the other (which aids in thresholding). A histogram also makes it easy to determine when an image has changed or when something has moved. In the video above, you can see how even slight changes in lighting or shadow are reflected in the histogram.
Let’s dive into the code and see how the video was created. Note that this post assumes you have OpenCV and the OpenCV-Python bindings installed and set up. Installing OpenCV can be an involved and nontrivial process that is considerably outside the scope of this post. You might find the OpenCV installation documentation for Linux, Windows, or iOS useful. Personally, I’m currently running OpenCV 3.2.0 on Ubuntu 16.04 and found Adrian Rosebrock’s instructions at pyimagesearch to be invaluable.
The code
Open your favorite editor and create a file named real_time_histogram.py, or grab the file from my Github and follow along.
|
|
First, we import the necessary packages and set up the argument parser. All arguments are optional; by default, the script will take video input from a camera without resizing the video frames, and will display a grayscale histogram with 16 bins.
|
|
Next, we process the arguments. On lines 19 and 21, we create an object
called capture
, an instance of the VideoCapture
class. If using a camera,
cv2.VideoCapture()
must be supplied with an integer representing the device
ID. If there’s only one camera connected, we can simply pass 0, as on
line 19. Alternatively, to read from a video file or image sequence, we must
pass it a filename, as on line 21.
|
|
Here, we initialize the plot and axis, as well as set the plot title and the x and y axis labels.
|
|
Next, we initialize the line(s) that will actually represent the histogram(s).
In the case of the RGB histogram (lines 40-42), we have three line objects,
one for each channel: red, green, and blue. In the grayscale histogram (line
44), there’s only one channel and, consequently, one line object. All
the lines are initialized with the specified number of bins on the x axis, with
the x axis values spanning the range from 0 to bins - 1
, which is accomplished
with np.arange(bins)
. Because a line requires both x values and y values to be
initialized, we also pass an array of zeros for the y data using
np.zeros((bins,))
. The lw
keyword argument sets the line width. The c
keyword argument sets the color. alpha
sets the transparency of the line.
Lines 45-46 set the x and y axis limits, respectively. On line 47, we turn on interactive plotting. Although our plot will not be interactive, interactive plotting allows other code to execute while the plot is open—in this case, the “other code” is the upcoming block that processes the video. Line 48 displays the plot window.
|
|
We arrive now at the loop that will continuously process each frame of the
video. On line 52, we utilize the read()
method of our VideoCapture
class instance, capture
. The read()
method grabs, decodes, and returns the
next frame of the video, which we store in the variable frame
. It also returns
a Boolean True if a frame was successfully grabbed or False if not, which we
store in the variable grabbed
. A False value would be returned after the end
of the video, if reading from a file, or if the camera were disconnected, if
reading from a camera.
On lines 58-62, frame
is resized to the specified width in pixels (if a
width was given as one of the arguments to the script). In OpenCV-Python, images
are represented by numpy arrays, so we can use standard numpy functions, as we
do on line 59, to get the height and width of the frame.
|
|
At last, the main event. For the RGB case, we first split the three-channel
image array into three single-channel arrays with cv2.split()
on line 68.
It is important to note that OpenCV uses the BGR format to represent images by
default, hence the ordering of the tuple for the output arrays,
(b, g, r) = cv2.split(frame)
. To actually compute the histograms, we use
cv2.calcHist()
, which does the heavy lifting for us. The first argument to
calcHist()
is a list of the source images. In this case, we’re computing
a one-dimensional histogram for each channel, so we only provide one image for
each histogram. However, the calcHist()
function can also create
multidimensional histograms. For example, a two-dimensional RG histogram would
provide information on how frequently red and green (at varying intensities)
occur together in the same pixel.
The second argument to calcHist()
is a list of the indices of the channels
from the source images to use for the histogram. Again, since we’re
supplying a single-channel source image for each histogram, there’s only
one index: [0]
. The third argument is an optional mask, e.g., if we were only
interested in a certain part of the image, we could create a mask—a 2D
array of the same width and height as the source images that contained positive
nonzero integers for the pixels we were interested in and zeros for the pixels
we wanted to ignore. In this case, we’re interested in the whole image, so
we set the mask argument to None
. The fourth argument to calcHist()
is a
list of the number of bins for each dimension of the histogram. Since
we’re creating one-dimensional arrays, we supply it with a single value.
The fifth argument is a list of the min, max
values of the bin boundaries for
each dimension. OpenCV represents images with unsigned 8-bit integers, which
take on a range of values from 0 to 255.
After computing the histograms, the plot line objects defined earlier are updated with the new frequencies on lines 72-74.
The grayscale histogram is similar, except we first convert the image from BGR
to grayscale with cv2.cvtColor()
on line 76. Because there’s only
one channel (and one corresponding plot line object), we only need one call to
calcHist()
. In both cases, the image is displayed with cv2.imshow()
, whose
first argument is a string for the title of the window and whose second argument
is the image to display.
To actually refresh the plot, we call fig.canvas.draw()
.
|
|
Finally, the if statement on line 82 exits the loop if the user presses the
Q key. The function cv2.waitKey()
waits for a keypress for a number of
milliseconds determined by the input argument. cv2.waitKey(1)
means it waits 1
millisecond. If it’s given an integer less than or equal to zero, it waits
indefinitely (if we did this instead of a positive value like 1, the loop
wouldn’t proceed to the next iteration until a key was pressed). If a key
is pressed, a 32-bit int is returned, but only the last 8 bits of the value
correspond to the ASCII representation of the key. The bitwise AND operator &
is used to extract these 8 bits (0xFF
is a hex value that is equivalent to
11111111 in binary, i.e., 0b11111111
). This 8-bit ASCII value is then compared
to the ASCII value of “q,” which is given by the built-in Python
function ord()
.
Once the loop is exited, the VideoCapture
method release()
closes the video
file or camera input, and cv2.destroyAllWindows()
closes any open OpenCV
windows.
Hopefully, that was relatively straightforward. Perhaps we’ll explore uses of histograms in a future post.