Video Analytics 101 – Object Classification

Object Classification using Video

Object Classification determines the type of video object and assigns it to a predefined “class.”

Most of us are familiar with video analytics, and their ability to provide intelligence to security camera feeds.  However, within the category of video analytics, many capabilities, features and terminology can get a little confusing.  Sometimes it is good to step back and get a little refresher.  One such feature is Object Classification.  What is it and why should you care?

What is Object Classification?

The simplest definition for “classification” is the “ability to determine the type of object.”  So rather than merely understanding that something is there, classification determines the type of object: human, animal, car, truck, boat.  When mentioning classification, discussion may slip into the Johnson Criteria terminology (i.e. “recognition”), required pixels on object, or learning systems, all of which are interesting discussions, but not required to understand that classification provides clarification as to the type of object you are dealing with.

What Classification is not?

While discussing object classification, it is also good to understand what it is not.  Generally, for security surveillance, objects of interest are classified as cars, trucks, people, and other.  While more refined classification, such as specific models of cars (e.g. Ford Pinto, AMC Gremlin), is technically feasible, it requires additional algorithms and processing power.  So unless there is a need and budget to further refine the classification, it generally is not done.  As more and more attributes of an object are understood, the capability begins to move into a different category called Identification.  Some algorithms that fall into this category would be Facial Recognition; the ability to uniquely determine a person through unique facial features, or Unique Object Identification; the ability to sub-classify with a larger number of attributes, such as color, hair type, glasses, clothing or car model, license plate type, grill features, tire marks, etc.

Why should critical facilities care?

Video Analytics Classification of a person

Classification takes the operator directly to the task of assessing the actions to mitigate the threat.

Like most establishments, critical facilities must increasingly try to do more with less.  This applies equally to the protection of facilities, assets and personnel.  One means to achieve this synergy is the use of sensors that add intelligence, such as the addition of object classification.  This intelligence eliminates false alarms, filters out nuisance events and reduces the workload of the operator; allowing them to manage more sensors and/or larger facilities. 

In the case of object classification, a typical scenario would involve the breach of the perimeter fence.  In a normal deployment, the operator would need to be monitoring the camera and witness the breach, or they would need to have some other detection device, such as video analytics, a fence sensor or a proximity sensor to detect the event.  Detection itself, however, is not enough; the operator is now required to take action to determine what type of object has created the breach: a person, a car, an animal or perhaps a piece of debris.  The operator must perform some type of object classification before they may take action.  This is one area where classification can add value.

As more video management systems move to a dynamic map-based interface, meaning camera sensors and object locations are dynamically displayed on the map of a facility; classification can also allow the indication of the object type at its real-time location.  Now the operator knows the detection type and has a visual indication of where that object is currently located.  In this manner, object classification allows operators to be more efficient and effective by automating the detection and classification steps and bringing them directly to the task of assessing the actions to mitigate the threat.

How does software “Classify” anyway?

Map-based (geospatial) video management systems dynamically display targets

Map-based video management systems can display the object’s class dynamically on the user interface

We often take for granted how we can look at a video object and quickly determine the type of object we are looking at.  What is even more amazing is our ability to sort that intelligence into a bunch of ones and zeroes that will then allow a computer to do the same.  When creating software to perform classification of video objects, the first step is to partition the image into meaningful regions (e.g. objects and background), then select a set of features that will aid in discriminating between object classes of interest.  When classifying between vehicles, humans and animals, popular feature choices include object size, aspect ratio, compactness, color, and in some cases, the attributes pertaining to the object’s physical movement are used in the analysis. The software then compares the attributes of a specific object with those of each class of interest.   When a high enough level of similarity is found, the new object is assigned to that class.

I get it…so how do I get it?

Understanding the benefits of classification is only part of the value proposition.  The other piece is enabling your current or future surveillance system with this capability.  The ability to classify objects typically resides in software in one of three locations: in the camera, in an edge device, or in a server.  Generally, the choice of where analytics should reside comes down to network bandwidth limitation and the ability to upgrade the software when algorithms that are more powerful become available.  Generally, analytics built into cameras is inexpensive but has limited performance and not upgradable.  Edge devices and servers leverage off the shelf hardware technology and therefore can be more easily upgraded.  When network bandwidth is limited, the analytics needs to occur near or in the camera.  If good network connectivity can be provided from the camera to an environmentally controlled enclosure, a server may be the best choice as it is more easily updated and generally lower cost when multiple cameras are equipped with analytics.

By detecting and then classifying objects, video analytic systems provide more than just the understanding that there is something there.  Now they can be very specific and indicate “what” is there.  This alleviates the need for the operator to take this step and allows them to focus on assessing and reacting to the threat.

You can read about other types of video analytics here.

(An earlier version of this article first appeared in Remote Magazine)

This entry was posted in Geospatial, Video Analytics and tagged , , , , , . Bookmark the permalink.