top of page

Why DRI/DORI Cannot Be Used to Claim Autonomous AI Classification Performance in Modern Security Systems - Whitepaper

Jan 12, 2026

1. Introduction

Some surveillance vendors claim they can automatically detect and classify humans and vehicles at extreme distances—sometimes 1 to 6 miles—using only a few pixels per target. In some cases, claims as low as 2–100 pixels are made. These claims are often justified by citing DRI (Detection, Recognition, Identification) or DORI tables, which were created decades ago for human observers, not autonomous computer-vision systems.

 

At the same time, some vendors attempt to avoid AI and machine learning altogether, relying instead on simple motion detection, thresholding, or rule-based analytics, while still claiming “automatic detection.” Both approaches—misusing DRI/DORI or avoiding ML entirely—lead to systems that fail in real-world deployments.

 

These claims conflict with the physics of imaging, the limitations of sensors, and the fundamental requirements of modern machine-learning (ML)—especially in environments where atmospheric turbulence, reduced contrast, camera shake, background complexity, partial occlusions, animals, and environmental motion are common.

 

This white paper explains why DRI and DORI apply only to human perception, why they cannot be used to predict autonomous classification performance, why ML systems require substantially more pixels on target, and why systems that do not use AI/ML suffer from unacceptably high false-alarm rates. It also explains why long-range conditions require even more margin, and why no company can bypass physics with software. Any extraordinary claim must be validated through a Proof of Concept (POC).

 

2. What DRI and DORI Actually Measure

2.1 DRI (Detection, Recognition, Identification)

DRI was developed in 1958 to estimate how far a human observer could visually interpret a target using optical or thermal equipment. It describes whether a person can detect that “something is there,” recognize a general category (such as human versus vehicle), or identify a specific type.

 

Humans can often recognize a person with very limited visual information—on the order of 12–16 vertical pixels, which might correspond to roughly:

  • 12 pixels high × ~4 pixels wide ≈ ~48 total pixels, or

  • 16 pixels high × ~5 pixels wide ≈ ~80 total pixels

This is possible because the human brain can infer missing detail, guess intent, and apply context.

 

DRI was never designed to evaluate autonomous systems.

 

2.2 DORI (IEC 62676-4:2015)

DORI extends similar ideas to CCTV system design and again describes what a human operator can interpret when viewing video. Recognition-level DORI values often correspond to 7–12 pixels across the target width, still assuming a human is making the judgment.

 

Neither DRI nor DORI evaluates whether a computer can autonomously classify a target, nor do they account for turbulence, camera shake, background complexity, camouflage, or occlusions.

 

3. Why DRI/DORI Cannot Be Applied to Machine Learning

Machine-learning systems such as Convolutional Neural Networks (CNNs) and Transformers classify objects by extracting visual features from the image, including shape, edges, texture gradients, motion consistency, and frame-to-frame stability.

If these features do not physically exist in the pixels, the ML system cannot classify the object.

 

For example, a person appearing 10 pixels high × ~3 pixels wide ≈ ~30 total pixels does not contain enough information to reliably determine head shape, limb movement, torso structure, or vehicle geometry. A human observer might guess; an algorithm cannot.

 

DRI/DORI recognition thresholds describe what humans can guess from incomplete data. ML systems require real, measurable information.

 

4. What Happens If You Do NOT Use AI / Machine Learning

It is equally important to understand the consequences of not using AI/ML at all.

Systems that rely solely on traditional video analytics—such as simple motion detection, pixel change thresholds, background subtraction, or rule-based logic—lack the ability to understand what is moving. They can detect motion, but they cannot reliably classify it.

 

As a result, non-AI systems typically suffer from:

  • Extremely high false-alarm rates

  • Inability to distinguish humans from animals

  • Inability to reject nuisance motion

  • Poor scalability to large or complex environments

 

4.1 Why Non-AI Systems Generate Excessive False Alarms

Without ML classification, a system must alarm on any motion that meets basic criteria. This includes:

  • Animals

  • Blowing vegetation

  • Shadows

  • Clouds and moving sun patterns

  • Heat shimmer and atmospheric turbulence

  • Camera shake

  • Insects and birds

  • Rain, snow, and dust

 

Rule-based filters can reduce some noise, but they quickly break down in real environments because natural motion is highly variable. As thresholds are tightened to reduce false alarms, real threats are missed. As thresholds are loosened to avoid misses, false alarms explode.

 

This tradeoff cannot be solved without classification.

 

4.2 Non-AI Systems Cannot Scale

As coverage areas grow larger or more complex, non-AI systems become unmanageable:

  • Operators are overwhelmed by alarms

  • Alarm fatigue sets in

  • Systems are ignored or turned down

  • Real threats are lost in noise

 

In practice, many non-AI deployments are eventually disabled or relegated to “monitoring only” because they generate too many alarms to be useful.

 

4.3 Detection Without Classification Is Operationally Dangerous

A system that “detects motion” but cannot determine whether the object is a human, vehicle, animal, or irrelevant noise is not an autonomous security system. It simply shifts the burden to the operator, increasing workload and increasing the chance of human error.

 

This is why modern perimeter security requires both detection and classification, and why AI/ML—used correctly and within physical limits—is essential.

 

5. Long-Range Physics Further Increase ML Requirements (1–6 Miles)

At long ranges, multiple physical effects degrade imagery beyond what DRI/DORI assume:

  • Reduced contrast

  • Background complexity

  • Atmospheric turbulence

  • Camera shake

  • Loss of gradients

 

PureTech mitigates camera-induced motion by performing its proprietary image stabilization as the first processing step, ensuring downstream analytics operate on a stable image. Even so, long-range ML classification requires more pixels, not fewer.

 

6. Occlusions: Why Real-World Systems Must Design for More Pixel Margin

Real environments include frequent occlusions caused by vegetation, terrain, infrastructure, and partial self-occlusion. When only part of a target is visible, the effective usable pixel count drops sharply.

For example:

  • 40 px high × 15 px wide ≈ 600 total pixels may be sufficient for a fully visible person.

  • Seeing only half the body may require significantly more total pixels to maintain classification confidence.

 

Designing only to ideal conditions guarantees failure.

 

7. Independent Evidence: Pixel Requirements for Reliable Autonomous Classification

Independent research and industry experience consistently show that reliable autonomous classification cannot be achieved with only a handful of pixels, regardless of algorithm choice or marketing claims.

 

In practical deployments, autonomous classification systems must achieve high probability of correct classification, low false-alarm rates, and low misclassification rates simultaneously. Achieving all three requires substantial spatial and temporal information about the target.

 

Across a wide range of studies and real-world deployments, several consistent observations emerge:

  • Very small targets (on the order of only tens of total pixels) do not contain sufficient structure for reliable autonomous classification.

  • As pixel counts increase into the hundreds of total pixels, classification accuracy improves substantially, particularly when combined with temporal information such as motion consistency.

  • At long ranges, additional factors—including atmospheric turbulence, reduced contrast, background complexity, and partial occlusions—further reduce usable information, increasing the amount of image data (pixels) required to maintain high accuracy.

 

Importantly, there is no single universal pixel threshold that guarantees reliable classification at long range. The effective pixel requirement depends on multiple factors, including sensor modality, environmental conditions, target contrast, degree of occlusion, and system architecture.

 

Systems that rely primarily on static image appearance and single-frame analysis tend to require significantly larger target images (in the thousands) to achieve acceptable performance under degraded conditions. More advanced systems that exploit stabilized imagery, coherent motion over time, and real-world constraints can extract more information from the same imagery—but no credible system can achieve reliable autonomous classification at DRI/DORI recognition levels or at 2 to 10s of total pixels.

 

For long-range applications, it is realistic to expect that classification accuracy improves as available target information (pixel count) grows from a few tens of pixels into the hundreds or more, depending on conditions. Claims of reliable classification far below this regime are not supported by physics, industry experience, or independent research.

 

8. Training Data: Why Good ML Requires Large, Clean, Real-World Datasets

ML performance depends heavily on training data quality and quantity. Modern vision models typically require hundreds of thousands to millions of representative examples.

 

PureTech has been training visible and thermal ML models for 8 years, using hundreds of thousands of real-world images collected under operational conditions.

 

Garbage In, Garbage Out

Poor training data leads directly to poor performance. Garbage data includes:

  • Targets that are too small

  • Unrealistic close-ups never seen in deployment

  • Low-contrast imagery

  • Partial fragments without sufficient structure

  • Severe blur or turbulence distortion

  • Incorrect or inconsistent labeling

 

PureTech applies proprietary preprocessing and quality controls to prevent such data from contaminating training.

 

9. Thermal vs. Visible Imaging

Thermal imaging often outperforms visible cameras at long range and at night because it measures emitted heat rather than reflected light. Advantages include better target-background separation, no need for lighting, reduced impact from shadows, and reduced effectiveness of visual camouflage.

Thermal does not eliminate physics limits, but it improves signal quality under difficult conditions.

 

10. MWIR, LWIR, and SWIR Overview

  • LWIR (8–14 µm): uncooled, durable, good short- to medium-range performance

  • MWIR (3–5 µm): superior long-range performance, higher contrast, requires cooling

  • SWIR (~1–2 µm): reflected-light imaging, good detail in low light, poor in fog or total darkness

 

Each has tradeoffs; none can violate physics.

 

11. PureTech’s Physics-Aligned Multi-Cue Approach

PureTech Systems combines:

  1. Image stabilization (first step)

  2. Terrain-mapped object tracking for real-world size, speed, and direction

  3. Motion consistency filtering

  4. Shape plausibility checks

  5. Speed profiling

  6. Contextual and trajectory filtering

  7. ML classification applied

 

PureTech holds 16 issued patents covering image processing, stabilization, and computer vision.

 

12. Why This Matters: Missed Detections, False Alarms, and ROI

A missed detection can mean loss of life, loss of critical infrastructure, regulatory penalties, lawsuits, and reputational damage.

False alarms waste time, consume resources, cause alarm fatigue, and obscure real threats. Excessive false alarms are functionally equivalent to missed detections because operators stop responding appropriately.

 

Organizations that choose systems based solely on lowest acquisition cost often incur far higher total cost of ownership and risk exposure. Investing upfront in systems designed around physics, robust ML, stabilization, terrain mapping, and multi-cue validation delivers far better ROI by avoiding catastrophic failures and operational collapse.

 

13. Proof of Concept: The Only Valid Verification

Any vendor claiming autonomous classification at DRI/DORI pixel levels, at less than several hundred total pixels, especially under extreme long-range and occluded conditions must demonstrate the claim in a Proof of Concept.

Physics always wins.

 

14. Conclusion

DRI and DORI describe what humans can infer. They do not describe what autonomous systems require. Systems that ignore ML generate unacceptable false alarms. Systems that misuse ML or ignore physics miss real threats.

PureTech Systems delivers reliable autonomous detection and classification by respecting physical reality, using stabilized imagery, terrain-mapped measurements, multi-cue analytics, disciplined ML training, and patented computer-vision technology—producing operational security systems that actually work as demonstrated by its real-world deployments in the most challenging environments such as country borders.

  • LinkedIn
  • YouTube
GSA Contract
TMA-2023-Logo-Member[1]_edited.png
SIA Members
Safe Skies Members

2025      PureTech Systems. All rights reserved. 

Wix News Images (2).png
bottom of page