December 7, 2023
August 19, 2025
 —  
Blog

A Tale of Two Datasets: The Disparity Between IT and OT Data

A Tale of Two Datasets: The Disparity Between IT and OT Data

Welcome back, fellow travelers on the digital frontier. Joe Baxter here, and today we're diving into the heart of what makes our IT and OT worlds so different: their data. Before we begin, in case you have missed any of the previous installments, here are quick links:

  1. Priorities … Here
  2. Personnel … Here
  3. Programs … Here
  4. Processing … Here
  5. Parameters … You are Here
  6. Privileges … coming soon
  7. Placement … coming soon
  8. Protocols … coming soon

Parameters (aka data). It's a fascinating subject, and when you peel back the layers, you find that the very fabric of how each world handles information is fundamentally distinct, again emphasizing the difference between IT and OT.

The Foundation: Logging, Tagging, and Consumption

In the IT world, data is a living, breathing entity. We've built an entire industry around classifying, protecting, and monetizing it. Our software is a marvel of efficiency, capable of tagging, encrypting, and storing vast amounts of data in a way that's easily consumable. What's more, we’ve learned to use this data to make our systems smarter. We're constantly sending metadata and statistics to the cloud, feeding hungry AI models with the very metrics needed for everything from basic capacity planning to advanced threat hunting. This is a dynamic, evolving process.

In the OT world, the approach is much more… grounded. Data is primarily a historical record. We log access for compliance, sure, but the crown jewels are stodgy, old, uninteresting stuff like Sequence of Events (SOE) data. This information is meticulously collected to serve one primary purpose: post-incident forensics. When a turbine unexpectedly trips or a valve fails, this data helps us understand why. At present (although that is changing now!), this data rarely undergoes the kind of sophisticated AI analysis we see in the IT space. It's a static record, a historical ledger of what happened.

The truly important data in OT doesn’t issue massive sell orders to the stock market or transfer money between offshore accounts. OT data looks rather pedestrian by comparison. We care about ramping a generating unit to a particular setpoint or closing a valve after a flow of x number of monitored millions of gallons (or liters).

The Scale and Scope of Data

The sheer volume of data in IT and OT is another key differentiator. IT databases are gargantuan, often containing Petabytes and Exabytes of both structured and unstructured data. We use complex Relational Database Management Systems (RDMS) that consume a significant portion of our storage and processing power. We'll even “cube” this data into historical warehouses for deep analysis.

Contrast this with the OT world, where performance is king. We favor flat file databases because they're fast, efficient, and (frankly) a little brain-dead. These databases are tiny by comparison, often only a few Gigabytes or Terabytes at most. They contain the unstructured data necessary for the systems to function, but they're not built for the kind of massive, multifaceted queries that are commonplace in IT.

The Purpose of Retention

Why we keep data is arguably the most significant difference. In the IT world, our data retention policies are dominated by a web of legal and regulatory compliance. Regulations like GDPR, SOX, HIPAA, and GLBA bind us. Even in the private sector, we adhere to standards like PCI-DSS. We're constantly preparing for eDiscovery requests and safeguarding Personally Identifiable Information (PII).

In the OT world, the data retention playbook is simpler. The raison-d'être is historical evidence and forensics. We're not worried about eDiscovery or PII in the same way. We’re focused on having the necessary information to reconstruct an event, diagnose a problem, and get the system back online as quickly as possible. Until PLCs start demanding legal status for privacy, we are probably immune to the privacy challenges that overlay IT today.

The Nature of Change: Dynamic vs. Static

Finally, let's talk about change. In IT, our data models are constantly evolving. New fields are added, records are calculated, and the data schema is continuously in flux to meet new business needs. It’s a dynamic, ever-changing environment.

OT data, on the other hand, is remarkably static. A setpoint for a generating unit in a SCADA system, for example, will only ever fluctuate between zero and its nameplate rating. The data structure and the range of values are fixed. The purpose and function of the data remain constant over time. This predictability is a cornerstone of OT system stability.

So there you have it—a deep dive into the fundamental differences between IT and OT data. This isn't just an academic exercise; understanding these differences is crucial for anyone hoping to bridge the gap and build effective, integrated cybersecurity solutions.

Until next time, this is Joe Baxter signing off.

OT Secure Remote Access
Network Cloaking
Network Segmentation

Experience the simplicity of BlastShield to secure your OT network and legacy infrastructure.

Schedule a Demo