Responsible AI – When is Probability Dangerous?

I read an interesting article a while back, and am going to break down some concerns about probability and AI.

Both are critical to discuss as we roll forward into an untamed future where unregulated AI intersects with not just business practices, but the human condition, or even potential human existence. Here I’ll discusses the use of big data and AI to make decisions about life, without having the knowledge, regulation, or oversight to do it ethically.

An existential example of this absence is in the field of reproductive technology. AI is being used to evaluate a person’s life before birth. Polygenic screening within IVF treatments is a fascinating study of how unregulated AI, with incomplete data and opaque algorithms, is influencing deeply personal, and truly existential, decisions.

Unregulated IVF

On April 1, 2025, the New York Times published an opinion piece, IVF Gene Selection Fertility: Should Human Life be Optimized. It’s a fascinating article, starting with the genetic defect her mother suffered from, spurring her to start a genetic screening on embryos’ DNA for a variety of conditions. Starting with the emotional screening for a genetic switch for blindness, nearly seamlessly goes on to discuss polygenic screening – stating it’s a risk profile for conditions such as heart disease.

What is Polygenic Screening, Really?

To understand ethical concerns around polygenic embryo screening, we have to start with what these predictions are actually based on, and how fragile that foundation is.

Polygenic Risk Score (PRS)

Now, if you want to actually look up Polygenic Screening, you will want to search for Polygenic Risk Score (PRS). A PRS is the sum of the effects of Single Nucleotide Polymorphisms (SNP), which are small differences in a DNA sequence that vary from person to person. One SNP has minimal impact, but adding multiple together can produce what is called a Polygenic Risk Score (PRS) to allow for an estimate of individual condition risk1

This means we are stacking thousands of tiny genetic differences to generate a profile. It’s probabilistic, not predictive; built on patterns, not certainties.

Single Nucleotide Polymorphisms (SNP)

There are more than 10 million SNPs within the human genome, and when they interact with each other, they can have a variety of different outcomes, like height, tumor risk, or even psychiatric conditions. Because SNPs interact in complex ways, predicting outcomes is regularly uncertain. Stacking these millions of SNPs, their conditions, or mutations, can all alter and have effects on gene expressions; like how tall you are able to grow, your likelihood of having medical conditions, and more.

This is very exciting technology, a huge leap forward in the human understanding of how we are made, and how all our genes, stack to make us whole. But, it is in its infancy, and we have over 10 million SNPs. Consider if we had only 10 million independent SNPs, and each SNP had 2 possible alleles (version of a gene), then there is too large a possibility of combinations to even begin to fathom.

210,000,000 ≈ 103,010,299

While this formula is a simplification, the point remains: the sheer scale of genetic variation makes prediction practically impossible.

And here’s one more key data point: if you add together participants of three of the most comprehensive publicly available genome sequencing datasets, including  gnomAD v4, UK Biobank, and 1000 Genomes project, they collectively represent only about .15% of the European population2.  That means probabilities are being generated based on an incredibly small and incomplete dataset.

We Don’t Have All the Information Yet.

Clearly, a fundamental problem is that the data being used to build these predictive models is incomplete. SNPs come from limited datasets, and companies are using those incomplete samples to build statistical probabilities about complex human traits. The incomplete foundation makes the outputs, the “predictions”, fundamentally unreliable.

On top of that, the SNPs are then added and combined to create the PRS. This means that the risk score is a set of numbers from a non-complete data set, combining that to form relationships with other non-complete data sets, to predict complex human traits. Building predictions on incomplete datasets is foundationally unsound.

In a well cited paper Nature Genetics, Manuscript: Common SNPs explain article states “SNPs identified to date explain only ~5% of the phenotypic variance for height”3. Even for something as measurable as height, our best models still explain only a small fraction; the study used Human Height SNP because it was one of the most identifiable, quantifiable, and best understood SNPs to date. What does this imply about our ability to predict more complex traits?

If using an SNP for height is one of the most current, defined and understandable SNPs to date, we must accept that combining these even less defined and understood to aggregate a score for probabilities is risky at best – misleading at worst.

Carefully Selected as an Embryo

Returning to the NY Times article, Orchid is described as “… offering what is essentially a risk profile on each embryo’s propensity for conditions such as heart disease, for which the genetic component is far more complex.”  This is a very simplified statement to explain the SNP and PRS as discussed above4.

The article also states Orchid can screen for conditions like obesity, autism, intellectual ability, and height. However, it glosses over a critical point: the United States has little to no regulation in the IVF field. These PRS-based screenings have made the U.S. a destination for global fertility patients, not because the science is more advanced, but because the regulatory barriers are fewer.

Today, Preimplantation Genetic Testing (PGT) is used in over half of all IVF procedures. Several American companies now offer PGT-P, which incorporates Polygenic Risk Scores. This is happening even though, in adults, the results of these scores remain ethically controversial and scientifically uncertain.

Most other countries have recognized concerns regarding PRS. Many have banned or strictly limited the use of non-medical predictive scores in IVF. Several European nations prohibit the use of polygenic embryo screening for traits like intelligence or psychiatric risk, citing ethical and scientific concerns5. In contrast, the lack of oversight in the United States has fueled a booming and largely unregulated market. It is a market driven by hope, but based on data that does not justify that level of confidence.

Big Data

Continuing with Orchid as our example, we know they run modeling techniques on both partners DNA to map inheritance patterns.

Advanced statistical models are used, but they add complexity and reduce transparency for the perspective parent. Unless the parent is already a geneticist, or a computer scientist, they are unlikely to understand Orchids description for Statistical Modeling6, including Monte Carlo simulations, Bayesian probability, recombination modeling and more, to predict embryo outcomes. These aren’t simple calculators, they are big data aggregators running probabilistic scenarios based on incomplete datasets. Their complexity challenges any principles of transparency and informed consent that we would want from responsible AI. The end of this hopeful future parents can get an embryo report, allowing them to make an informed decision about which embryo to prioritize for transfer. But, are they informed? Given the complexity of the models and the opacity of the language used to describe them, the answer is almost certainly no. Informed consent becomes nearly impossible under these conditions.

This entire process is built on large datasets and intricate modeling methodologies. It’s interesting, and promises all sorts of great things; but it is all probabilistic. Consider what it means to tell someone to select an embryo that is 50% less likely to develop schizophrenia, when the base risk for the condition is already under 1 percent, the prediction is drawn from data representing less than 0.15 percent of the population, and the models themselves have wide margins of uncertainty

The numbers may imply precision, but the science and data behind them does not.

AI Studies

Several recent studies highlight how artificial intelligence is being used to improve polygenic risk prediction.

Aside from the excitement showing that AI can boost the performance of polygenic prediction, they are each using predictions that are best known and defined, such as height, breast cancer risk,  cholesterol, etc. They didn’t each choose identical prediction sets, but they were within a set of well defined, best known combinations.

As discussed earlier, even for a trait like height, only about 5% of the variation we observe in people’s height can be explained by the SNPs identified so far. That means even the most advanced tools remain limited. Applying machine learning and AI to these best-known examples allows for more effective testing and training, which leads to more consistent outputs. This success does not automatically carry over to traits that are more complex or poorly understood.

Make no mistake, it’s incredibly exciting that some of the machine learning tools are capable of increasing the accuracy of probabilistic risk to have breast cancer, or high blood pressure, but it’s just a probability. These studies show AI can improve prediction slightly for well understood conditions, but the models are still dependent on incomplete data from small data groups7. The excitement around AI’s predictive power must be tempered by the reality that these are still just probabilities, not certainties.

Increasing probability is not the same as achieving accuracy, and when considering embryo selection, that distinction matters.

Responsible AI and Global Regulation

All of this brings us back to the ethical foundation of AI use. Responsible AI is supposed to be transparent, fair, inclusive, and accountable8. Yet the use of PRS in embryo selection in the U.S. is none of these. It is opaque, unregulated, built on incomplete data, and applied in deeply personal, high-stakes decisions.

By contrast, the European Union and United Kingdom have laws in place that prohibit the use of polygenic screening in IVF for non-medical traits or poorly validated conditions. They have drawn clear ethical boundaries that can be seen in multiple countries, and from multiple governing bodies.

Conclusion: Building the Tech, Ignoring the Guardrails

What we are seeing is a path of two futures: the U.S. builds the technology, while the EU builds the guardrails. Without urgent alignment between innovation and ethical governance, we risk letting unproven AI systems make decisions that shape human life before it even begins. This reflects a complete abdication of governance. As described in this paper, there is an unreasonable amount of information that a person would have to know, to be well enough informed to understand the use of these tools and what they mean.

Handing a person who wants to be a parent, a sheet of paper with predictions based on incomplete data, and then them choosing which life to pursue?  This paper didn’t even begin to scrape the deeper ethical questions of what happens when that child isn’t the tallest, smartest, or what ever trait they were selected for?  Do parents raise that child as if they purchased a pre made pizza, and wonder where the toppings they ordered are?  Probabilistic outcomes on a human existence, being used to decide if one cell cluster is of greater value than another.  The problems here lay much deeper than just the selection or avoidance of potential heart disease.

The USA stands out among the countries with advanced in vitro fertilization for its lack of regulations governing PGT-P. Often, its ethical and legal guardrails are shaped by international influences, like the GDPR inspired California and Colorado privacy laws. Here’s to hoping that the states start looking at their IVF industries, and start to explore how to support the future families.

At some point, we may reconsider whether what is being offered by these clinics is truly “probabilistic” at all. When the data behind these models is so limited, the output becomes less a scientific prediction and more a matter of hope. It is almost certain that legal teams have carefully worded disclaimers to avoid claims of certainty or guarantees, yet it may be inevitable that at some point, disappointed parents may decide that the probability of a lawsuit is worth pursuing.

Could the IVF clinics perhaps be curtailed from selling designer babies under false advertising or predatory practices? Nothing is decided, and I don’t have the answer. But I do think we should be asking these questions.

I look forward to watching and learning more as our future unfolds.


Footnotes

  1. Psychiatry at the Margins ↩︎
  2. Approximate Participant Counts in known genome projects that become part of the data lake: UK Biobank (500k), 1000 Genomes Project (2504), gnomAD v4 (800k). The estimated average participant of European descent was 95%, 24%, 83%. Estimate of European population 750 million. ↩︎
  3. National Library of Medicine | National Center for Biotechnology Information | Sizing up human height variation published in May 2008 and Common SNPs explain a large proportion of heritability for human height. These two articles go into great depths and details of the combining of SNPs as well as the fact that there is an average of 45% variance even when measuring SNP of the most recognized type (Height). They also highlight that SNPs (at current understanding) only account for a small fraction of genetic variation. ↩︎
  4. US Leadership in AI: P8 – States trust requires accuracy, reliability, explainability, objectivity, and more, the use of AI and probabilistic models for PGT-P is missing the mark. ↩︎
  5. In the UK the Human Fertilization and Embryology Authority (HFEA) prohibits the use of PGT-P for non medical purposes. Similar laws can be found in most of the EU member states as well. ↩︎
  6. Orchidhealth.com website page The Science Behind our GRS, they mention simulations, modeling patterns, statistical computing, and recombination. ↩︎
  7. Nature | Analysis of polygenic risk score usage and performance in diverse human populations. Most common data sets for PRS and SNPs are of white European descent (67%), East Asian (19%), and others in smaller quantities. ↩︎
  8. KPMG Trusted AI governance approach lists fairness, transparency, explainability, accountability, data integrity, reliability, security, safety, privacy, and sustainability as their principles within their pillars of KPMG Trusted AI. Using that same set of pillars, PRS-P lacks at least five of the ten. We haven’t reviewed the other five in this paper. ↩︎

References

References

Adrien Badre, L. Z. (2023, July 24). Arxiv | Quantitative Biology > Quantitative Methods | Deep neural network improves the estimation of polygenic risk scores for breast cancer. https://arxiv.org/abs/2307.13010

Aftab, A. (2024, February 17). Psychatry at the Margins | Polygenic Embroy Screening and Schizophrenia. https://www.psychiatrymargins.com/p/polygenic-embryo-screening-and-schizophrenia

Biobank. (2025). Uk Biobank | QHole genome sequencing. https://www.ukbiobank.ac.uk/enable-your-research/about-our-data/genetic-data

Elgart, M., Lyons , G., Romero-Brufau, S., Kurniansyah, N., Brody, J. A., Guo, X., . . . Sofer, T. (2022, August 22). Communications Biology | Non-linear machine learning models incorporating SNPs and PRS improve polygenic prediction in diverse human populations. Retrieved from Communications biology: https://www.nature.com/articles/s42003-022-03812-z

Gabriel Lázaro-Muñoz, P. J. (2023, November 9). ELSI Hub | Screening Embroyos for Psychiatric Conditions: Public Perspectives, Ethical and Social Issues. https://elsihub.org/sites/default/files/2025-05/Screening%20Embryos%20for%20Psychiatric%20Conditions_Nov%202023%20version.pdf

Genet, N. (2011, December 6). National Library of Medicine | Common SNPs explain a large proportion of heritability for human height. https://pmc.ncbi.nlm.nih.gov/articles/PMC3232052/

Global Resilience Federation. (2023). The Leadership Guide to Securing AI. https://static1.squarespace.com/static/60ccb2c6d4292542967cece7/t/64de2fcdedf2a93df1177eea/1692282832064/AI+Balancing+Act_DASDesign+FINAL_digital+Secured.pdf

Global Resilience Federation. (2025). Global Resilience Federation | AI Security. https://www.grf.org/ai-security

HHS. (Revised 2024, March 27). Federal Policy for the Protection of Human Subjects (‘Common Rule’). https://www.hhs.gov/ohrp/regulations-and-policy/regulations/common-rule/index.html?utm_source=chatgpt.com

Human Fertilisation & Embryology Authority. (2025, April 15). Embryo testing and treatments for disease. https://www.hfea.gov.uk/treatments/embryo-testing-and-treatments-for-disease

IGSR: International Genome Sample Resource. (2025). How many individuals have been sequenced in IGSR projects. https://www.internationalgenome.org/faq/how-many-individuals-have-been-sequenced-in-igsr-projects-and-how-were-they-selected/

Jan Henric Klau, C. M. (2023, June 26). Frontiers | AI – based multi-PRS models outperform classical single-PRS models. https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2023.1217860/full

Katherine Chao, g. P. (n.d.). gnomAD v4.0. https://gnomad.broadinstitute.org/news/2023-11-gnomad-v4-0/

KPMG. (2023, December). KPMG Trusted Approach . https://assets.kpmg.com/content/dam/kpmgsites/xx/pdf/2023/12/kpmg-trusted-ai-approach.pdf?v=latest

L. Duncan, H. S. (2019, July 25). Nature Communications | Analysis of polygenic risk score usage and performance in diverse human populations. https://www.nature.com/articles/s41467-019-11112-0

Logan, J. (2022, September 2). Mad in America | Genetic Embryo Screening for Psychiatric Risk Not Supported by Evidence, Ethically Questionable. Mad in America | Science, Psychiatry and Social Justice: https://www.madinamerica.com/2022/09/genetic-screening-ethically-questionable/

Merriam Webster Dictionary. (2025, May 4). Merriam Webster | Dictionary | Morals. https://www.merriam-webster.com/dictionary/morals

NIST. (2019, August 9). NIST | U.S. Leadership in AI: A Plan for Federal Engagement in Developing Technical Standards and Related Tools. https://www.nist.gov/system/files/documents/2019/08/10/ai_standards_fedengagement_plan_9aug2019.pdf

OECD.AI and GPAI. (2025). OECD | Policies, data and analysis for trustworthy artificial intelligence. https://oecd.ai/en/

Sussman, A. L. (2025, April 01). Should Human Life Be Optimized. https://www.nytimes.com/interactive/2025/04/01/opinion/ivf-gene-selection-fertility.html

The White House. (2023, November 01). Federal Register | Safe, Secure, and Trustworth Development and Use of Artificial Intelligence | EO 14110. https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence

The White House. (2025, 01 31). Federal Register | Removing Barriers to American Leadership in Artificial Intelligence | EO 14179. https://www.federalregister.gov/documents/2025/01/31/2025-02172/removing-barriers-to-american-leadership-in-artificial-intelligence

Tom L. Beauchamp, J. F. (2009, 2013). Pinciples of Biomedical Ethics. Retrieved from Internet Archive | https://archive.org/details/principlesofbiom0000beau_k8c1/page/n5/mode/2up

Non-Human Identity (NHI) and PLCs

Introduction: PLCs in Everyday Automation

When people consider robots, many think of the humanoid replacements seen on TV, with digitized faces, and funny walking gates. Most don’t consider how robotics are already part of nearly every aspect of our current everyday life. The backbone for many controls for infrastructure, including critical such as water systems, transportation, turbines, and also non critical systems such as automated warehouses and similar robotics applications is the humble Programming Logic Controller (PLC).

To illustrate this, consider something as simple as buying a bottle of water at your local market. That bottle has traveled through multiple automated networks before reaching the shelf. It was likely stored in an automated warehouse, where PLCs controlled its journey from storage to a truck, merging with other soda co-products into a perfectly stacked, mixed pallet. This entire process is orchestrated by a network of PLCs ensuring seamless movement.

PLCs are the backbone of millions of automation services, and to explore it’s ubiquitous, and importance, we will explore the PLC in an Automated Warehouse to understand how something so common, can be so critical.

What does a PLC do

As mentioned above, a PLC is a Programming Logic Controller, but that isn’t enough for the lay person to understand what a PLC does. A PLC is in fact, a computer used for industrial automation.

It is designed to replicate a set operation or process over, and over again- while collecting vital information from connected systems such as sensors, SCADA (Supervisory Control and Data Acquisition) systems, and HMIs (Human Machine Interfaces)—to collect and process data.

Based on this input, the PLC determines the appropriate response, such as activating motors for conveyors, lifts, and other components within an automated warehouse. The PLC itself operates by: detecting the state of all things it’s connected to, following its repeatable logic for action, and output communication back to what it is connected to; in our example it will be turning on or off the motors for conveyors, lifts, and other parts of an Automated Warehouse system, while also broadcasting status the the larger system.1[1]

PLC in its “Natural Environment”

Before diving into how PLCs identify and communicate, let’s explore an environment in which they operate. Consider an automated warehouse, like an amusement park ride for your bottle of water. Products start at different bays, and ride conveyors, that climb, twist, merge, and more, all to end at the the loading dock; PLCs manage every transition.

Automated warehouses function as massive logistical hubs, moving thousands of products daily through an intricate system of conveyors, sorters, and palletizers. These warehouses are structured into zones, each managed by dedicated PLCs that control specific identified (NHI) conveyor types, motors, actuators, etc. Here I provide some human readable examples of identifiers:

  1. PLC: (PLC001, PLC002, etc.) – Each PLC must be identified and identifiable for communication and controls to happen.
  2. BAY: (B001, B002, etc.) – where the product is waiting to be “picked” and have the PLC release onto a conveyor.
  3. Release Conveyor: (RC001, RC002, etc.) move products from storage.
  4. Merge Conveyor: (MC001, MC002, etc.) Multiple conveyors merge into a single conveyor monitored by sensor controllers (e.g., Raspberry Pi devices).
  5. Divert Conveyor: (D001, D002, etc.) When the boxes on the conveyor are splitting onto two or more conveyors.
  6. Sequence Check: (SC001, SC001, etc.) found at intersections to proper order.
  7. Palletizer Merge: till it reaches the Palletizer zone we will call the conveyor that delivers to the palletizer “PM”. Within the Palletizer there are multiple zones:
  8. Palletization & Patterns – Different automated systems have different palletization or loading patterns.  (We will not go into the specifics of any individual warehouse pattern systems.)

At each stage, PLCs ensure that products follow the correct path, communicating with sensors using their programmed logic to maintain order. At many PLC locations, there may be another controller working in tandem, a sensor controller, often a Raspberry Pi. This little card sized controller is used specifically to capture data on sensors, lasers, and similar; used to identify a container as it rolls through a sequence check, this device must have its own identification, and be able to communicate to and be recognized by the PLC. Now that we have a high-level overview the warehouse flow, we can explore how PLCs identify and communicate with each other within this environment.

PLC Identity and Non-Human Identity (NHI)

For PLCs to function in an automated system, they must be able to recognize and communicate with all the devices around them – motors, actuators, and even secondary controllers like the Raspberry Pi. To do this, they rely on a set of unique identifiers known as Non-Human Identity (NHI). These identifiers allow PLCs to track and communicate with every connected device in real time, enabling the automation of operations.

Some of the key NHI mechanisms used in Industrial Automation include:

  • IP or MAC Addresses – Common in modern Ethernet-based networks.
  • Industrial Protocols – Such as Ethernet/IP, Modbus TCP/IP, and Profinet.
  • Legacy Network Identifiers – Older systems use Profibus, CANopen, and DeviceNet, which assign Node IDs instead of IP addresses enabling PLC to communicate to different machines.
  • Memory Addresses & Tags – PLCs store references to connected devices, ensuring recognition even after hardware replacements.
  • Routing Tables & Network Maps – Define communication pathways in complex systems.
  • Raspberry Pi Running Node-RED – fetches data from Allen-Bradley PLC using Modbus TCP/IP, a quick SCADA alternative, and can in some instances be a sub network within a PLC network.

In the warehouse, these identifiers allow different zones to work in tandem. When a product is released from storage, the release conveyor’s PLC communicates with merge and divert PLCs, ensuring proper sequencing for palletization. If anything goes wrong – like a PLC not recognizing a product assigned path, it will trigger a fault, forcing human workers to intercept and correct. Even a single miscommunication can create delays that ripple through the entire warehouse.

Mixed-Age Systems and Heartbeat Identification

Industrial automation systems change, adapt, and evolve over time. As facilities upgrade, they often end up with mixed-age systems, where legacy PLCs must coexist with modern networked controllers and machines2. In such environments, older PLCs often rely on heartbeat signals—simple, periodic pings that confirm a device is online If a heartbeat is lost, the system assumes failure and may trigger emergency shutdowns.

While this mechanism ensures safety, it also presents a risk: heartbeat ID spoofing could allow an unauthorized device to mimic a PLC’s presence, potentially disrupting warehouse operations. (We’ll discuss more in depth to come.)

Multiple PLC Networks and Leader-Follower Configurations

In automated warehouses, PLCs do not operate in isolation, they are part of segmented networks. To manage complexity, PLCs are often grouped into leader-follower configurations, where a leader PLC oversees several subordinate controllers. This structure:

  • Reduces network congestion by centralizing decision-making.
  • Ensures coordinated actions across multiple warehouse zones.
  • Helps isolate faults—if a follower PLC fails, the leader can reroute operations or trigger alerts.

When a PLC broadcasts an error, it’s effects ripple through the system. The Leader PLC broadcasts the error, and upstream PLCs must determine whether it will impact their operations. If an issue is detected, they halt to prevent further errors. Meanwhile, downstream PLCs continue running until a sequence check further along the process detects a missing product. At that point, the downstream PLC registers the failure and alerts its own upstream systems, trigging a secondary shutdown.

This cascading effect can halt sections of the warehouse or, in extreme cases, bring the entire facility to a standstill if warehouse staff and Controls Engineers do not quickly identify and resolve the originating PLC failure.

For example, consider merge zone PLC detecting a sequencing error. The PLC immediately notifies its leader PLC, which then signals upstream systems to pause product flow. By stopping movement before the issue spreads further, the system minimizes disruption and reduces downtime.

The Interconnected Nature of PLCs

The ability of PLCs to recognize and communicate with each other and partner systems is what keeps an automated system running smoothly. But as warehouses grow more complex, integrating mixed age networks, external controllers, and industrial IoT devices, the question of identity becomes just as important as function. Without strong Non-Human Identity (NHI) mechanisms, PLCs cannot securely authenticate the machines they interact with, leaving gaps for errors and exploitation.

In the next section, we will explore some mechanisms PLCs use to establish identity. From IP/MAC addressing to legacy network identifiers, each method plays a role in ensuring that every PLC, sensor, and actuator knows it’s place in the system. These identities and identity methods allow PLCs to interact reliably, but come with limitation and challenges.

Key Non-Human Identity Methods in Automated Warehouses

We continue to explore some of the top uses and vulnerabilities of the Non-Human Identity of Automated Warehouses, and how they relate specifically with regards to the PLC.

IP or MAC Address-Based Identification

When properly set up, PLCs rely on IP or MAC addresses for network communication and identification. In most warehouse environments, leader PLCs may use multiple identifiers for redundancy and protection, while subordinate PLCs may be identified by their MAC address for simplicity.

While MAC spoofing doesn’t have much news coverage, it does happen. There was a 2016 MAC Spoofing attack that cost millions of dollars3. In an industrial setting, even if the malicious actor is successfully blocked through traveling latterly or upstream through network segmentation, we have seen how a single PLC error can cascade and effect the whole system. Strong segmentation may not be enough to prevent disruptions.

And recall, the PLC is not only communicating with other PLCs, but actuators, and other devices such as sensors, and interfaces. If there isn’t a good and regular inventory of all connected devices, the impact of an identity failure can cascade across the entire system.

Industry Protocols (Ethernet/IP, Modbus TCP/IP, Profinet, etc)

Industrial control systems were originally designed with isolation in mind, and isolation was considered secure when Operational Technology (OT) networks were separate from IT infrastructure. However, as automation environments have become more interconnected, these once-closed networks are now discoverable on the network, and face security risks.

Many industry-standard protocols, including Ethernet/IP, Modbus TCP/IP , and Profinet, were developed assuming that the network was closed and secure. These protocols were designed without encryption or authentication mechanisms4,5, making them inherently insecure for communication over modern networks.

This introduces a path to access and capture MAC addresses, verification protocols, or other operational information, widening the door for attack. There are add-ons for security, however the core issue remains; these protocols were not designed with cybersecurity in mind, leaving critical systems vulnerable.

Legacy Network Identifiers

Recall discussing “mixed age systems” above?  Older PLCs may not be fully compatible with newer PLC’s, even when using the same brand. When a facility upgrades, it is very unlikely to transition all existing hardware and components; instead, legacy products that are still working remain, sometimes segmenting by network, or even using “heart beat” protocols, where an older PLC broadcasts a heartbeat (ping) as “proof of life”.

The problem; this heartbeat/follower PLC protocol lacks any NHI identifiers at all, and opens another avenue for entry and disruption. When combined with a non-encrypted network protocol, and a threat actor may be able to map older network segments, identify vulnerable devices, and then make plans accordingly.

Claroty TEAM82 has demonstrated the risk in multiple ways, one of the most interesting involves leveraging a legacy PLC to access the SCADA systems.  The fastest way to achieve this? Trigger a fault in a legacy PLC, and then the Engineer may use SCADA or HMI to review (and thus attacker gains access to engineer SCADA and more)6. If we have older devices that are using heartbeat as it’s identifier, the bar to access that is pretty low .

Protect the Non-Human Identity – Protect the System

By now, it should be clear just how deeply PLCs are embedded in modern life. They don’t just move your bottle of water from storage to shipment, they quietly control much of the world’s infrastructure, from manufacturing and logistics to water plants and critical utilities.

A warehouse shutdown is inconvenient, but what happens when a PLC error does more than stop operations? What if instead of halting a system, it mistakenly activates equipment? What if a disrupted PLC logic sequence sends the wrong command at the wrong time?

Can you imagine an entire pallet of water falling from 7 stories of a warehouse bay? Who was working there at the time, how were they affected? Now, take that same failure, and apply it to a water treatment plant. What happens when a gate controlling chemical flow opens too early or too late?

Non-Human Identity in industrial automation is established through control systems, MAC and IP addresses, industrial protocols, and authentication mechanisms that help machines communicate with their intended counterparts. As automation networks grow more complex and interconnected, protecting these identity structures becomes critical. If a PLC’s identity is spoofed, or compromised, the consequences could ripple far beyond a single warehouse, impacting safety, security, and infrastructure at a much larger scale.


  1. DO Supply, Explaining HMI, SCADA, and PLCs, What They Do, and How They Work Together ↩︎
  2. [1] POR Automated Wherehouse, Overcoming Common Software Implementation Challenges(p8) ↩︎
  3. Secure W2, MAC Spoofing Attacks Explained: A Technical Overview ↩︎
  4. Veridify Security, OT Security: Cybersecurity for Modbus ↩︎
  5. ODVA, Overview of CIP Security ↩︎
  6. Claroty Team82, Evil PLC Attack: Using a Controller as Predator Rather than Prey ↩︎

References

Allen-Bradley. (2005). Rockwell Automation | Literature | Documents | ag-um008. Retrieved from Rockwell Automation: https://literature.rockwellautomation.com/idc/groups/literature/documents/um/ag-um008_-en-p.pdf

DO Supply. (2019, February 4). Explaining HMI, SCADA, and PLCs, What They Do, and How They Work Together. Retrieved from DO Supply: https://www.dosupply.com/tech/2019/02/04/explaining-hmi-scada-and-plcs-what-they-do-and-how-they-work-together/

Huges, C. (2025, February 20). Understanding OWASP’s Top 10 List of non-human identity criticlal risks. Retrieved from CSO: https://www.csoonline.com/article/3828216/understanding-owasps-top-10-list-of-non-human-identity-critical-risks.html

ODVA. (n.d.). ODVA | Technology Standards | Distincht CIP Services. Retrieved from ODVA: https://www.odva.org/wp-content/uploads/2023/07/PUB00319R2_CIP-Security-At-a-Glance.pdf

Panduit. (2022, October). Panduit | Markets | Documents | Infrastructure Warehouse Automation. Retrieved from Panduit: https://www.panduit.com/content/dam/panduit/en/website/solutions/markets/documents/infrastructure-warehouse-automation-cpcb261.pdf

Project, O. W. (2025). OWASP Non-Human Identities Top10. Retrieved from OWASP: https://owasp.org/www-project-non-human-identities-top-10/2025/

Rockwell Automation. (2024, June). Rockwell Automation | Literature | PlantPAx Distributed Control System Configuration and Implementation. Retrieved from Rockwell Automation: https://literature.rockwellautomation.com/idc/groups/literature/documents/um/proces-um100_-en-p.pdf

Secure W2. (2025). MAC Spoofing Attacks Explained: A Technical Overview. Retrieved from Secure W2: https://www.securew2.com/blog/how-do-mac-spoofing-attacks-work

Sharon Brizinov, M. S. (2022, August 13). Claroty | Team82 | Evil PLC Attack: Using a Controller as Predator Rather than Prey. Retrieved from Claroty: https://claroty.com/team82/research/evil-plc-attack-using-a-controller-as-predator-rather-than-prey

Tecsys. (2024). infohub Tecsys | Resources | e-book | Improving Warehouse Operations with Low Code Application Platforms. Retrieved from Tecsys: https://infohub.tecsys.com/resources/e-book/improving-warehouse-operations-with-low-code-application-platforms

The Robot Report. (2024). Automated Warehouse | Overcoming Common Software Implementation Challenges. WTWH Media LLC.

Veridify Security. (n.d.). OT Security: Cybersecurity for Modbus. Retrieved from Veridify Security: https://www.veridify.com/ot-security-cybersecurity-for-modbus/

Blog Sample – Serverless

A sample of technical writing via Blog.

What is Serverless – in Laymen Terms

Serverless applications is an interesting name, that really has less to do with the application, and more to do with the technology hosting and storage of the application. Serverless applications do make use of servers, it’s just that they use them differently than in the past.

If you consider an application to be a product, activity, or service, you can in turn also think of the server as the house in which that product, activity, or service is homed. In traditional server systems, that house is static, probably like your house, or mine.

In the current “Serverless” system, you can have that same product, activity, and service, but the house can change as the needs grow or shrink- like adding a room when you need more space, or renting that room out when space is not being used.

Serverless technology has benefits for both the server hub, and the producer of the application. Applications using serverless architecture only pay for services when actively using those services- as in executing a process.

Let’s Take a More Technical Look

The most well-known and understood advantage and selling point of serverless computing is that it economizes the use of cloud resources. Serverless providers only charge for the time that code is executing, maximizing the function and profitability for both the provider and developer. Interestingly Serverless has also increased stability due to spinning services/instances as needed and having redundancy built into the system.

The numbers of applications and services that have moved to serverless is a testament to it’s economical use and function.

Additional interesting strengths are even greater costs reduction when multiple applications share common components, and in defining workflows.

Current thoughts on defining and describing serverless include calling it Event Driven, or Function and a Service (FaaS) protocol. Serverless architecture is best utilized to process events, or discrete chunks of data generated as a time series.

How it Works

Data arrives at the application, (via human or endpoint), and the architecture incorporates an API gateway that accepts the data and determines which serverless component receives the data.

Regardless of which host is being used for the applications serverless architecture, the runtime environment will pass the data is to the component, where it is processed, and returned to the gateway for further processing by other runtime functions, or returned to the user completed.

  1. Application Development
    • Developers write code, and deploy to the cloud provider.
  2. Cloud Host
    • Application Code is hosted by the cloud provider, and homed in a fleet of servers.
  3. Application Use
    • Requests are made to execute the Application code.
    • The cloud provider creates a new container to run the code in.
    • The container is deleted when the execution has been completed
      • Usually after a time period of inactivity
How Serverless Works
Simple flow diagram

Considerations

It’s important to keep in mind that serverless systems are not intended to become complete application. Successful use of serverless requires a separation of data input from computing actions. This separation will affect all stages of development and testing.

Timed out

One challenge is that Serverless isn’t as successful with longer computation times. For example, if processing takes to long, serverless can stop, and require a cold start- it simply may not work for that longer time period. There are some work arounds for this, but they can be problematic. One fix could be to make lots of little computations, that when broken apart, are fast enough to work well in a serverless environment; but the amount of coding time and rebuilding by developers can be prohibitive.

Serverless is Stateless (lack of persistence & it’s impact)

Another consideration is Serverless functions are stateless; individual functions accept input, they process that input, and they output a result. By design, there is no local or persistent storage.

The lack of persistence has impacts in both development and testing. For example, developers in data processing applications often want to be able to temporarily persist data that may be needed a few steps along, and testing can depend on maintaining a state from one step to the next in a workflow, results of previous operations can be understood as input to subsequent steps.

It becomes challenging to test more than one function at a time, and to replicate a serverless system for testing of a process that may use multiple functions is not always possible.

The most common approach is to break the development & tests into even smaller processes. It requires a heavy lift at the beginning, for a transformation of the workflow, as well as greater breakdown in understanding development and testing coverage into micro units, rather than full processes.

Some testing and developers have resorted to ad hoc methods of persisting data, such as creating and writing files to a cloud database. This can make an application more difficult to maintain, and could have security impacts depending on the platform/product/material being stored.

Major providers now have documentation and best practice methods and work arounds for providing persistence. AWS has introduced Step Functions, Microsoft Azure has Durable Functions and Logic Apps, and there are open source add on solutions as well.

Wrap Up

Serverless – or Function as a Service is one of the greatest transitions in recent computational history and demand. As the cost of moving data becomes more affordable, relative cost then increases on the storage or computation. Serverless architecture is a leap forward on this, moving our storage and computation from a static system to a kinetic system allowing for peaks and valleys to be represented and carried over in both costs and savings for providers and consumers. Finding a way to distribute the costs of both storage and functions based on use in a live and active manner is a huge leap forward, and we are still at the beginning stages of this.

What’s coming to Serverless? Things to keep an eye on include security, persistent storage, and data integrity. Global Serverless Computing Market is expected a compound annual growth rate of more than 22% in the period between 2024-2031.1


Other related and interesting content can be found at the following:


Footnote

  1. https://www.skyquestt.com/report/serverless-architecture-market ↩︎

Welcome!

Welcome

Hello,

Welcome to my sample site.

Features here include some sample documents and training videos.

For now, select the above image.
I captured this image while exploring last summer.

Selecting the image will take you to the blog post, where more menus are available.

Once on the post itself, you will notice a link to Training and Development on the right.
The link takes you to a video; a quick overview of the beginning processes of training and development.

This site has only just begun, so please forgive it’s sparseness.
I am using orphan pages to practice code, and explore some of the fine points of CSS.

As this site grows, please feel free and comfortable to explore.