How Can Machine Learning Help Agencies Safeguard Supply Chains?

By Erik Ekwurzel

SVP & CTO, Public Sector Solutions

Dun & Bradstreet

July 21, 2021

Machine learning is a powerful tool for robust commercial due diligence.

Since the mid-20th century, machine learning (ML) and its potential impact on artificial intelligence (AI) have fired the imaginations of academics, scientists, and engineers. In the 21st century, computing power is making the impact a reality. Interest in the idea that systems (machines) can identify patterns, “learn” from that data, and make decisions with minimal human involvement has quickly spread through the private and public sectors, prompting numerous research and development efforts.

Dun & Bradstreet’s government team has worked hard to help agencies map supply chains, conduct due diligence, identify weaknesses, and spot adversarial infiltration among their supplier networks.

Particularly for government, ML is a powerful asset for mitigating risk in the supply chain – a need that’s been heightened by recent executive orders and highly publicized disruptions. During the past three years, Dun & Bradstreet’s government team has worked hard to help agencies map supply chains, conduct due diligence, identify weaknesses, and spot adversarial infiltration among their supplier networks.

To understand how ML works in supply chain discovery, let’s first look at the factors that help shape its output: types of supply chains, types of data, and the steps ML uses to “teach” itself.

What Are the Three Types of Supply Chains?

Through our work, we’ve seen how important it is for agencies to first outline their goal for supply chain discovery. And that goal will be significantly shaped by the type of supply chain they want to protect. The first type is also the simplest type – companies that supply a particular organization. For example, consider a large or medium-sized government prime contractor. An agency may want to use machine learning to help understand who all suppliers are to this organization, what resilience issues exist in that network, and where possibilities for adversarial influence may occur.

Using machine learning to identify vulnerabilities within a platform supply chain offers a somewhat different perspective. An example of a platform could be an IT system. As an agency, you might have two or three IT systems of similar nature, and your goal may be to understand overlaps in their supply networks. Machine learning can help you pinpoint when and where those overlaps occur, and what risks they generate.

Industry is the third type of supply chain. An example here could be personal medical equipment. As an agency, perhaps you want to use ML to evaluate all the suppliers involved in creating heart monitors to be sure your country will have continued access.

Each type of supply chain relies on different kinds of data – another key factor in ML results. In our work, ML produced the most beneficial results for organizational supply chain discovery using what Dun & Bradstreet calls “behavioral counterparty data.” Think of that as the information provided from global accounts receivables, from thousands of businesses across 45 countries.

With platform supply chains – recall our IT systems example here – the counterparty data is useful, but we find that behavioral inquiry data is even more useful. Inquiry data is when one counterparty requests information on another counterparty.

Supply chain discovery for industry, like our medical equipment example, usually means starting with open web data to find all the discoverable vending companies that are involved. Once you have that, you can then dig further into the behavioral and counterparty data to understand each vending company supply chain.

Why Is the Right Data Important?

After setting a goal for supply chain discovery, agencies need access to reliable, comprehensive data. The Dun & Bradstreet Data Cloud can provide the kinds of data that are helpful for machine learning, such as public data, which comes from business registries, suits, liens, judgments, or UCC filings. Proprietary data is also helpful, and that includes things like maritime shipping, IP addresses, and ultimate beneficial ownership.

Our Data Cloud also includes entrusted data, such as data in accounts receivables and private company financials. Entrusted means that these companies trust Dun & Bradstreet to hold this data securely. Behavioral data is probably the most important data for machine learning and encompasses inquiries (what counterparties are interested in other counterparties), device fingerprints, match audits, and more.

The machine learning that we’ve been using for supply chain discovery also leverages Dun & Bradstreet’s “below the waterline” data. What does that mean? Picture an iceberg partially submerged at sea. The portion above the water can represent the data we provide through products and services to government agencies and the private sector. The portion under the water represents data for which Dun & Bradstreet is contractually obligated to protect from public exposure, but from which we can develop information derivatives in the form of scores or insights created when this data is aggregated.

How Does Machine Learning “Learn”?

With an understanding of supply chain and data types in mind, let’s now imagine that a government agency issues a contract award for an IT system. That contract award pushes a ripple of activities through the signal and behavioral data systems that Dun & Bradstreet monitors — a ripple started when the prime contractor makes several inquiries on potential subcontractors. Those subcontractors then make inquiries on their subcontractors, and that continues all the way down the line to the raw materials providers.

The activities associated with the contract award don’t occur in a vacuum nor do they get amplified. Because the government makes purchases continually, as does private industry, the contract award ripple occurs within a fantastic amount of noise. So how can we teach machine learning to filter the signals that are important for just the platform (IT system) supply chain in our award event? What needs to happen?

Step 1: Augment customer-provided seed data to identify an initial group of suppliers.
What this means is finding a way to prime the machine learning pump. At this step, you’re gathering all the evidence you can of actual suppliers in the supply chain that are publicly exposed. Usually you can come up with between 25 to 75 prime contractors and tier one suppliers, and perhaps one or two tier-two suppliers; trying to go any deeper at this point is difficult, as the data becomes opaque.
Step 2: Identify and confirm additional suppliers.
In step two, you feed that data into the machine learning model and “turn the crank.” Looking at all the behavioral data signals, the ML model will output more suppliers that it thinks are involved with the IT system from our contract award example.
Step 3: Calculate the strength of identified connections.
Finding more suppliers that are tied to the IT system is helpful, but an agency will also want to understand the strength of all those connections. ML can help use multiple data points to do that.
Step 4: Separate positives from false positives to continuously train the machine learning after each cycle.
There is no “easy button,” so in the final step, researchers review the ML model output to determine which results are positives and which are false positives. Then we turn the crank again for each tier until we see a clearer picture of the supply chain. That’s where the ML finishes supply chain discovery.

How Does ML Continuously Monitor Resilience Weakness and Adversarial Influence?

To help illustrate how ML detects and continuously monitors weaknesses in resilience and possible adversarial influence, let’s add more detail to our imagined IT system contract award. Pretend our contract is for printers, and that we are using ML to better understand the suppliers that help build them.

Starting with step 1, we discover – through publicly available data – 51 tier one suppliers, and a handful of tier two suppliers. We feed that data into the ML model, turn the crank, and examine tiers two, three, and four. So far, we’ve been able to discover 177 confirmed participants in the supply chain, plus another 111 potentials. Moving a potential to a confirmed requires much deeper research, but we’ll stop here for our illustration.

Within that confirmed data, we can see foreign-owned suppliers for various printer components. Based on the countries in which these suppliers are located, we can start to identify where adversarial influence may exist in the supply chain.

To help ensure resilience, let’s pretend that the agency from our example decides to order three separate IT systems that do essentially the same thing from three major prime contractors. From the agency’s perspective, it may feel that spreading orders across three contractors constitutes resilience. Let’s say, in our example, all three prime contractors have relatively independent supply chains, achieving resilience for government. However, over time, ML picks up that changes in subcontractors lead to a reduction to eight tier-one suppliers to all three primes.

So while our agency is mitigating its resilience risk across three prime contractors, there may, over time, be greater risks in the tier-one level below that. If one of those eight tier-one subcontractors experiences issues, then two or three of the prime contractors are likely to experience resilience issues in their supply chains.

Use Machine Learning as Part of Your Toolbox

Machine learning analysis is a valuable asset for supply chain risk management. It’s an efficient way to help agencies quickly gain a sharper view of the supplier landscape and most of the players in it, especially where it’s not easy to ask the suppliers to reveal themselves or to examine suppliers for every piece of a deliverable. (An example could be the companies that supply each component of a single microchip, which then is used within a more complicated technology involving many other suppliers, such as a smart phone.)

At Dun & Bradstreet, we continue to use ML to help agencies map supply chains, monitor supply chains, spot potential resilience issues, and guard against prohibited equipment, systems, and subsidiaries. Learn more about our public sector solutions.

The information provided is a suggestion only and is based on best practices. Dun & Bradstreet is not liable for the outcome or results of specific programs or tactics. Please contact an attorney or tax professional if you need legal or tax advice.