Dr. Raphael Nagel (LL.M.), Founding Partner Tactical Management, on Proprietary Data as AI Competitive Advantage
Dr. Raphael Nagel (LL.M.), Founding Partner, Tactical Management
Aus dem Werk · ALGORITHMUS

Proprietary Data as AI Competitive Advantage: Why Domain Data Beats Capital in the AI Era

Proprietary Data as AI Competitive Advantage is the strategic position in which decades of domain-specific operational data, combined with algorithmic competence, create an AI capability no hyperscaler can replicate. Dr. Raphael Nagel (LL.M.) argues in ALGORITHMUS that domain data, not capital scale, defines defensible positions for European industrial champions.

Proprietary Data as AI Competitive Advantage is the structural market position created when a company owns exclusive, high-quality, domain-specific operational data and possesses the algorithmic competence to transform that data into decision intelligence competitors cannot replicate. The advantage is not the volume of data but its specificity: sensor readings from 100,000 installed machines, thirty years of clinical trial results, two decades of route-optimization records. In ALGORITHMUS, Dr. Raphael Nagel (LL.M.) frames this as the new refinery economics of the AI era, where owning both the crude, meaning domain data, and the refinery, meaning applied AI capability, produces moats that capital alone cannot buy.

Why Proprietary Data Has Become the AI Era’s True Scarcity

Proprietary domain data has replaced raw data volume as the scarce input in AI value creation. Dr. Raphael Nagel (LL.M.) argues in ALGORITHMUS that the phrase data is the new oil is misleading: what creates defensible advantage is not data mass but the combination of exclusive domain data and the algorithmic competence to refine it into decision intelligence.

The empirical record supports this thesis. Google held more data than any firm in 2010, yet Microsoft, Amazon, Meta and hundreds of startups still built successful digital businesses. Netflix owned more viewing data than every film studio combined, and the studios retained relevant market positions. Bloomberg held more financial data than the entire hedge fund industry, yet quantitative funds like Renaissance Technologies generated a 66 percent average annual return of the Medallion Fund from 1988 to 2018 through superior modeling, not superior data access.

The strategic reframing matters for European decision-makers. Foundation model competition is essentially closed: training GPT-4 cost between 63 and 100 million dollars, and Epoch AI projects that the next frontier model will cost over one billion dollars. Domain data, by contrast, cannot be purchased at any price if the competitor never generated it. This is where European industrial firms hold structural positions that American hyperscalers cannot storm.

The Refinery Metaphor: Why Data Without Algorithmic Competence Is Worthless

Owning proprietary data without algorithmic competence is owning crude without a refinery. Dr. Raphael Nagel (LL.M.) makes this point explicitly in ALGORITHMUS: the competitive moat forms only when domain data and applied AI capability are combined under the same governance. A Mittelstand firm with thirty years of sensor data but no modeling team holds a potential asset, not a realized one.

The refinery must be actively maintained. Data requires continuous updating, models require retraining as operating conditions drift, and domain knowledge must be codified and integrated into the system. MLOps, the operational discipline for machine learning, becomes the functional equivalent of refinery operations: without it, models degrade silently as the world they were trained on changes.

The corollary is sharp. Firms that have data but outsource all modeling to generic foundation model providers effectively sell their crude to someone else’s refinery. That decision may be rational for non-core processes, but for capabilities that define competitive position it amounts to licensing away the most defensible asset a company owns. Tactical Management frequently encounters portfolio candidates who have not yet recognized that their process data is a balance sheet item.

Siemens Xcelerator and the Industrial Mittelstand Playbook

Siemens Xcelerator is the clearest working example of proprietary data converted into an AI competitive advantage. The platform uses decades of machine operating data from hundreds of thousands of installed Siemens assets worldwide to train predictive maintenance, process optimization and fault diagnosis models. No general industrial foundation model can replicate this corpus because no one else generated it.

TRUMPF follows the same logic with its smart-factory platform, using laser-technology operating data accumulated over decades. Bosch Connected Industry and KUKA Robotics pursue analogous strategies. What these firms share is recognition that the data produced by their installed base is not exhaust but asset. Bosch reported billions of connected devices producing telemetry that, properly structured, becomes a training corpus competitors cannot assemble.

For the broader Mittelstand, the playbook has three moves. First, audit what proprietary data exists: machine sensor data, transaction records, customer interaction logs, process documentation. Second, transform product sales into service-plus-model offerings, where the model learns from the installed base and each new customer strengthens the moat. Third, integrate with European sovereign infrastructure where possible, because the US CLOUD Act of 2018 exposes data held by American hyperscalers to extraterritorial access requests regardless of the data’s physical location.

Capital Versus Data: The European Strategic Answer

Capital asymmetry cannot be closed, but data asymmetry can be exploited. American AI startups received more than 50 billion dollars in venture capital in 2023 according to PitchBook data cited in ALGORITHMUS, while European counterparts received roughly six billion euros. No European policy intervention will close that gap in a relevant timeframe.

What European firms can do is refuse to compete on the terrain where capital dominates and concentrate on the terrain where domain data dominates. Foundation model development at frontier scale is closed. Vertically specialized applications built on proprietary operational data are wide open. Veeva Systems reached 2.3 billion dollars in revenue in 2023 with over 35 percent EBITDA margin by building pharma-specific software that Salesforce and Microsoft could not displace, because Veeva understood pharmaceutical regulatory workflows better than any general CRM.

Dr. Raphael Nagel (LL.M.), Founding Partner of Tactical Management, treats this as the primary value creation thesis for European industrial private equity over the coming decade. A Mittelstand target with 100 million euros revenue and 10 percent EBITDA margin can realistically reach 15 to 18 percent through systematic AI integration on proprietary data, producing 40 to 64 million euros of enterprise value at an eight-times exit multiple on a one-to-three million euro integration cost.

Legal and Governance Dimensions of Data Moats

Proprietary data as competitive advantage rests on legal foundations that most firms underestimate. The EU AI Act, fully in force from August 2024, classifies systems in credit, personnel, critical infrastructure and other domains as high-risk, triggering documentation, bias-testing and audit obligations with fines of up to seven percent of global annual turnover. A data moat built on non-compliant processing is a liability, not an asset.

The NIS2 Directive, transposed from October 2024, expands critical infrastructure definitions and introduces personal liability of management boards for cybersecurity implementation. Proprietary training data in energy, water, food supply and digital infrastructure sectors is now subject to governance requirements where directors bear direct responsibility. The DSGVO framework, combined with the 2018 US CLOUD Act, creates structural uncertainty for European data held on American hyperscaler servers, even when physically located in Frankfurt or Dublin.

The practical consequence is that data strategy and legal strategy must be designed together. Privacy by design, synthetic data generation for sensitive categories, sovereign cloud architecture for regulated data, and explicit documentation of training provenance are no longer optional hygiene items. They are the conditions under which proprietary data remains a defensible rather than a litigable asset.

The strategic conclusion of ALGORITHMUS on this question is direct: the AI era does not reward those with the most data, it rewards those most disciplined in the use of domain data they alone possess. For European industrial firms, Mittelstand champions, regulated service providers and institutional investors, proprietary data combined with algorithmic competence is the only moat that survives the collapse of foundation model pricing and the commoditization of general AI services. Capital cannot buy it, hyperscalers cannot synthesize it, and regulators increasingly reward its responsible stewardship through the EU AI Act and NIS2 frameworks. Dr. Raphael Nagel (LL.M.), Founding Partner of Tactical Management, frames this as the defining strategic decision of the next twelve to twenty-four months: firms that convert their operational data into defensible AI assets now will hold structurally superior positions in 2030, while those that delegate the refinery to external providers will find themselves paying platform rent on assets they once owned. The forward-looking claim is unambiguous. Domain data is the new raffinery of the AI era, and the question is no longer whether to build on it, but whether to build on it before competitors and regulators foreclose the window. The decisions that determine this are being made now, in boardrooms and investment committees, and they cannot be postponed to the next planning cycle.

Frequently asked

What makes proprietary data a competitive advantage in AI rather than just a technical asset?

Proprietary data becomes a competitive advantage when it is domain-specific, exclusively held, and paired with the algorithmic competence to refine it into decision intelligence. Dr. Raphael Nagel (LL.M.) argues in ALGORITHMUS that general data is abundant and commoditized, while operational data from an installed base of industrial machines, a clinical trial history, or a logistics network cannot be purchased or synthetically generated by competitors. The moat is structural: a competitor cannot replicate twenty years of sensor readings even with unlimited capital, which is why this form of advantage survives even as foundation model pricing collapses toward zero.

Why can’t European firms simply compete with OpenAI or Google on foundation models?

Because frontier foundation model training requires capital at a scale only a handful of global actors can mobilize. Training GPT-4 cost between 63 and 100 million dollars according to Stanford researchers, and Epoch AI projects the next generation will exceed one billion dollars per training run. European venture capital invested roughly six billion euros in AI in 2023 versus more than 50 billion dollars in the United States. The strategic answer Dr. Raphael Nagel (LL.M.) outlines is not to contest the foundation layer but to dominate vertical applications built on proprietary European industrial data.

How does Siemens Xcelerator illustrate the proprietary data advantage?

Siemens Xcelerator trains AI models for predictive maintenance, process optimization and fault diagnosis on decades of operating data from hundreds of thousands of installed Siemens machines worldwide. No general industrial foundation model has access to this corpus because no competitor generated it. The result is a defensible moat that cannot be overtaken by more compute or more general training data. Dr. Raphael Nagel (LL.M.) highlights this in ALGORITHMUS as the template for how European industrial champions transform installed-base telemetry into a balance sheet asset that Silicon Valley platforms cannot acquire.

What legal risks does a data-based AI strategy carry under the EU AI Act and NIS2?

The EU AI Act classifies many industrial, HR and credit applications as high-risk, imposing documentation, bias-testing and audit requirements with fines of up to seven percent of global turnover. The NIS2 Directive, in force since October 2024, introduces personal liability of board members for cybersecurity in expanded critical infrastructure categories. The US CLOUD Act of 2018 creates extraterritorial access risk for data held on American hyperscaler infrastructure, even physically located in Europe. Tactical Management treats privacy-by-design, sovereign architecture and provenance documentation as the minimum conditions for a defensible data strategy.

How should a Mittelstand CEO begin building proprietary data into an AI advantage?

Start with an honest inventory: what exclusive operational data exists, in what quality, in what format. Sensor data, transaction logs, service records, and customer interaction histories are typically underexploited. Next, define a Build, Buy or Control decision per function, keeping internal control over data assets that differentiate the business. Finally, transform product offerings into service-plus-model bundles where the installed base continuously strengthens the data moat. Dr. Raphael Nagel (LL.M.) argues this sequence is the only realistic path to defensible AI positions for firms that cannot outspend American hyperscalers.

Claritáte in iudicio · Firmitáte in executione

For weekly analysis on capital, leadership and geopolitics: follow Dr. Raphael Nagel (LL.M.) on LinkedIn →

For weekly analysis on capital, leadership and geopolitics: follow Dr. Raphael Nagel (LL.M.) on LinkedIn →

Author: Dr. Raphael Nagel (LL.M.). About