Modelling the Unknown: A Practical Guide to Synthetic Data in Modern Analytics

By Frederic Lundgren, Senior Product Owner 1st Party Data & Adtech, INGKA 

Featured talk at

11 June 2025

Partners

“Is it real, or is it made up?”
This was the question posed by Fredric Lundgren at the June 11, 2025 Web Analytics Wednesday in Copenhagen, where he led an audience of analytics professionals through the nuanced world of synthetic data. As the Data & Machine Learning Team Lead at INGKA, Lundgren is no stranger to the tensions between statistical modeling and operational reality.

As data privacy regulations tighten, user behaviour become harder to observe, and platform transparency shrinks, synthetic data (also known as modelled or estimated data) has become a mainstay in the analytics toolkit. But how it’s used, when it should be trusted, and what its limitations are remain largely undiscussed.

This article explores Lundgren’s perspective on the strategic and ethical implications of synthetic data, grounded in INGKA’s real-world application.

The Problem: Analytics in a Consent Constrained World

“Web and marketing analysts live in a different data reality than economists or academics,” Lundgren told the crowd. “Our data quality isn’t just poor, it’s often fractured.”

At INGKA, as with most global retailers, user consent is a key constraint. On average, only about 60 percent of website sessions are tracked, with the remaining 40 percent lost to consent refusal, ad blockers, or technical limitations. In mobile apps, consent rates can drop even lower, and tracking is bound to users rather than sessions, adding layers of complexity.

Without complete data, foundational metrics like website visitation, product interactions, or campaign performance become unreliable. Synthetic data is used to fill the gaps not just to report past performance, but to steer future strategy.

What Is Synthetic Data

Synthetic data refers to statistically modelled data points created when observed data is missing or insufficient. It can be used to:

  • Forecast future outcomes

  • Label users based on inferred characteristics

  • Aggregate or reconstruct incomplete datasets

Lundgren outlined this visually: raw data is the INGKA furniture schematic, synthetic data is the missing screw you’ve statistically estimated based on what’s in the box, and the final dashboard is the assembled product, held together with assumptions.

The Use Case: Modeling Visitation at INGKA

Why does INGKA need synthetic data? The answer lies in the intersection of product development and digital strategy.

“We no longer distribute the iconic INGKA catalogue,” Lundgren explained. “Our first customer interaction often happens online, via web or app. That visitation is now a primary KPI, especially because it predicts store visitation.”

But with only partial tracking available, measuring digital engagement became unreliable. So, INGKA built a model using consented data and actual sales, estimating overall visitation patterns across platforms. This allowed them to:

  • Validate the ROI of digital product features

  • Justify headcount in tech and product teams

  • Stabilise reporting for internal stakeholders

Importantly, this modelling wasn’t marketing driven. It originated from product development’s need to demonstrate value.

The Double Edged Sword: Continuity vs. Blind Trust

While synthetic data restores continuity and comparability, it also introduces risk. “You can’t evaluate the black box,” Lundgren warned. “Platforms like Google and Meta serve up modelled metrics, but their methodology isn’t transparent. You can’t verify their assumptions.”

Worse still, synthetic data often becomes the training input for further models, creating a recursive cycle of assumptions feeding assumptions. If your first model is biased, your entire stack may inherit that flaw.

“Incomplete data leads to modelling,” said Lundgren, “which then feeds machine learning. It’s both the chicken and the egg.”

A Framework for Critical Use

Fredric urged analysts to stay critical, even if they have few alternatives. Here are his practical takeaways:

1. Know Your Basis

Always start by understanding how much observed data your models are built on. Is it 60 percent of sessions, or 20 percent? Quantify the gap before trusting the outcome.

2. Validate When You Can

Use independent sources (like sales data, CRM metrics, or qualitative research) to triangulate model accuracy.

3. Be Transparent with Stakeholders

Label synthetic data clearly in reports. Explain what’s estimated, what’s observed, and how that impacts decisions.

4. Challenge Black Box Models

If you can’t audit the methodology, consider using synthetic models internally, tailored to your business logic, rather than relying entirely on vendor outputs.

The Road Ahead

Looking forward, Lundgren offered four possible scenarios: consent levels could rise, fall, fluctuate, or become irrelevant if regulation changes. In all of them, synthetic data will remain necessary.

As he put it, “We need to ensure our teams aren’t penalised for gaps beyond their control. That’s what synthetic data helps with, but only if we remain vigilant.”

His parting thought? A reminder from the founder of IKEA: “To a glorious future for all of us.” That future will undoubtedly be modelled, but let’s make sure it’s modelled well.

Fredric Lundgren

Fredric Lundgren

Senior Product Owner 1st Party Data & Adtech, INGKA

Fredric Lundgren is a data-driven leader with a passion for turning insights into business impact. As Head of Online Analytics at INGKA Group (IKEA), he leads a talented team pioneering data-driven ways of working to enhance online performance and machine learning — with a special focus on the IKEA App.

Known for fostering trust, clarity, and collaboration, Fredric empowers teams to move from reactive to creative thinking through transparency, accountability, and shared purpose.

About Ingka Group

Ingka Group is the largest franchisee of the IKEA brand, operating most IKEA stores globally as part of the IKEA franchise system. While the IKEA brand and concept are owned by Inter IKEA Group, Ingka Group runs the retail, digital, and customer-facing operations in over 30 markets—so when you think of “IKEA,” you’re often interacting with Ingka Group.

Featured talk at

11 June 2025

Partners