- What source systems and related data underpins the reports that are produced?
- What data transformations and aggregations have been applied, and how is their accuracy evidenced?
- Which controls are in place that warrant reliance data reports?
- Who owns and governs data pipelines and controls?
- How are changes and exceptions managed and documented?
Various regulatory frameworks include explicit provision for data audits, particularly where societal safety, systematic risks, AI and digital services are provided. This is most relevant to VLOPs (Very Large Online Platforms) in the context of increasing regulatory scrutiny on data and reporting obligations.
VLOP: Defined under the Digital Services Act as an online platform that serves an average of at 45+ million monthly active users within the European Union. You can view the full list of VLOPs here: Supervision of the designated very large online platforms and search engines under DSA.
Which areas of the regulatory reporting data pipeline landscape are considered hotspots?
Digital Services Act (DSA): Requires platforms to provide transparency reports that include data on advertising, algorithmic systems, illegal content, content moderation etc. Auditable data lineage, logs and documentation are essential.
Digital Markets Act (DMA): Gatekeepers must submit detailed annual compliance reports covering all core platform services. Reports include evidence of compliance with obligations such as data sharing, interoperability, and advertising metrics transparency.
EU AI Act: Requires high-risk AI systems to maintain comprehensive technical documentation and records. Additionally, providers must: Report serious incidents to national authorities within 2–15 days, depending on severity. Fulfill transparency obligations by providing deployers with clear information on system capabilities, limitations, and risks. Conduct and document impact assessments.
Online Safety Act (UK): Transparency Reports: Information on illegal and harmful content moderation, including measures taken and effectiveness. Risk assessments for harmful content and age assurance measures. Complaints and dispute resolution statistics.
“Regulators and auditors expect organisations to operate a robust control framework over data pipelines.”
Example investigations that reinforce why organisations cannot afford complacency
- DSA - Investigation of X (previously Twitter) as it relates to risk management, content moderation, dark patterns, advertising transparency, data access for researchers. Read more here.
- DSA - Investigation of TikTok : Protection of minors, addictive design, harmful content, advertising transparency, data access for researchers. Read more here.
- DMA - Scrutiny of data ecosystems across Alphabet, Apple and Meta leaves gatekeepers no room for complacency. Read more here.
"Auditors seek efficiency and confidence in their reviews. A strong control environment signals that data risks are understood and managed systematically."
The quick-reference snapshot of the essentials for a controlled data environment:
- Source systems are catalogued: Data sources are documented, with ownership and brief descriptions.
- Data pipeline scripts are version-controlled: Processes are documented, with clear links to source and target systems.
- Data pipeline changes are governed: Changes are managed through formal processes, with approvals, testing, and rollback plans.
- Governance roles are assigned: Every report has a named owner responsible for its integrity and compliance which includes ownership of underlying data pipelines.
- Control owners and operators are in place: Automated and manual controls are assigned and evidenced through logs or sign-offs and testing.
Key artifacts that support answering questions from regulators and auditors:
- Risk assessment and mitigation evidence: Documentation showing that risks in source-to-report data flows have been identified, assessed, and addressed.
- Control framework: A structured set of controls covering data lineage, transformation accuracy, and reporting integrity, with evidence of design and operating effectiveness.
- Governance and change management records: Clear documentation of ownership, decision-making, approvals, and how changes, exceptions, and control failures were managed and resolved.
"The real question is not whether an auditor or regulator might ask. It’s whether you can answer with confidence when they do."
Organisations that invest in a controlled data environment not only accelerate audits but also strengthen regulatory trust and operational resilience.