
A Data Product is a self-contained, discoverable, and accessible dataset with clear ownership, documentation, and access controls. It treats data as a product with:
Core Characteristics:
- Discoverable: Easy to find through marketplace
- Addressable: Has unique identifier and API endpoint
- Self-Describing: Complete metadata and documentation
- Trustworthy: Quality metrics and SLAs
- Secure: Access controls and governance
- Interoperable: Standard formats and interfaces
Data Product Structure:
Data Product
├── Metadata (name, description, owner)
├── Schema (fields, data types, constraints)
├── Data (actual dataset)
├── API Endpoint (programmatic access)
├── Documentation (usage guides, examples)
├── Access Controls (who can access)
├── Semantics (business term mappings)
└── Quality Metrics (completeness, accuracy)

Definition of Reference Data Products:
Reference Data Products are special data products that contain standardized, shared data used across multiple applications and processes. They serve as single source of truth for common reference data.
Examples:
- Country codes (ISO 3166)
- Currency codes (ISO 4217)
- Product categories
- Customer segments
- Organization hierarchies
- Status codes
Characteristics:
- Stable: Changes infrequently
- Shared: Used by multiple applications
- Authoritative: Single source of truth
- Standardized: Follows industry standards
- Versioned: Changes tracked carefully
¶ Differences Between Data Products and Reference Data Products
| Aspect |
Data Product |
Reference Data Product |
| Update Frequency |
Regular (hourly/daily) |
Infrequent (quarterly/yearly) |
| Size |
Large datasets |
Small, curated lists |
| Ownership |
Domain-specific team |
Central data governance |
| Purpose |
Transactional/analytical data |
Lookup values, standards |
| Change Impact |
Isolated to domain |
Organization-wide |
| Versioning |
Continuous |
Explicit versions |
| Access Pattern |
Query/filter/aggregate |
Lookup by key |
| Examples |
Customer transactions |
Country codes |
Usage Example:
Regular Data Product:
Customer Transactions
- 10M+ records
- Updated hourly
- Complex queries
- Domain: Finance
Reference Data Product:
Transaction Status Codes
- 10 records
- Updated yearly
- Simple lookups
- Domain: Enterprise-wide
Usage:
Transaction.status_code → References → Status Codes.code
Responsibilities:
- Strategic Ownership: Overall accountability for data product
- Business Decisions: Prioritize features and access requests
- Approval Authority: Final say on publishing and deprecation
- Stakeholder Communication: Represent product to organization
Permissions:
- Create and edit product metadata
- Approve/reject access requests
- Publish data products
- Assign data stewards
- Deprecate or retire products
Typical Roles:
- Business unit leaders
- Product managers
- Department heads
Responsibilities:
- Quality Assurance: Ensure data accuracy and completeness
- Documentation: Maintain product documentation
- Technical Oversight: Manage schema and sync strategies
- Issue Resolution: Address data quality issues
Permissions:
- Edit product metadata and schema
- Update documentation
- Manage field definitions
- Add/remove semantics
- Cannot publish (owner approval needed)
Typical Roles:
- Data analysts
- Data engineers
- Subject matter experts
Responsibilities:
- Policy Enforcement: Ensure compliance with data policies
- Standards Maintenance: Define and enforce standards
- Access Governance: Oversee access approval workflows
- Audit and Compliance: Monitor data usage and access
- Risk Management: Identify and mitigate data risks
Permissions:
- View all data products
- Override access decisions (in exceptional cases)
- Configure approval workflows
- Access audit logs
- Enforce data retention policies
Typical Roles:
- Chief Data Officer
- Data governance team
- Compliance officers
Responsibilities:
- Proper Usage: Use data according to agreements
- Compliance: Adhere to access terms and conditions
- Feedback: Report issues or suggest improvements
- Credential Management: Secure API keys and credentials
Permissions:
- Read access to granted data products
- API access with rate limits
- Request additional access
- View product documentation
Typical Entities:
- Business intelligence tools
- Analytics applications
- Machine learning models
- Reporting systems
- Third-party integrations