Book GitHub Repository
github.com/data-contract-book/chapter-7-implementing-data-contracts
Chapter 7: Implementing Data Contracts
Hands-on sandbox environment demonstrating end-to-end data contract architecture using open-source tools. Companion code for the O'Reilly book "Data Contracts: Developing Production-Grade Pipelines at Scale."
What's in the Repo
A realistic museum application scenario using Met Museum API data (~3,000 records) covering four architectural components:
- Data Assets — PostgreSQL database setup, Alembic migrations, and raw data files
- Contract Definition — JSON Schema-based data contract specs (e.g.,
object_images_contract_spec.json) - Detection — Python scripts to build a data catalog, extract contract specs, and identify violations
- Prevention — Unit tests embedded in CI/CD workflows (GitHub Actions) to catch violations before deployment
Getting Started
- GitHub Codespaces — easiest option, auto-configures everything
- Docker Desktop + VS Code — dev containers with full local control
Both options automatically set up Python, PostgreSQL, install dependencies, and seed the database.
Links
- GitHub — Source code
- Get the book — Amazon, Barnes & Noble, Target, O'Reilly
1 # Book GitHub Repository
2
3 [github.com/data-contract-book/chapter-7-implementing-data-contracts](https://github.com/data-contract-book/chapter-7-implementing-data-contracts)
4
5 ## Chapter 7: Implementing Data Contracts
6
7 Hands-on sandbox environment demonstrating end-to-end data contract architecture using open-source tools. Companion code for the O'Reilly book "Data Contracts: Developing Production-Grade Pipelines at Scale."
8
9 ---
10
11 ## What's in the Repo
12
13 A realistic museum application scenario using Met Museum API data (~3,000 records) covering four architectural components:
14
15 - **Data Assets** — PostgreSQL database setup, Alembic migrations, and raw data files
16 - **Contract Definition** — JSON Schema-based data contract specs (e.g., `object_images_contract_spec.json`)
17 - **Detection** — Python scripts to build a data catalog, extract contract specs, and identify violations
18 - **Prevention** — Unit tests embedded in CI/CD workflows (GitHub Actions) to catch violations before deployment
19
20 ## Getting Started
21
22 - **GitHub Codespaces** — easiest option, auto-configures everything
23 - **Docker Desktop + VS Code** — dev containers with full local control
24
25 Both options automatically set up Python, PostgreSQL, install dependencies, and seed the database.
26
27 ## Links
28
29 - [GitHub](https://github.com/data-contract-book/chapter-7-implementing-data-contracts) — Source code
30 - [Get the book](/oreilly-book/data-contracts-book) — Amazon, Barnes & Noble, Target, O'Reilly
No editor is open
Open a file from the Explorer or use Ctrl+P
TERMINAL
Welcome to markfreeman.dev terminal
Type 'help' for available commands.
visitor@markfreeman.dev:~$