Data Types & Constraints: Concept
Module: Foundational Concepts
Data types define what kind of data a column can store (numbers, text, dates), while constraints are rules that enforce data validity.
Data types and constraints work together to maintain data quality:
Data Types:
- INTEGER: Whole numbers (-2B to 2B)
- DECIMAL(10,2): Exact precision for money
- VARCHAR(n): Variable-length text
- DATE: Date values with validation
- BOOLEAN: TRUE/FALSE values
Constraints:
- PRIMARY KEY: Unique identifier, cannot be NULL
- FOREIGN KEY: References another table's primary key
- NOT NULL: Field must have a value
- UNIQUE: No duplicate values allowed
- CHECK: Custom validation rules
- DEFAULT: Auto-fills value if not provided
Choosing correct data types impacts:
1. Storage efficiency - smaller types use less disk space
2. Query performance - proper types enable faster comparisons
3. Data validity - type checking prevents impossible data
Modern Data Types (2025):
• JSON/JSONB: PostgreSQL JSONB (binary, indexed); SQL Server JSON (native 2025, compressed); MySQL JSON (validated). Use for API responses, configs.
• Temporal with Timezones: PostgreSQL TIMESTAMPTZ (UTC internal, local display); MySQL DATETIME (no TZ); store UTC always.
• Precise Numeric: DECIMAL(10,4) for finance (no float rounding); avoid FLOAT for money.
Conversion Performance Rules:
• CAST (ANSI): CAST(col AS type) – portable, fastest.
• CONVERT (SQL Server): Style codes for dates.
• TRY_CAST: Safe, returns NULL on failure (no error).
• Implicit conversions in WHERE: Table scan (no index).
• Benchmark: Implicit VARCHAR→INT on 10M rows: 45s vs CAST: 2s.
Data types and constraints form the foundation of database integrity. Every production database uses constraints to enforce business rules and prevent invalid data.
Financial systems require exact precision (DECIMAL for money). Inventory systems need validation (CHECK for non-negative stock). User registration needs unique email enforcement. Multi-tenant SaaS normalizes user data from API JSON streams with TRY_CAST for analytics. Each relies on proper data types and constraints.