Learning Focus

Use this lesson to understand Performance Implication with practical syntax and examples.

Performance Implications of MySQL Data Types

Storage Efficiency

Smaller Data Types: Reduce disk I/O by fitting more rows per page.
- Example: Use TINYINT (1 byte) instead of INT (4 bytes) for status flags.
- Impact: A table with 1 million rows saves 3 MB with this change.
Fixed vs Variable Types:
- CHAR wastes space for variable-length data but is faster for comparisons.
- VARCHAR saves space but adds 1-2 bytes of length overhead.

Indexing Impact

Index Size: Smaller data types -> smaller indexes -> faster scans.
- Example: Indexing a BIGINT (8 bytes) vs INT (4 bytes) doubles index size.
- Use SMALLINT (2 bytes) for foreign keys if possible.

Prefix Indexes: For TEXT/VARCHAR, index only the necessary prefix:

CREATE INDEX idx_name ON users (name(10));  -- First 10 characters

Query Execution

Data Type Conversions: Implicit conversions in WHERE clauses cause full scans.

-- [X] Slow: phone (VARCHAR) vs INTEGER
SELECT * FROM users WHERE phone = 123456789;
-- [OK] Fast: Match types
SELECT * FROM users WHERE phone = '123456789';

Temporal Types:
- TIMESTAMP (4 bytes) vs DATETIME (5-8 bytes): Smaller but limited to 1970-2038.
- DATE (3 bytes) is optimal for date-only storage.

Join Operations

Key Size Impact: Larger keys slow down joins.
- Example: Joining on VARCHAR(255) vs INT increases memory usage.
- Best Practice: Use integer foreign keys where possible.

Memory Usage

Temporary Tables: Queries using TEXT/BLOB may spill to disk, slowing sorting/grouping.
- Mitigation: Avoid SELECT * when unnecessary.

Specialized Types

ENUM: Stored as integers but string-readable. Faster than VARCHAR for filtering.
```
-- ENUM('red','green','blue') uses 1 byte vs VARCHAR(6) using 6+ bytes.
```
SET: Bitmask storage for multi-value columns. Compact but limited to 64 options.

Collation & Character Sets

utf8mb4: Uses 4 bytes/character but supports emojis.
latin1: 1 byte/character. Use for ASCII-only data to save space.
Impact: A VARCHAR(255) in utf8mb4 uses 1,020 bytes (2554) vs 255 bytes in latin1.

NULL Handling

Overhead: Each NULL column adds a bit to the row's NULL bitmap.
- Example: 8 NULL columns -> 1 byte overhead.
- Best Practice: Use NOT NULL DEFAULT where appropriate.

Floating-Point Tradeoffs

FLOAT/DOUBLE: Faster math but approximate.
DECIMAL: Exact precision but slower computations.
- Use DECIMAL for financial data, FLOAT for scientific calculations.

BLOB/TEXT Considerations

Off-Page Storage: InnoDB stores >~8KB BLOB/TEXT externally, requiring extra I/O.

Retrieval Overhead:

-- [X] Slow:
SELECT content FROM articles WHERE ...;
-- [OK] Better:
SELECT id, title FROM articles WHERE ...;

Performance Optimization Checklist

Downsize Columns: Use the smallest type that accommodates your data.
Index Wisely: Prefer numeric types for keys and limit index length.
Avoid Implicit Conversions: Match data types in joins/filters.
Normalize Data: Replace repetitive strings with integer IDs.
Monitor Row Size: Stay under 65,535 bytes to prevent row overflow.

Example: Optimizing a User Table

Before:

CREATE TABLE users (
    id BIGINT PRIMARY KEY,                -- 8 bytes
    email VARCHAR(255) CHARACTER SET utf8mb4,  -- 1020 bytes
    country VARCHAR(50),                 -- 50 bytes
    created_at DATETIME                  -- 5 bytes
);

Total Row Size: ~1,083 bytes

After:

CREATE TABLE users (
    id INT UNSIGNED PRIMARY KEY,         -- 4 bytes
    email VARCHAR(100) CHARACTER SET latin1,  -- 100 bytes
    country CHAR(2),                     -- 2 bytes
    created_at TIMESTAMP                 -- 4 bytes
);

Total Row Size: 110 bytes (90% reduction)

Next Steps:

Explain Plans: Use EXPLAIN to analyze query performance.
Benchmark: Test with realistic data volumes.
Profile Storage: Monitor disk usage with SHOW TABLE STATUS.

By aligning data types with use cases and storage needs, you can achieve 10x+ performance gains in large-scale applications.

Concept Map

flowchart LR
    A[Schema Context] --> B[Performance Implication]
    B --> C[Query Pattern]
    C --> D[Validation]
    D --> E[Production Use]

Common Pitfalls

Pitfall	Consequence	Prevention
Executing queries without validating sample rows	Logic errors reach production data or reports	Start with `SELECT ... LIMIT 10` and inspect edge cases
Ignoring NULL and duplicate behavior	Aggregations and filters return misleading results	Test with `NULL`, duplicates, and empty sets explicitly
Using advanced syntax before checking schema	Queries fail due to missing columns/indexes	Verify structure with `DESCRIBE table_name;` and adapt query design

Quick Reference

CREATE TABLE sample_performance_implication (id INT PRIMARY KEY, value PERFORMANCE_IMPLICATION);
DESCRIBE sample_table;
SHOW CREATE TABLE sample_table;

What's Next

Previous: Storage Consideration - Review the previous lesson to reinforce context.
Module Overview - Return to this module index and choose another related lesson.

Storage Efficiency​

Indexing Impact​

Query Execution​

Join Operations​

Memory Usage​

Specialized Types​

Collation & Character Sets​

NULL Handling​

Floating-Point Tradeoffs​

BLOB/TEXT Considerations​

Performance Optimization Checklist​

Concept Map​

Common Pitfalls​

Quick Reference​

What's Next​