Skip to main content
Learning Focus

Use this lesson to understand Performance Implication with practical syntax and examples.

Performance Implications of MySQL Data Types

Storage Efficiency

  • Smaller Data Types: Reduce disk I/O by fitting more rows per page.
    • Example: Use TINYINT (1 byte) instead of INT (4 bytes) for status flags.
    • Impact: A table with 1 million rows saves 3 MB with this change.
  • Fixed vs Variable Types:
    • CHAR wastes space for variable-length data but is faster for comparisons.
    • VARCHAR saves space but adds 1-2 bytes of length overhead.

Indexing Impact

  • Index Size: Smaller data types -> smaller indexes -> faster scans.

    • Example: Indexing a BIGINT (8 bytes) vs INT (4 bytes) doubles index size.
    • Use SMALLINT (2 bytes) for foreign keys if possible.
  • Prefix Indexes: For TEXT/VARCHAR, index only the necessary prefix:

    CREATE INDEX idx_name ON users (name(10));  -- First 10 characters

Query Execution

  • Data Type Conversions: Implicit conversions in WHERE clauses cause full scans.

    -- [X] Slow: phone (VARCHAR) vs INTEGER
    SELECT * FROM users WHERE phone = 123456789;
    -- [OK] Fast: Match types
    SELECT * FROM users WHERE phone = '123456789';

  • Temporal Types:

    • TIMESTAMP (4 bytes) vs DATETIME (5-8 bytes): Smaller but limited to 1970-2038.
    • DATE (3 bytes) is optimal for date-only storage.

Join Operations

  • Key Size Impact: Larger keys slow down joins.
    • Example: Joining on VARCHAR(255) vs INT increases memory usage.
    • Best Practice: Use integer foreign keys where possible.

Memory Usage

  • Temporary Tables: Queries using TEXT/BLOB may spill to disk, slowing sorting/grouping.
    • Mitigation: Avoid SELECT * when unnecessary.

Specialized Types

  • ENUM: Stored as integers but string-readable. Faster than VARCHAR for filtering.

    -- ENUM('red','green','blue') uses 1 byte vs VARCHAR(6) using 6+ bytes.

  • SET: Bitmask storage for multi-value columns. Compact but limited to 64 options.

Collation & Character Sets

  • utf8mb4: Uses 4 bytes/character but supports emojis.
  • latin1: 1 byte/character. Use for ASCII-only data to save space.
  • Impact: A VARCHAR(255) in utf8mb4 uses 1,020 bytes (2554) vs 255 bytes in latin1.

NULL Handling

  • Overhead: Each NULL column adds a bit to the row's NULL bitmap.
    • Example: 8 NULL columns -> 1 byte overhead.
    • Best Practice: Use NOT NULL DEFAULT where appropriate.

Floating-Point Tradeoffs

  • FLOAT/DOUBLE: Faster math but approximate.
  • DECIMAL: Exact precision but slower computations.
    • Use DECIMAL for financial data, FLOAT for scientific calculations.

BLOB/TEXT Considerations

  • Off-Page Storage: InnoDB stores >~8KB BLOB/TEXT externally, requiring extra I/O.

  • Retrieval Overhead:

    -- [X] Slow:
    SELECT content FROM articles WHERE ...;
    -- [OK] Better:
    SELECT id, title FROM articles WHERE ...;

Performance Optimization Checklist

  1. Downsize Columns: Use the smallest type that accommodates your data.
  2. Index Wisely: Prefer numeric types for keys and limit index length.
  3. Avoid Implicit Conversions: Match data types in joins/filters.
  4. Normalize Data: Replace repetitive strings with integer IDs.
  5. Monitor Row Size: Stay under 65,535 bytes to prevent row overflow.

Example: Optimizing a User Table

Before:

CREATE TABLE users (
id BIGINT PRIMARY KEY, -- 8 bytes
email VARCHAR(255) CHARACTER SET utf8mb4, -- 1020 bytes
country VARCHAR(50), -- 50 bytes
created_at DATETIME -- 5 bytes
);

Total Row Size: ~1,083 bytes

After:

CREATE TABLE users (
id INT UNSIGNED PRIMARY KEY, -- 4 bytes
email VARCHAR(100) CHARACTER SET latin1, -- 100 bytes
country CHAR(2), -- 2 bytes
created_at TIMESTAMP -- 4 bytes
);

Total Row Size: 110 bytes (90% reduction)


Next Steps:

  1. Explain Plans: Use EXPLAIN to analyze query performance.
  2. Benchmark: Test with realistic data volumes.
  3. Profile Storage: Monitor disk usage with SHOW TABLE STATUS.

By aligning data types with use cases and storage needs, you can achieve 10x+ performance gains in large-scale applications.

Concept Map

flowchart LR
A[Schema Context] --> B[Performance Implication]
B --> C[Query Pattern]
C --> D[Validation]
D --> E[Production Use]

Common Pitfalls

PitfallConsequencePrevention
Executing queries without validating sample rowsLogic errors reach production data or reportsStart with SELECT ... LIMIT 10 and inspect edge cases
Ignoring NULL and duplicate behaviorAggregations and filters return misleading resultsTest with NULL, duplicates, and empty sets explicitly
Using advanced syntax before checking schemaQueries fail due to missing columns/indexesVerify structure with DESCRIBE table_name; and adapt query design

Quick Reference

CREATE TABLE sample_performance_implication (id INT PRIMARY KEY, value PERFORMANCE_IMPLICATION);
DESCRIBE sample_table;
SHOW CREATE TABLE sample_table;

What's Next