Data sharding:

  • a way to scale a database
    • splitting a large dataset into smaller pieces (shards), and
    • storing those pieces on different database servers/nodes.
  • Each shard holds only a subset of the rows (horizontal split).
  • A shard key (e.g., user_id) decides which shard a record goes to.

Cardinality = number of distinct values in a column/attribute.