Kafka
- Event streaming platform
- a distributed commit log used for event streaming
- producers append events to topic partitions
- consumers read them as streams using offsets
Producer(writes) → Kafka(store & repl) → Consumer(reads,process, commit offsets)
Concepts
- Topic
- A topic is the name of a category of messages, like a channel.
- e.g.,
project-events
- Message/Record
- Topic 里的一条事件数据。
- e.g.,
PROJECT_APPROVED事件,payload 里有projectId、budget、approvedBy…
- Partition
- Topic 会被切成多个 Partition(分片)
- 并行:多个 partition 可以被多个 consumer 并行消费
- 顺序:Kafka 只保证同一个 partition 内的顺序
- Topic 会被切成多个 Partition(分片)
- Offset
- Offset = 消息在某个 partition 里的编号(从 0 开始递增)。
- Broker
- Kafka 集群里的一台服务器就叫一个 broker。
- Replica(副本)
- 对 每个 partition 来说:
- Leader:主副本,对外处理写入(producer)和读取(consumer)
- Follower:从副本,复制 leader 的数据(备份)
- 每个 partition 可以有多个副本,分布在不同 broker 上,防止机器挂了数据丢。
- 对 每个 partition 来说:
- ISR(In-Sync Replicas,同步副本集合)✅ 核心
- ISR = 跟得上 leader 的副本集合。
- 只有在 ISR 里的 follower 才算“同步”。
- 写入时如果配置
acks=all,Kafka 会等 ISR 都写好才算成功(更耐崩)
- Producer / Consumer
- Producer:发消息的人(ProjectService 发事件)
- Consumer:收消息的人(ContractService 订阅事件创建合同)
- Consumer Group(消费者组)✅ 核心
- Kafka 用它来实现“一组机器协作消费”。
- 规则:
- 同一个 consumer group 内:一个 partition 同时只会分配给一个 consumer
- 这样可以并行消费,同时不重复处理(在同组内)
- Example:ContractService 开 3 个实例,属于同一个 group,它们会分着读不同 partition。
- Rebalance
- 当 consumer 增减、崩溃、扩容时,Kafka 会重新分配 partition 给 group 内的消费者,这个过程叫 rebalance。
- Delivery (At-most-once / At-least-once / Exactly-once)
- At-most-once:可能丢,不重复(很少用)
- At-least-once:不丢但可能重复(最常见)
- Exactly-once:尽量做到不丢不重(更复杂)
Delivery semantics
-
We first choose the desired delivery semantics (e.g., at-least-once).
-
Then we implement it by combining
- producer settings (acks/retries/idempotence) and
- consumer offset commit strategy (commit after processing)
- plus idempotent downstream writes to tolerate duplicates.
-
Delivery models
- At-most-once (commit → process)
- processed 0 or 1 time (no duplicates), but can lose messages.
- How it happens: consumer commits offset before processing (or drops on error).
- Use when: losing an event is acceptable (rare), or you prioritize latency over correctness.
- At-least-once (process → commit) - most common
- processed 1 or more times (no loss), but duplicates are possible.
- How it happens: consumer processes first, then commits offset; if it crashes after processing but before commit → it reprocesses.
- Use when: correctness matters; handle duplicates via idempotency/dedup.
- Exactly-once (process + commit + produce)
- processed exactly 1 time (no loss, no duplicates) from the system’s point of view.
- In Kafka: achievable with idempotent producer + transactions and careful consumer/processing design (often summarized as “EOS”).
- Use when: duplicates are very costly and you can afford complexity (payments-like workflows, some stream processing).
- At-most-once (commit → process)
Producer side and Consumer side
-
Producer Side:
acks:acks=0(fastest, highest risk of loss)- Producer sends the message
- Producer does not wait for any response and treats it as success
acks=1(leader only, good latency)- Producer sends the message to the partition leader
- Leader appends the record to its log
- Leader replies OK immediately
acks=all/acks=-1(strongest durability, higher latency)- Producer sends the message to the leader
- Leader appends to its log
- Leader waits for acknowledgements from in-sync replicas (ISR)
- Once the number of in-sync replicas that have written the record meets
min.insync.replicas, - Leader replies
OK
retries(retry on failure)- idempotent producer
"enable.idempotence": True
-
Consumer Side
- poll → process → commit offset (depend on delivery models)
-
At-least-once
- Producer:
- Set
acks=allto improve durability - Enable
retriesto survive transient network/broker failures (avoid dropping messages). - Enable idempotent producer
- Set
- Consumer:
- Commit offsets only after processing succeeds(core)
- Make downstream writes idempotent
- Producer: