这样,我们经常会丢失消息,这对下游消费者来说是不好的,因为我们无法简单地重新生成传入的流量。
错误信息为:
[Producer clientId=producer-5] Received invalid metadata error in produce request on partition topic-21 due to org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.. Going to request metadata update now
[Producer clientId=producer-5] Got error produce response with correlation id 974706 on topic-partition topic-21, retrying (8 attempts left). Error: NOT_LEADER_FOR_PARTITION
[Producer clientId=producer-5] Got error produce response with correlation id 974707 on topic-partition topic-21, retrying (1 attempts left). Error: NOT_LEADER_FOR_PARTITION
有没有已知的方法可以避免这种情况? 我们应该返回到最大重试次数的默认设置吗? 为什么它一直发送到相同的代理,尽管它以 NOT_LEADER_FOR_PARTITION 响应?
欢迎任何提示。
编辑:我们刚刚注意到,代理指标 kafka_network_requestmetrics_responsequeuetimems 在那个时间上升了,但是最大值只有大约 2.5 秒。