-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Description
Describe the bug
We are noticing that loki distributors are still trying to make requests to ingesters long after they have left. We can observe this through dropped packets from Cilium. We are using ingester_client.pool_config. health_check_ingesters: true and we can see from the logs and metrics that the ingester clients appear to be removed, but as mentioned we still see from Cilium that the distributors are sending some kind of request / traffic to old ingester ips at port 9095.
If we restart the distributor deployment this traffic disappears until ingester pods start to be recycled, then the issue manifests again.
To Reproduce
Steps to reproduce the behavior:
- Loki running in distributed mode in kubernetes (v 3.5.3)
- Kill or cycle an ingester pod
Expected behavior
Distributor logs a warning that the ingester client is being removed like:
level=warn ts=2025-12-07T19:42:16.989021087Z caller=pool.go:250 component=distributor msg="removing ingester failing healthcheck" addr=172.19.150.60:9095 reason="rpc error: code = DeadlineExceeded desc = context deadline exceeded while waiting for connections to become ready"No further communication is made to the distributors that have left.
Environment:
- Infrastructure: Kubernetes
- Deployment tool: Helm
Screenshots, Promtail config, or terminal output
If applicable, add any output to help explain your problem.