Skip to content

Panic in inputs.zfs after upgrading to 1.36.3-1 #17952

@acranox

Description

@acranox

Relevant telegraf.conf

# # Read metrics of ZFS from arcstats, zfetchstats, vdev_cache_stats, pools and datasets
# # This plugin ONLY supports Linux & FreeBSD
[[inputs.zfs]]
  interval="300s"
#   ## ZFS kstat path. Ignored on FreeBSD
#   ## If not specified, then default is:
#   # kstatPath = "/proc/spl/kstat/zfs"
#   
#   ## By default, telegraf gather all zfs stats
#   ## Override the stats list using the kstatMetrics array:
#   ## For FreeBSD, the default is:
#   # kstatMetrics = ["arcstats", "zfetchstats", "vdev_cache_stats"]
#   ## For Linux, the default is:
#   # kstatMetrics = ["abdstats", "arcstats", "dnodestats", "dbufcachestats",
#   #     "dmu_tx", "fm", "vdev_mirror_stats", "zfetchstats", "zil"]
  kstatMetrics = [""] 
##  
#   ## By default, don't gather zpool stats
#   # poolMetrics = false
  poolMetrics = true
#
#   ## By default, don't gather dataset stats
#   # datasetMetrics = false
  datasetMetrics = false

#   ## Report fields as the type defined by ZFS (Linux only)
#   ## This is disabled for backward compatibility but is STRONGLY RECOMMENDED
#   ## to be enabled to avoid overflows. This requires UINT support on the output
#   ## for most fields.
#   ## useNativeTypes = false
  useNativeTypes = true

Logs from Telegraf

Nov 05 17:25:00 hostname.example.com telegraf[839584]: 2025-11-05T22:25:00Z E! FATAL: [inputs.zfs] panicked: runtime error: index out of range [1] with length 0, Stack:
Nov 05 17:25:00 hostname.example.com telegraf[839584]: goroutine 265 [running]:                                                                                                               
Nov 05 17:25:00 hostname.example.com telegraf[839584]: github.com/influxdata/telegraf/agent.panicRecover(0xc0004cec00)    
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /go/src/github.com/influxdata/telegraf/agent/agent.go:1202 +0x6d        
Nov 05 17:25:00 hostname.example.com telegraf[839584]: panic({0xa0babc0?, 0xc0074d6240?})                                                                                                     
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /usr/local/go/src/runtime/panic.go:783 +0x132                                                                       
Nov 05 17:25:00 hostname.example.com telegraf[839584]: github.com/influxdata/telegraf/plugins/inputs/zfs.(*Zfs).processProcFile(0xc0074d6210?, {0x0?, 0x1?, 0xc0056e4c60?})                   
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /go/src/github.com/influxdata/telegraf/plugins/inputs/zfs/zfs_linux.go:252 +0xa13
Nov 05 17:25:00 hostname.example.com telegraf[839584]: github.com/influxdata/telegraf/plugins/inputs/zfs.(*Zfs).Gather(0xc00075b040, {0xb8c5d00, 0xc004908f20})
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /go/src/github.com/influxdata/telegraf/plugins/inputs/zfs/zfs_linux.go:78 +0x4e7
Nov 05 17:25:00 hostname.example.com telegraf[839584]: github.com/influxdata/telegraf/models.(*RunningInput).Gather(0xc0004cec00, {0xb8c5d00, 0xc004908f20})
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /go/src/github.com/influxdata/telegraf/models/running_input.go:263 +0x244
Nov 05 17:25:00 hostname.example.com telegraf[839584]: github.com/influxdata/telegraf/agent.(*Agent).gatherOnce.func1()
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /go/src/github.com/influxdata/telegraf/agent/agent.go:590 +0x58                                              
Nov 05 17:25:00 hostname.example.com telegraf[839584]: created by github.com/influxdata/telegraf/agent.(*Agent).gatherOnce in goroutine 146                                                   
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /go/src/github.com/influxdata/telegraf/agent/agent.go:588 +0xf7
Nov 05 17:25:00 hostname.example.com telegraf[839584]: goroutine 1 [sync.WaitGroup.Wait]:
Nov 05 17:25:00 hostname.example.com telegraf[839584]: sync.runtime_SemacquireWaitGroup(0xc00403e820?, 0x80?)                    
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /usr/local/go/src/runtime/sema.go:114 +0x2e
Nov 05 17:25:00 hostname.example.com telegraf[839584]: sync.(*WaitGroup).Wait(0xc00566a610)
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /usr/local/go/src/sync/waitgroup.go:206 +0x85
Nov 05 17:25:00 hostname.example.com telegraf[839584]: github.com/influxdata/telegraf/agent.(*Agent).Run(0xc001794010, {0xb877710, 0xc00075be50})
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /go/src/github.com/influxdata/telegraf/agent/agent.go:208 +0xb6a
Nov 05 17:25:00 hostname.example.com telegraf[839584]: main.(*Telegraf).runAgent(0xc001aa0000, {0xb877710, 0xc00075be50}, 0x0?)
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /go/src/github.com/influxdata/telegraf/cmd/telegraf/telegraf.go:565 +0x19e5
Nov 05 17:25:00 hostname.example.com telegraf[839584]: main.(*Telegraf).reloadLoop(0xc001aa0000) 
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /go/src/github.com/influxdata/telegraf/cmd/telegraf/telegraf.go:207 +0x26b
Nov 05 17:25:00 hostname.example.com telegraf[839584]: main.(*Telegraf).Run(0xc001aa0000)
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /go/src/github.com/influxdata/telegraf/cmd/telegraf/telegraf_posix.go:20 +0xb8
Nov 05 17:25:00 hostname.example.com telegraf[839584]: main.runApp.func1(0xc000b20800)
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /go/src/github.com/influxdata/telegraf/cmd/telegraf/main.go:261 +0xdc9
Nov 05 17:25:00 hostname.example.com telegraf[839584]: github.com/urfave/cli/v2.(*Command).Run(0xc001c9c580, 0xc000b20800, {0xc0001ec000, 0x5, 0x5})
Nov 05 17:25:00 hostname.example.com telegraf[839584]:         /go/pkg/
Nov 05 17:25:00 hostname.example.com telegraf[839584]: 2025-11-05T22:25:00Z E! PLEASE REPORT THIS PANIC ON GITHUB with stack trace, configuration, and OS information: https://github.com/infl
uxdata/telegraf/issues/new/choose

System info

Telegraf 1.36.3-1 Debian 13.1

Docker

No response

Steps to reproduce

  1. Upgrade telegraf to 1.36.3-1

...

Expected behavior

It shouldn't panic

Actual behavior

Telegraf panicks every interval that the zfs plugin is set to run on, and telegraf restarts.

Additional info

I upgraded telegraf from 1.36.2-1 to 1.36.3-1, and it crashed after restarting.
After reading through recent changes to the ZFS plugin I added the useNativeTypes = true option to the config file, but it had no effect, telegraf still crashed.
Downgrading to 1.36.2-1 resolved the crashing.

Metadata

Metadata

Assignees

Labels

bugunexpected problem or unintended behavior

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions