Debugging an ioctl Problem on OpenBSD

posted on feb 16th, 2022 with tags debugging and openbsd

I was trying to use a V4L2 Ruby module on my OpenBSD laptop but ran into a problem where sending the V4L2 ioctls from this module would fail, while other V4L2 programs on OpenBSD worked fine.

Since I got a few questions recently about kernel development and debugging, I thought I’d write up how I finally tracked it down and fixed it. (Spoiler: it was not an OpenBSD problem.)

Table of Contents

  1. The Issue
  2. Enabling Debugging in the Kernel Driver
  3. The Bug

The Issue

After first getting the module to compile on OpenBSD, I ran some simple Ruby code to set the parameters and start fetching data from my camera:

camera = Video4Linux2::Camera.open("/dev/video1")
camera.image_width  = 640
camera.image_height = 480
camera.framerate    = 10/1r
camera.format       = :YUYV
camera.start

But this failed with an ENOTTY error from ioctl:

ioctl(VIDIOC_S_FMT): Inappropriate ioctl for device

The Ruby module’s C code that does this looks pretty straightforward:

...
  struct v4l2_format fmt;
...

  BZERO(fmt);

  fmt.type                = V4L2_BUF_TYPE_VIDEO_CAPTURE;
  fmt.fmt.pix.width       = wd;
  fmt.fmt.pix.height      = ht;
  fmt.fmt.pix.pixelformat = fcc;
  fmt.fmt.pix.field       = V4L2_FIELD_INTERLACED;

  err = xioctl(fd, VIDIOC_S_FMT, &fmt);
  if (err < 0) {
	perror("ioctl(VIDIOC_S_FMT)");
	...

The module is using a BZERO macro that just calls bzero, and an xioctl wrapper function that retries ioctl when it fails with errno == EINTR.

If I look at the code for OpenBSD’s video(1) utility, it sets the format pretty similarly:

	bzero(&fmt, sizeof(struct v4l2_format));
	fmt.type = d->buf_type;
	fmt.fmt.pix.width = vid->width;
	fmt.fmt.pix.height = vid->height;
	fmt.fmt.pix.pixelformat = encs[vid->enc].id;
	fmt.fmt.pix.field = V4L2_FIELD_ANY;
	if (ioctl(d->fd, VIDIOC_S_FMT, &fmt) < 0) {
		warn("VIDIOC_S_FMT");
		return 0;
	}

I tried changing the field to V4L2_FIELD_ANY to match the OpenBSD utility, but it still failed with ENOTTY.

I usually turn to ktrace to diagnose syscall issues, but in this case it just showed that it was calling ioctl on a filehandle 14, sending it VIDIOC_S_FMT, and passing some data (a v4l2_format struct).

90052 ruby27   CALL  ioctl(14,VIDIOC_S_FMT,0x7f7ffffd47c0)
90052 ruby27   RET   ioctl -1 errno 25 Inappropriate ioctl for device

Enabling Debugging in the Kernel Driver

The ioctl is sent through /dev/video1, which arrives at the video(4) driver’s videoioctl function. It then does some error checking and passes it along to the USB-specific uvideo(4) driver’s uvideo_s_fmt function.

There are already some debugging printfs in uvideo_s_fmt, which are done through a DPRINTF macro which actually prints only when UVIDEO_DEBUG is enabled:

#ifdef UVIDEO_DEBUG
int uvideo_debug = 1;
#define DPRINTF(l, x...) do { if ((l) <= uvideo_debug) printf(x); } while (0)
#else
#define DPRINTF(l, x...)
#endif

I added a #define UVIDEO_DEBUG above that block, recompiled the kernel and booted to it. Running the video utility makes the kernel print out a line, but it prints nothing from my Ruby code.

uvideo1: uvideo_s_fmt: requested width=640, height=480

Since the DPRINTF that prints that is pretty high in the function, I’m guessing it’s not even reaching uvideo, so let’s go back to the video driver.

	DPRINTF(3, "video_ioctl(%zu, '%c', %zu)\n",
	    IOCPARM_LEN(cmd), (int) IOCGROUP(cmd), cmd & 0xff);

	error = EOPNOTSUPP;
	switch (cmd) {
	...
	case VIDIOC_S_FMT:
		if (!(flags & FWRITE))
			return (EACCES);
		if (sc->hw_if->s_fmt)
			error = (sc->hw_if->s_fmt)(sc->hw_hdl,
			    (struct v4l2_format *)data);
		break;

Similarly, we can enable debugging printfs in the video driver by defining VIDEO_DEBUG at the top, compiling, and rebooting. Now when I run my Ruby code, I get:

video_ioctl(208, 'V', 5)

Looking at /usr/src/sys/sys/videoio.h, I can see that it does appear to be the correct ioctl in the group V, number 5, and a proper size of that v4l2_format struct:

#define VIDIOC_S_FMT            _IOWR('V',  5, struct v4l2_format)

The _IOWR macro is a series of macros that build up an unsigned long that corresponds to every possible ioctl in the kernel.

So the ioctl is reaching the video driver, appears to be recognized as VIDIOC_S_FMT, but is not actually reaching the uvideo driver’s s_fmt function by way of the big switch. Is it even getting into the case VIDIOC_S_FMT? Let’s add our own debugging printf to show that we reached that, and the path it’s taking inside of the function.

When running the video utility, it now prints:

video_ioctl(208, 'V', 5)
uvideo1: uvideo_s_fmt: requested width=640, height=480
uvideo1: uvideo_s_fmt: offered width=640, height=480
S_FMT: returned 0
videoioctl: final ret 0

But when running my Ruby code, it only prints the first line, indicating that it’s not even reaching the case VIDIOC_S_FMT. Since the switch is on the entire cmd, not just the number (5) or on the group (V), it must match all of the ioctl parameters. Let’s print out the whole thing:

video_ioctl(208, 'V', 5) (3234878981 vs s_fmt 3234878981)
uvideo1: uvideo_s_fmt: requested width=640, height=480
uvideo1: uvideo_s_fmt: offered width=640, height=480
S_FMT: returned 0
videoioctl: final ret 0

And from my Ruby code:

video_ioctl(208, 'V', 5) (18446744072649463301 vs s_fmt 3234878981)
videoioctl: final ret 25

The Bug

At this point I went back to the Ruby C code. How is it passing the ioctl request parameter?

static int xioctl(int fh, int request, void *arg)
{
  int r;

  do {
	r = ioctl(fh, request, arg);
  } while (r == -1 && EINTR == errno);

  return r;
}

Have you spotted it yet?

Let’s look at the OpenBSD ioctl man page:

SYNOPSIS
    #include <sys/ioctl.h>

    int
    ioctl(int d, unsigned long request, ...);

Of course, request must be an unsigned long, but this Ruby module’s xioctl wrapper is truncating it to a signed int before passing it to the kernel.

A quick fix to change the xioctl wrapper’s definition and the problem is solved.

The ioctl function on Linux also requires an unsigned long, so this fix is not specific to OpenBSD but presumably the V4L2 ioctl values on Linux don’t exceed the size of a 32-bit integer and so this never caused issues for anyone on Linux.


Questions or comments? E-mail me or tweet your reply @jcs.