Debugging an ioctl Problem on OpenBSD
I was trying to use a V4L2
Ruby module
on my OpenBSD laptop but ran into a problem where sending the V4L2 ioctl
s from
this module would fail, while other V4L2 programs on OpenBSD worked fine.
Since I got a few questions recently about kernel development and debugging, I thought I'd write up how I finally tracked it down and fixed it. (Spoiler: it was not an OpenBSD problem.)
The Issue
After first getting the module to compile on OpenBSD, I ran some simple Ruby code to set the parameters and start fetching data from my camera:
camera = Video4Linux2::Camera.open("/dev/video1")
camera.image_width = 640
camera.image_height = 480
camera.framerate = 10/1r
camera.format = :YUYV
camera.start
But this failed with an ENOTTY
error from ioctl
:
ioctl(VIDIOC_S_FMT): Inappropriate ioctl for device
The Ruby module's C code that does this looks pretty straightforward:
...
struct v4l2_format fmt;
...
BZERO(fmt);
fmt.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
fmt.fmt.pix.width = wd;
fmt.fmt.pix.height = ht;
fmt.fmt.pix.pixelformat = fcc;
fmt.fmt.pix.field = V4L2_FIELD_INTERLACED;
err = xioctl(fd, VIDIOC_S_FMT, &fmt);
if (err < 0) {
perror("ioctl(VIDIOC_S_FMT)");
...
The module is using a BZERO
macro that just calls bzero
, and an xioctl
wrapper function that retries ioctl
when it fails with errno == EINTR
.
If I look at the code for OpenBSD's
video(1)
utility, it
sets the
format
pretty similarly:
bzero(&fmt, sizeof(struct v4l2_format));
fmt.type = d->buf_type;
fmt.fmt.pix.width = vid->width;
fmt.fmt.pix.height = vid->height;
fmt.fmt.pix.pixelformat = encs[vid->enc].id;
fmt.fmt.pix.field = V4L2_FIELD_ANY;
if (ioctl(d->fd, VIDIOC_S_FMT, &fmt) < 0) {
warn("VIDIOC_S_FMT");
return 0;
}
I tried changing the field to V4L2_FIELD_ANY
to match the OpenBSD utility, but
it still failed with ENOTTY
.
I usually turn to
ktrace
to diagnose syscall issues, but in this case it just showed that it was calling
ioctl
on a filehandle 14, sending it VIDIOC_S_FMT
, and passing some data (a
v4l2_format
struct).
90052 ruby27 CALL ioctl(14,VIDIOC_S_FMT,0x7f7ffffd47c0)
90052 ruby27 RET ioctl -1 errno 25 Inappropriate ioctl for device
Enabling Debugging in the Kernel Driver
The ioctl
is sent through /dev/video1
, which arrives at the
video(4)
driver's videoioctl
function.
It then does some error checking and
passes it along
to the USB-specific
uvideo(4)
driver's uvideo_s_fmt
function.
There are already some debugging printf
s in uvideo_s_fmt
, which are done
through a DPRINTF
macro which actually prints only when UVIDEO_DEBUG
is
enabled:
#ifdef UVIDEO_DEBUG
int uvideo_debug = 1;
#define DPRINTF(l, x...) do { if ((l) <= uvideo_debug) printf(x); } while (0)
#else
#define DPRINTF(l, x...)
#endif
I added a #define UVIDEO_DEBUG
above that block, recompiled the kernel and
booted to it.
Running the video
utility makes the kernel print out a line, but it prints
nothing from my Ruby code.
uvideo1: uvideo_s_fmt: requested width=640, height=480
Since the DPRINTF
that prints that is pretty high in the function, I'm
guessing it's not even reaching uvideo
, so let's go back to the video
driver.
DPRINTF(3, "video_ioctl(%zu, '%c', %zu)\n",
IOCPARM_LEN(cmd), (int) IOCGROUP(cmd), cmd & 0xff);
error = EOPNOTSUPP;
switch (cmd) {
...
case VIDIOC_S_FMT:
if (!(flags & FWRITE))
return (EACCES);
if (sc->hw_if->s_fmt)
error = (sc->hw_if->s_fmt)(sc->hw_hdl,
(struct v4l2_format *)data);
break;
Similarly, we can enable debugging printf
s in the video
driver by defining
VIDEO_DEBUG
at the top, compiling, and rebooting.
Now when I run my Ruby code, I get:
video_ioctl(208, 'V', 5)
Looking at /usr/src/sys/sys/videoio.h
, I can see that it does appear to be the
correct ioctl
in the group V
, number 5, and a proper size of that
v4l2_format
struct:
#define VIDIOC_S_FMT _IOWR('V', 5, struct v4l2_format)
The _IOWR
macro is a
series of macros
that build up an unsigned long
that corresponds to every possible ioctl
in
the kernel.
So the ioctl
is reaching the video
driver, appears to be recognized as
VIDIOC_S_FMT
, but is not actually reaching the uvideo
driver's s_fmt
function by way of the big switch
.
Is it even getting into the case VIDIOC_S_FMT
?
Let's add our own debugging printf
to show that we reached that, and the path
it's taking inside of the function.
When running the video
utility, it now prints:
video_ioctl(208, 'V', 5)
uvideo1: uvideo_s_fmt: requested width=640, height=480
uvideo1: uvideo_s_fmt: offered width=640, height=480
S_FMT: returned 0
videoioctl: final ret 0
But when running my Ruby code, it only prints the first line, indicating that
it's not even reaching the case VIDIOC_S_FMT
.
Since the switch
is on the entire cmd
, not just the number (5) or on the
group (V
), it must match all of the ioctl
parameters.
Let's print out the whole thing:
video_ioctl(208, 'V', 5) (3234878981 vs s_fmt 3234878981)
uvideo1: uvideo_s_fmt: requested width=640, height=480
uvideo1: uvideo_s_fmt: offered width=640, height=480
S_FMT: returned 0
videoioctl: final ret 0
And from my Ruby code:
video_ioctl(208, 'V', 5) (18446744072649463301 vs s_fmt 3234878981)
videoioctl: final ret 25
The Bug
At this point I went back to the Ruby C code.
How is it passing the ioctl
request parameter?
static int xioctl(int fh, int request, void *arg)
{
int r;
do {
r = ioctl(fh, request, arg);
} while (r == -1 && EINTR == errno);
return r;
}
Have you spotted it yet?
Let's look at the OpenBSD
ioctl
man page:
SYNOPSIS
#include <sys/ioctl.h>
int
ioctl(int d, unsigned long request, ...);
Of course, request
must be an unsigned long
, but this Ruby module's xioctl
wrapper is truncating it to a signed int
before passing it to the kernel.
A
quick fix
to change the xioctl
wrapper's definition and the problem is solved.
The ioctl
function on Linux
also requires
an unsigned long
, so this fix is not specific to OpenBSD but presumably
the V4L2 ioctl
values on Linux don't exceed the size of a 32-bit integer and
so this never caused issues for anyone on Linux.