For two years I've been driving myself crazy trying to figure out the source of a driver problem on OpenBSD: interrupts never arrived for certain touchpad devices. While debugging an unrelated issue over the weekend, I finally solved it.
It's been a long journey and it's a technical tale, but here it is.
Table of Contents
In 2015, I purchased a Samsung ATIV Book 9 laptop. Its touchpad was different than most other laptops used with OpenBSD previously, and would be a model for most touchpads to come after it: a Windows Precision Touchpad connected over I2C.
Most other laptops had a touchpad connected through the 8042 (PS/2) controller along with the keyboard, emulating the historical design of PCs having two PS/2 ports for an external mouse and keyboard. These touchpads from Synaptics, Elan, and ALPS each spoke a proprietary protocol and were rather bandwidth-constrained in terms of how much finger data they could communicate back to the OS which became a problem when multi-touch gestures became a thing in Windows.
For these devices, Microsoft produced its Windows Precision Touchpad specification and would handle the driver side of things, allowing vendors to have touchpads that shared a common driver and worked in Windows out of the box, as well as allow Microsoft to provide a better touchpad experience with gestures and palm rejection (but still not be able to rival what Apple does with the Broadcom touchpads on their MacBooks).
In 2016, I finished
for these touchpads which required an
I2C controller driver
a HID-over-I2C driver
a basic I2C-HID mouse driver
then a full transport-agnostic HID driver implementing the
Windows Precision Touchpad spec
and finally an I2C touchpad driver
to interface between
Shortly after, some laptops started showing up with their keyboard connected
over I2C as well, requiring the
In 2018, I wrote
to support USB-connected Windows Precision devices in use on some laptop
While all of this worked fairly well and somewhat modernized OpenBSD's non-ThinkPad laptop support (many ThinkPads up until some 2019 models still used a PS/2-connected touchpad and TrackPoint), there was one aspect that didn't work: on Broadwell chipsets, the touchpad would not wake up after an S3 suspend/resume.
Later in 2016, I purchased a Chromebook Pixel and got
OpenBSD running on it.
The Pixel also had its touchpad and touchscreen connected over I2C,
though being a Chromebook not running Windows, its touchpad did not conform to
the Windows Precision Touchpad standard which meant it needed a new driver
The Chromebook Pixel was also a Broadwell chipset and this new driver had the
same issue: communication with it failed after an S3 resume.
Two different vendors of touchpads and two different drivers, but the same
The I2C controller (
dwiic) worked fine after resume, but any time it tried to
communicate with the touchpad device, everything would just timeout.
After some months of debugging on Linux, I tracked down the fix to a single write to a register on the I2C controller device, found in Linux's Intel Low Power Subsystem (LPSS) driver for power gating.
Intel's LPSS is used for their I2C and SPI devices used to limit power usage
by quickly shutting off components when idle.
The way this is implemented in Linux is kind of confusing, and even now looking
at their main
I can't see where the
0x800 register comes from that OpenBSD's driver writes
to the I2C controller in order to power up the I2C slave device.
That Linux driver registers a clock (
clk) device and the
handles the register writing itself rather than calling back to a function in
the LPSS driver, which is why it took me so long to find it in 2016.
In 2017, I purchased a
Huawei Matebook X
with a Kaby Lake chipset which Intel refers to as the 100 Series.
Intel's I2C controllers on this chipset now show up as actual PCI devices, which
dwiic driver to handle both PCI and ACPI attachments.
dwiic driver fetches ACPI resource information for I2C slave devices that
are connected to it, like the touchpad.
That resource includes the I2C slave address and interrupt pin that it is
connected to on the
ihidev then attaches and uses the standard methods in the OpenBSD kernel to
ioapic device to register a callback to
ihidev whenever that pin
receives an interrupt.
Despite all of that being setup with the proper address and pin (which matched
what Linux did), the IOAPIC would never receive an interrupt on that pin
ihidev would never have its interrupt handler called when the touchpad was
It was being properly powered up and would respond to I2C HID commands, and
if polled after touching, there was finger data available to read.
It just never generated an interrupt.
As with the S3 resume issue, I spent months trying to figure out what was happening with these missing interrupts. I attended the OpenBSD t2k17 Hackathon and spent nearly a week straight in a room full of OpenBSD developers as I tried tearing apart the Linux I2C, LPSS, IOAPIC, and ACPI code with no luck.
As I heard reports from other users and developers with Intel 100 Series machines with the same interrupt problem, I started to assume it was specific to these newer chipsets. I went digging through Intel documentation and I2C implementations in other OSes (such as Coreboot and Google's Zircon kernel) to find anything related to this specific hardware.
Growing weary and admitting defeat, I added an
adaptive polling mechanism
ihidev so the kernel would poll the device every 200ms until there was
touch data available, then poll at 10ms until shortly after it stopped receiving
This was enough to get touchpads working on these new laptops, but it was slow
and wasted a bit of CPU time and battery power.
Unfortunately that "temporary" polling mechanism had to be used for the next two
years as no one could fix (or was not interested in fixing) this problem.
ACPI Node Walking
A few weeks ago, I purchased the 7th generation ThinkPad X1 Carbon. Getting OpenBSD installed and working on it has been quite a feat, as there were multiple bugs to fix. The first showstopper was a kernel panic shortly into booting the installer due to an AML problem with OpenBSD's AML parser reporting "Not Integer" when executing a particular method.
For some quick background: Linux and most smaller operating systems use an ACPI interpreter called ACPICA which is written and maintained by Intel. OpenBSD and Windows each use their own custom-developed ACPI stacks. Presumably Microsoft has many engineers available to maintain their ACPI implementation (since they also wrote and maintain the official ACPI specfication with Intel) and every other OS just re-imports the ACPICA code from Intel when it's updated. Unfortunately on OpenBSD, this means we have to fix bugs and implement new functionality required by the ACPI spec (now at version 6.3) when we encounter them on new hardware.
The cause of the "Not Integer" panic on the X1 was due to this AML in an
method (ironically, on its touchpad device):
Method (_INI, 0, NotSerialized) // _INI: Initialize
GPDI = 0x64
If ((OSYS < 0x07DC))
SRXO (GPDI, One)
INT1 = GNUM (GPDI)
INT2 = INUM (GPDI)
If ((TPDT == 0x05))
If ((^^^LPCB.NFCD == Zero))
_HID = "SYNA8005"
_HID = "SYNA8004"
ADBG (Concatenate ("TPD0 _HID:", ToHexString (_HID)))
HID2 = 0x20
BADR = 0x2C
ADBG (Concatenate ("TPD0 _INI:BADR=", ToHexString (BADR)))
When ACPI is being initialized in ACPICA or OpenBSD's ACPI code, it walks the
tree looking for any methods named
_INI and executes them.
This is how certain variables get initialized, interrupts get setup, and anything
else the hardware vendor needs to do.
At this point you may be thinking: maybe there's just an
_INI function that
OpenBSD is not executing that is needed to fix the touchpad interrupt problem.
I checked this a long time ago and listed out all of the
_INI method calls that
OpenBSD did and compared it to Linux.
The results were similar enough that I didn't investigate further.
ToHexString operator in that
_INI function is one built-in to ACPI and
is supposed to convert string or integer data into a string of hexadecimal
The way it was implemented in OpenBSD's AML parser
11 years ago
was to only accept integer arguments, so anything passed to it that wasn't an
integer (such as the
_HID string above) would cause an
After reviewing the ACPI specification, the
was just to allow passing other types to the
functions since the underlying OpenBSD implementations already handled
converting non-integer types.
However, while debugging that crash, I noticed something strange.
The first conditional in that
_INI method checks against
OSYS, which is a
global variable that most DSDTs compute according to which version of whichever
operating system it's running on.
There's a long history related to
_OSI that I won't go into, but basically
claims to be Windows
now, except on Apple hardware, where we all claim to be Darwin, because it's
easier for other OSes to behave like Windows and macOS than for the hardware
vendors to update their BIOS code when a driver issue in Linux is fixed.
--== Eval Method [\\_SB_.PCI0.I2C1.TPD0._INI, 0 args] to t ==--
===== Stack \\_SB_.PCI0.I2C1.TPD0._INI:Method
parsename: \\GPDI 5
write 00 6fb1847a 0020 [\\GNVS]
parsename: \\OSYS 5
read 00 6fb18000 0010 [\\GNVS]
aml_evalexpr: LLess 0 7dc = ffffffffffffffff
quick: 203a8 [LLess] alloc return integer = 0xffffffffffffffff
parse-if @ 203a6
parsename: \\_SB_.SRXO 8
If ((OSYS < 0x07DC)) was being turned into a conditional
LLess 0 7dc,
but why was
Looking elsewhere in the DSDT,
OSYS is initialized like so:
Method (_INI, 0, Serialized) // _INI: Initialize
TBPE = One
OSYS = 0x03E8
If (CondRefOf (\_OSI))
If (_OSI ("Windows 2001"))
WNTF = One
WXPF = One
WSPV = Zero
OSYS = 0x07D1
If (_OSI ("Windows 2001 SP1"))
WSPV = One
OSYS = 0x07D1
If (_OSI ("Windows 2015"))
WIN8 = One
OSYS = 0x07DF
Basically for each newer version of Windows that the system reports it is
compatible with (OpenBSD reports up to Windows 2015),
OSYS is updated to a
OSYS variable is then used in various other DSDT methods related to setting
up devices, basically to allow backwards compatibility if the machine is
being used with older versions of Windows that may not be able to deal with a
device set up in one particular way vs. another.
OSYS is being initialized in
_SB.PCI0._INI, why is it zero when doing
the conditional in the touchpad's
Well as it turns out, the way that OpenBSD's ACPI stack was walking the entire
DSDT tree looking for
_INI methods was slightly different than ACPICA (and
On OpenBSD, nodes were being walked in this order;
But in ACPICA, they were walked in this order:
That slight change in ordering was the entire cause of the interrupt problem.
Earlier I wrote that I checked the list of
_INI calls in Linux vs. OpenBSD,
but I didn't realize the order of them was so important and that they had
interdependencies that weren't explicit.
\_SB_.PCI0.I2C1.TPL1._INI was executed first,
OSYS was still zero,
meaning that conditional mentioned earlier was returning true, executing
SRXO (GDPI, One).
_SB_.PCI0._INI was being executed, properly initializing
ihidev would attach later, it would call the touchpad device's
method to retrieve information about the I2C slave address and interrupt
information that was supposed to be setup earlier in its
Method (_CRS, 0, NotSerialized) // _CRS: Current Resource Settings
If ((OSYS < 0x07DC))
Return (SBFI) /* \_SB_.PCI0.I2C1.TPD0.SBFI */
By this time,
OSYS was properly set, and it would return resource information
saying that its interrupts were routing through the IOAPIC on a particular pin,
and OpenBSD would try to configure the IOAPIC accordingly.
However, that didn't match what the firmware was actually doing earlier when
_INI was executed, because it was being told to route its interrupt through
some other mechanism or perhaps it never activated anything.
for this was to change the node walk algorithm to match ACPICA and execute a
matching child node (
_INI) of a device before recursing through its child
With that change in place, it now properly executes
_SB_.PCI0.I2C1.TPL1._INI, ensuring that
OSYS is set before it's read.
With that fix in place, I was happy to finally
In the end, the bug had nothing to do with the devices being Intel 100 Series,
and was most likely affecting all of them similarly because their vendors all
used the same DSDT template from Intel, which uses
OSYS in device
methods without an explicit dependency on
_SB_.PCI0._INI to initialize it.
These fixes are now in the OpenBSD tree and have been in recent snapshots, so if this bug affected you and you want to try it out with proper interrupts, try the most recent snapshot.