KAN-MLP-Mixer: A comprehensive investigation of the usage of Kolmogorov-Arnold Networks (KANs) for improving IMU-based Human Activity Recognition

arXiv cs.AI 05/20/26, 04:00 AM Papers
Summary
This paper systematically explores hybrid KAN and MLP architectures for IMU-based human activity recognition, achieving a 5.33% average macro F1 improvement over pure MLP baselines.
arXiv:2605.19031v1 Announce Type: new Abstract: Kolmogorov-Arnold Networks (KANs) have demonstrated an exceptional ability to learn complex functions on clean, low-dimensional data but struggle to maintain performance on noisy and imperfect real-world datasets. In contrast, conventional multi-layer perceptrons (MLPs) are far more tolerant to noise and computationally efficient. Replacing all MLP components with KANs in HAR models often degrades accuracy and computation efficiency, highlighting an open challenge: how to combine KANs' precision with MLPs' noise robustness and efficiency. To address this, we systematically explore various placements of KAN modules within deep HAR networks and propose a hybrid architecture that strategically synergizes the strengths of both paradigms, which uses a KAN-based input embedding layer, retains MLP layers for intermediate feature mixing, and introduces a specialized LarctanKAN module for final activity classification. Across eight public HAR datasets, the hybrid KAN-MLP model achieves an average macro F1 score relative improvement of 5.33\% compared pure-MLP model, significantly outperforming standalone KAN and MLP baselines. Furthermore, integrating this hybrid strategy into other state-of-the-art HAR architectures consistently boosts their performance. Our findings demonstrate that a carefully orchestrated combination of KAN, MLP, or other conventional neural components yields more robust and accurate HAR models for real-world wearable sensing environments.
Original Article
View Cached Full Text
Cached at: 05/20/26, 08:27 AM
# A comprehensive investigation of the usage of Kolmogorov–Arnold Networks (KANs) for improving IMU-based Human Activity Recognition
Source: [https://arxiv.org/html/2605.19031](https://arxiv.org/html/2605.19031)
\(5 June 2009\)

###### Abstract\.

Kolmogorov–Arnold Networks \(KANs\) have demonstrated an exceptional ability to learn complex functions on clean, low\-dimensional data but struggle to maintain performance on noisy and imperfect real\-world datasets\. In contrast, conventional multi\-layer perceptrons \(MLPs\) are far more tolerant to noise and computationally efficient\. Replacing all MLP components with KANs in HAR models often degrades accuracy and computation efficiency, highlighting an open challenge: how to combine KANs’ precision with MLPs’ noise robustness and efficiency\. To address this, we systematically explore various placements of KAN modules within deep HAR networks and propose a hybrid architecture that strategically synergizes the strengths of both paradigms, which uses a KAN\-based input embedding layer, retains MLP layers for intermediate feature mixing, and introduces a specialized LarctanKAN module for final activity classification\. Across eight public HAR datasets, the hybrid KAN–MLP model achieves an average macro F1 score relative improvement of 5\.33% compared pure\-MLP model, significantly outperforming standalone KAN and MLP baselines\. Furthermore, integrating this hybrid strategy into other state\-of\-the\-art HAR architectures consistently boosts their performance\. Our findings demonstrate that a carefully orchestrated combination of KAN, MLP, or other conventional neural components yields more robust and accurate HAR models for real\-world wearable sensing environments\.

Do, Not, Us, This, Code, Put, the, Correct, Terms, for, Your, Paper

††copyright:acmlicensed††journalyear:2018††doi:XXXXXXX\.XXXXXXX††conference:Make sure to enter the correct conference title from your rights confirmation email; June 03–05, 2018; Woodstock, NY††isbn:978\-1\-4503\-XXXX\-X/2018/06††ccs:Do Not Use This Code Generate the Correct Terms for Your Paper††ccs:Do Not Use This Code Generate the Correct Terms for Your Paper††ccs:Do Not Use This Code Generate the Correct Terms for Your Paper††ccs:Do Not Use This Code Generate the Correct Terms for Your Paper## 1\.Introduction

Accurate Human Activity Recognition \(HAR\) from body\-worn sensors is a cornerstone of ubiquitous computing, enabling applications ranging from personalized health monitoring and fitness tracking on smartwatches\(Abbas et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib2); Yin et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib53)\)to context\-aware interactions in smart environments\(Abdel\-Salam et al\.,[2021](https://arxiv.org/html/2605.19031#bib.bib3); Bian et al\.,[2022](https://arxiv.org/html/2605.19031#bib.bib9)\)\. Inertial Measurement Units \(IMUs\), commonly embedded in wearables, provide rich motion data\. However, leveraging this data effectively is challenging; real\-world IMU signals are notoriously complex and plagued by noise, sensor drift, placement variations, and inter\-subject differences, hindering the development of robust, universally applicable HAR systems\(Tseng and Wen,[2023](https://arxiv.org/html/2605.19031#bib.bib49)\)\.

Deep learning models have become the standard for tackling HAR, automatically learning features from sensor data\(Chen et al\.,[2021](https://arxiv.org/html/2605.19031#bib.bib15)\)\. Among these, Multi\-Layer Perceptrons \(MLPs\) and their variants remain surprisingly effective\(Ojiako and Farrahi,[2023](https://arxiv.org/html/2605.19031#bib.bib37); Zhou et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib57)\)\. While perhaps less complex than CNNs or RNNs, MLPs are valued not only for their representational power but crucially for theirrobustness to noiseandcomputational efficiency\(Le et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib29)\)\. This efficiency is paramount for deployment on resource\-constrained wearable devices where battery life and processing power are limited\. Indeed, recent work has shown that purely MLP\-based architectures, like MLP\-Mixers adapted for HAR and MLPHAR, can achieve state\-of\-the\-art or competitive performance with significantly fewer parameters than heavier models\(Ojiako and Farrahi,[2023](https://arxiv.org/html/2605.19031#bib.bib37); Zhou et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib57); Miyoshi et al\.,[2025](https://arxiv.org/html/2605.19031#bib.bib36)\), making them highly practical for the wearable and ubiquitous computing domain\.

Recently, Kolmogorov–Arnold Networks \(KANs\) emerged as a novel neural network paradigm offering remarkable theoretical potential\(Liu et al\.,[2024c](https://arxiv.org/html/2605.19031#bib.bib34)\)\. Inspired by the Kolmogorov\-Arnold representation theorem, KANs replace the fixed activations and linear weights of MLPs with learnable univariate functions on network edges\. This design grants them exceptional flexibility, enabling them to approximate complex functions with high fidelity, often surpassing traditional networks on clean, low\-dimensional mathematical or physics\-based tasks\(Liu et al\.,[2024c](https://arxiv.org/html/2605.19031#bib.bib34); Drokin,[2024](https://arxiv.org/html/2605.19031#bib.bib19); Poeta et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib40); Jamali et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib25)\),including new efficient architectures for quantum machine learning\(Werner et al\.,[2025](https://arxiv.org/html/2605.19031#bib.bib50); Ivashkov et al\.,[2026](https://arxiv.org/html/2605.19031#bib.bib24)\)\.

However, translating KANs’ theoretical promise to the messy reality of wearable sensor data presents a significant hurdle\. A growing body of evidence indicates that KANs exhibit considerablesensitivity to noise and data imperfections\(Shen et al\.,[2025](https://arxiv.org/html/2605.19031#bib.bib44); Cang et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib12); Ibrahum et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib23)\)\. Their intricate function\-fitting mechanism, while powerful on clean signals, appears vulnerable to the inherent variability and noise found in IMU data collected in the wild\. Moreover, KANs are based on the assumption that the target function is continuous; if this assumption is violated, the model may fail to approximate the function effectively\(Liu et al\.,[2024c](https://arxiv.org/html/2605.19031#bib.bib34)\), as shown in[Fig\.1](https://arxiv.org/html/2605.19031#S1.F1)\. As we can see there, KANs can fit better smooth periodic signals while MLPs outperform them in decision boundaries such as step functions\. Directly substituting KANs for MLPs in established HAR pipelines often results in a substantial drop in accuracy and can negate the efficiency benefits sought in wearable applications due to complex spline computations and tuning requirements\(Le et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib29)\), including our own preliminary findings shown in[Table3](https://arxiv.org/html/2605.19031#S3.T3)\.

This creates a critical dilemma for HAR researchers aiming to leverage cutting\-edge architectures:How can we harness the potential function approximation power of KANs for complex activity patterns without sacrificing the robustness and computational efficiency essential for practical, real\-world wearable HAR systems?While initial explorations into hybrid KAN models exist\(Liu et al\.,[2024a](https://arxiv.org/html/2605.19031#bib.bib32); Yang and Wang,[2024](https://arxiv.org/html/2605.19031#bib.bib52); Bodner et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib10)\), there lacks a systematic investigation into how to best integrate KAN components within HAR architectures specifically designed to handle real\-world sensor streams effectively and efficiently\.

To bridge this gap, we firstly conduct a systematic empirical investigation into strategically integrating KANs and their variants within a strong, efficient, pure MLP\-based HAR architecture \(MLPHAR\(Zhou et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib57)\)\)\. Our analysis reveals that the placement of KAN modules is critical: they excel at initial data embedding but falter in intermediate feature mixing roles, while specific variants show promise for classification\. Based on these insights, we then proposeKAN\-MLP\-Mixer, a novel hybrid architecture that synergistically combines:

- 1\.AnEfficientKANmodule for adaptive data embedding, chosen for its ability to capture input complexities while managing computational overhead\.
- 2\.StandardMLP layersfor intermediate feature mixing, preserving the baseline’s proven robustness and efficiency\.
- 3\.ALarctanKANmodule for classification, selected for its smooth, bounded activation potentially offering enhanced stability against noise in the final prediction stage\.

We evaluate KAN\-MLP\-Mixer rigorously across eight diverse wearable and ubiquitous HAR datasets\. Our results demonstrate that this targeted hybridization achieves a significant average macro F1\-score improvement of5\.33%over the robust MLPHAR baseline and substantially outperforms naive full KAN replacement strategies\.

The main contributions of this work are:

- •Asystematic empirical analysisof different KAN integration strategies within an MLP\-based HAR framework, identifying performance trade\-offs on real\-world IMU sensor data\.
- •The proposal ofKAN\-MLP\-Mixer, a novel hybrid architecture specifically designed to balance KAN expressiveness with MLP robustness and efficiency for practical HAR\.
- •Demonstration of improved performance, showing that KAN\-MLP\-Mixer consistently outperforms the MLPHAR baseline across eight diverse public HAR datasets by an average of 5\.33% macro F1\-score\.
- •Providing evidence thatstrategic hybridization, rather than wholesale replacement, offers a viable pathway to leverage advanced architectures like KANs effectively in challenging real\-world sensor\-based applications common in ubiquitous computing\.
- •Extending the hybrid design strategyacross diverse neural backbones, window sizes, and sensing modalities, demonstrating consistent performance improvements under varying conditions through strategic hybridization\.
- •Providing a set ofpractical design guidelinesderived from comprehensive empirical findings to inform the effective application of KANs in sensor\-based HAR tasks\.

While comprehensive benchmarking against the broader HAR state\-of\-the\-art and detailed computational profiling for on\-device deployment remain important future steps, our findings provide strong evidence for the potential of carefully constructed KAN\-MLP hybrids\. This work paves the way for developing more accurate, robust, and ultimately practical HAR systems suitable for the demands of next\-generation wearable and mobile technologies\.

![Refer to caption](https://arxiv.org/html/2605.19031v1/x1.png)\(a\)Predictions on a step function\. LarctanKAN achieves the lowest error \(RMSE = 0\.018\) with only 593 parameters, outperforming both MLP and KAN, the latter of which exhibits overshooting due to its spline\-based formulation and uses 5440 parameters\. This demonstrates the advantage of LarctanKAN for modeling discontinuous decision boundaries\.
![Refer to caption](https://arxiv.org/html/2605.19031v1/x2.png)\(b\)Predictions on a smooth periodic functionf\(x\)=sin⁡\(2πx\)cos⁡\(2πx\)f\(x\)=\\sin\(2\\pi x\)\\cos\(2\\pi x\)\. KAN achieves the lowest RMSE \(0\.001\), capturing the waveform precisely\. MLP and LarctanKAN underfit due to limited expressivity, highlighting KAN’s strength in embedding smooth sensor signals\.

Figure 1\.Comparison of model predictions on synthetic functions representing typical characteristics of sensor data\. The step function emulates classification boundaries, while the periodic function simulates continuous IMU signal patterns\. Results highlight the motivation for our hybrid architecture: using KAN in the embedding layer and LarctanKAN in the classifier\.
## 2\.Related Work

Table 1\.Comparative summary of KAN variantsKAN TypeBasis FunctionFunction FormMajor FeatureKAN\(Liu et al\.,[2024c](https://arxiv.org/html/2605.19031#bib.bib34)\)B\-splineϕ\(x\)=w\(silu\(x\)\+∑iciBi\(x\)\)\\phi\(x\)=w\\left\(\\text\{silu\}\(x\)\+\\sum\_\{i\}c\_\{i\}B\_\{i\}\(x\)\\right\)High interpretabilityEfficientKAN\(Cao,[2024](https://arxiv.org/html/2605.19031#bib.bib13)\)B\-splineϕ\(x\)=w\(silu\(x\)\+∑iciBi\(x\)\)\\phi\(x\)=w\\left\(\\text\{silu\}\(x\)\+\\sum\_\{i\}c\_\{i\}B\_\{i\}\(x\)\\right\)GPU optimized, memory\-efficientFastKAN\(Li,[2024](https://arxiv.org/html/2605.19031#bib.bib30)\)Gaussian RBFϕ\(x\)=∑i=1Kciexp⁡\(−\(x−μi\)22σ2\)\\phi\(x\)=\\sum\_\{i=1\}^\{K\}c\_\{i\}\\exp\\left\(\-\\frac\{\(x\-\\mu\_\{i\}\)^\{2\}\}\{2\\sigma^\{2\}\}\\right\)Fast computationWavKAN\(Bozorgasl and Chen,[\[n\. d\.\]](https://arxiv.org/html/2605.19031#bib.bib11)\)Waveletϕ\(x\)=∑j=0J∑k=02j−1wj,kψj,k\(x\)\\phi\(x\)=\\sum\_\{j=0\}^\{J\}\\sum\_\{k=0\}^\{2^\{j\}\-1\}w\_\{j,k\}\\psi\_\{j,k\}\(x\)Multiresolution analysisFourierKAN\(Xu et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib51)\)Sine and Cosineϕ\(x\)=∑k=1g\(akcos⁡\(kx\)\+bksin⁡\(kx\)\)\\phi\(x\)=\\sum\_\{k=1\}^\{g\}\(a\_\{k\}\\cos\(kx\)\+b\_\{k\}\\sin\(kx\)\)Smooth and differentiableLarctanKAN\(Chen and Zhang,[2024a](https://arxiv.org/html/2605.19031#bib.bib16)\)Arctanϕ\(x;k\)=arctan⁡\(kx\)\\phi\(x;k\)=\\arctan\(kx\)Compact, efficient activationMLP–ϕ\(x\)=fact\(Wx\+b\)\\phi\(x\)=f\_\{act\}\(Wx\+b\)Simple, widely usedThis section reviews prior research relevant to our work, focusing on KANs, the role of MLPs in HAR, and the evolution of hybrid architectures in this domain\.

### 2\.1\.Kolmogorov\-Arnold Networks \(KANs\): Potential and Challenges

KANs, inspired by the Kolmogorov–Arnold representation theorem\(Kolmogorov,[1957](https://arxiv.org/html/2605.19031#bib.bib28); Arnold,[1963](https://arxiv.org/html/2605.19031#bib.bib5),[2009](https://arxiv.org/html/2605.19031#bib.bib6)\), represent a recent shift in neural network design\(Liu et al\.,[2024c](https://arxiv.org/html/2605.19031#bib.bib34)\)\. Instead of fixed activations at nodes, KANs employ learnable univariate functions \(often splines\) on network edges\. This grants them significant theoretical expressiveness, allowing them to approximate complex functions accurately, sometimes with fewer parameters than traditional MLPs, particularly on clean, well\-structured tasks like symbolic regression or physics modeling\(Liu et al\.,[2024c](https://arxiv.org/html/2605.19031#bib.bib34); Toscano et al\.,[2025](https://arxiv.org/html/2605.19031#bib.bib48); Liu et al\.,[2024b](https://arxiv.org/html/2605.19031#bib.bib33); Koenig et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib27); Genet and Inzirillo,[2024](https://arxiv.org/html/2605.19031#bib.bib22)\)\. This potential for capturing intricate relationships is intriguing for modeling complex human motion patterns from sensor data\.

However, the transition from theoretical promise to practical application in ubiquitous computing faces hurdles\. KANs’ reliance on fine\-grained, learnable activation functions makes them susceptible to noise and data irregularities\(Somvanshi et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib45); Shen et al\.,[2025](https://arxiv.org/html/2605.19031#bib.bib44); Cang et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib12)\)\. This sensitivity is a major concern for wearable sensor HAR, where IMU signals are inherently noisy and variable due to real\-world conditions\(Tseng and Wen,[2023](https://arxiv.org/html/2605.19031#bib.bib49)\)\. Furthermore, the computational overhead associated with training and evaluating spline\-based KANs can be substantial\(Le et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib29)\), potentially conflicting with the strict resource constraints \(power, memory, compute\) of wearable devices\. Ongoing research aims to mitigate these issues through architectural variants like EfficientKAN\(Cao,[2024](https://arxiv.org/html/2605.19031#bib.bib13)\), FastKAN\(Li,[2024](https://arxiv.org/html/2605.19031#bib.bib30)\), WavKAN\(Bozorgasl and Chen,[\[n\. d\.\]](https://arxiv.org/html/2605.19031#bib.bib11)\), FourierKAN\(Xu et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib51)\), and parameter\-efficient designs like LarctanKAN\(Chen and Zhang,[2024a](https://arxiv.org/html/2605.19031#bib.bib16),[b](https://arxiv.org/html/2605.19031#bib.bib17)\), as well as regularization techniques\(Altarabichi,[2024](https://arxiv.org/html/2605.19031#bib.bib4)\)\. A comparative summary of KAN variants are presented in[Table1](https://arxiv.org/html/2605.19031#S2.T1)\. An initial study by Liu et al\.\(Liu et al\.,[2024a](https://arxiv.org/html/2605.19031#bib.bib32)\)explored KANs for feature extraction in HAR, hinting at their potential but not providing a systematic hybridization strategy for robust, practical use\. Thus, while KANs offer a powerful new tool, their direct application to complex, sensor\-based HAR scenarios remains challenging\.

### 2\.2\.MLPs: The Robust and Efficient Workhorse for HAR

Despite the advent of newer architectures, MLPs remain a highly relevant and effective tool, particularly in practical HAR applications\. Long known as universal function approximators\(Pinkus,[1999](https://arxiv.org/html/2605.19031#bib.bib39)\), MLPs have seen renewed interest, partly inspired by MLP\-Mixer architectures\(Tolstikhin et al\.,[2021](https://arxiv.org/html/2605.19031#bib.bib47)\)demonstrating their capability even in complex domains previously dominated by CNNs or Transformers\.

Crucially for HAR on wearable devices, MLPs offer a compelling balance of performance, robustness, andcomputational efficiency\(Zhou et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib57); Ojiako and Farrahi,[2023](https://arxiv.org/html/2605.19031#bib.bib37); Miyoshi et al\.,[2025](https://arxiv.org/html/2605.19031#bib.bib36); Liu et al\.,[2021](https://arxiv.org/html/2605.19031#bib.bib31)\)\. Their simpler structure with fixed activation functions makes them generally easier to train, more robust to noise, and significantly more lightweight than many alternatives\(Le et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib29); Fan and Gao,[2021](https://arxiv.org/html/2605.19031#bib.bib21)\)\. Models like MLPHAR\(Zhou et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib57)\)achieve competitive accuracy on HAR benchmarks using only MLP layers, often with orders of magnitude fewer parameters than CNN or RNN counterparts like DeepConvLSTM\(Ordóñez and Roggen,[2016](https://arxiv.org/html/2605.19031#bib.bib38)\)or TinyHAR\(Zhou et al\.,[2022](https://arxiv.org/html/2605.19031#bib.bib58)\)\. This efficiency is vital for deployment on battery\-powered wearables\. Their established effectiveness and practicality make MLPs not only a strong baseline but also a valuable component for building robust and deployable HAR systems\.

### 2\.3\.Hybrid Architectures: Combining Strengths for HAR

Recognizing that different network components excel at different tasks, hybrid architectures have become common in HAR to leverage complementary strengths\. The seminal DeepConvLSTM\(Ordóñez and Roggen,[2016](https://arxiv.org/html/2605.19031#bib.bib38)\)combined CNNs for local spatial feature extraction from sensor windows with LSTMs for modeling temporal dependencies, significantly improving performance\. This principle has been extended to include attention mechanisms\(Khan and Ahmad,[2021](https://arxiv.org/html/2605.19031#bib.bib26)\)and Transformers\(Dirgová Luptáková et al\.,[2022](https://arxiv.org/html/2605.19031#bib.bib18); Shavit and Klein,[2021](https://arxiv.org/html/2605.19031#bib.bib43); Sui et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib46)\)combined with CNNs, RNNs, or MLPs\(Zhou et al\.,[2022](https://arxiv.org/html/2605.19031#bib.bib58); Zhang et al\.,[2022](https://arxiv.org/html/2605.19031#bib.bib56); Enokibori,[2024](https://arxiv.org/html/2605.19031#bib.bib20)\)to capture both local patterns and long\-range or global context within activity sequences\.

The emergence of KANs opens new possibilities for hybridization\. The core motivation is to combine KANs’ potential for high\-fidelity function approximation with the proven robustness and efficiency of established components like CNNs, RNNs, or, as explored in this work, MLPs\(Somvanshi et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib45); Liu et al\.,[2024a](https://arxiv.org/html/2605.19031#bib.bib32); Yang and Wang,[2024](https://arxiv.org/html/2605.19031#bib.bib52); Bodner et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib10)\)\. For instance, a KAN module might excel at learning complex initial transformations from raw sensor data, while an MLP or CNN handles subsequent feature aggregation more robustly and efficiently\. While the idea of KAN\-based HAR hybrids exists, and specific variants like Temporal\-KAN have been proposed\(Somvanshi et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib45)\), concrete implementations and systematic studies evaluating how best to integrate KANs into HAR pipelines, particularly considering the noise and efficiency constraints of wearable sensing, are currently lacking\.

## 3\.Empirical Study

We conducted a comprehensive empirical investigation to explore the impact of integrating various KAN variants into the establishedMLPHARarchitecture\(Zhou et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib57)\)\. Specifically, we systematically replaced the original MLP layers in the MLPHAR baseline with different KAN variants, including the original KAN, EfficientKAN, FastKAN, WavKAN, FourierKAN, and LarctanKAN\. We evaluated these modifications on eight widely\-used HAR datasets\.

### 3\.1\.Testbed Model

To systematically evaluate the impact of integrating KANs into existing neural architectures, we selectedMLPHAR\(Zhou et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib57)\)as our baseline testbed model\. MLPHAR is a purely MLP\-based neural architecture specifically tailored for sensor\-based HAR tasks\.

The MLPHAR model consists of three sequentially concatenated modules: the Data Embedding module, the Feature Mixer module, and the Classifier module\. Detailed information about the implementation can be found in the existing work\(Zhou et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib57)\)\. Formally, these modules are arranged in the following order:

Data Embedding⇒Feature Mixer⇒Classifier\\text\{Data Embedding\}\\Rightarrow\\text\{Feature Mixer\}\\Rightarrow\\text\{Classifier\}
The straightforward design and robust empirical performance of MLPHAR make it an ideal model to rigorously assess the benefits and limitations of various KAN implementations\. By systematically substituting its MLP layers with different KAN variants, we can precisely isolate and quantify the effects of KAN\-based components, free from confounding architectural complexities found in models employing convolutional or recurrent modules\.

Specifically, MLPHAR was chosen as the test model for several key reasons:

- •Architectural Purity:MLPHAR exclusively employs fully connected layers, intentionally omitting convolutional and recurrent layers commonly found in other HAR architectures\. This design choice provides a controlled experimental environment to clearly identify how KAN replacements impact representational capacity and training dynamics compared to MLP\.
- •Proven Effectiveness:Despite its simplicity, MLPHAR has consistently achieved competitive results across multiple HAR benchmark datasets, as shown in the previous work\(Zhou et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib57)\), often matching or exceeding the performance of more sophisticated models such as CNNs and LSTMs\. Thus, it provides a reliable baseline against which the performance implications of integrating KANs can be accurately measured\.
- •Practical Efficiency:Due to its compact parameterization and computational efficiency, MLPHAR is particularly well\-suited for deployment on edge devices and wearable systems\. Evaluating KAN integrations within such an efficiency\-critical model directly addresses real\-world constraints, ensuring the practical applicability of our findings in ubiquitous computing environments\.

By employing MLPHAR as our foundational test architecture, this empirical study delivers a clear, consistent, and practically relevant evaluation of various KAN\-based architectural designs, providing actionable insights for future development in sensor\-based HAR applications\.

### 3\.2\.Datasets

We employed eight widely used benchmark datasets to comprehensively evaluate the impact of integrating various KAN variants into the MLPHAR model\. These datasets represent a broad spectrum of HAR scenarios, differing notably in sensor types, body sensor placements, data sampling frequencies, and the complexity of activity classification tasks\. The window size was selected by referring to the existing works\(Zhou et al\.,[2022](https://arxiv.org/html/2605.19031#bib.bib58),[2024](https://arxiv.org/html/2605.19031#bib.bib57)\)\. The raw input dataX∈ℝL×CX\\in\\mathbb\{R\}^\{L\\times C\}\(withLLas window length andCCas the number of sensor channels\) is split intoTTintervals of lengthτ\\tau, producing segmentsXt∈ℝτ×T×CX\_\{t\}\\in\\mathbb\{R\}^\{\\tau\\times T\\times C\}as the input of the MLPHAR model\. Detailed characteristics of each dataset are summarized in Table[2](https://arxiv.org/html/2605.19031#S3.T2)\.

By leveraging these diverse datasets, our empirical study robustly evaluates the generalization and performance of various KAN\-integrated architectures under realistic and varying conditions characteristic of practical HAR deployments\.

Table 2\.Details of HAR datasets used in experiments\.DatasetSensorsaPositionbFreq\. \(Hz\)ClassesChannelsWindow \(s\)Tc𝝉\\taucHAPT\(Reyes\-Ortiz et al\.,[2013](https://arxiv.org/html/2605.19031#bib.bib42)\)Acc, GyroWaist501262\.56816OPPO\(Chavarriaga et al\.,[2013](https://arxiv.org/html/2605.19031#bib.bib14)\)Acc, Gyro, MagLower Arm301891\.00310DG\(Bachlin et al\.,[2009](https://arxiv.org/html/2605.19031#bib.bib7)\)AccLeg64231\.00416PAMAP2\(Reiss and Stricker,[2012](https://arxiv.org/html/2605.19031#bib.bib41)\)Acc, GyroRight Hand331263\.00911Skodar\(Zappi et al\.,[2008](https://arxiv.org/html/2605.19031#bib.bib54)\)AccRight Wrist301033\.00518DSADS\(Zhang et al\.,[2015](https://arxiv.org/html/2605.19031#bib.bib55)\)Acc, Gyro, MagArm251995\.00525MotionSense\(Malekzadeh et al\.,[2021](https://arxiv.org/html/2605.19031#bib.bib35)\)Acc, GyroWaist50662\.56816MHEALTH\(Banos et al\.,[2014](https://arxiv.org/html/2605.19031#bib.bib8)\)Acc, GyroArm501362\.56816
- aonly the IMU sensor is selected
- bonly sensor in one position is selected
- cThe raw input dataX∈ℝL×CX\\in\\mathbb\{R\}^\{L\\times C\}\(withLLas window length andCCas the number of sensor channels\) is split intoTTintervals of lengthτ\\tau, producing segmentsXt∈ℝτ×T×CX\_\{t\}\\in\\mathbb\{R\}^\{\\tau\\times T\\times C\}as the input of the MLPHAR Model\.

### 3\.3\.Experiment Setup

All experiments were conducted on an NVIDIA A6000 GPU\. We trained each model for up to 200 epochs, employing early stopping with a patience of 7 epochs to prevent overfitting\. Model parameters were optimized using the Adam optimizer with an initial learning rate of 0\.001\. We adopted subject\-independent evaluation strategies appropriate for each dataset to ensure fair and realistic assessments\. Specifically, most datasets utilized a Leave\-One\-Subject\-Out \(LOSO\) cross\-validation scheme, where data from each subject was iteratively reserved as the test set while data from all remaining subjects served as the validation and training set\. Two exceptions were the MotionSense dataset, which employed Leave\-Group\-Out validation, and the Skodar dataset, evaluated using Leave\-One\-Session\-Out cross\-validation\. Model performance was assessed using the macro F1\-score, providing a balanced metric suitable for handling class\-imbalanced HAR datasets and effectively capturing performance across all activity categories\. To ensure reproducibility and consistent results, the initial parameters were generated with five seeds \(1\-5\), and the average was reported\. In the empirical experiment, only the data from the accelerometer in a single body part was used; we further extended the experiment to multiple sensors in multiple positions in[Section7\.1](https://arxiv.org/html/2605.19031#S7.SS1)\.

This empirical study consists of two sequential parts, each designed to analyze the integration of KANs into the MLPHAR architecture from a different perspective\. In the first part, we replace all MLP \(or linear\) layers in the original MLPHAR model with KAN or one of its variants to construct fully KAN\-based models\. This setup allows us to assess the performance and limitations of pure KAN architectures in the context of HAR tasks, and to compare them directly against the original MLP\-based baseline\. In the second part, we investigate the effects of partial KAN integration by selectively inserting KAN components at different positions within the MLPHAR architecture, specifically, the data embedding, feature mixer, or classifier modules\. This staged integration enables us to precisely isolate and quantify the individual contributions of KAN modules at each stage of the network, offering deeper insights into where KANs are most effective within a modern MLP\-based HAR pipeline\.

To ensure fair comparison, we maintain consistent input/output dimensions and layer counts between MLP and KAN configurations, with the data embedding module set to a hidden dimension of 16 across all datasets\. KAN configuration details are as follows: KAN and EfficientKAN use a grid size of 5, spline degree 3, with grid range\[−1,1\]\[\-1,1\]; FastKAN uses grid size 8, grid range\[−2,2\]\[\-2,2\]; FourierKAN uses grid size 5, and the Mexican hat function was selected in WavKAN\. Further analysis on the effects of hidden dimension \(in MLPHAR\) and grid size \(in EfficientKAN\) is discussed in[Section7\.4](https://arxiv.org/html/2605.19031#S7.SS4)\.

### 3\.4\.Direct Replacement of all MLPs with KANs

Table 3\.Result Summary of direct replacement of MLPs with KANs in the MLPHAR baseline\(Zhou et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib57)\)across eight HAR datasets\. Each cell shows the macro F1 score on a specific dataset\. The highest mean macro F1 score per dataset is highlighted inGold, the second\-best inSilverand the third\-best inBronze\. The last column reports the average relative improvement \(%\) of each model across all datasets over the baseline\. \(macro\-F1 ± std\)\.ModelDGDSADSPAMAP2OPPOSkodarHAPTMotionSenseMHEALTHGain \(%\)KAN0\.591±0\.1640\.434±0\.2280\.322±0\.3040\.318±0\.1980\.396±0\.4310\.297±0\.3390\.420±0\.3870\.306±0\.373\-36\.66EfficientKAN0\.592±0\.1640\.432±0\.2280\.322±0\.3040\.317±0\.1970\.395±0\.4300\.298±0\.3400\.421±0\.3880\.305±0\.372\-36\.74FastKAN0\.543±0\.1160\.503±0\.0780\.474±0\.2000\.415±0\.1240\.478±0\.1900\.604±0\.1260\.753±0\.0250\.627±0\.161\-12\.37WavKAN0\.533±0\.0400\.377±0\.0740\.435±0\.1510\.420±0\.0860\.087±0\.0130\.480±0\.0370\.654±0\.0510\.495±0\.049\-28\.20FourierKAN0\.570±0\.2180\.015±0\.0070\.037±0\.0610\.114±0\.0100\.051±0\.0100\.263±0\.2060\.666±0\.1110\.064±0\.066\-66\.45LarctanKAN0\.583±0\.1870\.480±0\.1050\.440±0\.2690\.185±0\.1510\.053±0\.1560\.547±0\.2670\.672±0\.3100\.592±0\.310\-28\.99Baseline \(MLP\)0\.580±0\.1010\.504±0\.0900\.540±0\.1860\.411±0\.1750\.844±0\.0170\.685±0\.0460\.836±0\.0260\.747±0\.1140\.00To evaluate the direct effectiveness of KANs in HAR tasks, we replaced all MLP \(or linear\) layers in the MLPHAR baseline model with KAN\-based alternatives\. The results are summarized in[Table3](https://arxiv.org/html/2605.19031#S3.T3), which reports the average macro F1 score across eight benchmark HAR datasets\. As shown in the in[Table3](https://arxiv.org/html/2605.19031#S3.T3), the baseline MLP model \(MLPHAR\) consistently outperforms all KAN\-based variants on the majority of datasets\. Specifically, MLP achieves the best results on seven out of eight datasets, with substantial margins in several cases such asPAMAP2,MotionSense, andMHEALTH\. The only exceptions are found in theDGdataset, where the vanillaKANslightly exceeds the baseline \(0\.591 vs\. 0\.580\), and inOPPO, whereWavKANshows a marginal gain \(0\.420 vs\. 0\.411\)\. These two cases, however, are isolated and do not indicate a general trend\.

Among the KAN variants tested,FastKANandLarctanKANdemonstrate relatively competitive performance\.FastKAN, in particular, achieves the smallest drop in average accuracy \(only –12\.37% relative to the MLP baseline\), suggesting its potential when adapted properly\. In contrast,FourierKANperforms significantly worse across all datasets, with the steepest performance degradation \(–66\.45%\), raising questions about the suitability of Fourier basis representations for HAR tasks in this form\.

The consistent performance gap observed between the MLP baseline and KAN\-based replacements suggests that direct substitution is not effective\. This may be attributed to several factors\. First, KANs introduce a different representational inductive bias compared to standard MLPs, which could disrupt feature hierarchies learned through dense layers\. Second, KANs rely on spline\-based function approximations that may require task\-specific tuning of hyperparameters, such as the number of grids or the interpolation strategy\. Lastly, their optimization behavior and sensitivity to initialization could lead to suboptimal convergence when used without tailored training routines\.

In a short conclusion, while the theoretical flexibility of KANs makes them appealing, our results show that a naïve replacement of MLPs in HAR pipelines is insufficient\. More targeted design and careful tuning are required to unlock the full potential of KANs in this domain\.

### 3\.5\.Performance of Selective KAN Integration

[Table6](https://arxiv.org/html/2605.19031#S5.T6)summarizes the experimental outcomes of these configurations across eight benchmark datasets\. The primary observations drawn from these results are:

- •KAN in Data Embedding \(K\-M\-M\): Integrating KAN variants at the Data Embedding module typically yielded positive results\. Notably, standard KAN and EfficientKAN in the embedding position resulted in average performance improvements of approximately 3\.38% and 3\.15%, respectively\. This indicates that KAN modules effectively capture initial nonlinear relationships and representations from raw sensor data\.
- •KAN in Feature Mixer \(M\-K\-M\): Replacing the Feature Mixer module with KAN variants generally resulted in substantial performance degradation \(ranging approximately from \-7% to \-25%\)\. Such consistent performance drops suggest that KAN modules might not be well\-suited for the role of intermediate feature transformation, possibly due to difficulties handling feature interactions at deeper network layers\.
- •KAN in Classifier \(M\-M\-K\): The integration of KANs at the Classifier module produced mixed results\. While most variants showed neutral or negative impacts, LarctanKAN showed slight positive effects \(\+1\.36% improvement\), suggesting that certain KAN designs could offer advantages in classification scenarios, potentially due to their smooth activation functions aiding decision boundary formation\. This result aligns with the existing work\(Chen and Zhang,[2024a](https://arxiv.org/html/2605.19031#bib.bib16)\)\.

Overall, these selective integration results demonstrate the importance of module\-specific placements when employing KAN\-based architectures in HAR\. Optimal performance gains from KANs are achieved by carefully identifying the modules within which their unique functional approximation capabilities can be most effectively leveraged\.

## 4\.Proposed Hybrid Architecture

![Refer to caption](https://arxiv.org/html/2605.19031v1/x3.png)Figure 2\.The proposed hybrid network architecture KAN\-MLP\-Mixer based on an empirical study\. It consists of three modules: KAN\-based data embedding, MLP\-based feature mixer and LarctanKAN\-based classifier\. This proposed hybrid network is built based on the MLPHAR framework\(Zhou et al\.,[2024](https://arxiv.org/html/2605.19031#bib.bib57)\)\. During implementation, we employedEfficientKANin place of the original KAN, as it delivers nearly identical performance while offering significantly improved memory efficiency and training speed\. \(Split: The raw input dataX∈ℝL×CX\\in\\mathbb\{R\}^\{L\\times C\}\(withLLas window length andCCas the number of sensor channels\) is split intoTTintervals of lengthτ\\tau, producing segmentsXt∈ℝτ×T×CX\_\{t\}\\in\\mathbb\{R\}^\{\\tau\\times T\\times C\}as the input of the neural network\.FFT: Fast Fourier Transform\)Based on the findings of our empirical study of KAN integration, we propose a new hybrid architecture,KAN\-MLP\-Mixer, strategically combining the strengths of both KAN and MLPs to improve the performance in HAR tasks\.

### 4\.1\.Design Rationale

Our experimental results reveal that KANs perform best when applied to the input embedding stage, where they capture complex nonlinear relationships from raw sensor data\. In contrast, substituting KANs into the feature mixer significantly degrades performance, likely due to optimization difficulties and overfitting\. Finally, LarctanKAN demonstrates a unique advantage when used in the classifier, offering improved generalization and smooth decision boundaries\.

Based on these insights, we adopt a modular hybrid architecture as shown in[Fig\.2](https://arxiv.org/html/2605.19031#S4.F2)\. This design leverages the expressiveness of KANs at the input level, preserves the stability and scalability of MLPs for feature transformation, and enhances classification robustness via LarctanKAN’s bounded, smooth activation function\.

### 4\.2\.Architecture Details

- •Data Embedding \(EfficientKAN\):The raw input signal is passed through a KAN\-based embedding module\. We adopt the EfficientKAN variant due to its lower memory cost and strong performance in the data embedding role, as observed in our experiments\. This module transforms raw sensor input into a high\-level feature representation using spline\-based function learning\.
- •Feature Mixer \(MLP\):The transformed features are then passed through a standard MLP block, which remains unchanged from the original MLPHAR\. This module performs the core latent feature transformation and mixing\. Our results show that MLPs outperform all tested KANs in this role, providing better generalization and stability\.
- •Classifier \(LarctanKAN\):The final layer is a LarctanKAN classifier, which replaces the original MLP\-based classification head\. The arctangent\-based activation offers smooth gradients and bounded outputs, which improve robustness and classification performance\.

### 4\.3\.Expected Benefits

This hybrid architecture is designed to: \(1\)Enhance expressivenessin early feature extraction using KANs\. \(2\)Retain robustness and efficiencyin mid\-layer processing via MLPs\. \(3\)Improve generalization and stabilityin classification using LarctanKAN\.

In doing so, it addresses the performance degradation seen in fully KAN\-based models while still leveraging their theoretical strengths in function approximation\. We demonstrate in the following section that this architecture consistently outperforms both pure KAN and pure MLP baselines across multiple HAR benchmarks\.

## 5\.Evaluation of the Proposed Hybrid Architecture

To comprehensively evaluate our proposed hybrid architecture \(KAN\-MLP\-Mixer\), we benchmarked its performance across eight diverse and representative HAR datasets, comparing against both the original MLPHAR baseline and previously tested selective KAN configurations\.[Table6](https://arxiv.org/html/2605.19031#S5.T6)summarizes the macro F1 scores obtained, clearly highlighting our proposed model’s effectiveness\.

### 5\.1\.Overall Performance

Our proposed KAN\-MLP\-Mixer architecture consistently outperforms all previously evaluated configurations, achieving an average macro F1 improvement of\+5\.33%relative to the original MLPHAR baseline\. Such significant gains underscore the complementary advantages of strategically integrating KAN modules into pure MLP\-based models, confirming our initial hypothesis that careful placement of KANs can substantially enhance HAR performance\.

Our hybrid model consistently outperformed baseline architectures, demonstrating substantial improvements in datasets such as OPPO \(\+5\.90%\), Skodar \(\+4\.38%\), MHEALTH \(\+5\.52%\), DSADS \(\+3\.10%\), and HAPT \(\+2\.00%\), showcasing the hybrid approach’s superior ability to handle modality fusion, noise resilience, and fine\-grained activity distinctions\. Even in simpler or structured scenarios \(DG, MotionSense, PAMAP2\), the hybrid design achieved notable improvements by strategically leveraging EfficientKAN’s expressive embedding and LarctanKAN’s robust classification capabilities, balanced by the generalization strength of MLP\-based mixers\. These results validate our selective integration of KAN modules, confirming their practical effectiveness and adaptability across diverse real\-world wearable HAR contexts\.

### 5\.2\.Comparative Insights and Module Contributions

Our results yield several key insights regarding module\-specific contributions within the KAN\-MLP\-Mixer architecture:

- •KAN\-based Data Embedding \(EfficientKAN\):Consistently improves early\-stage feature extraction, particularly effective with noisy and raw sensor signals, confirming prior observations that expressive embedding functions significantly boost performance in HAR tasks\.
- •MLP\-based Feature Mixer:Critical for maintaining high generalization performance\. Our empirical results reinforce earlier findings that MLP layers remain highly effective at deeper network levels, especially when complex feature interactions and stability are required\.
- •LarctanKAN Classifier:Provides substantial advantages in classification accuracy, leveraging smooth and bounded activation functions that create well\-defined and stable decision boundaries, crucial in noisy HAR data contexts\.

This nuanced combination leverages each module’s specific strengths, resulting in the superior overall performance observed\.

### 5\.3\.Generalization and Practical Implications

Compared to previous approaches that exhibited significant performance variability across datasets, the proposed KAN\-MLP\-Mixer hybrid architecture demonstrates remarkable stability and generalization\. Its consistently superior performance across diverse sensor placements and varying classification complexities underscores its suitability for real\-world wearable and ubiquitous computing applications\. These empirical findings highlight the clear advantages of strategically integrating KAN modules into robust MLP\-based HAR frameworks, offering actionable guidance for researchers and practitioners seeking to build accurate, efficient, and reliable HAR systems on mobile and wearable devices\.

Table 4\.Comparison of Model Performance across Various Datasets\. This table provides the mean F1 score for different models evaluated on various datasets\. The top three models for each dataset are highlighted: first place inGold, second place inSilver, and third place inBronze\.\(macro\-F1 ± std\)\.ModelDGDSADSPAMAP2OPPOSkodarHAPTMotionSenseMHEALTHGain \(%\)KANK\-M\-M0\.589±0\.0930\.509±0\.0710\.549±0\.1920\.478±0\.0900\.860±0\.0200\.697±0\.0400\.831±0\.0240\.772±0\.0953\.38M\-K\-M0\.586±0\.1750\.416±0\.2190\.450±0\.2740\.256±0\.1870\.525±0\.4190\.487±0\.3230\.599±0\.3570\.457±0\.378\-25\.62M\-M\-K0\.571±0\.0800\.517±0\.0910\.549±0\.1830\.412±0\.1550\.834±0\.0220\.697±0\.0430\.829±0\.0270\.755±0\.1140\.46EfficientKANK\-M\-M0\.587±0\.0940\.507±0\.0710\.547±0\.1920\.477±0\.0860\.858±0\.0170\.698±0\.0410\.831±0\.0250\.769±0\.0973\.15M\-K\-M0\.584±0\.1750\.419±0\.2190\.449±0\.2740\.252±0\.1840\.526±0\.4200\.486±0\.3220\.598±0\.3560\.457±0\.378\-25\.71M\-M\-K0\.572±0\.0810\.517±0\.0890\.545±0\.1830\.411±0\.1560\.835±0\.0160\.696±0\.0440\.829±0\.0250\.757±0\.1170\.39FastKANK\-M\-M0\.562±0\.0800\.484±0\.1780\.482±0\.2420\.388±0\.1550\.426±0\.3370\.672±0\.0440\.791±0\.0280\.739±0\.091\-10\.16M\-K\-M0\.558±0\.0810\.451±0\.0670\.512±0\.1750\.434±0\.0800\.808±0\.0280\.657±0\.0980\.803±0\.0220\.721±0\.102\-3\.72M\-M\-K0\.560±0\.0800\.486±0\.0750\.531±0\.1840\.335±0\.1810\.887±0\.0180\.673±0\.1010\.828±0\.0210\.746±0\.109\-3\.14WavKANK\-M\-M0\.553±0\.0660\.523±0\.0780\.563±0\.1880\.464±0\.0810\.793±0\.0260\.610±0\.0600\.787±0\.0290\.735±0\.107\-1\.02M\-K\-M0\.523±0\.0520\.429±0\.0810\.452±0\.1580\.398±0\.0540\.729±0\.0410\.562±0\.0510\.721±0\.0270\.575±0\.080\-14\.07M\-M\-K0\.548±0\.0790\.363±0\.0930\.467±0\.1750\.353±0\.0890\.638±0\.1280\.609±0\.0510\.765±0\.0530\.622±0\.086\-15\.22FourierKANK\-M\-M0\.543±0\.0500\.311±0\.1800\.518±0\.1760\.377±0\.1470\.714±0\.0240\.641±0\.0430\.712±0\.1250\.667±0\.091\-13\.02M\-K\-M0\.541±0\.0720\.438±0\.0770\.496±0\.1600\.431±0\.0840\.721±0\.0400\.643±0\.0430\.777±0\.0210\.666±0\.094\-7\.70M\-M\-K0\.547±0\.0590\.385±0\.0730\.458±0\.1550\.406±0\.0600\.528±0\.0520\.615±0\.0360\.783±0\.0190\.586±0\.067\-15\.18LarctanKANK\-M\-M0\.568±0\.0800\.507±0\.0840\.552±0\.1810\.408±0\.1750\.810±0\.0220\.682±0\.0400\.830±0\.0270\.755±0\.098\-0\.54M\-K\-M0\.579±0\.1760\.543±0\.0950\.440±0\.2710\.367±0\.2060\.426±0\.3970\.563±0\.2750\.724±0\.2650\.613±0\.288\-15\.07M\-M\-K0\.584±0\.1010\.513±0\.0780\.559±0\.1870\.402±0\.1680\.834±0\.0270\.698±0\.0370\.837±0\.0260\.793±0\.1091\.36MLPHAR0\.580±0\.1010\.504±0\.0900\.540±0\.1860\.411±0\.1750\.844±0\.0170\.685±0\.0460\.836±0\.0260\.747±0\.1140\.00KAN\-MLP\-Mixer0\.598±0\.0980\.535±0\.0810\.560±0\.1920\.470±0\.0800\.881±0\.0190\.705±0\.0400\.840±0\.0260\.802±0\.1025\.33Table 5\.\[New\]Comparison of Model Performance across Various Datasets\. This table provides the mean F1 score for different models evaluated on various datasets\. The top three models for each dataset are highlighted: first place inGold, second place inSilver, and third place inBronze\. \(macro\-F1 ± std\)\.ModelDGDSADSPAMAP2OPPOSkodarHAPTMOTIONSENSEMHEALTHKANK\-M\-M69\.072±1\.89986\.751±1\.33980\.651±1\.82473\.770±1\.55389\.240±1\.76471\.147±2\.05088\.419±1\.08191\.540±2\.021M\-K\-M58\.037±12\.28636\.110±44\.79134\.811±41\.85036\.881±32\.10937\.218±43\.90730\.197±34\.66439\.395±41\.00538\.210±46\.092M\-M\-K68\.311±2\.00786\.523±1\.09979\.961±1\.03573\.623±1\.42389\.285±2\.46170\.960±2\.27487\.748±1\.21392\.142±1\.521EfficientKANK\-M\-M71\.418±2\.08383\.538±19\.57981\.447±1\.59970\.525±13\.92490\.208±1\.91271\.432±1\.81389\.090±1\.08992\.889±1\.571M\-K\-M56\.644±11\.58436\.100±44\.77328\.329±38\.84430\.396±29\.81936\.542±42\.91128\.193±34\.62639\.377±40\.83938\.161±46\.112M\-M\-K67\.879±2\.30287\.540±1\.12879\.939±0\.95674\.004±1\.67088\.020±1\.84371\.789±1\.89988\.079±1\.27391\.247±1\.547FastKANK\-M\-M73\.453±2\.47072\.930±24\.82077\.292±1\.61568\.624±2\.23182\.017±3\.07768\.773±2\.53786\.181±1\.44889\.748±1\.392M\-K\-M68\.345±2\.27385\.305±1\.31074\.959±2\.48871\.194±1\.64981\.894±5\.27969\.131±1\.67586\.321±1\.39687\.638±1\.973M\-M\-K62\.895±3\.43588\.424±1\.40178\.489±1\.40065\.458±18\.53773\.971±27\.17970\.554±1\.84187\.953±1\.38291\.679±1\.575WavKANK\-M\-M69\.327±2\.48484\.304±2\.24278\.520±1\.84671\.460±1\.79686\.280±3\.64562\.896±2\.95785\.222±1\.24289\.056±2\.424M\-K\-M55\.558±3\.34580\.431±1\.99862\.996±3\.33665\.905±2\.47150\.267±7\.45858\.187±2\.91579\.256±2\.09677\.456±4\.059M\-M\-K60\.347±4\.32772\.672±7\.91964\.746±6\.23262\.739±5\.31363\.378±11\.10760\.803±4\.43379\.861±5\.50479\.326±5\.873FourierKANK\-M\-M72\.712±2\.21084\.612±1\.34475\.714±1\.74371\.658±1\.61878\.761±2\.95068\.276±2\.14478\.744±19\.72385\.093±17\.555M\-K\-M64\.348±3\.26280\.476±2\.46970\.372±1\.91267\.929±3\.38760\.654±4\.93167\.055±2\.33782\.085±1\.18483\.810±3\.874M\-M\-K58\.543±2\.13880\.280±1\.70154\.973±3\.69167\.064±1\.84438\.108±8\.51459\.777±2\.31681\.560±1\.45269\.575±4\.281LarctanKANK\-M\-M67\.910±2\.21985\.214±1\.10576\.916±1\.78272\.085±1\.84486\.318±2\.47870\.103±1\.70586\.792±1\.23689\.801±1\.342M\-K\-M60\.943±11\.48551\.243±42\.55948\.925±39\.88249\.073±31\.60551\.953±41\.32044\.280±35\.01455\.157±40\.36454\.897±44\.770M\-M\-K69\.984±2\.16986\.537±2\.26482\.010±1\.24974\.572±1\.55488\.769±2\.41470\.860±1\.87988\.775±1\.02391\.566±1\.794MLPHAR68\.572±2\.27186\.243±1\.60379\.270±1\.35572\.898±1\.86188\.766±2\.72270\.481±1\.79787\.570±1\.03790\.907±1\.704EfficientKAN\-LarctanKAN71\.967±1\.67288\.492±1\.07283\.930±1\.69975\.363±1\.43390\.235±2\.46572\.080±2\.44689\.697±1\.06493\.669±1\.235Table 6\.Comparison of Model Performance across Various Datasets\. This table provides the mean F1 score for different models evaluated on various datasets\. The top three models for each dataset are highlighted: first place inGold, second place inSilver, and third place inBronze\. \(macro\-F1 ± std\)\.ModelDGDSADSPAMAP2OPPOSkodarHAPTMOTIONSENSEMHEALTHKANK\-M\-M69\.072±1\.89986\.751±1\.33980\.923±1\.84773\.770±1\.55389\.240±1\.76471\.147±2\.05088\.419±1\.08191\.540±2\.021M\-K\-M58\.037±12\.28636\.110±44\.79132\.252±40\.95136\.881±32\.10937\.218±43\.90730\.197±34\.66439\.395±41\.00538\.210±46\.092M\-M\-K68\.311±2\.00786\.523±1\.09980\.421±1\.27173\.623±1\.42389\.285±2\.46170\.960±2\.27487\.748±1\.21392\.142±1\.521EfficientKANK\-M\-M71\.394±2\.20983\.663±19\.60581\.400±1\.37270\.846±13\.99890\.174±1\.94671\.023±1\.94389\.213±1\.20692\.866±1\.548M\-K\-M56\.644±11\.58436\.100±44\.77326\.945±38\.01630\.396±29\.81936\.542±42\.91128\.193±34\.62639\.377±40\.83938\.161±46\.112M\-M\-K67\.879±2\.30287\.540±1\.12880\.277±1\.13374\.004±1\.67088\.020±1\.84371\.789±1\.89988\.079±1\.27391\.247±1\.547FastKANK\-M\-M73\.453±2\.47072\.930±24\.82077\.601±1\.62268\.624±2\.23182\.017±3\.07768\.773±2\.53786\.181±1\.44889\.748±1\.392M\-K\-M68\.345±2\.27385\.305±1\.31074\.928±2\.37771\.194±1\.64981\.894±5\.27969\.131±1\.67586\.321±1\.39687\.638±1\.973M\-M\-K62\.895±3\.43588\.424±1\.40174\.444±17\.60565\.458±18\.53773\.971±27\.17970\.554±1\.84187\.953±1\.38291\.679±1\.575WavKANK\-M\-M69\.327±2\.48484\.304±2\.24279\.112±2\.00971\.460±1\.79686\.280±3\.64562\.896±2\.95785\.222±1\.24289\.056±2\.424M\-K\-M55\.558±3\.34580\.431±1\.99863\.505±3\.51465\.905±2\.47150\.267±7\.45858\.187±2\.91579\.256±2\.09677\.456±4\.059M\-M\-K60\.347±4\.32772\.672±7\.91963\.430±8\.28562\.739±5\.31363\.378±11\.10760\.803±4\.43379\.861±5\.50479\.326±5\.873FourierKANK\-M\-M72\.712±2\.21084\.612±1\.34475\.946±1\.71971\.658±1\.61878\.761±2\.95068\.276±2\.14478\.744±19\.72385\.093±17\.555M\-K\-M64\.348±3\.26280\.476±2\.46970\.630±1\.84567\.929±3\.38760\.654±4\.93167\.055±2\.33782\.085±1\.18483\.810±3\.874M\-M\-K58\.543±2\.13880\.280±1\.70154\.391±3\.85367\.064±1\.84438\.108±8\.51459\.777±2\.31681\.560±1\.45269\.575±4\.281LarctanKANK\-M\-M67\.910±2\.21985\.214±1\.10577\.193±1\.84572\.085±1\.84486\.318±2\.47870\.103±1\.70586\.792±1\.23689\.801±1\.342M\-K\-M60\.943±11\.48551\.243±42\.55951\.711±39\.19449\.073±31\.60551\.953±41\.32044\.280±35\.01455\.157±40\.36454\.897±44\.770M\-M\-K69\.984±2\.16986\.537±2\.26482\.212±1\.24674\.572±1\.55488\.769±2\.41470\.860±1\.87988\.775±1\.02391\.566±1\.794MLPHAR63\.965±5\.31290\.182±4\.89281\.175±6\.57373\.980±6\.49886\.099±4\.96469\.704±5\.45786\.558±1\.68290\.594±7\.658EfficientKAN\-LarctanKAN67\.115±5\.68992\.079±4\.16485\.839±4\.58375\.699±5\.76885\.222±7\.34369\.940±7\.58288\.604±1\.88491\.971±7\.519

## 6\.Ablation Study: Module\-Level Analysis of Hybrid Design

To systematically examine the role and effectiveness of each component in the hybrid architecture inspired by the empirical study, we conducted a comprehensive ablation study across three core modules: the data embedding layer, feature mixer, and classifier\. For each module, we tested six alternatives, five KAN variants \(EfficientKAN, FastKAN, WavKAN, FourierKAN, LarctanKAN\) and a standard MLP, while keeping the other two components fixed\. The results, shown in[Fig\.3](https://arxiv.org/html/2605.19031#S6.F3), report average performance improvements compared to the original MLPHAR model across eight HAR datasets\.

![Refer to caption](https://arxiv.org/html/2605.19031v1/x4.png)\(a\)Data Embedding
![Refer to caption](https://arxiv.org/html/2605.19031v1/x5.png)\(b\)Feature Mixer
![Refer to caption](https://arxiv.org/html/2605.19031v1/x6.png)\(c\)Classifier

Figure 3\.Average performance improvement compared to the MLPHAR baseline across eight datasets using the hybrid model with different\(a\)data embedding,\(b\)feature mixer, and\(c\)classifier modules\. In \(a\), the feature mixer is fixed as MLP and the classifier as LarctanKAN\. In \(b\), the data embedding is fixed as EfficientKAN and the classifier as LarctanKAN\. In \(c\), the data embedding is EfficientKAN and the feature mixer is MLP\.In[Fig\.3\(a\)](https://arxiv.org/html/2605.19031#S6.F3.sf1), EfficientKAN achieves the highest improvement \(\+5\.33%\), validating its expressiveness and efficiency for modeling raw, smooth IMU signals\. LarctanKAN \(\+2\.14%\) and MLP \(\+1\.36%\) also deliver modest improvements, while WavKAN \(\+0\.47%\) shows limited gains\. In contrast, FastKAN \(\-2\.08%\) and FourierKAN \(\-8\.33%\) underperform significantly, indicating that not all spline\-based or basis\-function approaches are suitable for early\-stage processing in noisy or variable sensor conditions\.

[Fig\.3\(b\)](https://arxiv.org/html/2605.19031#S6.F3.sf2)highlights the importance of choosing a stable and generalizable mixer module\. When EfficientKAN and LarctanKAN are fixed, MLP provides the best improvement \(\+5\.33%\), reaffirming its effectiveness in latent feature transformation\. FastKAN also performs well \(\+2\.93%\), but other KAN variants, particularly EfficientKAN \(\-44\.26%\), FourierKAN \(\-6\.18%\) and LarctanKAN \(\-31\.93%\), degrade performance sharply\. This suggests that KANs are more prone to instability and overfitting in deeper latent spaces, likely due to sensitivity to input distributions and higher\-order feature interactions\.

As shown in[Fig\.3\(c\)](https://arxiv.org/html/2605.19031#S6.F3.sf3), LarctanKAN outperforms all other classifier options \(\+5\.33%\), confirming its ability to model complex decision boundaries without instability\. MLP \(\+3\.38%\), EfficientKAN \(\+3\.60%\), and FastKAN \(\+3\.49%\) provide moderate improvements\. However, WavKAN \(\-10\.14%\) and FourierKAN \(\-13\.33%\) again show severe performance drops, likely due to poorly shaped output manifolds or unbounded activation behaviors\.

These results strongly reinforce the modular strategy behind our final model:

- •EfficientKAN is most suitable as a data embedding layer, effectively extracting rich features from smooth IMU signals\.
- •MLP remains the most stable and generalizable feature mixer, ensuring robustness in high\-dimensional latent spaces\.
- •LarctanKAN provides the most effective classifier module, offering smooth yet discriminative decision boundaries critical for HAR classification\.

In addition, this study also highlights that KAN variants are not interchangeable; their performance is highly sensitive to position within the network\. When misapplied, they can cause significant degradation, as seen with FourierKAN in both embedding and classifier roles, and EfficientKAN in the feature mixer\.

Thus, this ablation study provides both quantitative and architectural support for the hybrid design: EfficientKAN–MLP–LarctanKAN\.

## 7\.Discussion

In this section, we first evaluate the generalizability of the proposed hybrid design across multiple sensing modalities, diverse window sizes, and a variety of neural backbones\. We then evaluate parameter and computational efficiency of hybrid KAN designs\. Finally, we discuss the limitations of the current work, outline potential directions for future research, and offer practical design guidelines for applying KANs in sensor\-based HAR systems\.

### 7\.1\.Extending Hybrid Design Across Multiple Modalities

To evaluate the generalizability of the KAN\-MLP\-Mixer hybrid architecture across different sensing modalities and channels, we conducted experiments under three sensor configurations as shown in[Table7](https://arxiv.org/html/2605.19031#S7.T7): \(1\) single 3\-axis accelerometer \(ACC\) in single body position, \(2\) single IMU, and \(3\) multiple sensors\. We exclude the dataset Skodar, DG and HAPT, because only acceleration sensor data is available in the first two datasets, and only 6\-channel IMU data is included in the HAPT dataset\.

Table 7\.Sensor channel configuration and the performance improvement by KAN\-MLP\-Mixer \(The gain is the average macro F1 score improvement across the five dataset by hybrid KAN\-MLP\-Mixer compared to pure MLPHAR model\)DatasetChannels for One AccChannels for Single IMU1Channels for Multiple Sensors2DSADS3945PAMAP23618OPPO3977MotionSense3612MHEALTH3612Macro F1 Gain \(%\)6\.434\.682\.17
- 16\-channel configuration includes 3\-axis accelerometer and 3\-axis gyroscope\. another 3\-axis magnetic information is included in a 9\-channel configuration\.
- 2Multiple sensor configuration includes multiple IMUs on different body parts and a single IMU with additional feature channels like rotation angle\.

![Refer to caption](https://arxiv.org/html/2605.19031v1/x7.png)Figure 4\.Performance comparison for KAN\-MLP\-Mixer and MLPHAR models on five datasets under three sensor configurations \(single ACC, single IMU, Multiple sensors\)\. Numerical annotations show the absolute performance difference between models for each condition;▲\\blacktrianglefavors KAN\-MLP\-Mixer,▼\\blacktriangledownfavors MLPHAR\.As shown in[Fig\.4](https://arxiv.org/html/2605.19031#S7.F4)and[Table7](https://arxiv.org/html/2605.19031#S7.T7), the hybrid models KAN\-MLP\-Mixer consistently outperformed the MLPHAR baseline across all five public datasets\. The KAN\-MLP\-Mixer model achieved an average macro F1 score improvement of 6\.43% with a single 3\-channel accelerometer input, 4\.68 % with a single IMU sensor input, and 2\.17% with multiple channel sensors, compared to the MLPHAR baseline\. These results demonstrate that the benefits of introducing KAN\-based feature embeddings and the LarctanKAN module are not confined to a specific modality or sensor richness level\. The largest performance gains were observed under the single\-accelerometer setting, where sensor information is relatively sparse\. This finding suggests that KANs’ strong function approximation capabilities are particularly advantageous when handling limited and noisy sensor data\. In contrast, when multiple IMU sensors are available, the improvement margin diminishes\. Specifically, in the OPPO dataset, which includes 77\-channel inputs from IMUs placed on multiple body locations, the KAN\-MLP\-Mixer and MLPHAR models achieved comparable performance\. Nevertheless, substantial gains were consistently observed when only a single accelerometer or IMU sensor served as input\. A plausible explanation is that richer sensing mitigates the limitations of the MLPHAR model, while the complexity introduced by numerous input channels and diverse body placements reduces the effectiveness of the KAN\-based embedding\. Overall, these results validate the robustness of the proposed hybrid design across varying sensor configurations and underscore its practical value for real\-world HAR deployments with heterogeneous sensing conditions\.

### 7\.2\.Extending Hybrid Design Across Diverse Window Size

To further assess the flexibility of the proposed KAN\-MLP hybrid architecture, we investigated its performance under varying temporal window sizes\. The data from a single IMU was used in this experiment\. We select the sensors as shown in[Table2](https://arxiv.org/html/2605.19031#S3.T2)\. Specifically, we compared three configurations: \(1\) activity\-oriented mixed window sizes from 1 to 5 seconds as shown in[Table2](https://arxiv.org/html/2605.19031#S3.T2)\. \(2\) uniform 5\-second windows, and \(3\) uniform 10\-second windows\. The evaluation was conducted across eight datasets, as illustrated in[Fig\.5](https://arxiv.org/html/2605.19031#S7.F5)\.

![Refer to caption](https://arxiv.org/html/2605.19031v1/x8.png)Figure 5\.Performance comparison for KAN\-MLP\-Mixer and MLPHAR models under three window size configurations\. Numerical annotations show the absolute performance difference between models for each condition;▲\\blacktrianglefavors KAN\-MLP\-Mixer,▼\\blacktriangledownfavors MLPHAR\.Across eight datasets, the KAN\-MLP\-Mixer model achieved an average macro F1 score improvement of 4\.14% with a mixed window size, 4\.38% with a 5\-second window size, and 5\.39% with a 10\-second window size, compared to the MLPHAR baseline\.Across all datasets, the KAN\-MLP\-Mixer consistently outperformed the MLPHAR baseline, regardless of the windowing strategy\. On average, the KAN\-MLP\-Mixer achieved a macro F1 score improvement of 4\.14% with a mixed window size, 4\.38% with a 5\-second window size, and 5\.39% with a 10\-second window size\. These results demonstrate that the hybrid model not only adapts well to activity\-driven heterogeneous windowing across different datasets but also benefits from longer uniform windows, which offer more temporal context\. The largest gains were observed under the 10\-second window condition, suggesting that the KAN\-MLP\-Mixer architecture is particularly effective at leveraging richer temporal information to improve classification accuracy\. The robustness of the model across different window sizes further highlights its practical applicability to real\-world scenarios, where sensing and segmentation strategies may vary depending on deployment constraints\. Overall, these results reinforce the versatility and strong generalization ability of the KAN\-MLP hybrid design across diverse temporal settings\.

### 7\.3\.Extending Hybrid Design Across Diverse Neural Backbones

To evaluate the generalizability of the proposed hybrid design, we further extended our modular approach, specifically, the use ofKAN\-based embedding layersandLarctanKAN classifiers, to a set of widely adopted non\-pure MLP\-based neural network architectures, includingMCNN,DeepConvLSTM\(Ordóñez and Roggen,[2016](https://arxiv.org/html/2605.19031#bib.bib38)\), and the lightweightTinyHAR\(Zhou et al\.,[2022](https://arxiv.org/html/2605.19031#bib.bib58)\)models\. The MCNN model comprises four convolutional layers for feature extraction, followed by two fully connected layers serving as the classifier\. DeepConvLSTM shares a similar feature extraction and classification structure with MCNN but incorporates an additional LSTM module between the feature extractor and the classifier to capture temporal dependencies\. TinyHAR features a more sophisticated and efficient design, integrating transformer\-based and MLP\-based cross\-channel feature interaction and fusion modules, in addition to convolutional layers for feature extraction, LSTM layers for temporal modeling, and MLP layers for final classification\. Together, these three models encompass a wide range of state\-of\-the\-art components commonly used in deep learning\-based HAR studies, making them ideal candidates for evaluating the generalizability performance of KANs\-based hybrid models\. In each model, we define the convolutional blocks as the feature extractor, the MLP layers at the end as the classifier, and the remaining modules between the feature extractor and classifier as the backbone\. Notably, the MCNN model, with its simpler architecture, does not contain a distinct backbone module\.

![Refer to caption](https://arxiv.org/html/2605.19031v1/x9.png)Figure 6\.The extending hybrid design across diverse neural backbones, the original models only have the pure convolutional layers on the feature extractor and the MLP layer on the classifier\. We tested several combinations by replacing the first convolutional layer in the feature extractor or MLPs in the classifier with KANs\. The combination shown above is denoted asK\-B\-K\.In these experiments, we used the sensor configurations summarized in[Table2](https://arxiv.org/html/2605.19031#S3.T2)\. Specifically, we replaced the original first convolutional layer in the feature extraction block withEfficientKANmodule, where the linear kernel function in the convolutional layer is replaced by the EfficientKAN, while preserving the backbone design of each model\. Similarly, we substituted their output classifier heads withLarctanKANmodules\. This modular insertion allowed us to assess whether our hybrid philosophy, originally optimized for pure\-MLP models, could transfer effectively to more complex backbones commonly deployed in HAR tasks\.[Fig\.6](https://arxiv.org/html/2605.19031#S7.F6)illustrates the resulting hybrid architectures\. In the original model designs, there was typically a single convolutional branch; in our hybrid configuration, we split this branch into two parallel branches, introducing a KAN layer into one of them\. Each branch outputs half the number of channels compared to the original single\-branch configuration, thereby maintaining the same overall output dimensionality before feeding into the backbone\. We compare our hybrid architecture with a version where the KAN block is replaced by a vanilla CNN block, matching the other branch\.

We evaluated three variants based on different combinations:

- •K\-B\-M:EfficientKAN in the first convolutional layer, original backbone, and MLP classifier\.
- •C\-B\-K:Original CNN\-based feature extractor and backbone, LarctanKAN classifier\.
- •K\-B\-K:EfficientKAN in the first convolutional layer and LarctanKAN classifier, and original backbone, as shown in[Fig\.6](https://arxiv.org/html/2605.19031#S7.F6)\.

Table 8\.Performance of hybrid variants across diverse neural backbones among 8 datasets \(macro\-F1 ± std\)\.ModelDGDSADSPAMAP2OPPOSkodarHAPTMotionSenseMHEALTHGain \(%\)K\-B\-M0\.654±0\.1640\.657±0\.0890\.615±0\.2270\.544±0\.1400\.964±0\.0080\.819±0\.0240\.914±0\.0320\.642±0\.151\+0\.24C\-B\-K0\.668±0\.1480\.675±0\.0900\.615±0\.2040\.575±0\.1350\.960±0\.0140\.821±0\.0270\.919±0\.0340\.652±0\.143\+1\.77K\-B\-K0\.672±0\.1590\.686±0\.0800\.616±0\.1960\.564±0\.1220\.968±0\.0090\.825±0\.0250\.919±0\.0350\.659±0\.137\+2\.13MCNN0\.659±0\.1550\.657±0\.0910\.607±0\.2020\.567±0\.1460\.952±0\.0120\.817±0\.0300\.921±0\.0300\.615±0\.1420\.00K\-B\-M0\.680±0\.1470\.683±0\.0980\.631±0\.2160\.593±0\.1560\.964±0\.0130\.834±0\.0290\.920±0\.0240\.635±0\.153\+0\.22C\-B\-K0\.638±0\.1580\.678±0\.1000\.629±0\.2160\.599±0\.1520\.971±0\.0050\.837±0\.0250\.929±0\.0290\.684±0\.157\+0\.61K\-B\-K0\.650±0\.1550\.689±0\.0820\.654±0\.2140\.596±0\.1570\.959±0\.0160\.830±0\.0320\.936±0\.0210\.671±0\.179\+1\.09DeepConvLSTM0\.669±0\.1500\.677±0\.1060\.614±0\.2430\.586±0\.1570\.966±0\.0150\.838±0\.0200\.925±0\.0340\.656±0\.1470\.00K\-B\-M0\.665±0\.1510\.670±0\.0740\.624±0\.2070\.589±0\.1400\.955±0\.0130\.810±0\.0290\.928±0\.0170\.580±0\.146\-0\.16C\-B\-K0\.641±0\.1760\.655±0\.1410\.643±0\.2130\.561±0\.1400\.940±0\.0110\.810±0\.0410\.914±0\.0210\.615±0\.134\-0\.82K\-B\-K0\.631±0\.1740\.636±0\.0850\.634±0\.2000\.590±0\.1290\.931±0\.0280\.814±0\.0290\.910±0\.0290\.689±0\.109\+0\.46TinyHAR0\.658±0\.1680\.659±0\.1290\.623±0\.2110\.589±0\.1290\.930±0\.0140\.806±0\.0350\.913±0\.0230\.636±0\.1190\.00[Table8](https://arxiv.org/html/2605.19031#S7.T8)presents the performance summary of the hybrid variants neural network across the three models, it can be found that the K\-B\-K models achieve a better macro F1 score than the original models and other variants\. Besides, the average performance revealed a clear trend: KAN\-based modules delivered the largest performance gains \(2\.13%\) in the simplest architecture \(MCNN\), with progressively smaller improvements \(1\.09%\) for DeepConvLSTM and especially for the advanced TinyHAR \(0\.46%\)\. This indicates that the benefits of the EfficientKAN embedding and LarctanKAN classifier are inversely related to the capacity and sophistication of the host architecture\. In other words, the more complex and expressive the backbone, the less additional value the KAN components seem to provide\.

One fundamental reason for the diminishing returns is the difference in model capacity across these backbones\. MCNN, being a relatively shallow CNN\-only model, has limited ability to capture complex, non\-linear relationships in sensor data\. It relies on local convolutional filters and static activation functions, so augmenting it with KAN modules significantly boosts its expressive power\. In contrast, DeepConvLSTM already possesses greater representational capacity by combining convolutional feature extractors with LSTM layers that capture temporal dynamics\. This means the baseline DeepConvLSTM can already model many of the patterns that KAN would help with in a simpler network\. Any additional complex mapping provided by the KAN embedding or classifier might partially duplicate what the CNN\+LSTM is learning \(a form of learning redundancy\), resulting in only moderate gains\. Finally, TinyHAR is an even more sophisticated model, incorporating not just convolution and recurrent layers but also attention\-based mechanisms \(transformer encoder blocks\) for cross\-channel feature interaction\. Such attention modules effectively perform a data\-dependent re\-weighting of features, which is conceptually similar to the kernel\-based weighting that KAN modules implement\. Thus, the TinyHAR baseline is already extremely expressive – it was designed to match or exceed DeepConvLSTM’s accuracy with a much smaller parameter count by cleverly maximizing model utilization\. There is simply less “unused capacity” for KAN to fill in TinyHAR\. The small improvement observed in TinyHAR with KAN likely comes from fine\-tuning the decision boundaries or subtle feature tweaks, but much of KAN’s powerful function\-approximation ability is redundant given TinyHAR’s existing components\. In essence, as the backbone’s capacity to approximate complex functions increases, the marginal utility of inserting KAN layers decreases\.

### 7\.4\.Evaluating Parameter and Computational Efficiency of Hybrid KAN Designs

We further assessed the practicality and efficiency of the hybrid KAN\-MLP\-Mixer architecture by analyzing its parameter and computational complexities relative to the baseline MLPHAR model\. As depicted in[Fig\.7](https://arxiv.org/html/2605.19031#S7.F7), when the KAN\-MLP\-Mixer and MLPHAR models have identical layer configurations and input/output dimensions, the hybrid KAN design generally incurs a higher total parameter count\. This increase arises primarily because each KAN layer inherently introduces approximatelyGtimes more parameters than a corresponding MLP layer, whereGis the number of grid points used in the B\-spline basis\. However, by strategically assigning KAN modules only to the data embedding and classifier components, and preserving MLP in the typically larger feature mixer module, the hybrid design achieves a balanced trade\-off between parameter efficiency and expressive capacity\.

![Refer to caption](https://arxiv.org/html/2605.19031v1/x10.png)Figure 7\.Number of parameters in different components of KAN\-MLP\-Mixer and MLPHAR models\.\(TheLarctanKANbased classifier and MLP modules have the same number of trainable parameters; therefore, the parameter difference between theKAN\-MLP\-Mixerand theMLPHARmodels originates entirely from the data embedding module\.\)In order to investigate the effect of model size on performance, we also increased the size of the data embedding in MLPHAR models \(see[Fig\.8](https://arxiv.org/html/2605.19031#S7.F8)\) to elucidate whether our gain in performance can be matched by simply increasing the size of MLP\-only models\. This parameter\-efficiency comparison shows that, across multiple benchmark datasets, the KAN\-MLP\-Mixer consistently outperforms or matches the performance of the MLPHAR baseline while using fewer or comparable parameters\. Specifically, as the grid sizeGincreased from 1 to 6 in the KAN\-based components and the hidden dimensions varied from 8 to 40 in the data embedding module from MLPHAR, the hybrid model demonstrated superior scalability\. In datasets such as DG, DSADS, and HAPT, the KAN\-MLP\-Mixer achieved notably higher macro F1 scores even at lower parameter counts compared to the MLPHAR baseline\. This suggests that the hybrid model’s strategically placed KAN layers leverage their powerful function approximation capabilities to learn richer representations, leading to improved accuracy without excessively increasing model size\.

![Refer to caption](https://arxiv.org/html/2605.19031v1/x11.png)Figure 8\.Parameter\-efficiency when scaling models comparing between KAN\-MLP\-Mixer and MLPHAR across eight benchmark datasets\. For KAN\-MLP\-Mixer, the grid sizeGGwas systematically increased from 1 to 6; for MLPHAR, the hidden dimensions in the data embedding module were varied from 8, 16, 24, 32, to 40\. These results illustrate the scalability of KAN\-hybrid models and their ability to achieve competitive or superior performance compared to MLP\-only models under varying parameter budgets\.Computational efficiency analysis further corroborates the practicality of the hybrid approach, as shown in[Fig\.9](https://arxiv.org/html/2605.19031#S7.F9)\. Although KAN layers typically demand higher computation due to the evaluation of spline\-based activation functions, the KAN\-MLP\-Mixer still maintains competitive or superior computational efficiency across the benchmark datasets\. Notably, in datasets like DG, DSADS, OPPO, and Skodar, the hybrid architecture achieves higher macro F1 scores at comparable or lower floating\-point operation \(FLOP\) budgets relative to MLPHAR\. This efficiency arises primarily because the KAN layers’ ability to accurately model complex input\-output relationships with fewer neurons and layers compensates for their inherently higher per\-layer computational demands\. Consequently, the hybrid architecture can provide enhanced performance within practical computational constraints, making it particularly suitable for deployment in resource\-constrained wearable and ubiquitous computing scenarios\.

![Refer to caption](https://arxiv.org/html/2605.19031v1/x12.png)Figure 9\.Computational efficiency comparison between KAN\-MLP\-Mixer and MLPHAR across eight benchmark datasets\. For KAN\-MLP\-Mixer, grid sizeGGwas increased from 1 to 6; for MLPHAR, hidden dimensions were varied from 8, 16, 24, 32, to 40\. These results demonstrate how KAN\-based models maintain competitive or superior performance relative to MLP\-based models while offering favorable computational profiles under varying FLOP budgets\.
### 7\.5\.Limitations of the Current Study

Despite the promising results and comprehensive evaluation presented, this study is subject to several limitations that should be acknowledged:

Dataset Scope and Generalization\.Our evaluation relies primarily on publicly available benchmark datasets, which—while diverse in sensor modalities, positions, and activity complexity—may not encompass the full variety and unpredictability of real\-world HAR scenarios\. Many real\-world datasets include intermittent sensor noise, missing data, and user\-specific variations not fully captured by the controlled benchmark conditions\.

Sensor Modality and Placement Constraints\.The selected datasets primarily utilize accelerometers, gyroscopes, and occasionally magnetometers\. Other sensor modalities, such as physiological sensors \(e\.g\., heart rate monitors, electromyography\), environmental sensors \(e\.g\., ambient light or temperature\), or fusion with video and audio data, were not considered\. Consequently, the performance of the proposed architecture in multi\-modal sensing contexts remains unexplored\.

Computational and Energy Constraints\.While the proposed KAN\-MLP\-Mixer architecture significantly improves model performance, its practical deployment on resource\-constrained devices \(e\.g\., low\-powered wearables or IoT sensors\) has not been directly validated\. The trade\-off between improved accuracy and increased computational complexity or energy consumption—common challenges in real\-world applications—requires further investigation\.

Architectural and Hyperparameter Optimization\.Our proposed model utilizes specific variants of KANs \(EfficientKAN and LarctanKAN\) selected based on empirical performance\. The sensitivity of our results to hyperparameters such as KAN spline complexity, grid size, or the number of layers and neurons remains to be fully quantified\. Additionally, automated methods for systematically selecting or optimizing these parameters have yet to be explored\.

### 7\.6\.Future Work Directions

Based on the insights gained from this study, several promising research directions emerge:

Extension to Multi\-modal HAR Systems\. Exploring the integration of additional sensor modalities beyond inertial sensors—such as physiological, audio, visual, and environmental data—offers an intriguing avenue for enhancing HAR performance and robustness\. Investigating how KAN\-based architectures perform within such multi\-modal data fusion frameworks could further validate their utility in practical HAR deployments\.

Resource\-aware Optimization for Edge Devices\. Evaluating and optimizing the proposed hybrid architecture explicitly for computational and energy efficiency on edge and wearable devices represents an important next step\. Future work should examine model compression techniques, efficient quantization, lightweight model structures, and energy\-aware training procedures to ensure the hybrid model remains practical for ubiquitous deployment\.

Automated Architecture Search and Hyperparameter Optimization\. Investigating automated techniques, such as Neural Architecture Search \(NAS\), to systematically identify optimal KAN placements, architectural parameters, and hyperparameter settings would further refine performance\. Additionally, rigorous sensitivity analyses of hyperparameters could provide clearer guidelines for deploying KAN\-based hybrid architectures in diverse scenarios\.

Interpretable and Explainable KAN\-based Models\. Given the inherent interpretability of KAN modules, future research should investigate interpretability\-focused evaluations, providing clearer insights into what sensor\-derived features and patterns contribute most significantly to accurate HAR predictions\. Enhanced interpretability could substantially increase user trust and acceptance in critical application domains such as healthcare or rehabilitation\.

### 7\.7\.Design Guidelines for Sensor\-Based HAR

Our comprehensive ablation results, supported by theoretical insights and synthetic function fitting experiments, offer actionable design principles for integrating KANs into modern neural architectures\. We distill these findings into three practical guidelines:

1. \(1\)Use KANs selectively, not universally:While KANs demonstrate strong representational capacity—particularly for modeling smooth, continuous functions—our findings show thatfull\-scale replacement of MLPs with KANs leads to significant performance degradation, especially in deeper layers\. Instead of wholesale substitution, KANs should be applied strategically where their strengths \(e\.g\., spline\-based approximation, localized activation\) align with the structure of the input data\.
2. \(2\)Match model component to signal characteristics:Our empirical results suggest a strong link between model layer type and the nature of the signal it processes: - •UseKANs for data embeddingwhen inputs are smooth and periodic, such as IMU sensor data in HAR\. - •UseMLPs for latent feature mixing, where signal continuity is less pronounced and stability, generalization, and scalability are crucial\. - •UseLarctanKANs for classification, where stable, bounded nonlinearities can model decision boundaries without inducing overshooting or noise amplification\.
3. \(3\)Prioritize hybrid modularity over architectural novelty:The key to successful KAN integration lies not in the novelty of the model itself, but in themodular hybridization of architectures\. Each module should be chosen and positioned based on its demonstrated strengths\. Our EfficientKAN–MLP–LarctanKAN configuration provides a template for how to combine expressive, stable, and efficient modules into a unified architecture—one that outperforms both pure MLP and pure KAN\-based designs across diverse datasets\.

These guidelines are intended to help researchers and practitioners move beyond one\-size\-fits\-all architectures, encouraging principled design choices based on signal properties, module behavior, and empirical evidence\. We believe this modular mindset will be essential as the field continues to integrate emerging neural operators into scalable, interpretable, and domain\-adaptive machine learning systems\.

## 8\.Conclusion

This work systematically explored integrating KANs into robust and computationally efficient MLP based architectures for IMU\-based HAR\. Recognizing KANs’ high expressivity yet sensitivity to noisy, real\-world data and MLPs’ robustness and efficiency, we introduced the KAN\-MLP\-Mixer hybrid architecture\. This hybrid design strategically employs EfficientKAN for adaptive input embedding, retains standard MLP layers for intermediate feature mixing, and incorporates a specialized LarctanKAN classifier\. Evaluations across eight diverse HAR datasets confirmed the effectiveness of this targeted integration, achieving an average macro F1 score improvement of 5\.33% over the pure\-MLP baseline \(MLPHAR\)\. Our results clearly demonstrate that selective hybridization significantly surpasses both standalone KAN and MLP models\. Furthermore, extending this hybrid strategy to other state\-of\-the\-art neural backbones consistently improved their performance, underscoring its broad applicability\. This work thus provides concrete guidelines for effectively leveraging KANs in practical wearable sensing scenarios, marking a promising advancement towards accurate HAR systems harnessing the potential function approximation power of KANs\.

## References

- \(1\)
- Abbas et al\.\(2024\)Sidra Abbas, Shtwai Alsubai, Muhammad Ibrar Ul Haque, Gabriel Avelino Sampedro, Ahmad Almadhor, Abdullah Al Hejaili, and Iryna Ivanochko\. 2024\.Active machine learning for heterogeneity activity recognition through smartwatch sensors\.*IEEE Access*12 \(2024\), 22595–22607\.
- Abdel\-Salam et al\.\(2021\)Reem Abdel\-Salam, Rana Mostafa, and Mayada Hadhood\. 2021\.Human activity recognition using wearable sensors: review, challenges, evaluation benchmark\. In*International workshop on deep learning for human activity recognition*\. Springer, 1–15\.
- Altarabichi \(2024\)Mohammed Ghaith Altarabichi\. 2024\.Dropkan: Regularizing kans by masking post\-activations\.*arXiv preprint arXiv:2407\.13044*\(2024\)\.
- Arnold \(1963\)Vladimir I\. Arnold\. 1963\.On Functions of Three Variables\.*Doklady Akademii Nauk SSSR*148 \(1963\), 9–12\.
- Arnold \(2009\)Vladimir I Arnold\. 2009\.On functions of three variables\.*Collected Works: Representations of Functions, Celestial Mechanics and KAM Theory, 1957–1965*\(2009\), 5–8\.
- Bachlin et al\.\(2009\)Marc Bachlin, Meir Plotnik, Daniel Roggen, Inbal Maidan, Jeffrey M Hausdorff, Nir Giladi, and Gerhard Troster\. 2009\.Wearable assistant for Parkinson’s disease patients with the freezing of gait symptom\.*IEEE Transactions on Information Technology in Biomedicine*14, 2 \(2009\), 436–446\.
- Banos et al\.\(2014\)Oresti Banos, Rafael Garcia, Juan A Holgado\-Terriza, Miguel Damas, Hector Pomares, Ignacio Rojas, Alejandro Saez, and Claudia Villalonga\. 2014\.mHealthDroid: A novel framework for agile development of mobile health applications\.*Ambient Assisted Living and Daily Activities*8868, 14 \(2014\), 91–98\.
- Bian et al\.\(2022\)Sizhen Bian, Mengxi Liu, Bo Zhou, and Paul Lukowicz\. 2022\.The state\-of\-the\-art sensing techniques in human activity recognition: A survey\.*Sensors*22, 12 \(2022\), 4596\.
- Bodner et al\.\(2024\)Alexander Dylan Bodner, Antonio Santiago Tepsich, Jack Natan Spolski, and Santiago Pourteau\. 2024\.Convolutional kolmogorov\-arnold networks\.*arXiv preprint arXiv:2406\.13155*\(2024\)\.
- Bozorgasl and Chen \(\[n\. d\.\]\)Zavareh Bozorgasl and Hao Chen\. \[n\. d\.\]\.Wav\-kan: Wavelet kolmogorov\-arnold networks, 2024\.*arXiv preprint arXiv:2405\.12832*\(\[n\. d\.\]\)\.
- Cang et al\.\(2024\)Yueyang Cang, Li Shi, et al\.2024\.Can KAN Work? Exploring the Potential of Kolmogorov\-Arnold Networks in Computer Vision\.*arXiv preprint arXiv:2411\.06727*\(2024\)\.
- Cao \(2024\)Blealtan Cao\. 2024\.An Efficient Implementation of Kolmogorov\-Arnold Network\.[https://github\.com/Blealtan/efficient\-kan](https://github.com/Blealtan/efficient-kan)\.Accessed: 2025\-04\-10\.
- Chavarriaga et al\.\(2013\)Ricardo Chavarriaga, Hesam Sagha, Alberto Calatroni, Sundara Tejaswi Digumarti, Gerhard Tröster, José del R Millán, and Daniel Roggen\. 2013\.The Opportunity challenge: A benchmark database for on\-body sensor\-based activity recognition\.*Pattern Recognition Letters*34, 15 \(2013\), 2033–2042\.
- Chen et al\.\(2021\)Kaixuan Chen, Dalin Zhang, Lina Yao, Bin Guo, Zhiwen Yu, and Yunhao Liu\. 2021\.Deep learning for sensor\-based human activity recognition: Overview, challenges, and opportunities\.*ACM Computing Surveys \(CSUR\)*54, 4 \(2021\), 1–40\.
- Chen and Zhang \(2024a\)Zhijie Chen and Xinglin Zhang\. 2024a\.Larctan\-skan: Simple and efficient single\-parameterized kolmogorov\-arnold networks using learnable trigonometric function\.*arXiv preprint arXiv:2410\.19360*\(2024\)\.
- Chen and Zhang \(2024b\)Zhijie Chen and Xinglin Zhang\. 2024b\.Lss\-skan: Efficient kolmogorov\-arnold networks based on single\-parameterized function\.*arXiv preprint arXiv:2410\.14951*\(2024\)\.
- Dirgová Luptáková et al\.\(2022\)Iveta Dirgová Luptáková, Martin Kubovčík, and Jiří Pospíchal\. 2022\.Wearable sensor\-based human activity recognition with transformer model\.*Sensors*22, 5 \(2022\), 1911\.
- Drokin \(2024\)Ivan Drokin\. 2024\.Kolmogorov\-arnold convolutions: Design principles and empirical studies\.*arXiv preprint arXiv:2407\.01092*\(2024\)\.
- Enokibori \(2024\)Yu Enokibori\. 2024\.rTsfNet: a DNN model with Multi\-head 3D Rotation and Time Series Feature Extraction for IMU\-based Human Activity Recognition\.*Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies*8, 4 \(2024\), 1–26\.
- Fan and Gao \(2021\)Changjun Fan and Fei Gao\. 2021\.Enhanced human activity recognition using wearable sensors via a hybrid feature selection method\.*Sensors*21, 19 \(2021\), 6434\.
- Genet and Inzirillo \(2024\)Remi Genet and Hugo Inzirillo\. 2024\.Tkan: Temporal kolmogorov\-arnold networks\.*arXiv preprint arXiv:2405\.07344*\(2024\)\.
- Ibrahum et al\.\(2024\)Ahmed Dawod Mohammed Ibrahum, Zhengyu Shang, and Jang\-Eui Hong\. 2024\.How Resilient Are Kolmogorov–Arnold Networks in Classification Tasks? A Robustness Investigation\.*Applied Sciences*14, 22 \(2024\), 10173\.
- Ivashkov et al\.\(2026\)Petr Ivashkov, Po\-Wei Huang, Kelvin Koor, Lirandë Pira, and Patrick Rebentrost\. 2026\.QKAN: quantum Kolmogorov\-Arnold networks with applications in machine learning and multivariate state preparation\.*npj Quantum Information*12, 1 \(11 Mar 2026\), 73\.[doi:10\.1038/s41534\-026\-01202\-5](https://doi.org/10.1038/s41534-026-01202-5)
- Jamali et al\.\(2024\)Ali Jamali, Swalpa Kumar Roy, Danfeng Hong, Bing Lu, and Pedram Ghamisi\. 2024\.How to learn more? Exploring Kolmogorov–Arnold networks for hyperspectral image classification\.*Remote Sensing*16, 21 \(2024\), 4015\.
- Khan and Ahmad \(2021\)Zanobya N Khan and Jamil Ahmad\. 2021\.Attention induced multi\-head convolutional neural network for human activity recognition\.*Applied soft computing*110 \(2021\), 107671\.
- Koenig et al\.\(2024\)Benjamin C Koenig, Suyong Kim, and Sili Deng\. 2024\.KAN\-ODEs: Kolmogorov–Arnold network ordinary differential equations for learning dynamical systems and hidden physics\.*Computer Methods in Applied Mechanics and Engineering*432 \(2024\), 117397\.
- Kolmogorov \(1957\)Andrei Nikolaevich Kolmogorov\. 1957\.On the representations of continuous functions of many variables by superposition of continuous functions of one variable and addition\. In*Dokl\. Akad\. Nauk USSR*, Vol\. 114\. 953–956\.
- Le et al\.\(2024\)Tran Xuan Hieu Le, Thi Diem Tran, Hoai Luan Pham, Vu Trung Duong Le, Tuan Hai Vu, Van Tinh Nguyen, Yasuhiko Nakashima, et al\.2024\.Exploring the limitations of kolmogorov\-arnold networks in classification: Insights to software training and hardware implementation\. In*2024 Twelfth International Symposium on Computing and Networking Workshops \(CANDARW\)*\. IEEE, 110–116\.
- Li \(2024\)Ziyao Li\. 2024\.Kolmogorov\-Arnold Networks are Radial Basis Function Networks\.\(2024\)\.arXiv:2405\.06721 \[cs\.LG\]
- Liu et al\.\(2021\)Hanxiao Liu, Zihang Dai, David So, and Quoc V Le\. 2021\.Pay attention to mlps\.*Advances in neural information processing systems*34 \(2021\), 9204–9215\.
- Liu et al\.\(2024a\)Mengxi Liu, Daniel Geißler, Dominique Nshimyimana, Sizhen Bian, Bo Zhou, and Paul Lukowicz\. 2024a\.Initial investigation of kolmogorov\-arnold networks \(kans\) as feature extractors for imu based human activity recognition\. In*Companion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing*\. 500–506\.
- Liu et al\.\(2024b\)Ziming Liu, Pingchuan Ma, Yixuan Wang, Wojciech Matusik, and Max Tegmark\. 2024b\.Kan 2\.0: Kolmogorov\-arnold networks meet science\.*arXiv preprint arXiv:2408\.10205*\(2024\)\.
- Liu et al\.\(2024c\)Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljačić, Thomas Y Hou, and Max Tegmark\. 2024c\.Kan: Kolmogorov\-arnold networks\.*arXiv preprint arXiv:2404\.19756*\(2024\)\.
- Malekzadeh et al\.\(2021\)Mohammad Malekzadeh, Richard Clegg, Andrea Cavallaro, and Hamed Haddadi\. 2021\.Dana: Dimension\-adaptive neural architecture for multivariate sensor data\.*Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies*5, 3 \(2021\), 1–27\.
- Miyoshi et al\.\(2025\)Takeru Miyoshi, Makoto Koshino, and Hidetaka Nambo\. 2025\.Applying MLP\-Mixer and gMLP to Human Activity Recognition\.*Sensors*25, 2 \(2025\), 311\.
- Ojiako and Farrahi \(2023\)Kamsiriochukwu Ojiako and Katayoun Farrahi\. 2023\.MLPs Are All You Need for Human Activity Recognition\.*Applied Sciences*13, 20 \(2023\), 11154\.
- Ordóñez and Roggen \(2016\)Francisco J\. Ordóñez and Daniel Roggen\. 2016\.Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition\.*Sensors*16, 1 \(2016\), 115\.
- Pinkus \(1999\)Allan Pinkus\. 1999\.Approximation theory of the MLP model in neural networks\.*Acta numerica*8 \(1999\), 143–195\.
- Poeta et al\.\(2024\)Eleonora Poeta, Flavio Giobergia, Eliana Pastor, Tania Cerquitelli, and Elena Baralis\. 2024\.A benchmarking study of kolmogorov\-arnold networks on tabular data\. In*2024 IEEE 18th International Conference on Application of Information and Communication Technologies \(AICT\)*\. IEEE, 1–6\.
- Reiss and Stricker \(2012\)Attila Reiss and Didier Stricker\. 2012\.Introducing a new benchmarked dataset for activity monitoring\. In*2012 16th international symposium on wearable computers*\. IEEE, 108–109\.
- Reyes\-Ortiz et al\.\(2013\)Jorge L\. Reyes\-Ortiz, Davide Anguita, Alessandro Ghio, Luca Oneto, and Xavier Parra\. 2013\.*Human Activity Recognition Using Smartphones*\.[doi:10\.24432/C54S4K](https://doi.org/10.24432/C54S4K)
- Shavit and Klein \(2021\)Yoli Shavit and Itzik Klein\. 2021\.Boosting inertial\-based human activity recognition with transformers\.*IEEE Access*9 \(2021\), 53540–53547\.
- Shen et al\.\(2025\)Haoran Shen, Chen Zeng, Jiahui Wang, and Qiao Wang\. 2025\.Reduced effectiveness of kolmogorov\-arnold networks on functions with noise\. In*ICASSP 2025\-2025 IEEE International Conference on Acoustics, Speech and Signal Processing \(ICASSP\)*\. IEEE, 1–5\.
- Somvanshi et al\.\(2024\)Shriyank Somvanshi, Syed Aaqib Javed, Md Monzurul Islam, Diwas Pandit, and Subasish Das\. 2024\.A survey on kolmogorov\-arnold network\.*arXiv preprint arXiv:2411\.06078*\(2024\)\.
- Sui et al\.\(2024\)Yueyuan Sui, Minghui Zhao, Junxi Xia, Xiaofan Jiang, and Stephen Xia\. 2024\.Tramba: A hybrid transformer and mamba architecture for practical audio and bone conduction speech super resolution and enhancement on mobile and wearable platforms\.*Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies*8, 4 \(2024\), 1–29\.
- Tolstikhin et al\.\(2021\)Ilya O Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, et al\.2021\.Mlp\-mixer: An all\-mlp architecture for vision\.*Advances in neural information processing systems*34 \(2021\), 24261–24272\.
- Toscano et al\.\(2025\)Juan Diego Toscano, Vivek Oommen, Alan John Varghese, Zongren Zou, Nazanin Ahmadi Daryakenari, Chenxi Wu, and George Em Karniadakis\. 2025\.From pinns to pikans: Recent advances in physics\-informed machine learning\.*Machine Learning for Computational Science and Engineering*1, 1 \(2025\), 1–43\.
- Tseng and Wen \(2023\)Yu\-Hsuan Tseng and Chih\-Yu Wen\. 2023\.Hybrid Learning Models for IMU\-Based HAR with Feature Analysis and Data Correction\.*Sensors*23, 18 \(2023\), 7802\.
- Werner et al\.\(2025\)Yannick Werner, Akash Malemath, Mengxi Liu, Vitor Fortes Rey, Nikolaos Palaiodimopoulos, Paul Lukowicz, and Maximilian Kiefer\-Emmanouilidis\. 2025\.QuKAN: A Quantum Circuit Born Machine Approach to Quantum Kolmogorov Arnold Networks\.*Scientific Reports*15, 1 \(09 Oct 2025\), 35239\.[doi:10\.1038/s41598\-025\-22705\-9](https://doi.org/10.1038/s41598-025-22705-9)
- Xu et al\.\(2024\)Jinfeng Xu, Zheyu Chen, Jinze Li, Shuo Yang, Wei Wang, Xiping Hu, and Edith C\-H Ngai\. 2024\.FourierKAN\-GCF: Fourier Kolmogorov\-Arnold Network–An Effective and Efficient Feature Transformation for Graph Collaborative Filtering\.*arXiv preprint arXiv:2406\.01034*\(2024\)\.
- Yang and Wang \(2024\)Xingyi Yang and Xinchao Wang\. 2024\.Kolmogorov\-arnold transformer\.*arXiv preprint arXiv:2409\.10594*\(2024\)\.
- Yin et al\.\(2024\)Yafeng Yin, Lei Xie, Zhiwei Jiang, Fu Xiao, Jiannong Cao, and Sanglu Lu\. 2024\.A systematic review of human activity recognition based on mobile devices: overview, progress and trends\.*IEEE Communications Surveys & Tutorials*26, 2 \(2024\), 890–929\.
- Zappi et al\.\(2008\)Piero Zappi, Clemens Lombriser, Thomas Stiefmeier, Elisabetta Farella, Daniel Roggen, Luca Benini, and Gerhard Tröster\. 2008\.Activity recognition from on\-body sensors: accuracy\-power trade\-off by dynamic sensor selection\. In*Wireless Sensor Networks: 5th European Conference, EWSN 2008, Bologna, Italy, January 30\-February 1, 2008\. Proceedings*\. Springer, 17–33\.
- Zhang et al\.\(2015\)Licheng Zhang, Xihong Wu, and Dingsheng Luo\. 2015\.Recognizing human activities from raw accelerometer data using deep neural networks\. In*2015 IEEE 14th International conference on machine learning and applications \(ICMLA\)*\. IEEE, 865–870\.
- Zhang et al\.\(2022\)Ye Zhang, Longguang Wang, Huiling Chen, Aosheng Tian, Shilin Zhou, and Yulan Guo\. 2022\.IF\-ConvTransformer: A framework for human activity recognition using IMU fusion and ConvTransformer\.*Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies*6, 2 \(2022\), 1–26\.
- Zhou et al\.\(2024\)Yexu Zhou, Tobias King, Haibin Zhao, Yiran Huang, Till Riedel, and Michael Beigl\. 2024\.MLP\-HAR: Boosting Performance and Efficiency of HAR Models on Edge Devices with Purely Fully Connected Layers\. In*Proceedings of the 2024 ACM International Symposium on Wearable Computers*\. 133–139\.
- Zhou et al\.\(2022\)Yexu Zhou, Haibin Zhao, Yiran Huang, Till Riedel, Michael Hefenbrock, and Michael Beigl\. 2022\.Tinyhar: A lightweight deep learning model designed for human activity recognition\. In*Proceedings of the 2022 ACM International Symposium on Wearable Computers*\. 89–93\.
KAN-MLP-Mixer: A comprehensive investigation of the usage of Kolmogorov-Arnold Networks (KANs) for improving IMU-based Human Activity Recognition

Similar Articles

Geometric Kolmogorov--Arnold Network (GeoKAN)

Toeplitz MLP Mixers are Low Complexity, Information-Rich Sequence Models

Hierarchical RBF-KAN and RBF-SKAN Architectures for Multidimensional Function Approximation and Random Field Learning

AnyMo: Geometry-Aware Setup-Agnostic Modeling of Human Motion in the Wild

Collaboration of Fusion and Independence: Hypercomplex-driven Robust Multi-Modal Knowledge Graph Completion

Submit Feedback

Similar Articles

Geometric Kolmogorov--Arnold Network (GeoKAN)
Toeplitz MLP Mixers are Low Complexity, Information-Rich Sequence Models
Hierarchical RBF-KAN and RBF-SKAN Architectures for Multidimensional Function Approximation and Random Field Learning
AnyMo: Geometry-Aware Setup-Agnostic Modeling of Human Motion in the Wild
Collaboration of Fusion and Independence: Hypercomplex-driven Robust Multi-Modal Knowledge Graph Completion