MOSAIC: Orchestrating Collaborative Knowledge Tracing with Hierarchical Semantic Alignment
Summary
MOSAIC is a novel framework that uses a frozen LLM to generate semantic embeddings and hierarchical prediction prompts for knowledge tracing, achieving state-of-the-art results on multiple benchmarks.
View Cached Full Text
Cached at: 06/30/26, 05:31 AM
# MOSAIC: Orchestrating Collaborative Knowledge Tracing with Hierarchical Semantic Alignment
Source: [https://arxiv.org/html/2606.29049](https://arxiv.org/html/2606.29049)
11institutetext:Columbia University, New York, USA
11email:li\.xinjin@columbia\.edu22institutetext:University of California, Berkeley, Berkeley, USA
22email:cwang998@berkeley\.edu33institutetext:School of Information Systems and Management, Carnegie Mellon University, Pittsburgh, PA, USA
33email:yuzhenl@alumni\.cmu\.edu44institutetext:Department of Mathematics, University of Southern California, Los Angeles, USA
44email:pengbinf@alumni\.usc\.edu55institutetext:University of Massachusetts Amherst, Amherst, USA
55email:ziqisha@umass\.edu66institutetext:Computer Science Department, UC San Diego, La Jolla, USA
66email:yeyang\-zhou@ucsd\.edu77institutetext:Carnegie Mellon Institute for Strategy & Technology, Carnegie Mellon University, Pittsburgh, USA
77email:yuma13926@gmail\.comMengyue Wang†\\daggerYuzhen LinPengbin FengZiqi ShaYeyang ZhouYu Ma\*
###### Abstract
Knowledge Tracing \(KT\) is important for personalized education but traditionally suffers from two key limitations: a reliance on shallow ID\-based representations that neglect semantic depth and a restriction to single\-granularity mastery estimation that overlooks hierarchical knowledge dependencies\. To address these challenges, we propose MOSAIC \(Multi\-granularity Online Semantic AI for Collaborative Knowledge\), a novel framework that orchestrates LLM\-driven semantic alignment with sequential modeling\. Unlike methods that use LLMs solely as predictors, MOSAIC leverages a frozen LLM to generate dynamic, context\-aware embeddings and hierarchical prediction prompts, explicitly capturing collaborative signals and peer interactions\. Furthermore, we introduce a cross\-granularity consistency objective that jointly regularizes mastery estimation across concept, topic\-cluster, and global proficiency levels\. Extensive experiments on ASSISTments, EdNet, and a newly collected large\-scale MOOC dataset demonstrate that MOSAIC establishes new state\-of\-the\-art results\. Specifically, our method achieves AUC improvements of up to 3\.4% and Accuracy gains of up to 2\.5 % across all benchmarks\. Notably, MOSAIC exhibits superior robustness in collaboration\-rich environments and long\-sequence scenarios \(AUC 0\.862 on MOOC\), offering both high predictive precision and semantically grounded interpretability\.
††footnotetext:\*Corresponding authors\.††footnotetext:†\\daggerEqual contribution\.## 1Introduction
Knowledge tracing \(KT\) is a fundamental problem in AI for education, aiming to model how a learner’s latent knowledge state evolves over time from sequential interaction data and to predict future performance for personalized intervention, tutoring, and curriculum adaptation\[[10](https://arxiv.org/html/2606.29049#bib.bib4),[55](https://arxiv.org/html/2606.29049#bib.bib1),[60](https://arxiv.org/html/2606.29049#bib.bib3)\]\. Classical and neural KT methods have established strong performance on benchmark datasets and have become core components in intelligent tutoring systems\. However, the learning environments faced by modern educational platforms are no longer limited to simple question–response logs\. In realistic settings, student learning unfolds through heterogeneous and semantically rich signals, including attempts, hints, time\-on\-task, instructor feedback, and peer discussion\. This shift raises a broader challenge for KT: beyond predicting the next response, a practical learner model should be able to track knowledge evolution under rich context and provide coherent estimates of mastery across different levels of abstraction\.
Despite substantial progress, existing KT methods still face two persistent limitations\. First, many models rely heavily on shallow ID\-based representations for questions, concepts, and interactions\. While effective for large\-scale prediction, such representations often ignore the semantic content of exercises, the meaning of knowledge components, and the contextual information carried by collaborative learning signals\. As a result, they remain limited in semantic expressiveness and are not well suited to realistic educational scenarios where textual and contextual evidence plays an important role\. Second, most KT models operate primarily at a single granularity, typically the skill or concept level\. This design overlooks the inherently hierarchical nature of learning, where fine\-grained concept mastery is related to broader topic understanding and overall proficiency\. Consequently, existing methods can achieve accurate local prediction while still lacking a coherent view of learner state across multiple knowledge levels\[[1](https://arxiv.org/html/2606.29049#bib.bib2),[3](https://arxiv.org/html/2606.29049#bib.bib13)\]\.
Prior work only partially addresses these challenges\. Probabilistic and psychometric formulations provide interpretable state\-transition mechanisms, but they rely on simplified assumptions that limit their ability to capture complex learning behaviors\[[10](https://arxiv.org/html/2606.29049#bib.bib4),[49](https://arxiv.org/html/2606.29049#bib.bib7)\]\. Deep sequential KT models improve temporal modeling through recurrent, memory\-based, and attention\-based architectures, yet many still depend on ID\-level embeddings and remain focused on a single granularity of mastery estimation\[[55](https://arxiv.org/html/2606.29049#bib.bib1),[79](https://arxiv.org/html/2606.29049#bib.bib8),[21](https://arxiv.org/html/2606.29049#bib.bib10),[38](https://arxiv.org/html/2606.29049#bib.bib11)\]\. Structure\-aware and graph\-based approaches further incorporate concept relations, but they often rely on predefined or weakly adaptive structures and do not naturally accommodate semantically rich collaborative context\[[46](https://arxiv.org/html/2606.29049#bib.bib14),[53](https://arxiv.org/html/2606.29049#bib.bib12)\]\. Meanwhile, large language models \(LLMs\) provide powerful semantic understanding and have shown strong potential in educational tasks such as feedback generation, tutoring dialogue, and explanation support\[[34](https://arxiv.org/html/2606.29049#bib.bib21),[2](https://arxiv.org/html/2606.29049#bib.bib23)\]\. Yet directly using LLMs as end\-to\-end predictors is not ideal for KT, where stable temporal state tracking and sequential inductive bias remain essential\. These observations suggest that the key opportunity is not to replace KT with an LLM, but to integrate semantic understanding and temporal learner modeling in a principled way\.
Motivated by this perspective, we proposeMOSAIC\(Multi\-granularityOnlineSemanticAIforCollaborative Knowledge\), a unified framework for semantically grounded and hierarchically consistent knowledge tracing\. The central idea is simple: we preserve a sequential KT backbone for modeling temporal knowledge evolution, while using a frozen LLM as a semantic alignment module to encode heterogeneous learning context\. Specifically, MOSAIC leverages a frozen LLM to transform problem content, learning behaviors, and collaborative interaction text into context\-aware semantic representations, which are then consumed by a sequential model for learner state tracking\. On top of the shared latent state, MOSAIC jointly estimates mastery at three complementary levels—concept, topic\-cluster, and global proficiency\. To encourage coherent learner modeling across these levels, we further introduce a cross\-granularity consistency objective that regularizes agreement between fine\-grained and coarse\-grained mastery estimates\. In this way, MOSAIC combines the temporal robustness of conventional KT with the semantic richness of LLM\-based representations, while moving beyond single\-level prediction toward structured multi\-level learner modeling\.
We evaluate MOSAIC on ASSISTments, EdNet, and a large\-scale university MOOC dataset that reflects collaboration\-rich learning environments\. Across benchmarks, MOSAIC consistently outperforms strong KT baselines, with particularly clear gains in settings involving long interaction sequences and rich collaborative context\. These results suggest that combining semantic alignment with hierarchical inductive structure is a promising direction for next\-generation KT systems\. Our main contributions are summarized as follows:
- •We identify a key limitation of existing KT methods: although effective at temporal prediction, they remain insufficient for semantically grounded and hierarchically consistent learner modeling in realistic collaborative environments\.
- •We propose MOSAIC, a unified framework that couples frozen\-LLM\-based semantic alignment with sequential knowledge tracing to model learner mastery at the concept, topic\-cluster, and global levels\.
- •We introduce a cross\-granularity consistency objective that regularizes multi\-level learner representations and show strong empirical gains on standard concept\-level prediction across multiple benchmarks, especially in collaboration\-rich and long\-horizon learning scenarios\.
## 2Related Work
### 2\.1Broader Impacts of LLMs and Representation Learning
Recent advances in Large Language Models \(LLMs\) and representation learning have profoundly influenced a wide spectrum of domains, establishing a foundational paradigm for multi\-modal and adaptive intelligence\. In embodied intelligence and autonomous systems, researchers have significantly enhanced dynamic environmental interaction and multimodal understanding through scene perception, continuous adaptation, metacognitive coordination, and safe skill execution\[[39](https://arxiv.org/html/2606.29049#bib.bib33),[4](https://arxiv.org/html/2606.29049#bib.bib34),[52](https://arxiv.org/html/2606.29049#bib.bib35),[7](https://arxiv.org/html/2606.29049#bib.bib36),[27](https://arxiv.org/html/2606.29049#bib.bib37),[20](https://arxiv.org/html/2606.29049#bib.bib38),[85](https://arxiv.org/html/2606.29049#bib.bib42),[43](https://arxiv.org/html/2606.29049#bib.bib46),[41](https://arxiv.org/html/2606.29049#bib.bib47),[42](https://arxiv.org/html/2606.29049#bib.bib48),[13](https://arxiv.org/html/2606.29049#bib.bib61),[80](https://arxiv.org/html/2606.29049#bib.bib62),[33](https://arxiv.org/html/2606.29049#bib.bib63),[29](https://arxiv.org/html/2606.29049#bib.bib65),[72](https://arxiv.org/html/2606.29049#bib.bib64),[28](https://arxiv.org/html/2606.29049#bib.bib66),[24](https://arxiv.org/html/2606.29049#bib.bib70),[12](https://arxiv.org/html/2606.29049#bib.bib77),[61](https://arxiv.org/html/2606.29049#bib.bib78),[51](https://arxiv.org/html/2606.29049#bib.bib88),[50](https://arxiv.org/html/2606.29049#bib.bib89),[65](https://arxiv.org/html/2606.29049#bib.bib83)\], while concurrent work in federated, continual, and personalized representation learning investigates heterogeneous adaptation, task\-conditioned parameter generation, and graph\-based evolution for user\-specific intelligence\[[77](https://arxiv.org/html/2606.29049#bib.bib39),[48](https://arxiv.org/html/2606.29049#bib.bib41),[69](https://arxiv.org/html/2606.29049#bib.bib43),[68](https://arxiv.org/html/2606.29049#bib.bib44),[30](https://arxiv.org/html/2606.29049#bib.bib45),[73](https://arxiv.org/html/2606.29049#bib.bib58),[5](https://arxiv.org/html/2606.29049#bib.bib59),[18](https://arxiv.org/html/2606.29049#bib.bib91),[16](https://arxiv.org/html/2606.29049#bib.bib92),[17](https://arxiv.org/html/2606.29049#bib.bib90)\]\. Concurrently, existing efforts have accelerated developments in trustworthy AI, model robustness, and explainability by mitigating hallucinations and establishing rigorous defense, counterfactual reasoning, reflection, and interpretation frameworks\[[70](https://arxiv.org/html/2606.29049#bib.bib49),[26](https://arxiv.org/html/2606.29049#bib.bib50),[71](https://arxiv.org/html/2606.29049#bib.bib51),[57](https://arxiv.org/html/2606.29049#bib.bib55),[59](https://arxiv.org/html/2606.29049#bib.bib56),[58](https://arxiv.org/html/2606.29049#bib.bib57),[74](https://arxiv.org/html/2606.29049#bib.bib67),[76](https://arxiv.org/html/2606.29049#bib.bib68),[75](https://arxiv.org/html/2606.29049#bib.bib69),[44](https://arxiv.org/html/2606.29049#bib.bib85),[23](https://arxiv.org/html/2606.29049#bib.bib93)\], while expanding into life sciences for multimodal genomic analysis and phenotype recognition\[[63](https://arxiv.org/html/2606.29049#bib.bib52),[82](https://arxiv.org/html/2606.29049#bib.bib53),[22](https://arxiv.org/html/2606.29049#bib.bib54)\]\. Furthermore, efficient reasoning pipelines and tool\-augmented ecosystems have emerged to optimize scalability through reasoning pruning, structured supervision, and reliable evidence calibration\[[47](https://arxiv.org/html/2606.29049#bib.bib40),[32](https://arxiv.org/html/2606.29049#bib.bib79),[31](https://arxiv.org/html/2606.29049#bib.bib80),[78](https://arxiv.org/html/2606.29049#bib.bib81),[6](https://arxiv.org/html/2606.29049#bib.bib82),[56](https://arxiv.org/html/2606.29049#bib.bib84)\], complemented by generative paradigms and multimodal fusion that advance graph modeling, personalization, and rich semantic understanding in recommender systems\[[84](https://arxiv.org/html/2606.29049#bib.bib60),[64](https://arxiv.org/html/2606.29049#bib.bib86),[66](https://arxiv.org/html/2606.29049#bib.bib87)\]and computer vision tasks\[[25](https://arxiv.org/html/2606.29049#bib.bib71),[17](https://arxiv.org/html/2606.29049#bib.bib90),[28](https://arxiv.org/html/2606.29049#bib.bib66)\]\. Finally, these foundation models have demonstrated substantial socioeconomic impact within finance and decision intelligence, driving innovations in financial question answering, sentiment forecasting, bankruptcy prediction, and interpretable market analysis beyond traditional benchmarks\[[11](https://arxiv.org/html/2606.29049#bib.bib72),[9](https://arxiv.org/html/2606.29049#bib.bib73),[36](https://arxiv.org/html/2606.29049#bib.bib74),[8](https://arxiv.org/html/2606.29049#bib.bib75),[83](https://arxiv.org/html/2606.29049#bib.bib76),[23](https://arxiv.org/html/2606.29049#bib.bib93),[81](https://arxiv.org/html/2606.29049#bib.bib94)\]\.
### 2\.2The Evolution of Sequential Knowledge Tracing
Knowledge tracing has evolved from classical probabilistic student models to highly expressive deep sequential architectures\. Early methods, such as Bayesian Knowledge Tracing\[[10](https://arxiv.org/html/2606.29049#bib.bib4)\], provide interpretable state transitions but rely on overly simplified assumptions regarding learning dynamics, although related Bayesian adaptive\-control ideas have also been explored in sequential decision\-making settings\[[14](https://arxiv.org/html/2606.29049#bib.bib31)\]\. The introduction of Deep Knowledge Tracing\[[55](https://arxiv.org/html/2606.29049#bib.bib1)\]demonstrated that recurrent networks could effectively map raw interaction sequences to future performance\. Subsequent architectures have heavily refined this temporal modeling through memory and attention mechanisms, prominently including Dynamic Key\-Value Memory Networks\[[79](https://arxiv.org/html/2606.29049#bib.bib8)\], AKT\[[21](https://arxiv.org/html/2606.29049#bib.bib10)\], and MonaCoBERT\[[38](https://arxiv.org/html/2606.29049#bib.bib11)\]\. More recent iterations further constrain these sequences by incorporating question difficulty\[[45](https://arxiv.org/html/2606.29049#bib.bib15)\]or hierarchical session structures\[[35](https://arxiv.org/html/2606.29049#bib.bib17)\], with related work also exploring adaptive context\-length optimization in multi\-agent reinforcement learning settings\[[15](https://arxiv.org/html/2606.29049#bib.bib32)\]\. However, despite achieving strong predictive accuracy, these sequential baselines remain bottlenecked by their reliance on shallow, ID\-based representations and short behavioral features\. Because they fundamentally optimize for single\-granularity, next\-response prediction, they lack the semantic depth required to model collaboration\-rich environments or complex peer interactions\.
### 2\.3Structure\- and Hierarchy\-Aware Knowledge Tracing
To move beyond the limitations of pure ID\-based modeling, a parallel line of research injects content and structural priors into the KT pipeline\. Models like EERNN\[[62](https://arxiv.org/html/2606.29049#bib.bib20)\]and EKT\[[46](https://arxiv.org/html/2606.29049#bib.bib14)\]exploit exercise text to improve question representations, while Graph\-based Knowledge Tracing\[[53](https://arxiv.org/html/2606.29049#bib.bib12)\]formalizes concept dependencies using graph neural networks\. More recently, Relation\-Aware KT\[[54](https://arxiv.org/html/2606.29049#bib.bib26)\]combined text\-derived exercise relations with forgetting behaviors to improve predictive robustness\. While these studies validate the importance of relational inductive biases, they rely on predefined, static schemas and restrict semantic enrichment to the individual question or concept level\. Consequently, they are inherently ill\-equipped to process unstructured, dynamic collaborative signals—such as open\-ended peer discussions—and rarely produce jointly constrained, multi\-level mastery estimates across fine\-grained concepts, topic clusters, and overall proficiency\.
### 2\.4Large Language Models as Semantic Engines in KT
Recognizing the limitations of static representations, recent literature has begun integrating Large Language Models \(LLMs\) into educational modeling\. Current approaches generally fall into three paradigms: text\-aware representation learning \(e\.g\., LKT\[[37](https://arxiv.org/html/2606.29049#bib.bib27)\]\), LLM\-driven structural generation \(e\.g\., SINKT\[[19](https://arxiv.org/html/2606.29049#bib.bib28)\], which builds heterogeneous concept\-question graphs\), and LLM\-centered prediction or profile\-based forecasting \(e\.g\., LLM\-KT\[[67](https://arxiv.org/html/2606.29049#bib.bib29)\]and CIKT\[[40](https://arxiv.org/html/2606.29049#bib.bib30)\], which respectively use plug\-and\-play instructions and analyst\-predictor loops\)\. While these works successfully demonstrate the value of language\-model priors, they primarily focus on text\-aware representation learning, inductive or cold\-start generalization, or LLM\-centered prediction and profile generation\. They do not address the specific challenge resolved in this work: orchestrating collaborative, multi\-granularity knowledge tracing\. This gap necessitates MOSAIC, which uniquely isolates the LLM as a frozen semantic enhancer and prompt constructor\. This design injects deep semantic alignment and cross\-granularity consistency into the pipeline while preserving the efficiency and temporal stability of a dedicated sequential backbone\.
## 3Method
Figure 1:The overall architecture of the MOSAIC framework\. The model utilizes a frozen LLM to function as a semantic enhancer and prompt constructor, processing heterogeneous inputs including collaborative text and exercise records\. The right panel illustrates the hierarchical multi\-granularity estimation, where concept\-level, topic\-cluster, and global proficiency states are jointly optimized via cross\-granularity consistency losses\.### 3\.1Problem Formulation and Overview
We consider the knowledge tracing problem in a collaborative learning environment\. For each studentuu, we observe a chronologically ordered interaction sequence𝒮\(u\)=\{\(qt,rt,bt,ct\)\}t=1Tu,\\mathcal\{S\}^\{\(u\)\}=\\\{\(q\_\{t\},r\_\{t\},b\_\{t\},c\_\{t\}\)\\\}\_\{t=1\}^\{T\_\{u\}\},whereqtq\_\{t\}denotes the exercised question at time steptt,rt∈\{0,1\}r\_\{t\}\\in\\\{0,1\\\}is the student’s response correctness,btb\_\{t\}represents auxiliary learning behaviors \(e\.g\., attempt count or time\-on\-task\), andctc\_\{t\}denotes the associated knowledge concept or skill tag\. In addition, each interaction may be accompanied by collaborative textual contextxtx\_\{t\}, such as peer discussion or instructor feedback, reflecting social learning signals\.
The goal of knowledge tracing is to estimate the student’s latent knowledge state over time and predict future performance\. Unlike conventional KT, which focuses on a single granularity, MOSAIC jointly models learner mastery at multiple knowledge levels\. Specifically, at each time steptt, the model outputs mastery probabilities at the concept levely^t\(c\)\\hat\{y\}\_\{t\}^\{\(c\)\}, topic\-cluster levely^t\(g\)\\hat\{y\}\_\{t\}^\{\(g\)\}, and global proficiency levely^t\(u\)\\hat\{y\}\_\{t\}^\{\(u\)\}:\(y^t\(c\),y^t\(g\),y^t\(u\)\)=fθ\(𝒮≤t\(u\),x≤t\),\(\\hat\{y\}\_\{t\}^\{\(c\)\},\\hat\{y\}\_\{t\}^\{\(g\)\},\\hat\{y\}\_\{t\}^\{\(u\)\}\)=f\_\{\\theta\}\(\\mathcal\{S\}^\{\(u\)\}\_\{\\leq t\},x\_\{\\leq t\}\),wherefθf\_\{\\theta\}denotes the proposed MOSAIC model parameterized byθ\\theta\. The topic\-cluster level corresponds to a higher\-level grouping of related concepts, while the global level reflects an overall estimate of the student’s learning proficiency\.
As illustrated in Figure[1](https://arxiv.org/html/2606.29049#S3.F1), MOSAIC follows a simple design principle\. A frozen LLM is used to provide semantically grounded, context\-aware representations from heterogeneous learning signals, while a lightweight sequential KT backbone models temporal knowledge evolution\. On top of the shared latent state, three prediction heads estimate mastery at different granularities, and a consistency objective encourages coherent knowledge estimation across fine\-grained and coarse\-grained levels\.
### 3\.2Hierarchical Semantic Alignment with Frozen LLM
A key limitation of existing KT models is their reliance on discrete ID\-based embeddings for questions and concepts, which lack semantic expressiveness\. To address this issue, MOSAIC leverages a large language model \(LLM\) as a semantic enhancement module that transforms heterogeneous learning signals into context\-aware representations\.
For each time steptt, we construct a textual descriptionτt\\tau\_\{t\}that summarizes the current learning context, including the exercised questionqtq\_\{t\}, its associated conceptctc\_\{t\}, the student’s responsertr\_\{t\}, auxiliary behaviorsbtb\_\{t\}, and available collaborative interaction textxtx\_\{t\}\. The LLM encoder then maps this description to a dense semantic representation:𝐞t=LLMenc\(τt\)∈ℝd,\\mathbf\{e\}\_\{t\}=\\mathrm\{LLM\}\_\{\\text\{enc\}\}\(\\tau\_\{t\}\)\\in\\mathbb\{R\}^\{d\},where𝐞t\\mathbf\{e\}\_\{t\}serves as a dynamic knowledge embedding that captures semantic relations among questions, concepts, and interactions\.
In addition to dynamic embeddings, MOSAIC uses the LLM to construct prediction prompts that explicitly encode hierarchical knowledge context and recent learning history\. Formally, we define a prompt templateπ\(⋅\)\\pi\(\\cdot\)that conditions on the interaction prefix up to timett, producing a structured promptpt=π\(𝒮≤t,x≤t\)p\_\{t\}=\\pi\(\\mathcal\{S\}\_\{\\leq t\},x\_\{\\leq t\}\)\. The LLM processes this prompt to generate a prompt\-aware representation:𝐳t=LLMprompt\(pt\),\\mathbf\{z\}\_\{t\}=\\mathrm\{LLM\}\_\{\\text\{prompt\}\}\(p\_\{t\}\),which provides high\-level semantic guidance for downstream knowledge state modeling\. Importantly, the LLM is not used to directly predict correctness; instead, the generated embeddings𝐞t\\mathbf\{e\}\_\{t\}and prompt representations𝐳t\\mathbf\{z\}\_\{t\}are treated as auxiliary semantic inputs\. This design decouples semantic understanding from sequential prediction, allowing MOSAIC to retain the robustness of conventional KT models while benefiting from the rich representational capacity of LLMs\.
### 3\.3Sequential Knowledge State Modeling
Given the LLM\-generated semantic embedding𝐞t\\mathbf\{e\}\_\{t\}and prompt\-aware representation𝐳t\\mathbf\{z\}\_\{t\}at each time step, MOSAIC models the temporal evolution of a student’s knowledge state using a sequential encoder\. We first combine semantic and behavioral information through a fusion functionϕ\(⋅\)\\phi\(\\cdot\):𝐡t=ϕ\(𝐞t,𝐳t\),\\mathbf\{h\}\_\{t\}=\\phi\(\\mathbf\{e\}\_\{t\},\\mathbf\{z\}\_\{t\}\),where𝐡t\\mathbf\{h\}\_\{t\}represents the enriched interaction representation at timett\. In practice,ϕ\(⋅\)\\phi\(\\cdot\)can be instantiated as concatenation followed by a linear projection\.
Figure 2:The sequential processing workflow and multi\-granularity semantic alignment in MOSAIC\. The diagram illustrates how interaction history and collaborative text are transformed by the frozen LLM into dynamic latent states \(𝐳t,𝐞t\\mathbf\{z\}\_\{t\},\\mathbf\{e\}\_\{t\}\) over time\. The right section highlights theconsistency alignment mechanism, which enforces logical coherence between fine\-grained concept mastery, mid\-grained topic mastery, and coarse\-grained global proficiency during the sequential modeling process\.The sequence\{𝐡t\}t=1Tu\\\{\\mathbf\{h\}\_\{t\}\\\}\_\{t=1\}^\{T\_\{u\}\}is then fed into a sequential model to capture temporal dependencies in the student’s learning trajectory:𝐬t=SeqModel\(𝐡1,…,𝐡t\),\\mathbf\{s\}\_\{t\}=\\mathrm\{SeqModel\}\(\\mathbf\{h\}\_\{1\},\\ldots,\\mathbf\{h\}\_\{t\}\),where𝐬t\\mathbf\{s\}\_\{t\}denotes the latent knowledge state at timett\. The sequential model can be implemented using a recurrent neural network or a transformer\-based architecture, enabling MOSAIC to model both short\-term learning dynamics and long\-range knowledge accumulation\.
Compared with conventional KT models that rely solely on ID\-based inputs, the latent state𝐬t\\mathbf\{s\}\_\{t\}in MOSAIC is conditioned on semantically rich and context\-aware representations, allowing the model to better capture nuanced learning patterns and the influence of collaborative interactions over time\. This stage preserves the temporal inductive bias of knowledge tracing while enriching the state dynamics with semantic signals from the frozen LLM\.
### 3\.4Multi\-Granularity Mastery Estimation
Based on the latent knowledge state𝐬t\\mathbf\{s\}\_\{t\}, MOSAIC jointly estimates learner mastery at multiple knowledge granularities\. Specifically, we employ three prediction heads to model concept\-level, topic\-cluster\-level, and global proficiency mastery\. For each time steptt, the corresponding mastery probabilities are computed asy^t\(c\)=σ\(𝐖c𝐬t\),y^t\(g\)=σ\(𝐖g𝐬t\),y^t\(u\)=σ\(𝐖u𝐬t\),\\hat\{y\}\_\{t\}^\{\(c\)\}=\\sigma\(\\mathbf\{W\}\_\{c\}\\mathbf\{s\}\_\{t\}\),\\quad\\hat\{y\}\_\{t\}^\{\(g\)\}=\\sigma\(\\mathbf\{W\}\_\{g\}\\mathbf\{s\}\_\{t\}\),\\quad\\hat\{y\}\_\{t\}^\{\(u\)\}=\\sigma\(\\mathbf\{W\}\_\{u\}\\mathbf\{s\}\_\{t\}\),whereσ\(⋅\)\\sigma\(\\cdot\)denotes the sigmoid function, and𝐖c\\mathbf\{W\}\_\{c\},𝐖g\\mathbf\{W\}\_\{g\}, and𝐖u\\mathbf\{W\}\_\{u\}are learnable parameters for the concept\-, topic\-, and global\-level predictors, respectively\.
This design allows MOSAIC to produce fine\-grained and coarse\-grained mastery estimates from a shared latent state\. Rather than treating higher\-level signals as auxiliary outputs, MOSAIC explicitly models hierarchical knowledge states within a unified prediction framework\.
### 3\.5Cross\-Granularity Consistency Optimization
The consistency alignment mechanism is illustrated in Figure[2](https://arxiv.org/html/2606.29049#S3.F2)\. The model is trained with a standard prediction lossℒpred\\mathcal\{L\}\_\{\\text\{pred\}\}, defined as the binary cross\-entropy between the predicted mastery and observed student responses at the concept level\. To ensure coherent knowledge estimation across granularities, we further introduce a cross\-granularity consistency loss that regularizes the agreement between fine\-grained and coarse\-grained predictions:ℒcons=∑t\(∥y^t\(c\)−y^t\(g\)∥22\+∥y^t\(g\)−y^t\(u\)∥22\)\.\\mathcal\{L\}\_\{\\text\{cons\}\}=\\sum\_\{t\}\\left\(\\lVert\\hat\{y\}\_\{t\}^\{\(c\)\}\-\\hat\{y\}\_\{t\}^\{\(g\)\}\\rVert\_\{2\}^\{2\}\+\\lVert\\hat\{y\}\_\{t\}^\{\(g\)\}\-\\hat\{y\}\_\{t\}^\{\(u\)\}\\rVert\_\{2\}^\{2\}\\right\)\.This objective encourages concept\-level mastery to align with its corresponding topic\-cluster estimate, and topic\-level mastery to remain consistent with global proficiency, while still allowing flexibility for local variations\.
The final training objective is given byℒ=ℒpred\+λℒcons,\\mathcal\{L\}=\\mathcal\{L\}\_\{\\text\{pred\}\}\+\\lambda\\,\\mathcal\{L\}\_\{\\text\{cons\}\},whereλ\\lambdacontrols the strength of cross\-granularity regularization\. This joint objective enables MOSAIC to produce stable and interpretable mastery estimates across multiple knowledge levels\.
## 4Experiments
### 4\.1Experimental Setup
DatasetsWe evaluate MOSAIC on three educational datasets\. ASSISTments contains student–problem interaction logs with binary correctness labels and expert\-annotated skill tags, and is a standard benchmark for concept\-level knowledge tracing\. EdNet provides large\-scale student interaction sequences with rich temporal information and fine\-grained learning behaviors, making it suitable for evaluating knowledge tracing under long and dense learning trajectories\. In addition, we use a Chinese University MOOC dataset collected from public online courses, which includes problem\-solving records, detailed learning behaviors, expert\-provided knowledge tags, and collaborative interaction text such as discussion posts and peer comments\. This dataset is particularly useful for evaluating knowledge tracing in collaboration\-rich learning environments\.
Task Definition and Evaluation ProtocolOur primary task is standard concept\-level next\-response prediction: given a student’s interaction history up to time steptt, the model predicts the correctness of the next interaction\. For each dataset, student interaction sequences are ordered chronologically and split into training, validation, and test sets at the student level with a ratio of 8:1:1 to avoid information leakage\. Model selection and hyperparameter tuning are performed on the validation set\.
BaselinesWe compare MOSAIC against representative knowledge tracing baselines spanning classical probabilistic modeling, neural sequential modeling, and recent attention\-based architectures, including BKT, DKT, DKVMN, AKT, and MonaCoBERT\. These baselines provide strong reference for evaluating whether the proposed semantic alignment and hierarchical design yield consistent improvements over existing KT approaches\.
Evaluation MetricsWe reportArea Under the ROC Curve \(AUC\)andAccuracyas the evaluation metrics\. AUC serves as the primary metric because it measures ranking quality independent of a fixed decision threshold, while Accuracy is reported as a complementary thresholded metric\.
Implementation DetailsThe sequential knowledge tracing component is implemented using a transformer\-based encoder with 2 layers, 4 attention heads, and a hidden dimension ofd=256d=256\. For semantic enhancement, we employ Qwen2\.5\-7B\-Instruct as a frozen LLM to generate dynamic knowledge embeddings and prediction prompts for both English \(ASSISTments, EdNet\) and Chinese \(MOOC\) inputs\. To ensure efficiency, all LLM outputs are extracted offline, cached, and reused during training and inference, with no gradient back\-propagation through the LLM\.
### 4\.2Main Results on Standard Knowledge Tracing
Table 1:Concept\-level next\-response prediction performance on ASSISTments and EdNet\.Table 2:Concept\-level next\-response prediction performance on the Chinese University MOOC dataset\.The gains over the strongest baselines are substantial rather than marginal, with AUC improvements of 3\.4% on ASSISTments and 3\.0% on EdNet\. This pattern suggests that the advantage does not come from temporal modeling alone, since AKT and MonaCoBERT already provide competitive sequence modeling capacity\. A more plausible explanation is that MOSAIC augments the sequential backbone with semantically grounded representations and prompt\-aware hierarchical guidance, allowing it to move beyond shallow ID\-based interaction encoding\. By contrast, existing KT baselines mainly optimize single\-granularity next\-response prediction and therefore have limited ability to exploit rich problem semantics and contextual learning signals\.
The larger margin on the MOOC benchmark is consistent with this interpretation\. Because MOOC contains richer behavioral context and collaborative interaction text, it better exposes the weakness of purely ID\-driven KT models and highlights the benefit of semantically aligned modeling\. This result suggests that MOSAIC is not only a stronger generic predictor, but is particularly effective when the learning environment contains heterogeneous and collaboration\-rich signals\.
### 4\.3Ablation Study
Table 3:Ablation results on ASSISTments\.We conduct ablation studies on ASSISTments to quantify the contribution of each major component in MOSAIC\. All ablated variants share the same sequential backbone and training protocol, differing only in the removed component\.
Removing the LLM\-driven semantic embeddings causes the largest drop, reducing AUC from 0\.881 to 0\.852\. This suggests that the key limitation of prior KT models is not simply insufficient sequence modeling, but the lack of semantically expressive representations\. Without this component, the model falls back toward the shallow interaction encoding regime of conventional ID\-based KT methods\. Removing prompt\-based prediction also degrades performance, indicating that semantic embeddings alone are insufficient and that prompt\-aware hierarchical guidance provides additional structure for organizing learning history and supporting prediction\.
Excluding collaborative text further hurts performance, showing that the gain is not merely from adding an LLM\-derived feature pipeline, but specifically from MOSAIC’s ability to extract useful information from social learning signals\. Removing the cross\-granularity consistency loss also reduces both AUC and Accuracy, suggesting that consistency regularization improves optimization stability and helps preserve coherent hierarchical structure in the learned representations\. Taken together, the ablations explain why competing methods underperform: they may model temporal dependencies well, but they lack one or more components needed to align semantic context, collaborative evidence, and hierarchical learner\-state estimation within a unified framework\.
### 4\.4Analysis in Challenging Settings
Table 4:Effect of collaborative interaction modeling on the MOOC dataset\.Table 5:Performance \(AUC\) under different sequence lengths on EdNet\.We further analyze MOSAIC in two challenging yet realistic settings: collaboration\-rich learning environments and long interaction sequences\.
On the MOOC dataset, collaborative interaction modeling benefits both AKT and MOSAIC, but the gain is larger for MOSAIC\. This suggests that collaborative text is not automatically useful; it becomes more helpful only when the model can semantically align unstructured peer interactions with the student’s evolving knowledge state\. In this sense, MOSAIC converts collaboration from noisy side information into task\-relevant semantic evidence more effectively than standard KT models\.
A similar pattern appears in the long\-sequence analysis on EdNet\. As sequence length increases, baseline performance declines, whereas MOSAIC remains strong and achieves its largest advantage in the longest\-sequence regime\. This suggests that semantically enriched representations and prompt\-based guidance preserve informative context over extended learning histories, reducing the brittleness of purely ID\-driven sequential modeling when temporal dependencies become long and complex\. The fact that MOSAIC’s relative advantage grows with sequence length further supports the claim that its gains come from better context modeling rather than a narrow improvement in short\-horizon prediction\.
Overall, these results indicate that MOSAIC is particularly effective in realistic KT scenarios characterized by richer social context and longer temporal dependencies\. More importantly, the advantage appears precisely in the settings where the proposed semantic alignment and hierarchical inductive structure should matter most, strengthening the empirical support for the central design claim of the model\.
### 4\.5Qualitative Discussion on Interpretability
Beyond predictive performance, MOSAIC is designed to provide a more structured view of learner states through its multi\-granularity architecture and semantically informed representations\. Unlike conventional KT models that output only a single mastery score, MOSAIC produces concept\-level, topic\-cluster\-level, and global proficiency predictions from a shared latent state, which may facilitate a more organized analysis of learning progress\.
The cross\-granularity consistency objective is intended to encourage coherence across these levels of prediction, while the semantically enriched representations provide a principled way to connect model outputs with problem content, behavioral context, and collaborative interactions\. Compared with conventional single\-granularity KT models, this design offers a potentially more structured interface for analyzing learner dynamics and supporting downstream feedback\.
Because MOSAIC conditions its representations on problem content, response outcomes, and collaborative interaction text, changes in the predicted learner state can in principle be associated with meaningful learning events, such as persistent difficulties on a specific concept or improvements following peer discussion\. In this sense, MOSAIC may provide a semantically grounded interface for interpreting learner dynamics in practical educational applications that require both prediction quality and structured feedback\.
## 5Conclusion
We addressed a central limitation of knowledge tracing: existing methods often struggle to jointly capture semantic richness, hierarchical knowledge structure, and collaborative learning signals\. To this end, we presentedMOSAIC, a unified framework that combines frozen\-LLM\-based semantic alignment with a sequential knowledge tracing backbone, enabling student modeling that is both semantically informed and temporally grounded\.
Rather than treating the LLM as an end\-to\-end predictor, MOSAIC uses dynamic semantic embeddings and prompt\-aware representations to enhance sequential modeling, while jointly estimating concept\-level, topic\-cluster\-level, and global proficiency states under a cross\-granularity consistency objective\. This design preserves the temporal inductive bias of knowledge tracing while extending it beyond shallow ID\-based representations\. Empirically, MOSAIC achieves consistent improvements over strong baselines on ASSISTments, EdNet, and a large\-scale Chinese University MOOC dataset, with gains of up to 3\.4 % AUC and particularly strong robustness in collaboration\-rich and long\-sequence settings\.
Overall, our results suggest that semantic alignment, together with hierarchical inductive structure, is a promising direction for advancing knowledge tracing in realistic educational environments\. Future work includes extending MOSAIC to settings with explicit knowledge hierarchies and open\-ended assessment, as well as developing more efficient strategies for integrating large language models into large\-scale educational systems\.
## References
- \[1\]G\. Abdelrahman, Q\. Wang, and B\. P\. Nunes\(2023\)Knowledge tracing: a survey\.ACM Computing Surveys55\(11\),pp\. 224:1–224:37\.External Links:[Document](https://dx.doi.org/10.1145/3569576)Cited by:[§1](https://arxiv.org/html/2606.29049#S1.p2.1)\.
- \[2\]J\. Achiam, S\. Adler, S\. Agarwal, L\. Ahmad, I\. Akkaya, F\. L\. Aleman, D\. Almeida, J\. Altenschmidt, S\. Altman, S\. Anadkat,et al\.\(2023\)GPT\-4 technical report\.arXiv preprint arXiv:2303\.08774\.Cited by:[§1](https://arxiv.org/html/2606.29049#S1.p3.1)\.
- \[3\]Y\. Bai, J\. Zhao, T\. Wei, Q\. Cai, and L\. He\(2024\)A survey of explainable knowledge tracing\.arXiv preprint arXiv:2403\.07279\.Cited by:[§1](https://arxiv.org/html/2606.29049#S1.p2.1)\.
- \[4\]J\. Chan, Z\. Zhao, and Y\. Liu\(2026\)AdaGaR: adaptive gabor representation for dynamic scene reconstruction\.arXiv preprint arXiv:2601\.00796\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[5\]W\. Chen, X\. Guo, S\. Li, Y\. Zhong, Z\. Zhang, F\. Zhuang, H\. Liu, L\. Zhang, G\. Ye, and H\. He\(2026\)Learning structure\-semantic evolution trajectories for graph domain adaptation\.InInternational Conference on Learning Representations,Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[6\]Y\. Chen, P\. Qian, S\. Wang, S\. Zhang, H\. Xu, S\. Lin, and X\. Wei\(2026\)Does rag know when retrieval is wrong? diagnosing context compliance under knowledge conflict\.External Links:2605\.14473,[Link](https://arxiv.org/abs/2605.14473)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[7\]Z\. Chen, Y\. Hu, Z\. Li, Z\. Fu, H\. Wen, and W\. Guan\(2025\)HUD: hierarchical uncertainty\-aware disambiguation network for composed video retrieval\.InProceedings of the ACM International Conference on Multimedia,pp\. 6143–6152\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[8\]Z\. Cheng, L\. Lai, Y\. Liu, K\. Cheng, and X\. Qi\(2026\)Enhancing financial report question\-answering: a retrieval\-augmented generation system with reranking analysis\.External Links:2603\.16877,[Link](https://arxiv.org/abs/2603.16877)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[9\]Z\. Cheng, L\. Lai, and Y\. Liu\(2026\)Resolving the robustness\-precision trade\-off in financial rag through hybrid document\-routed retrieval\.External Links:2603\.26815,[Link](https://arxiv.org/abs/2603.26815)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[10\]A\. T\. Corbett and J\. R\. Anderson\(1994\)Knowledge tracing: modeling the acquisition of procedural knowledge\.User Modeling and User\-Adapted Interaction4\(4\),pp\. 253–278\.Cited by:[§1](https://arxiv.org/html/2606.29049#S1.p1.1),[§1](https://arxiv.org/html/2606.29049#S1.p3.1),[§2\.2](https://arxiv.org/html/2606.29049#S2.SS2.p1.1)\.
- \[11\]Y\. Dai, M\. Chen, and Z\. Zuo\(2023\)Neighbors in space: satellite imagery and chinese b\-share discount\.China Economic Review82,pp\. 102063\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[12\]M\. Deng, S\. Lu, J\. Shi, and W\. Zhang\(2026\)Adaptive traffic signal control optimization using a novel road partition and multi\-channel state representation method\.Urban Lifeline4\(1\),pp\. 9\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[13\]R\. Du, Z\. Li, J\. Zhang, K\. Gao, and L\. Hu\(2026\)Point cloud mapping and loop closure detection using superpoint semantic graph for autonomous driving\.IEEE Transactions on Intelligent Transportation Systems\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[14\]W\. Duan, Z\. Gao, J\. He, and J\. Xian\(2025\)Bayesian critique\-tune\-based reinforcement learning with adaptive pressure for multi\-intersection traffic signal control\.IEEE Transactions on Intelligent Transportation Systems26\(10\),pp\. 14968–14983\.External Links:[Document](https://dx.doi.org/10.1109/TITS.2025.3581858)Cited by:[§2\.2](https://arxiv.org/html/2606.29049#S2.SS2.p1.1)\.
- \[15\]W\. Duan, Y\. Yu, J\. He, and Y\. Shi\(2025\)Adaptive context length optimization with low\-frequency truncation for multi\-agent reinforcement learning\.InThe Thirty\-ninth Annual Conference on Neural Information Processing Systems,Cited by:[§2\.2](https://arxiv.org/html/2606.29049#S2.SS2.p1.1)\.
- \[16\]W\. Feng, L\. Ju, L\. Wang, K\. Song, X\. Zhao, and Z\. Ge\(2023\)Unsupervised domain adaptation for medical image segmentation by selective entropy constraints and adaptive semantic alignment\.InProceedings of the AAAI Conference on Artificial Intelligence,pp\. 623–631\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[17\]W\. Feng, B\. Wang, Z\. Wang, S\. Zhou, and Z\. Ge\(2026\)Leveraging image\-text pairs for generalized category discovery in medical image classification\.IEEE Transactions on Medical Imaging\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[18\]W\. Feng, S\. Zhou, Y\. Jiang, and Z\. Ge\(2026\)PRISM: progressive robust learning for open\-world continual category discovery\.InThe Fourteenth International Conference on Learning Representations,Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[19\]L\. Fu, H\. Guan, K\. Du, J\. Lin, W\. Xia, W\. Zhang, R\. Tang, Y\. Wang, and Y\. Yu\(2024\)SINKT: a structure\-aware inductive knowledge tracing model with large language model\.InProceedings of the 33rd ACM International Conference on Information and Knowledge Management,pp\. 632–642\.External Links:[Document](https://dx.doi.org/10.1145/3627673.3679760),[Link](https://doi.org/10.1145/3627673.3679760)Cited by:[§2\.4](https://arxiv.org/html/2606.29049#S2.SS4.p1.1)\.
- \[20\]Z\. Fu, Y\. Hu, Q\. Yang, S\. Zhang, Z\. Chen, and Z\. Li\(2026\)Air\-know: arbiter\-calibrated knowledge\-internalizing robust network for composed image retrieval\.External Links:2604\.19386,[Link](https://arxiv.org/abs/2604.19386)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[21\]A\. Ghosh, N\. T\. Heffernan, and A\. S\. Lan\(2020\)Context\-aware attentive knowledge tracing\.InProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining,pp\. 2330–2339\.External Links:[Document](https://dx.doi.org/10.1145/3394486.3403282),[Link](https://doi.org/10.1145/3394486.3403282)Cited by:[§1](https://arxiv.org/html/2606.29049#S1.p3.1),[§2\.2](https://arxiv.org/html/2606.29049#S2.SS2.p1.1)\.
- \[22\]J\. Guo, X\. Luo, J\. Zheng, Y\. Wang, K\. Chang, W\. Wang, and J\. Liu\(2025\)Quantized\-tinyllava: a new multimodal foundation model enables efficient split learning\.InarXiv preprint arXiv:2511\.23402,Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[23\]X\. Han, Y\. Xiao, Z\. Zhang, and M\. Zheng\(2026\)Interpretable factor decomposition for decision intelligence in large\-scale financial markets: evidence from china’s a\-share market\.arXiv preprint arXiv:2606\.12843\.External Links:[Document](https://dx.doi.org/10.48550/arXiv.2606.12843),[Link](https://arxiv.org/abs/2606.12843)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[24\]Y\. He, C\. Zhang, F\. Chen, and J\. Cao\(2026\)CineMatte: background matting for virtual production and beyond\.External Links:2605\.18328,[Link](https://arxiv.org/abs/2605.18328)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[25\]Y\. He, W\. Zhang, J\. Deng, and Y\. Cong\(2024\)Prior\-knowledge\-free video frame interpolation with bidirectional regularized implicit neural representations\.InMultiMedia Modeling: 30th International Conference, MMM 2024, Amsterdam, The Netherlands, January 29–February 2, 2024, Proceedings, Part III,Berlin, Heidelberg,pp\. 112–126\.External Links:ISBN 978\-3\-031\-53310\-5Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[26\]Q\. Huang, Z\. Xu, X\. Zhang, and J\. Zhang\(2025\)UniShield: an adaptive multi\-agent framework for unified forgery image detection and localization\.arXiv preprint arXiv:2510\.03161\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[27\]Q\. Huang, Z\. Chen, Z\. Li, C\. Wang, X\. Song, Y\. Hu, and L\. Nie\(2025\)Median: adaptive intermediate\-grained aggregation network for composed image retrieval\.InICASSP 2025\-2025 IEEE International Conference on Acoustics, Speech and Signal Processing \(ICASSP\),pp\. 1–5\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[28\]F\. Ji, J\. Yang, Z\. Song, L\. Gao, J\. Liang, Z\. Chen, J\. Zhang, and X\. Chen\(2026\)ServImage: an image generation and editing benchmark from real\-world commercial imaging services\.External Links:2604\.24023,[Link](https://arxiv.org/abs/2604.24023)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[29\]F\. Ji, J\. Yang, Z\. Song, Y\. Wang, Z\. Cui, Y\. Li, Q\. Jiang, M\. Fang, and X\. Chen\(2025\)FineState\-bench: a comprehensive benchmark for fine\-grained state control in gui agents\.arXiv preprint arXiv:2508\.09241\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[30\]N\. Jia, W\. Huang, C\. Ding, J\. Wang, and Z\. Zhu\(2024\)Physics\-informed unsupervised domain adaptation framework for cross\-machine bearing fault diagnosis\.Advanced Engineering Informatics62,pp\. 102774\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[31\]Y\. Jiang and F\. Ferraro\(2026\)SCRIBE: structured mid\-level supervision for tool\-using language models\.arXiv preprint arXiv:2601\.03555\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[32\]Y\. Jiang, D\. Li, and F\. Ferraro\(2026\)DRP: distilled reasoning pruning with skill\-aware step decomposition for efficient large reasoning models\.External Links:2505\.13975,[Link](https://arxiv.org/abs/2505.13975)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[33\]R\. Jiao, J\. Zhang, C\. Li, and L\. Hu\(2026\)Large\-kernel spatially parallel feature fusion for monocular 3d perception in autonomous driving\.Knowledge\-Based Systems343,pp\. 115998\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[34\]E\. Kasneci, K\. Seßler, S\. Küchemann, M\. Bannert, D\. Dementieva, F\. Fischer, U\. Gasser, G\. Groh, S\. Günnemann, E\. Hüllermeier,et al\.\(2023\)ChatGPT for good? on opportunities and challenges of large language models for education\.Learning and Individual Differences103,pp\. 102274\.Cited by:[§1](https://arxiv.org/html/2606.29049#S1.p3.1)\.
- \[35\]F\. Ke, W\. Wang, W\. Tan, L\. Du, Y\. Jin, Y\. Huang, and H\. Yin\(2024\)HiTSKT: a hierarchical transformer model for session\-aware knowledge tracing\.Knowledge\-Based Systems284,pp\. 111300\.Cited by:[§2\.2](https://arxiv.org/html/2606.29049#S2.SS2.p1.1)\.
- \[36\]L\. Lai, Z\. Cheng, K\. Cheng, and X\. Qi\(2026\)Do transformers always win? an empirical study of semantic embeddings for short\-text e\-commerce reviews\.In2026 9th International Symposium on Big Data and Applied Statistics \(ISBDAS\),pp\. 525–529\.External Links:[Document](https://dx.doi.org/10.1109/ISBDAS69350.2026.11484350)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[37\]U\. Lee, J\. Bae, D\. Kim, S\. Lee, J\. Park, T\. Ahn, G\. Lee, D\. Stratton, and H\. Kim\(2024\)Language model can do knowledge tracing: simple but effective method to integrate language model and knowledge tracing task\.arXiv preprint arXiv:2406\.02893\.External Links:[Document](https://dx.doi.org/10.48550/arXiv.2406.02893),[Link](https://arxiv.org/abs/2406.02893)Cited by:[§2\.4](https://arxiv.org/html/2606.29049#S2.SS4.p1.1)\.
- \[38\]U\. Lee, Y\. Park, Y\. Kim, S\. Choi, and H\. Kim\(2024\)MonaCoBERT: monotonic attention based ConvBERT for knowledge tracing\.InInternational Conference on Intelligent Tutoring Systems,pp\. 107–123\.Cited by:[§1](https://arxiv.org/html/2606.29049#S1.p3.1),[§2\.2](https://arxiv.org/html/2606.29049#S2.SS2.p1.1)\.
- \[39\]H\. Li, J\. Zhao, J\. Bazin, P\. Kim, K\. Joo, Z\. Zhao, and Y\. Liu\(2023\)Hong kong world: leveraging structural regularity for line\-based slam\.IEEE Transactions on Pattern Analysis and Machine Intelligence45\(11\),pp\. 13035–13053\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[40\]R\. Li, S\. Wu, J\. Wang, and W\. Zhang\(2025\)CIKT: a collaborative and iterative knowledge tracing framework with large language models\.InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing,Suzhou, China,pp\. 19321–19334\.External Links:[Document](https://dx.doi.org/10.18653/v1/2025.emnlp-main.975),[Link](https://aclanthology.org/2025.emnlp-main.975/)Cited by:[§2\.4](https://arxiv.org/html/2606.29049#S2.SS4.p1.1)\.
- \[41\]W\. Li, X\. Su, Y\. Cao, H\. Xu, X\. Xia, S\. You, Y\. Chen, and C\. Xu\(2026\)Sentinel\-vla: a metacognitive vla model with active status monitoring for dynamic reasoning and error recovery\.External Links:2605\.01191,[Link](https://arxiv.org/abs/2605.01191)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[42\]W\. Li, X\. Su, D\. Niu, Y\. Cao, H\. Xu, Z\. Qu, L\. Fan, S\. You, and C\. Xu\(2026\)VLA\-attc: adaptive test\-time compute for vla models with relative action critic model\.External Links:2605\.01194,[Link](https://arxiv.org/abs/2605.01194)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[43\]W\. Li, X\. Su, J\. Wu, F\. Yang, Y\. Liu, Y\. Chen, S\. You, and C\. Xu\(2025\)Identify, isolate, and purge: mitigating hallucinations in lvlms via self\-evolving distillation\.InProceedings of the 33rd ACM International Conference on Multimedia,pp\. 6791–6800\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[44\]L\. Lin, J\. You, Y\. Li, L\. S\. Lin, Y\. Wang, Z\. Zhang, and M\. Zheng\(2026\)Reflect\-guard: enhancing llm safeguards against adversarial prompts via logical self\-reflection\.arXiv preprint arXiv:2605\.24834\.External Links:2605\.24834,[Document](https://dx.doi.org/10.48550/arXiv.2605.24834),[Link](https://doi.org/10.48550/arXiv.2605.24834)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[45\]G\. Liu, H\. Zhan, and J\. Kim\(2024\)Question difficulty consistent knowledge tracing\.InProceedings of the ACM Web Conference 2024,pp\. 4239–4248\.External Links:[Document](https://dx.doi.org/10.1145/3589334.3645582),[Link](https://doi.org/10.1145/3589334.3645582)Cited by:[§2\.2](https://arxiv.org/html/2606.29049#S2.SS2.p1.1)\.
- \[46\]Q\. Liu, Z\. Huang, Y\. Yin, E\. Chen, H\. Xiong, Y\. Su, and G\. Hu\(2021\)EKT: exercise\-aware knowledge tracing for student performance prediction\.IEEE Transactions on Knowledge and Data Engineering33\(1\),pp\. 100–115\.External Links:[Document](https://dx.doi.org/10.1109/TKDE.2019.2924374),[Link](https://doi.org/10.1109/TKDE.2019.2924374)Cited by:[§1](https://arxiv.org/html/2606.29049#S1.p3.1),[§2\.3](https://arxiv.org/html/2606.29049#S2.SS3.p1.1)\.
- \[47\]X\. Liu, S\. Song, Z\. Zhang, C\. Zhang, H\. Lan, J\. Zeng, M\. Wu, M\. Heinrich, and S\. Yong\(2026\)Agora: toward autonomous bug detection in production\-level consensus protocols with llm agents\.InInternational Conference on Machine Learning,Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[48\]X\. Liu, Z\. Tang, X\. H\. Li, Y\. Song, S\. Ji, Z\. Liu, B\. Han, L\. Jiang, and J\. Li\(2025\)One\-shot federated learning methods: a practical guide\.International Joint Conference on Artificial Intelligence\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[49\]F\. M\. Lord\(2012\)Applications of item response theory to practical testing problems\.Routledge\.Cited by:[§1](https://arxiv.org/html/2606.29049#S1.p3.1)\.
- \[50\]S\. Lou and Z\. Cui\(2026\)Enhancing human mobility prediction with spatially aware llm\-based multi\-agent systems\.InProceedings of HILDA ’26,HILDA ’26,New York, NY, USA\.External Links:ISBN 9798400727153,[Link](https://doi.org/10.1145/3814573.3814949),[Document](https://dx.doi.org/10.1145/3814573.3814949)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[51\]S\. Lou\(2025\)Urban\-mas: human\-centered urban prediction with llm\-based multi agent system\.InProceedings of UrbanAI ’25,UrbanAI ’25,New York, NY, USA,pp\. 37–40\.External Links:ISBN 9798400721892,[Link](https://doi.org/10.1145/3764926.3771951),[Document](https://dx.doi.org/10.1145/3764926.3771951)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[52\]X\. Meng, P\. Hou, Z\. Zhao, J\. Civera, D\. Cremers, H\. Wang, and H\. Li\(2026\)Dream\-slam: dreaming the unseen for active slam in dynamic environments\.arXiv preprint arXiv:2602\.21967\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[53\]H\. Nakagawa, Y\. Iwasawa, and Y\. Matsuo\(2019\)Graph\-based knowledge tracing: modeling student proficiency using graph neural network\.InProceedings of the 2019 IEEE/WIC/ACM International Conference on Web Intelligence,pp\. 156–163\.External Links:[Document](https://dx.doi.org/10.1145/3350546.3352513),[Link](https://doi.org/10.1145/3350546.3352513)Cited by:[§1](https://arxiv.org/html/2606.29049#S1.p3.1),[§2\.3](https://arxiv.org/html/2606.29049#S2.SS3.p1.1)\.
- \[54\]S\. Pandey and J\. Srivastava\(2020\)RKT: relation\-aware self\-attention for knowledge tracing\.InProceedings of the 29th ACM International Conference on Information & Knowledge Management,pp\. 1205–1214\.External Links:[Link](https://dblp.org/rec/conf/cikm/PandeyS20)Cited by:[§2\.3](https://arxiv.org/html/2606.29049#S2.SS3.p1.1)\.
- \[55\]C\. Piech, J\. Bassen, J\. Huang, S\. Ganguli, M\. Sahami, L\. J\. Guibas, and J\. Sohl\-Dickstein\(2015\)Deep knowledge tracing\.Advances in Neural Information Processing Systems28,pp\. 505–513\.External Links:[Link](https://proceedings.neurips.cc/paper/2015/hash/bac9162b47c56fc8a4d2a519803d51b3-Abstract.html)Cited by:[§1](https://arxiv.org/html/2606.29049#S1.p1.1),[§1](https://arxiv.org/html/2606.29049#S1.p3.1),[§2\.2](https://arxiv.org/html/2606.29049#S2.SS2.p1.1)\.
- \[56\]P\. Qian, S\. Wang, X\. Wang, Y\. Chen, W\. Xu, Q\. Yu, S\. Lin, S\. Zhang, J\. You, and X\. Wei\(2026\)Relevant is not warranted: evidence\-force calibration for cited rag\.External Links:2605\.28044,[Link](https://arxiv.org/abs/2605.28044)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[57\]X\. Qin, M\. H\. Chignell, A\. Greifenberger, S\. Lokuge, E\. Toumeh, T\. Sternat, M\. Katzman, and L\. Wang\(2026\)Explainable counterfactual reasoning in depression medication selection at multi\-levels \(personalized and population\)\.BMC Medical Informatics and Decision Making\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[58\]X\. Qin, S\. Li, Y\. Cai, and L\. Wang\(2025\)Enhancing counterfactual explanations with feasibility and diversity\.In2025 IEEE International Conference on Data Mining Workshops \(ICDMW\),pp\. 2310–2319\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[59\]X\. Qin, R\. Yu, A\. Khayati, Z\. Qiu, G\. Zou, Y\. Li, and L\. Wang\(2025\)Interpretable and interactive deep survival analysis with time\-dependent extreme gradient integration\.In2025 IEEE International Conference on Data Mining \(ICDM\),pp\. 673–682\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[60\]S\. Shen, Q\. Liu, Z\. Huang, Y\. Zheng, M\. Yin, M\. Wang, and E\. Chen\(2024\)A survey of knowledge tracing: models, variants, and applications\.IEEE Transactions on Learning Technologies17,pp\. 1898–1919\.Cited by:[§1](https://arxiv.org/html/2606.29049#S1.p1.1)\.
- \[61\]J\. Shi, Y\. Lin, Y\. S\. Hua, Z\. Wang, Z\. Zhang, W\. Zheng, Y\. w\. Song, K\. Lu, and S\. Lu\(2026\)Multiscenario highway lane\-change intention prediction: a physics\-informed ai framework for three\-class classification\.InInternational Conference on Smart Transportation and City Engineering \(STCE 2025\),Vol\.14120,pp\. 129–145\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[62\]Y\. Su, Q\. Liu, Q\. Liu, Z\. Huang, Y\. Yin, E\. Chen, C\. H\. Q\. Ding, S\. Wei, and G\. Hu\(2018\)Exercise\-enhanced sequential modeling for student performance prediction\.InProceedings of the AAAI Conference on Artificial Intelligence,pp\. 2435–2443\.External Links:[Link](https://ojs.aaai.org/index.php/AAAI/article/view/11864)Cited by:[§2\.3](https://arxiv.org/html/2606.29049#S2.SS3.p1.1)\.
- \[63\]Y\. Tao, Y\. i\. Huang, Y\. Wang, X\. Luo, and J\. Liu\(2025\)AutoPCR: automated phenotype concept recognition by prompting\.InarXiv preprint arXiv:2507\.19315,Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[64\]J\. Tian, Z\. Wang, J\. Zhao, and Z\. Ding\(2024\)Mmrec: llm based multi\-modal recommender system\.In2024 19th International Workshop on Semantic and Social Media Adaptation & Personalization \(SMAP\),pp\. 105–110\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[65\]S\. Wang, P\. Qian, Y\. Chen, J\. You, X\. Wang, X\. Jiang, L\. Liu, H\. Yu, and J\. Xu\(2026\)When safe skills collide: measuring compositional risk in agent skill ecosystems\.External Links:2606\.00448,[Link](https://arxiv.org/abs/2606.00448)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[66\]Z\. Wang and J\. Tian\(2025\)DLRREC: denoising latent representations via multi\-modal knowledge fusion in deep recommender systems\.InProceedings of the 2025 9th International Conference on Computer Science and Artificial Intelligence,pp\. 575–581\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[67\]Z\. Wang, J\. Zhou, Q\. Chen, M\. Zhang, B\. Jiang, A\. Zhou, Q\. Bai, and L\. He\(2025\)LLM\-KT: aligning large language models with knowledge tracing using a plug\-and\-play instruction\.arXiv preprint arXiv:2502\.02945\.External Links:[Document](https://dx.doi.org/10.48550/arXiv.2502.02945),[Link](https://arxiv.org/abs/2502.02945)Cited by:[§2\.4](https://arxiv.org/html/2606.29049#S2.SS4.p1.1)\.
- \[68\]C\. Xiao and L\. Hou\(2026\)Prototype\-aligned federated soft\-prompts for continual web personalization\.InProceedings of the ACM Web Conference 2026 \(WWW ’26\),External Links:[Document](https://dx.doi.org/10.1145/3774904.3792626)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[69\]S\. Xiao, T\. Xu, C\. Xiao, W\. Luo, L\. Hou, and C\. Zhao\(2026\)Meta\-UCF: unified task\-conditioned LoRA generation for continual learning in large language models\.InThe Fourteenth International Conference on Learning Representations \(ICLR\),Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[70\]Z\. Xu, X\. Zhang, R\. Li, and Z\. Tang\(2025\)FakeShield: explainable image forgery detection and localization via multi\-modal large language models\.InInternational Conference on Multimedia Modeling,Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[71\]Z\. Xu, X\. Zhang, Y\. Xu, Q\. Huang, S\. Chen, T\. Yao, S\. Ding, and J\. Zhang\(2026\)GenShield: unified detection and artifact correction for ai\-generated images\.InInternational Conference on Machine Learning,Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[72\]J\. Yang, H\. Zhang, F\. Ji, Y\. Wang, M\. Wang, Y\. Luo, and W\. Ding\(2026\)Frequency point game environment for uavs via expert knowledge and large language model\.Drones10\(2\),pp\. 147\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[73\]M\. Yuan, Z\. Zhang, W\. Chen, C\. Zhao, T\. Cai, D\. Wang, R\. Liu, and F\. Zhuang\(2025\)HEK\-cl: hierarchical enhanced knowledge\-aware contrastive learning for recommendation\.ACM Transactions on Information Systems\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[74\]J\. Zang and H\. Liu\(2024\)Explanation based bias decoupling regularization for natural language inference\.In2024 International Joint Conference on Neural Networks \(IJCNN\),pp\. 1–8\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[75\]J\. Zang, M\. Ning, Y\. Wei, S\. Dou, J\. Zhang, N\. Mo, B\. Li, T\. Gui, Q\. Zhang, and X\. Huang\(2025\)Compression hacking: a supplementary perspective on informatics metric of language models from geometric distortion\.arXiv e\-prints,pp\. arXiv–2505\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[76\]J\. Zang\(2025\)Alleviating attention hacking in discriminative reward modeling through interaction distillation\.arXiv preprint arXiv:2508\.02618\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[77\]C\. Zhang, H\. Li, X\. Liu, L\. Jiang, and D\. Wang\(2025\)Hypernetworks for model\-heterogeneous personalized federated learning\.arXiv preprint arXiv:2507\.22330\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[78\]H\. Zhang, S\. Yang, X\. Liang, C\. Shang, Y\. Jiang, C\. Tao, J\. Xiong, H\. K\. So, R\. Xie, A\. X\. Chang, and N\. Wong\(2026\)Find your optimal teacher: personalized data synthesis via router\-guided multi\-teacher distillation\.External Links:2510\.10925,[Link](https://arxiv.org/abs/2510.10925)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[79\]J\. Zhang, X\. Shi, I\. King, and D\. Yeung\(2017\)Dynamic key\-value memory networks for knowledge tracing\.InProceedings of the 26th International Conference on World Wide Web,pp\. 765–774\.External Links:[Document](https://dx.doi.org/10.1145/3038912.3052580),[Link](https://doi.org/10.1145/3038912.3052580)Cited by:[§1](https://arxiv.org/html/2606.29049#S1.p3.1),[§2\.2](https://arxiv.org/html/2606.29049#S2.SS2.p1.1)\.
- \[80\]J\. Zhang, X\. Song, Y\. Li, D\. Liang, Z\. Zhang, and J\. Cai\(2026\)Adaptive dual cross\-attention network for multispectral object detection in autonomous driving\.Expert Systems with Applications,pp\. 132012\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[81\]Z\. Zhang, M\. Zheng, T\. Zhang, L\. Lin, Y\. Wang, and L\. Lin\(2026\)Bankruptcy prediction from 10\-k narratives: evidence from interpretable text scores and accounting baselines\.arXiv preprint arXiv:2606\.05623\.External Links:[Document](https://dx.doi.org/10.48550/arXiv.2606.05623),[Link](https://arxiv.org/abs/2606.05623)Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[82\]Z\. Zhang, X\. Bao, L\. Jiang, X\. Luo, Z\. Zhang, J\. Yin, M\. Zhao, Y\. Wang, A\. Comai, J\. Waldhaus,et al\.\(2025\)Developing a general ai model for integrating diverse genomic modalities and comprehensive genomic knowledge\.Nucleic Acids Research53,pp\. gkaf1269\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[83\]Z\. Zhang, R\. Fu, Y\. He, X\. Shen, Y\. Wang, X\. Du, H\. You, K\. s\. Jin, J\. Shi, and S\. Fong\(2026\)FinSentLLM: multi\-llm and structured semantic signals for enhanced financial sentiment forecasting\.InICASSP 2026\-2026 IEEE International Conference on Acoustics, Speech and Signal Processing \(ICASSP\),pp\. 17682–17686\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[84\]Y\. Zhao, C\. Tan, L\. Shi, Y\. Zhong, F\. Kou, P\. Zhang, W\. Chen, and C\. Ma\(2026\)Generative recommender systems: a comprehensive survey on model, framework, and application\.Information Fusion\.Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.
- \[85\]H\. Zhou, J\. Tang, Y\. Li, C\. Xiao, L\. Hou, Z\. Ke, and J\. Yao\(2026\)CoMem: compositional concept\-graph memory for vision–language adaptation\.InThe Fourteenth International Conference on Learning Representations \(ICLR\),Cited by:[§2\.1](https://arxiv.org/html/2606.29049#S2.SS1.p1.1)\.Similar Articles
MOSAIC: Modular Orchestration for Structured Agentic Intelligence and Composition
MOSAIC introduces a structured agentic framework for automated data science that uses memory-grounded model selection and workflow construction, validated on financial time-series tasks. It outperforms AutoML and agentic baselines.
MOSAIC: Module Discovery via Sparse Additive Identifiable Causal Learning for Scientific Time Series
This paper introduces MOSAIC, a method for module discovery in scientific time series that combines causal representation learning with sparse additive identifiable causal learning. It aims to recover interpretable latent variables and their associated observations without post-hoc alignment, validated on domains like molecular dynamics and climate data.
SciOrch: Learning to Orchestrate Expert LLMs for Solving Frontier Multimodal Scientific Reasoning Tasks
SciOrch presents an 8B vision-language model trained with MCTS to coordinate multiple expert LLMs for multimodal scientific reasoning, achieving superior performance while reducing API costs.
Context-aware Modality-Topology Co-Alignment for Multimodal Attributed Graphs
Proposes CoMAG, a unified backbone for multimodal attributed graphs that learns task-adaptive reliable contexts and performs modality-preserving alignment, achieving state-of-the-art results on graph-level prediction, modality matching, and graph-conditioned generation.
SARA: Unlocking Multilingual Knowledge in Mixture-of-Experts via Semantically Anchored Routing Alignment
This paper proposes SARA, a framework that aligns routing distributions of multilingual inputs using Jensen-Shannon divergence to improve expert sharing for low-resource languages in sparse Mixture-of-Experts models. Experiments on Qwen3-30B-A3B and Phi-3.5-MoE-instruct show improvements on multilingual benchmarks.