Hyperparameter Learning for Latent Factorization of Tensors for Representation Learning to Large-scale Dynamic Weighted Directed Network
Summary
This paper proposes an automated hyperparameter optimization framework based on Differential Evolution for Latent Factorization of Tensors (LFT) to improve prediction accuracy on large-scale dynamic weighted directed networks, reducing the need for manual tuning.
View Cached Full Text
Cached at: 06/10/26, 06:15 AM
# Hyperparameter Learning for Latent Factorization of Tensors for Representation Learning to Large-scale Dynamic Weighted Directed Network
Source: [https://arxiv.org/html/2606.09880](https://arxiv.org/html/2606.09880)
1st1^\{\\mathrm\{st\}\}Yaqian ZhanCollege of Computer andInformation ScienceSouthwest UniversityChongqing, Chinazyq000451@email\.swu\.edu\.cn2nd2^\{\\mathrm\{nd\}\}Jialan HeCollege of Computer andInformation ScienceSouthwest UniversityChongqing, Chinanihhu2020@email\.swu\.edu\.cn3rd3^\{\\mathrm\{rd\}\}Tianzhu ChenCollege of Computer andInformation ScienceSouthwest UniversityChongqing, Chinachen56@email\.swu\.edu\.cn
###### Abstract
Large\-scale dynamic weighted directed networks \(DWDNs\) are widely used to model time\-varying interactions among nodes\. Latent factorization of tensors \(LFT\) extracts target knowledge from DWDNs via low\-rank embedding\. However, similar to many machine learning models, the performance of LFT heavily depends on the selection of hyperparameters\. In practice, these parameters are often tuned manually or through grid search, which requires significant computational resources and human effort\. Motivated by this challenge, this paper proposes an automated hyperparameter optimization framework based on Differential Evolution \(DE\) for LFT \(DE\-LFT\)\. The proposed method integrates DE into the training process of the LFT model to automatically learn optimal regularization parametersλ1\\lambda\_\{1\},λ2\\lambda\_\{2\}andλ3\\lambda\_\{3\}\. As a result, the model can adaptively search the hyperparameter space and improve prediction accuracy\. Experimental results on four real\-world datasets demonstrate that the proposed approach achieves lower MAE and RMSE compared with manually tuned baselines while reducing the need for extensive parameter tuning\.
Keywords:Latent Factorization of Tensors, Differential Evolution, Hyperparameter Tuning, Representation Learning\.
## 1Introduction
In the big data era, large\-scale dynamic weighted directed networks \(DWDNs\) have been widely used in various practical applications, such as personalized recommendation systems\[[30](https://arxiv.org/html/2606.09880#bib.bib1),[32](https://arxiv.org/html/2606.09880#bib.bib7),[35](https://arxiv.org/html/2606.09880#bib.bib8),[47](https://arxiv.org/html/2606.09880#bib.bib10),[31](https://arxiv.org/html/2606.09880#bib.bib11),[38](https://arxiv.org/html/2606.09880#bib.bib15),[44](https://arxiv.org/html/2606.09880#bib.bib20),[12](https://arxiv.org/html/2606.09880#bib.bib38)\], intelligent transportation systems\[[26](https://arxiv.org/html/2606.09880#bib.bib2),[33](https://arxiv.org/html/2606.09880#bib.bib9),[49](https://arxiv.org/html/2606.09880#bib.bib13),[41](https://arxiv.org/html/2606.09880#bib.bib14),[45](https://arxiv.org/html/2606.09880#bib.bib16),[34](https://arxiv.org/html/2606.09880#bib.bib21)\], and electrical power grid infrastructures\[[37](https://arxiv.org/html/2606.09880#bib.bib3),[42](https://arxiv.org/html/2606.09880#bib.bib17),[40](https://arxiv.org/html/2606.09880#bib.bib18),[19](https://arxiv.org/html/2606.09880#bib.bib22),[33](https://arxiv.org/html/2606.09880#bib.bib9)\]\. The latent factorization of tensors \(LFT\) has shown strong capability in learning latent representations from DWDNs\[[17](https://arxiv.org/html/2606.09880#bib.bib4),[21](https://arxiv.org/html/2606.09880#bib.bib26),[6](https://arxiv.org/html/2606.09880#bib.bib28),[8](https://arxiv.org/html/2606.09880#bib.bib30),[28](https://arxiv.org/html/2606.09880#bib.bib31),[5](https://arxiv.org/html/2606.09880#bib.bib33),[9](https://arxiv.org/html/2606.09880#bib.bib48)\]\. It models the target as a high\-dimensional and incomplete \(HDI\) tensor and performs low\-rank approximation\[[18](https://arxiv.org/html/2606.09880#bib.bib5),[48](https://arxiv.org/html/2606.09880#bib.bib34),[14](https://arxiv.org/html/2606.09880#bib.bib35),[25](https://arxiv.org/html/2606.09880#bib.bib36),[29](https://arxiv.org/html/2606.09880#bib.bib37),[7](https://arxiv.org/html/2606.09880#bib.bib40),[27](https://arxiv.org/html/2606.09880#bib.bib45),[13](https://arxiv.org/html/2606.09880#bib.bib47)\]\.
However, similar to many machine learning models, the performance of LFT heavily depends on the selection of hyperparameters, particularly the regularization coefficientsλ1\\lambda\_\{1\},λ2\\lambda\_\{2\}andλ3\\lambda\_\{3\}of latent feature matrices\[[16](https://arxiv.org/html/2606.09880#bib.bib51),[39](https://arxiv.org/html/2606.09880#bib.bib52),[46](https://arxiv.org/html/2606.09880#bib.bib54),[3](https://arxiv.org/html/2606.09880#bib.bib61),[20](https://arxiv.org/html/2606.09880#bib.bib63)\]\. In practice, these parameters are often determined manually or via grid search\[[1](https://arxiv.org/html/2606.09880#bib.bib88),[11](https://arxiv.org/html/2606.09880#bib.bib89),[22](https://arxiv.org/html/2606.09880#bib.bib90)\], which is computationally expensive and inefficient\. To address this issue, this paper proposes a Differential Evolution \(DE\)\[[23](https://arxiv.org/html/2606.09880#bib.bib6),[10](https://arxiv.org/html/2606.09880#bib.bib85),[50](https://arxiv.org/html/2606.09880#bib.bib86),[24](https://arxiv.org/html/2606.09880#bib.bib87)\]based hyperparameter learning framework for LFT\. The proposed method integrates DE into the training process to automatically learn regularization coefficients and improve the model predictions accuracy\.
## 2Main Work
LFT models DWDNs as a third\-order tensor to capture evolving node interactions\. It learns three low\-rank latent factor matricesU,S,TU,S,Tthat approximate observed entries via:
x^u,s,t=∑k=1KUu,kSs,kTt,k,\\hat\{x\}\_\{u,s,t\}=\\sum\_\{k=1\}^\{K\}U\_\{u,k\}S\_\{s,k\}T\_\{t,k\},whereKKis the dimensionality of the latent representation space\. To optimize performance, we apply Differential Evolution \(DE\) for automatic hyperparameter tuning\. DE is a population\-based heuristic that iteratively evolves candidate solutions via mutation, crossover, and selection to find optimal regularization parameters \(λ1\\lambda\_\{1\},λ2\\lambda\_\{2\}andλ3\\lambda\_\{3\}\) minimizing prediction error \(MAE/RMSE\)\. This approach adaptively explores the hyperparameter space, enhancing accuracy while reducing computational costs\.
## 3Methods
To evaluate the effectiveness of DE\-LFT, experiments are conducted on four benchmark DWDNs\. The datasets were divided into training set and test set in a 80%: 20% ratio\. The LFT model is trained using stochastic gradient descent \(SGD\), while the regularization parameters are optimized using the DE algorithm\. Root mean square error \(RMSE\) and mean absolute error \(MAE\)\[[15](https://arxiv.org/html/2606.09880#bib.bib77),[4](https://arxiv.org/html/2606.09880#bib.bib78),[2](https://arxiv.org/html/2606.09880#bib.bib79),[43](https://arxiv.org/html/2606.09880#bib.bib83),[36](https://arxiv.org/html/2606.09880#bib.bib71)\]serve as evaluation metrics:
RMSE=1\|ψ\|∑xijk∈ψ\(xijk−x^ijk\)2,MAE=1\|ψ\|∑xijk∈ψ\|xijk−x^ijk\|\.\\mathrm\{RMSE\}=\\sqrt\{\\frac\{1\}\{\|\\psi\|\}\\sum\_\{x\_\{ijk\}\\in\\psi\}\(x\_\{ijk\}\-\\hat\{x\}\_\{ijk\}\)^\{2\}\},\\quad\\mathrm\{MAE\}=\\frac\{1\}\{\|\\psi\|\}\\sum\_\{x\_\{ijk\}\\in\\psi\}\|x\_\{ijk\}\-\\hat\{x\}\_\{ijk\}\|\.
## 4Results
The experimental results are shown in Table 1\. Compared with manually tuned hyperparameters, Grid Search slightly improves the prediction performance by exploring a larger hyperparameter space\. Furthermore, the proposed DE\-based hyperparameter learning method consistently achieves the best performance across all datasets\. This indicates that Differential Evolution can more effectively search the hyperparameter space and identify parameter combinations that yield lower prediction errors\. Overall, the proposed method achieves the lowest MAE and RMSE on all four datasets\.
Table 1:The Comparison Result of Rating Prediction Accuracy
## 5Conclusion
This paper proposes DE\-LFT, a Differential Evolution based hyperparameter learning framework for Latent Factorization of Tensors\. By automatically optimizing the regularization coefficients of latent feature matrices during training, the proposed method improves the prediction accuracy of LFT models\. Experimental results on four real\-world datasets demonstrate that DE\-LFT consistently achieves lower MAE and RMSE compared with manual tuning and grid search methods\.
## References
- \[1\]\(2012\)Random search for hyper\-parameter optimization\.Journal of Machine Learning Research13\(2\),pp\. 281–305\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p2.3)\.
- \[2\]J\. Chen, Y\. Yuan, X\. Luo, and X\. Gao\(2025\-Jul 22\)An adaptive neighborhood\-resonated graph convolution network for undirected weighted graph representation\.IEEE Transactions on Neural Networks and Learning Systems\.Cited by:[§3](https://arxiv.org/html/2606.09880#S3.p1.1)\.
- \[3\]J\. Gao, D\. Wu, J\. Chen, M\. Zhou, and X\. Luo\(2025\-Oct 5\)Federated deep latent factor model for privacy\-preserving recommendation\.In2025 IEEE International Conference on Systems, Man, and Cybernetics \(SMC\),pp\. 1689–1694\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p2.3)\.
- \[4\]M\. Han, L\. Wang, Y\. Yuan, and X\. Luo\(2025\-Aug 3\)Sgd\-dyg: self\-reliant global dependency apprehending on dynamic graphs\.InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining,Vol\.2,pp\. 802–813\.Cited by:[§3](https://arxiv.org/html/2606.09880#S3.p1.1)\.
- \[5\]N\. Han, S\. Lu, Z\. Lin, B\. Li, N\. Wang, and X\. Luo\(2026\-Feb 24\)TraceHG: an unsupervised dual\-view framework for microservice anomaly detection\.IEEE Transactions on Services Computing\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[6\]T\. He, Z\. Duan, and X\. Luo\(2026\)Modularized graph convolutional network\.IEEE/CAA Journal of Automatica Sinica13\(3\),pp\. 737–739\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[7\]Y\. He and X\. Luo\(2026\-Jan 30\)Tensor low\-rank orthogonal compression for convolutional neural networks\.IEEE/CAA Journal of Automatica Sinica13\(1\),pp\. 227–229\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[8\]Y\. He, H\. Wu, W\. Liu, and X\. Luo\(2026\-Mar 25\)A survey of latent factorization of tensor\-based model compression: algorithms, toolboxes and future directions\.Neurocomputing,pp\. 133455\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[9\]Z\. Hu, Z\. Peng, Z\. Bi, Q\. Shen, Z\. Liu, J\. Lou, and X\. Luo\(2025\-Dec 31\)Advancing healthcare with large language models: techniques and application\.IEEE/CAA Journal of Automatica Sinica12\(12\),pp\. 2371–2398\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[10\]Z\. Hu, X\. Xu, Q\. Su, H\. Zhu, and J\. Guo\(2020\)Grey prediction evolution algorithm for global optimization\.Applied Mathematical Modelling79,pp\. 145–160\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p2.3)\.
- \[11\]F\. Hutter, L\. Kotthoff, and J\. Vanschoren\(2019\)Automated machine learning: methods, systems, challenges\.Springer\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p2.3)\.
- \[12\]D\. Jankar and S\. L\. Badjate\(2026\)Federated learning and collaborative ai models in neuroscience research\.InAI\-driven Healthcare Innovations: Applications in Neurology and Medicine,pp\. 261–277\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[13\]L\. Lan, H\. Li, Z\. Xia, J\. Zhou, X\. Zhu, Y\. Li, Y\. Zhang, and X\. Luo\(2026\)CM\-cgns: cross\-modal clustering\-guided negative sampling for self\-supervised joint learning from medical images and reports\.External Links:6110595Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[14\]Z\. Li, P\. Hu, X\. Deng, L\. Hu, S\. Li, and X\. Luo\(2026\)A novel l1\-and\-l2\-norm\-integrated parameter identification model for robot calibration\.IEEE Transactions on Industrial Electronics\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[15\]X\. Liao, H\. Wu, and X\. Luo\(2025\-Aug 4\)A novel tensor causal convolution network model for highly\-accurate representation to spatio\-temporal data\.IEEE Transactions on Automation Science and Engineering\.Cited by:[§3](https://arxiv.org/html/2606.09880#S3.p1.1)\.
- \[16\]Z\. Liu, Z\. Zhang, X\. Luo, C\. Pan, L\. Wang, H\. Tang, and L\. He\(2025\-Dec 20\)An adaptive recognition method for reliable collaboration of manufacturing services based on edge\-aggregated graph convolutional network\.International Journal of Production Research,pp\. 1–28\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p2.3)\.
- \[17\]X\. Luo, M\. Chen, H\. Wu, Z\. Liu, H\. Yuan, and M\. Zhou\(2021\)Adjusting learning depth in nonnegative latent factorization of tensors for accurately modeling temporal patterns in dynamic qos data\.IEEE Transactions on Automation Science and Engineering18\(4\),pp\. 2142–2155\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[18\]X\. Luo, H\. Wu, H\. Yuan, and M\. Zhou\(2020\)Temporal pattern\-aware qos prediction via biased non\-negative latent factorization of tensors\.IEEE Transactions on Cybernetics50\(5\),pp\. 1798–1809\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[19\]X\. Luoet al\.\(2025\)A novel multi\-agent reinforcement learning framework for robust exception handling of manufacturing service collaboration based on asymmetric information\.Journal of Manufacturing Systems79,pp\. 364–382\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[20\]Q\. Ma, D\. Wu, and X\. Luo\(2025\)A review of deep learning\-based power load forecasting methods\.International Journal of Network Dynamics and Intelligence4\(4\),pp\. 100027\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p2.3)\.
- \[21\]W\. Qin, Y\. Ding, and X\. Luo\(2026\)A robust approach to electricity theft detection via tensor representation\-driven contrastive distillation\.IEEE Transactions on Industrial Informatics\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[22\]J\. Snoek, H\. Larochelle, and R\. P\. Adams\(2012\)Practical bayesian optimization of machine learning algorithms\.InAdvances in Neural Information Processing Systems,Vol\.25\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p2.3)\.
- \[23\]R\. Storn and K\. Price\(1997\)Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces\.Journal of Global Optimization11\(4\),pp\. 341–359\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p2.3)\.
- \[24\]W\. Tong, D\. Liu, Z\. Hu, and Q\. Su\(2023\)Hybridizing genetic algorithm with grey prediction evolution algorithm for solving unit commitment problem\.Applied Intelligence53\(17\),pp\. 5927–5943\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p2.3)\.
- \[25\]D\. V\. D\. Valiki\(2026\)Secure multi\-organization healthcare data analysis using federated ai architectures\.American Data Science Journal for Advanced Computations \(ADSJAC\)4\(01\)\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[26\]C\. Wang, J\. Wu, X\. Zheng, B\. Pei, X\. Zhang, D\. Yu, and J\. Tang\(2021\)Leveraging icn with network sensing for intelligent transportation systems: a dynamic naming approach\.IEEE Sensors Journal21\(14\),pp\. 15875–15884\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[27\]J\. Wang, W\. Li, Y\. Zhong, and X\. Luo\(2024\-Feb 19\)Mini\-hes: a parallelizable second\-order latent factor analysis model\.External Links:2402\.11948Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[28\]L\. Wang, Y\. Yuan, and X\. Luo\(2026\)Advanced high\-order graph convolutional networks with assorted time\-frequency transforms\.IEEE/CAA Journal of Automatica Sinica13\(2\),pp\. 394–408\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[29\]L\. Wang, Y\. Yuan, and X\. Luo\(2026\)Graph tensor convolutional network\.IEEE Transactions on Systems, Man, and Cybernetics: Systems\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[30\]D\. Wu, M\. Shang, X\. Luo, and Z\. Wang\(2022\)An l1\-and\-l2\-norm\-oriented latent factor model for recommender systems\.IEEE Transactions on Neural Networks and Learning Systems33\(10\),pp\. 5775–5788\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[31\]D\. Wu, Y\. Hu, K\. Liu, J\. Li, X\. Wang, S\. Deng, N\. Zheng, and X\. Luo\(2025\-04\)An outlier\-resilient autoencoder for representing high\-dimensional and incomplete data\.IEEE Transactions on Emerging Topics in Computational Intelligence9\(2\),pp\. 1379–1391\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[32\]D\. Wu, S\. Li, Y\. He, X\. Luo, and X\. Gao\(2026\-05\)Non\-gradient hash factor learning for high\-dimensional and incomplete data representation learning\.IEEE Transactions on Pattern Analysis and Machine Intelligence48\(5\),pp\. 5811–5826\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[33\]D\. Wu, C\. Liang, Y\. He, Y\. Qiao, and X\. Luo\(2026\-03\)Multimetric autoencoder for representing high\-dimensional and incomplete data\.IEEE Transactions on Systems, Man, and Cybernetics: Systems56\(3\),pp\. 1533–1546\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[34\]D\. Wu, Z\. Tang, Y\. He, and X\. Luo\(2026\-02\)SchemaRAG: a schema\-aware retrieval\-augmented generation framework for text\-to\-sql\.Proceedings of the ACM on Management of Data4\(1\),pp\. 82\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[35\]D\. Wu, S\. Zhong, Y\. He, X\. Luo, and X\. Gao\(2026\)Federated latent factorization of tensors for privacy\-preserving representation learning to large\-scale dynamic weighted directed network\.IEEE Transactions on Dependable and Secure Computing\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[36\]H\. Wu, Q\. Wang, X\. Luo, and Z\. Wang\(2025\-Oct 3\)Learning accurate representation to nonstandard tensors via a mode\-aware tucker network\.IEEE Transactions on Knowledge and Data Engineering\.Cited by:[§3](https://arxiv.org/html/2606.09880#S3.p1.1)\.
- \[37\]L\. Xiang, P\. Wang, F\. Chen, and G\. Chen\(2020\)Controllability of directed networked mimo systems with heterogeneous dynamics\.IEEE Transactions on Control of Network Systems7\(2\),pp\. 807–817\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[38\]R\. Xu, D\. Wu, and X\. Luo\(2025\-Aug\.\)Recursion\-and\-fuzziness reinforced online sparse streaming feature selection\.IEEE Transactions on Fuzzy Systems33\(8\),pp\. 2574–2586\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[39\]X\. Xu, M\. Lin, Z\. Xu, and X\. Luo\(2025\-Dec 19\)A sampling\-neighborhood\-regularized latent factorization of tensor for dynamic qos estimation\.IEEE Transactions on Network and Service Management23,pp\. 1707–1722\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p2.3)\.
- \[40\]J\. Yang, J\. Kuang, G\. Wang, Q\. Zhang, Y\. Liu, Q\. Liu, D\. Xia, S\. Li, X\. Wang, and D\. Wu\(2024\-09\)Adaptive three\-way knn classifier using density\-based granular balls\.Information Sciences678,pp\. 120858\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[41\]J\. Yang, X\. Lan, G\. Wang, Z\. Chen, Y\. Chen, and D\. Wu\(2025\-Nov\.\-Dec\.\)A hybrid ensemble end\-to\-end neural network for accurate protein\-protein interactions prediction\.IEEE/ACM Transactions on Computational Biology and Bioinformatics22\(6\),pp\. 2540–2553\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[42\]J\. Yang, Y\. Li, G\. Wang, Z\. Chen, and D\. Wu\(2024\-Nov\.\-Dec\.\)An end\-to\-end knowledge graph fused graph neural network for accurate protein\-protein interactions prediction\.IEEE/ACM Transactions on Computational Biology and Bioinformatics21\(6\),pp\. 2518–2530\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[43\]Y\. Yang, L\. Hu, G\. Li, D\. Li, P\. Hu, and X\. Luo\(2025\-Jun 30\)FMvPCI: a multiview fusion neural network for identifying protein complex via fuzzy clustering\.IEEE Transactions on Systems, Man, and Cybernetics: Systems\.Cited by:[§3](https://arxiv.org/html/2606.09880#S3.p1.1)\.
- \[44\]Z\. Yang, D\. Wu, J\. Chen, and X\. Luo\(2025\-Jun 30\)Concept factorization via self\-representation and adaptive graph structure learning\.In2025 International Joint Conference on Neural Networks \(IJCNN\),pp\. 1–8\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[45\]D\. You, H\. Yan, J\. Xiao, Z\. Chen, D\. Wu, L\. Shen, and X\. Wu\(2024\-Sept\.\)Online learning for data streams with incomplete features and labels\.IEEE Transactions on Knowledge and Data Engineering36\(9\),pp\. 4820–4834\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[46\]C\. Yu, D\. Wu, J\. Chen, M\. Zhou, and X\. Luo\(2025\-Dec 14\)Multi\-indicator latent factorization of tensors for spatio\-temporal signal recovery\.In2025 IEEE 31th International Conference on Parallel and Distributed Systems \(ICPADS\),pp\. 1–8\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p2.3)\.
- \[47\]C\. Yu, D\. Wu, Y\. He, J\. Chen, and X\. Luo\(2026\-April 13–17\)Federated latent factor learning for privacy\-preserving spatio\-temporal signal recovery\.InProceedings of the ACM Web Conference 2026 \(WWW ’26\),Dubai, United Arab Emirates,pp\. 1–12\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[48\]D\. Zeng, C\. Pan, K\. A\. Feng, and X\. Luo\(2025\)A novel magnetite ore refined sorting method based on magnetic induction and cnn\-sk\-bilstm network\.Gospodarka Surowcami Mineralnymi41\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[49\]L\. Zhang, G\. Lu, X\. Yan, P\. Xia, Z\. Chen, and D\. Wu\(2025\)A differential evolution optimized hybrid xgboost for accurate carbon emission prediction\.Environmental Modelling & Software193,pp\. 106627\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p1.1)\.
- \[50\]R\. Zhang, S\. Mao, and Y\. Kang\(2023\)A novel traffic flow prediction model: variable order fractional grey model based on an improved grey evolution algorithm\.Expert Systems with Applications224,pp\. 119943\.Cited by:[§1](https://arxiv.org/html/2606.09880#S1.p2.3)\.Similar Articles
Unlocking Feature Learning in Gated Delta Networks at Scale
This paper derives scaling rules for Gated Delta Networks using Maximal Update Parametrization (μP), enabling zero-shot hyperparameter transfer across model widths for efficient sub-quadratic LLM architectures. Experiments confirm stable learning-rate transfer under both AdamW and SGD, whereas standard parametrization fails.
Predictable Scaling Laws of Optimal Hyperparameters for LLM Continued Pre-training
This paper discovers predictable scaling laws for optimal hyperparameters (learning rate, batch size) in LLM continued pre-training, proposing a two-stage framework that reduces hyperparameter search overhead by up to 90% while maintaining performance.
ShadowPEFT: Shadow Network for Parameter-Efficient Fine-Tuning
ShadowPEFT introduces a centralized parameter-efficient fine-tuning method that uses a depth-shared shadow module to refine transformer layer representations, matching or outperforming LoRA/DoRA with comparable trainable parameters.
From Parameters to Data: A Task-Parameter-Guided Fine-Tuning Pipeline for Efficient LLM Alignment
P2D is a unified framework that leverages task-sensitive attention heads for both data selection and structural pruning, achieving an 8.3 pp performance gain and 7.0× speedup by updating only 10% of heads on 10% of data.
Deep Spectral Learning of Embedded Latent Transfer Operators for Stochastic Dynamical Systems
Proposes a spectral learning method for stochastic nonlinear dynamical systems using deep feature spaces and an operator-based latent state-space model, demonstrating stable performance in forecasting and filtering tasks.