A Survey on Data-Driven Models for Soil Moisture Regression and Classification
Summary
A structured survey of AI-based models for soil moisture estimation and classification, covering statistical time-series, geostatistical, classical ML, deep learning, and probabilistic/Bayesian methods.
View Cached Full Text
Cached at: 06/18/26, 05:41 AM
# A Survey on Data-Driven Models for Soil Moisture Regression and Classification
Source: [https://arxiv.org/html/2606.18316](https://arxiv.org/html/2606.18316)
11institutetext:Robotics and AI Team
Department of Computer Science, Space and Electrical Engineering
LuleåUniversity of Technology
Luleå, Sweden###### Abstract
Soil Moisture \(SM\) modelling constitutes a complex spatio\-temporal learning problem characterised by nonlinear environmental interactions, heterogeneous data sources, and limited ground observations\. Physics\-based approaches, such as water balance models, rely on explicit hydrological equations and high\-quality inputs, but their computational cost and scalability limitations restrict large\-scale deployment\. Data\-driven artificial intelligence \(AI\) methods have emerged as flexible alternatives, enabling the extraction of empirical relationships between soil moisture and environmental variables with reduced modelling assumptions\. This work presents a structured survey of AI\-based models for soil moisture estimation and classification\. Existing approaches are organized into five categories: \(a\) statistical time\-series models, \(b\) geostatistical methods \(c\) classical machine learning \(ML\) models, \(d\) Deep Learning \(DL\) models and \(e\) Probabilistic/Bayesian methods\. These models leverage historical soil moisture records, meteorological variables, vegetation indices, topography, soil characteristics, and geolocation data to perform regression or classification tasks\.
## 1Introduction
Soil Moisture \(SM\) is a crucial variable in the hydrological cycle, and its knowledge is vital for flood and drought forecasting\. Furthermore, soil moisture parameter influences agriculture and water\-resources management; therefore, accurate measurement and prediction of SM are of great importance\[[21](https://arxiv.org/html/2606.18316#bib.bib1)\],\[[28](https://arxiv.org/html/2606.18316#bib.bib7)\],\[[45](https://arxiv.org/html/2606.18316#bib.bib8)\]\. It changes dynamically in space and time and has high spatial heterogeneity in SM fields, as shown in Fig\.[1](https://arxiv.org/html/2606.18316#S1.F1), even within small catchments\[[61](https://arxiv.org/html/2606.18316#bib.bib35)\],\[[23](https://arxiv.org/html/2606.18316#bib.bib10)\]\.
Figure 1:Spatial variation of soil moisture over a grass\-covered field in Luleå, Sweden\. These spatial variations result from Ordinary Kriging interpolation, and the gradient colours show soil moisture variation\.Knowing the spatio\-temporal variability is important for realising and modelling the hydrological processes\[[34](https://arxiv.org/html/2606.18316#bib.bib33)\]\. Among spatial and temporal variability, a correlation has been examined, revealing that the temporal stability of SM is associated with the temporal persistence of its spatial distribution patterns\[[27](https://arxiv.org/html/2606.18316#bib.bib34)\]\.
SM dynamics are affected by nonlinear interactions with environmental and physical factors, such as climate, soil, vegetation and topography\[[2](https://arxiv.org/html/2606.18316#bib.bib36)\],\[[23](https://arxiv.org/html/2606.18316#bib.bib10)\],\[[40](https://arxiv.org/html/2606.18316#bib.bib11)\]\. Recent work by\[[39](https://arxiv.org/html/2606.18316#bib.bib37)\]presents that Artificial Neural Networks \(ANNs\) and tree ensembles such as Random Forests \(RFs\) outperform linear models as they capture complex nonlinear interactions among hydrotographic and climate predictors\. Furthermore, the nonlinear dependence of SM on multiple features is emphasised when RFs and ANNs significantly improve the estimation compared to simpler relationships\[[32](https://arxiv.org/html/2606.18316#bib.bib38)\],\[[2](https://arxiv.org/html/2606.18316#bib.bib36)\]\.
Another challenging issue is the mismatch between spatial scales and the multisource assimilation\. The study by\[[49](https://arxiv.org/html/2606.18316#bib.bib39)\]directly addresses the spatial scale mismatch between satellite and field scales by using inverse Hydroblocks\-RTM to downscale SMAP to 30m\. Furthermore, data assimilation of in\-situ and remote sensing SM improves the model states, as reported by\[[20](https://arxiv.org/html/2606.18316#bib.bib40)\], but also highlights the limitations of remote sensing for the topsoil at approximately5cm~5cmdepth and the need for downscaling and error modelling to address scale mismatches\.
Traditional physics\-based process models describe the hydrological processes governing SM transfer using physical equations and compute the relevant explanatory variables as part of land\-surface data assimilation techniques\[[12](https://arxiv.org/html/2606.18316#bib.bib14)\]\. These models are mostly numerical and rely heavily on highly accurate input data\. In addition, these mechanistic models demand substantial computational costs, which restricts their broader application at large scales\[[58](https://arxiv.org/html/2606.18316#bib.bib15)\]The statistical methods were then introduced to improve model adaptability\. Researchers can utilise data\-driven approaches to derive empirical relationships between SM and environmental parameters, thereby reducing reliance on physical parameters\[[58](https://arxiv.org/html/2606.18316#bib.bib15)\],\[[54](https://arxiv.org/html/2606.18316#bib.bib16)\]\.
The objective of this study is to survey data\-driven models for SM regression and classification tasks and to compare or distinguish between them\. The focus will be on categorising the models by learning paradigm, and on highlighting current challenges and research gaps\.
## 2Soil Moisture Modelling Tasks
Data\-driven SM modelling is commonly formulated as either a regression or a classification problem, depending on the target variable and application context\. Regression predicts continuous SM values, while classification assigns samples to discrete moisture states such as dry, moderate, or wet\[[54](https://arxiv.org/html/2606.18316#bib.bib16)\],\[[46](https://arxiv.org/html/2606.18316#bib.bib43)\],\[[59](https://arxiv.org/html/2606.18316#bib.bib2)\]\. In addition, some studies address spatial downscaling, where coarse\-resolution SM products are refined using higher\-resolution auxiliary data\[[49](https://arxiv.org/html/2606.18316#bib.bib39)\],\[[38](https://arxiv.org/html/2606.18316#bib.bib76)\], which fall under the umbrella of regression tasks\. Despite their inherent differences, the assumptions made and the way they utilise the provided input\-output data all follow the same step shown in Fig\.[2](https://arxiv.org/html/2606.18316#S2.F2)\.
### 2\.1Input Variables and Data Sources
In all data\-driven approaches, the GIGO aphorism holds \(Garbage In Garbage Out\)\. Therefore, of paramount importance in any of those tasks is the appropriate selection of input variables\. In the field of SM of as iputs to the modelling tasks are meteorological, remote sensing, soil, topographic, and temporal predictors and combinations of them including, but not limited to precipitation, temperature, solar radiation, vegetation indices, SAR backscatter, soil texture, elevation, slope, and antecedent soil moisture\[[2](https://arxiv.org/html/2606.18316#bib.bib36)\],\[[23](https://arxiv.org/html/2606.18316#bib.bib10)\],\[[31](https://arxiv.org/html/2606.18316#bib.bib68)\],\[[32](https://arxiv.org/html/2606.18316#bib.bib38)\],\[[30](https://arxiv.org/html/2606.18316#bib.bib59)\]\. In\-situ measurements from the International Soil Moisture Network \(ISMN\), passive microwave products \(e\.g\. SMAP and SMOS\), Sentinel\-1 SAR observations, ERA5\-Land reanalysis, and auxiliary products \(e\.g\. MODIS, DEMs, and SoilGrids\) are among the most common data sources\[[26](https://arxiv.org/html/2606.18316#bib.bib32)\],\[[44](https://arxiv.org/html/2606.18316#bib.bib62)\],\[[49](https://arxiv.org/html/2606.18316#bib.bib39)\],\[[48](https://arxiv.org/html/2606.18316#bib.bib72)\]\.
Figure 2:Pipeline for data\-driven soil moisture modelling, from data acquisition and preprocessing to feature construction, model selection, outputs, and validation\.
### 2\.2Regression Tasks
Regression is the most commonly employed tool in SM research, given the fact that SM measurements "live" in a continuous space\. To put it formally, the regression paradigm tries to estimate or forecast moisture given a set of environmental and geospatial predictors, either static or in a chronologically ordered setting\[[54](https://arxiv.org/html/2606.18316#bib.bib16)\]\. This way, the problem is formulated as a standard supervised setting, trying to create a mapping between a group of predictors and a moisture value, either recorded on the surface or underground\.
Regression can be applied into different settings depending on the input and more importantly the output space: a\)point estimation, with the goal of predicting SM at a specific location or depth from contemporaneous predictors\[[54](https://arxiv.org/html/2606.18316#bib.bib16)\],\[[26](https://arxiv.org/html/2606.18316#bib.bib32)\]b\)temporal forecasting, with the goal of predicting SM values in the future using historical observations and exogenous inputs \(e\.g\. precipitation, temperature, evapotranspiration, etc\)\[[1](https://arxiv.org/html/2606.18316#bib.bib18)\],\[[5](https://arxiv.org/html/2606.18316#bib.bib20)\],\[[14](https://arxiv.org/html/2606.18316#bib.bib52)\]c\)multi\-depth prediction, which differs from the first category that models are trying to predict SM values across several soil layers\[[2](https://arxiv.org/html/2606.18316#bib.bib36)\],\[[48](https://arxiv.org/html/2606.18316#bib.bib72)\]and d\)gap\-fillingorrecord reconstruction, where the goal is to "impute" SM observations in sparse or incomplete datasets\[[57](https://arxiv.org/html/2606.18316#bib.bib69)\],\[[44](https://arxiv.org/html/2606.18316#bib.bib62)\]\.Last but not least, a special case of regression modeling, which deserves mentioning separately, isspatial downscalingwhere SM measurements are refined to finer spatial resolutions using more densely located auxiliary variables \(e\.g vegetation maps, land surface temperature, topography, etc\.\)\[[49](https://arxiv.org/html/2606.18316#bib.bib39)\],\[[38](https://arxiv.org/html/2606.18316#bib.bib76)\],\[[41](https://arxiv.org/html/2606.18316#bib.bib77)\]\. Although downscaling is sometimes discussed separately because of its practical importance, it is best interpreted as a specialised regression problem in which the target remains continuous SM but the emphasis shifts to recovering finer spatial detail\.
### 2\.3Classification Tasks
Classification modelling relaxes the need to predict a continues variable by the task of categorizing SM to predefined categories/classes\[[37](https://arxiv.org/html/2606.18316#bib.bib50)\],\[[45](https://arxiv.org/html/2606.18316#bib.bib8)\],\[[19](https://arxiv.org/html/2606.18316#bib.bib71)\],\[[59](https://arxiv.org/html/2606.18316#bib.bib2)\]\. The class formation can either rely on expert knowledge or on binning continuous values\[[46](https://arxiv.org/html/2606.18316#bib.bib43)\]\. Even though this lack of precision may look at first glance strange, it can be useful in case not enough data are available in order to formulate a regression problem and/or in case the said categories are enough as part of a broader system or because they are easier to interpret \(e\.g\. for applications such as drought monitoring, irrigation decision support, to name a few\)\.
### 2\.4Evaluation and Validation
As in every Data\-Driven model, the utility of a model is evaluated based on a specific metrics that measure the output\(s\) of the model against a set of target values \(e\.g\.R2R^\{2\}, RMSE, MAE, ubRMSE, KGE, etc\., for regression and accuracy, F1\-score, kappa, precision, recall, AUC, etc\., for classification\[[52](https://arxiv.org/html/2606.18316#bib.bib3)\],\[[37](https://arxiv.org/html/2606.18316#bib.bib50)\]\. The way that metric is estimated \(design of training/validation/testing sets\) is also critical\. Random cross\-validation should probably be avoided in favour of temporal holdout, spatial blocking, or leave\-one\-station\-out validation\[[26](https://arxiv.org/html/2606.18316#bib.bib32)\],\[[36](https://arxiv.org/html/2606.18316#bib.bib78)\]\. The overall design of the experimental set\-up is crucial in order for the proposed modelling approach to generalise to new/unseen data\.
## 3Data\-Driven Model Taxonomy
This section reviews representative data\-driven approaches for SM regression and classification, starting with the more established ones and moving to more recent trends\. The taxonomy of survey’s data\-driven models is summarised in Fig\.[3](https://arxiv.org/html/2606.18316#S3.F3)\.
Figure 3:Diagram with the categorisation of this survey’s data\-driven models### 3\.1Statistical Time\-Series Models
A statistical time\-series model is a data\-driven model that assumes the observed data are generated by a random process and describes the relationship between dependent and independent variables, solving an optimisation problem\[[10](https://arxiv.org/html/2606.18316#bib.bib45)\]\. The main advantages of statistical modelling include interpretability, robustness under limited data availability, and the ability to provide confidence intervals for predictions, which however, comes at a cost of the need to make some assumptions regarding the distributions of the data and the generative process\[[43](https://arxiv.org/html/2606.18316#bib.bib46)\],\[[50](https://arxiv.org/html/2606.18316#bib.bib47)\],\[[15](https://arxiv.org/html/2606.18316#bib.bib48)\]\. Most of them have been around for over half a century\.
Autoregressive Integrated Moving Average \(ARIMA\) model uses past observation errors as inputs in order to forecast future values\. Variations of the basic model have found extensive use in SM modelling\. Huang in\[[17](https://arxiv.org/html/2606.18316#bib.bib56)\]used a plain ARIMA \(12,1,0\) for predicting the precipitation and the soil evaporation\. An extension of the basic technique is the Seasonal ARIMA \(SARIMA\), designed to handle seasonal time series\[[1](https://arxiv.org/html/2606.18316#bib.bib18)\]\. In\[[5](https://arxiv.org/html/2606.18316#bib.bib20)\]the SARIMA model utilises historical SM time\-series at a specific depth to predict SM while in\[[4](https://arxiv.org/html/2606.18316#bib.bib53)\], the SARIMA Model is used to forecast the moisture content in precision agriculture\. Giorgio et al\. in\[[14](https://arxiv.org/html/2606.18316#bib.bib52)\]developed a SARIMA\-based time\-series data\-driven model for a 48\-h forecast of soil water content and salinity in the context of irrigation with reclaimed water, using data from soil water content and salinity data from 50 cm beneath the soil surface with a time resolution of 15 min, hourly atmospheric data and daily irrigation amounts\. In\[[51](https://arxiv.org/html/2606.18316#bib.bib54)\], the Cluster\-ARIMA model is used to predict soil respiration wherein the soil respiration data is partitioned into clusters using clustering techniques and then Seasonal ARIMA \(SARIMA\) is employed to determine the cluster to which the data to be forecasted belongs\. The prediction accuracy from the experimental results was 98,3%\. The Vector Autoregression VAR\-ARIMA model was used by\[[55](https://arxiv.org/html/2606.18316#bib.bib55)\]to predict SM, treating precipitation, SM, and evapotranspiration as independent inputs\.
### 3\.2Geostatistical Methods
Geostatistical models are spatial modeling approaches that consider SM as a realisation of a spatial stochastic process\[[11](https://arxiv.org/html/2606.18316#bib.bib61)\],\[[9](https://arxiv.org/html/2606.18316#bib.bib60)\]\. Unlike classical regression models that assume independent observations, geostatistical methods model the dependence between measurements as a function of spatial distance\[[9](https://arxiv.org/html/2606.18316#bib.bib60)\]\. Kriging is the most widely used spatial predictor, and it outputs spatial interpolation of the collected data and generates a digital soil map of the variation in the predicted variable\[[16](https://arxiv.org/html/2606.18316#bib.bib63)\]\. In\[[44](https://arxiv.org/html/2606.18316#bib.bib62)\], the Ordinary Kriging \(OK\), utilised the available Soil Moisture Active Passive \(SMAP\) L3 SM pixels for filling the gaps and interpolating complete SM data\. The cross\-validation results showed a high correlation with the official SMAP SM and a high coefficient of determination\.
In\[[35](https://arxiv.org/html/2606.18316#bib.bib64)\], researchers utilised Inverse Distance Weighting \(IDW\) and Geographic Weighted Regression \(GWR\) for estimating the spatial distribution of soil properties and compaction characteristics\. At the same time, the independent variable was the Modified Normalised Difference Water Index \(MNDWI\) from the Landsat8 satellite image\. An assessment of OK and Ordinary Co\-Kriging \(OCK\) for spatial interpolation of precipitation by month and quarter is presented in\[[47](https://arxiv.org/html/2606.18316#bib.bib65)\]\. The input data are precipitation data collected over 4\-5 years from rain gauges, and for the OCK, they utilised spatiotemporal data of SM from the SM and Ocean Salinity \(SMOS\) global satellite, in addition to precipitation data as an ancillary variable\. OCK showed the highest predictive performance for spatial precipitation distribution in both quarterly \(R2R^\{2\}=0\.944–0\.992\) and monthly analyses, clearly outperforming OK and IDW, which exhibited substantially lowerR2R^\{2\}values\.
### 3\.3Classical Machine Learning Models
ML models automatically learn functional relationships between input variables and outputs by optimizing model parameters from input\-output pairs without the need to exploit expert knowledge\[[62](https://arxiv.org/html/2606.18316#bib.bib66)\]\.
RFs is a purely data\-driven ML model \(belonging in the family of ensemble methods\) that can capture nonlinear relationships and perform well on heterogeneous data\[[6](https://arxiv.org/html/2606.18316#bib.bib31)\]\. For example, in\[[26](https://arxiv.org/html/2606.18316#bib.bib32)\]the SMRFR \(Soil Moisture RF Regression\) model took as inputs the combination of several SM variables, sucsh as in\-situ SM data from ISMN \(International Soil Moisture Network\), ERA5\-Land reanalysis multi\-source predictors, MODIS vegetation indices, soil characteristics and topographic attributes to estimate SM at five depth layers daily and globally with 9 Km spatial resolution for the period 2000\-2023\. In\[[45](https://arxiv.org/html/2606.18316#bib.bib8)\], RFs were utilised for precipitation, land use, altitude, and potential evapotranspiration to estimate Soil Water Holding Capacity\. Additionally, the RFs approach was compared with direct estimation using pedotransfer functions from soil maps\. The results showed the RFs approach was more robust, especially for low soil moisture values\. Another study\[[7](https://arxiv.org/html/2606.18316#bib.bib67)\]examined RFs for interpolating and extrapolating Root Zone Soil Moisture \(RZSM\) estimates at the daily timescale\. The input comprised in\-situ measurements from an agricultural catchment collected over two years\. The results after the comparison of RFs prediction for RZSM with simulation from process\-based models showed higher accuracy for the RFs\.
For high\-resolution SM prediction, in\[[31](https://arxiv.org/html/2606.18316#bib.bib68)\], the Extreme Gradient Boosting Regression \(XGBR\) \(another member of the family of ensemble methods\) was tested in combination with a Generic Algorithm \(GA\), using data fusion from Sentinel\-1, Sentinel\-2, and ALOS Digital Surface Model \(DSM\)\. The result showed that XGBR\-GA achieved the highest performance among RFs, Support Vector Machines, and CatBoost GB regression, withR2R^\{2\}=0\.891 and RMSE=0\.879%\. An ANN was proposed in\[[57](https://arxiv.org/html/2606.18316#bib.bib69)\]for reconstructing missing daily surface SM records from the ESA CCI\. The considered variables were soil texture, geographic and topographic features, and vegetation conditions, among others\. The ANN model results outperformed the results of an OK model especially in regions with sparse vegetation\. Another study\[[22](https://arxiv.org/html/2606.18316#bib.bib70)\]evaluated an ANN model combined with other ML\-based models, such as RFs and Support Vector Regression \(SVR\), for daily improved SM data production\. Data from satellites and the Land Surface Model \(LSM\) were utilised, and the evaluation using the International Soil Moisture Network \(ISMN\) showed the ML\-based ensemble’s robustness in complex topographically areas with high\-density vegetation\.
Support Vector Machines \(SVMs\) are kernel\-based supervised learning models used for both regression and classification and are attractive because they can perform well under limited or noisy data\. In\[[18](https://arxiv.org/html/2606.18316#bib.bib58)\], the SVM model was enhanced using the Bald Eagle Search \(BES\) algorithm for SM prediction outperforming conventional SVMs\. Navidi et al\.\[[30](https://arxiv.org/html/2606.18316#bib.bib59)\]compared ANNs, Adaptive Neuro\-Fuzzy Inference System \(ANFIS\), SVMs and an optimised SVM by firefly and particle swarm meta\-heuristic algorithms, named SVM\-FFA and SVM\-PSA for predicting soil water content, using geometric mean diameter of soil particles, bulk density, organic carbon, NDVI and NSMI, with the metaheuristically optimised methods outperforming the conventional ML approaches\. In\[[29](https://arxiv.org/html/2606.18316#bib.bib81)\], an SVM, an RF, a GB, and an LR model were tested for predicting drought events using average temperature, specific humidity, soil moisture and dew point\. The SVM model outperformed the other ML models, achieving an AUC of 0\.8166\.
### 3\.4Deep Learning Models
DL models are methods based on deep \(multilayer\) NNs \(i\.e\. stacking nonlinear layers together\)\. What makes them attractive is their ability to learn hierarchical feature representations directly from raw data, eliminating in most cases the need for handcrafted features\[[3](https://arxiv.org/html/2606.18316#bib.bib73)\]\. In\[[8](https://arxiv.org/html/2606.18316#bib.bib79)\]a Long\-Short\-Term\-Memory \(LSTM\) model acting on satellite data, climate time series, topographic and soil\-type data forecasted daily SM values\. achieving \(R2R^\{2\}\) equal to 0\.87 and RMSE equal to 0\.046\. In\[[33](https://arxiv.org/html/2606.18316#bib.bib80)\], a combination of Graph Convolutional LSTM \(GConvLSTM\) and Convolutional LSTM \(ConvLSTM\), named GCCL utilised historical time series coming from SMAP L4 SM spatiotemporal grids for forecasting SM for the next 7 days\. achieving RMSE values between 0\.018–0\.038 m³/m³ for 1–7 day forecasts and reduced error by up to 14\.3% compared to ConvLSTM\.
A Convolutional Neural Network \(CNN\), another example of ANNs, was used to forecast SM levels, as part of an AI irrigation framework\[[19](https://arxiv.org/html/2606.18316#bib.bib71)\], with the results \(R2R^\{2\}=0\.87%\) and significantly exceeding those of linear regression and other baseline models\. In\[[48](https://arxiv.org/html/2606.18316#bib.bib72)\], the 1D\-CNN\-ANN model is presented to estimate SM in different depths by utilising raw amplitude data from Ground Penetrating Radar \(GPR\)\. The achieved performance in terms ofR2R^\{2\}and RMSE outperformed those of GBMs and SVMs in 10\-20\-30 cm depth\. The proposed model was robust even at 40 cm depth prediction\. For the classification of SM at 6\-24inches depth, in\[[60](https://arxiv.org/html/2606.18316#bib.bib44)\], a 1D\-CNN was developed, which utilised UGV\-acquired hyperspectral data and spectral preprocessing methods\. The results showed a testing accuracy of 0\.67 and a strong discriminative ability \(AUC\) of 0\.85\. The Mois\-EfficientNet is proposed in\[[24](https://arxiv.org/html/2606.18316#bib.bib42)\]for classification and regression of root\-zone soil moisture using GPR data\-derived images\. Initially, the Mois\-EfficientNet classifies GPR images of tree roots into distinct moisture\-content levels, and then this trained classification network is used to predict soil moisture\.
With the emergence of Transformer architecture, their use in SM modelling showed that they can be utilised for long prediction horizons and deep\-soil estimation tasks\[[53](https://arxiv.org/html/2606.18316#bib.bib4)\],\[[52](https://arxiv.org/html/2606.18316#bib.bib3)\]\. In\[[52](https://arxiv.org/html/2606.18316#bib.bib3)\], where seven DL models were tested, Transformers, even though they did not outperform LSMTs, were competitive, especially for longer\-horizon and deeper\-soil prediction\. However, in Wang and Zha\[[53](https://arxiv.org/html/2606.18316#bib.bib4)\], the Transformer model outperformed LSTM on average across all time lags, whereas LSTM\-Transformer outperformed both for longer time lags\. One of the latest trends in data\-driven methods is physics\-guided DL, where physically meaningful constraints are incorporated into the training objective\. Geng et al\.\[[13](https://arxiv.org/html/2606.18316#bib.bib5)\]proposed physically\-guided LSTM models \(PHY\-LSTM\) for next\-day soil moisture forecasting using the global ERA5\-Land dataset, and reported that PHYs\-LSTM increasedR2R^\{2\}by 20\.7%, and reduced RMSE by 8\.2%, compared with a conventional LSTM baseline\.
### 3\.5Probabilistic/Bayesian Models
Probabilistic and Bayes models have been around for quite sometime\. However recently with the advances in computational power have become relevant for data intensive applications\. A Bayesian inverse model was presented by\[[42](https://arxiv.org/html/2606.18316#bib.bib49)\], which utilised sparse data for predicting soil Available Water Capacity \(AWC\)\. The input was measurements of upper and lower drained limits across the tested area, and the output was the spatial prediction of AWC on a 90\-m grid\. Salakpi et al\. 2022\[[37](https://arxiv.org/html/2606.18316#bib.bib50)\]compared a hierarchical Bayesian model with a regular Bayesian autoregression model to improve the accuracy and precision of agricultural drought forecasting across different regions\. The Hierarchical Bayesian model forecasted more accurate and precise drought probabilities and had a lower probability of false alarms\. A prediction method based on Gaussian Process Regression \(GPR\), incorporating the Radially Uniform \(RU\) design algorithm to reduce computational cost, was proposed by\[[25](https://arxiv.org/html/2606.18316#bib.bib51)\]\. The experimental results show that the GPR model with the RU design algorithm outperforms the generic GPR model, achieving lower deterministic and probabilistic forecasting errors and reduced training time\. Zhang et al\.\[[56](https://arxiv.org/html/2606.18316#bib.bib57)\]introduced Stacking GPR \(SGPR\), an ensemble\-learning\-based method for soil moisture estimation, which notably exceeded existing models\.
## 4Discussion \- Conclusions
Table 1:Summary of Data\-Driven Models for Soil MoistureR: Regression task;C: Classification task\.
The reviewed literature, summarised in Table[1](https://arxiv.org/html/2606.18316#S4.T1), shows that several data\-driven techniques have been used for SM modelling, with the input variables and the predicted output for each approach\. Statistical and geostatistical approaches remain relevant; however, ML and DL are gaining popularity due to their ability to deal with heterogeneous and nonlinear inputs\. The main issue in comparing the proposed methods is the variability in experimental design \(e\.g\., various inputs and outputs\)\.
The majority of studies focus on Regression because SM is inherently continuous, but there are still a few studies that utilise classification\. Future work should place greater emphasis on uncertainty quantification and benchmarking across regions, depths, and sensing modalities\.
Overall, data\-driven methods offer strong predictive capability for complex SM regression and classification problems\. However, their effectiveness is often limited by data availability and reduced interpretability\.
## References
- \[1\]R\. Adhikari and R\. K\. Agrawal\(2013\)An introductory study on time series modeling and forecasting\.arXiv preprint arXiv:1302\.6613\.Cited by:[§2\.2](https://arxiv.org/html/2606.18316#S2.SS2.p2.1),[§3\.1](https://arxiv.org/html/2606.18316#S3.SS1.p2.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.2.1.1.1.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.5.4.1.1.1)\.
- \[2\]T\. Alahmad, M\. Neményi, A\. Széles, N\. Ali, O\. Hijazi, and A\. Nyéki\(2025\)Spatiotemporal prediction of soil moisture content at various depths in three soil types using machine learning algorithms\.Frontiers in Soil Science5,pp\. 1612908\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p3.1),[§2\.1](https://arxiv.org/html/2606.18316#S2.SS1.p1.1),[§2\.2](https://arxiv.org/html/2606.18316#S2.SS2.p2.1)\.
- \[3\]M\. A\. Amanullah, R\. A\. A\. Habeeb, F\. H\. Nasaruddin, A\. Gani, E\. Ahmed, A\. S\. M\. Nainar, N\. M\. Akim, and M\. Imran\(2020\)Deep learning and big data technologies for iot security\.Computer Communications151,pp\. 495–517\.Cited by:[§3\.4](https://arxiv.org/html/2606.18316#S3.SS4.p1.1)\.
- \[4\]S\. Balasooriya, C\. Nguyen, I\. Kavalchuk, and L\. Yasakethu\(2022\)Forecasting model comparison for soil moisture to obtain optimal plant growth\.In2022 IEEE International IOT, Electronics and Mechatronics Conference \(IEMTRONICS\),pp\. 1–7\.Cited by:[§3\.1](https://arxiv.org/html/2606.18316#S3.SS1.p2.1)\.
- \[5\]G\. E\. Box, G\. M\. Jenkins, G\. C\. Reinsel, and G\. M\. Ljung\(2015\)Time series analysis: forecasting and control\.John Wiley & Sons\.Cited by:[§2\.2](https://arxiv.org/html/2606.18316#S2.SS2.p2.1),[§3\.1](https://arxiv.org/html/2606.18316#S3.SS1.p2.1)\.
- \[6\]L\. Breiman\(2001\)Random forests\.Machine learning45\(1\),pp\. 5–32\.Cited by:[§3\.3](https://arxiv.org/html/2606.18316#S3.SS3.p2.1)\.
- \[7\]C\. Carranza, C\. Nolet, M\. Pezij, and M\. van der Ploeg\(2021\)Root zone soil moisture estimation with random forest\.Journal of hydrology593,pp\. 125840\.Cited by:[§3\.3](https://arxiv.org/html/2606.18316#S3.SS3.p2.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.11.10.1.1.1)\.
- \[8\]M\. F\. Celik, M\. S\. Isik, O\. Yuzugullu, N\. Fajraoui, and E\. Erten\(2022\)Soil moisture prediction from remote sensing images coupled with climate, soil texture and topography via deep learning\.Remote sensing14\(21\),pp\. 5584\.Cited by:[§3\.4](https://arxiv.org/html/2606.18316#S3.SS4.p1.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.18.17.1.1.1)\.
- \[9\]N\. Cressie\(2015\)Statistics for spatial data\.John Wiley & Sons\.Cited by:[§3\.2](https://arxiv.org/html/2606.18316#S3.SS2.p1.1)\.
- \[10\]A\. C\. Davison\(2003\)Statistical models\.Vol\.11,Cambridge university press\.Cited by:[§3\.1](https://arxiv.org/html/2606.18316#S3.SS1.p1.1)\.
- \[11\]P\. J\. Diggle, P\. J\. Ribeiro Jr, and O\. F\. Christensen\(2003\)An introduction to model\-based geostatistics\.InSpatial statistics and computational methods,pp\. 43–86\.Cited by:[§3\.2](https://arxiv.org/html/2606.18316#S3.SS2.p1.1)\.
- \[12\]R\. Fu, L\. Xie, T\. Liu, B\. Zheng, Y\. Zhang, and S\. Hu\(2023\)A soil moisture prediction model, based on depth and water balance equation: a case study of the xilingol league grassland\.International Journal of Environmental Research and Public Health20\(2\),pp\. 1374\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p5.1)\.
- \[13\]X\. Genget al\.\(2024\)Enhancing data\-driven soil moisture modeling with physically\-guided lstm networks\.Frontiers in Forests and Global Change7,pp\. 1353011\.External Links:[Document](https://dx.doi.org/10.3389/ffgc.2024.1353011)Cited by:[§3\.4](https://arxiv.org/html/2606.18316#S3.SS4.p3.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.25.24.1.1.1)\.
- \[14\]A\. Giorgio, N\. Del Buono, M\. Berardi, M\. Vurro, and G\. A\. Vivaldi\(2022\)Soil moisture sensor information enhanced by statistical methods in a reclaimed water irrigation framework\.Sensors22\(20\),pp\. 8062\.Cited by:[§2\.2](https://arxiv.org/html/2606.18316#S2.SS2.p2.1),[§3\.1](https://arxiv.org/html/2606.18316#S3.SS1.p2.1)\.
- \[15\]L\. Hagar and A\. J\. Martin\(2025\)An efficient framework for robust sample size determination\.arXiv preprint arXiv:2512\.16231\.Cited by:[§3\.1](https://arxiv.org/html/2606.18316#S3.SS1.p1.1)\.
- \[16\]A\. Hilal, S\. A\. Bangroo, N\. A\. Kirmani, J\. A\. Wani, A\. Biswas, M\. I\. Bhat, K\. Farooq, O\. Bashir, and T\. I\. Shah\(2024\)Geostatistical modeling—a tool for predictive soil mapping\.InRemote Sensing in Precision Agriculture,pp\. 389–418\.Cited by:[§3\.2](https://arxiv.org/html/2606.18316#S3.SS2.p1.1)\.
- \[17\]X\. Huang\(2023\)Research on soil moisture prediction based on mechanism analysis and arima model\.In3rd International Conference on Applied Mathematics, Modelling, and Intelligent Computing \(CAMMIC 2023\),Vol\.12756,pp\. 39–44\.Cited by:[§3\.1](https://arxiv.org/html/2606.18316#S3.SS1.p2.1)\.
- \[18\]Y\. Huang\(2023\)Improved svm\-based soil\-moisture\-content prediction model for tea plantation\.Plants12\(12\),pp\. 2309\.Cited by:[§3\.3](https://arxiv.org/html/2606.18316#S3.SS3.p4.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.15.14.1.1.1)\.
- \[19\]A\. Ismail, O\. Meqbel, M\. Yousef, M\. Tayfor, A\. Askri, R\. Errouissi, and A\. El Moutaouakil\(2025\)A cnn\-based predictive control framework for autonomous soil moisture management\.In2025 IEEE 19th International Conference on Application of Information and Communication Technologies \(AICT\),pp\. 1–4\.Cited by:[§2\.3](https://arxiv.org/html/2606.18316#S2.SS3.p1.1),[§3\.4](https://arxiv.org/html/2606.18316#S3.SS4.p2.2),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.20.19.1.1.1)\.
- \[20\]M\. Kivi, N\. Vergopolan, and H\. Dokoohaki\(2023\)A comprehensive assessment of in situ and remote sensing soil moisture data assimilation in the apsim model for improving agricultural forecasting across the us midwest\.Hydrology and Earth System Sciences27\(5\),pp\. 1173–1199\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p4.1)\.
- \[21\]M\. Lamichhane, S\. Mehan, and K\. R\. Mankin\(2025\)Soil moisture prediction using remote sensing and machine learning algorithms: a review on progress, challenges, and opportunities\.Remote Sensing17\(14\),pp\. 2397\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p1.1)\.
- \[22\]J\. Lee, S\. Park, J\. Im, C\. Yoo, and E\. Seo\(2022\)Improved soil moisture estimation: synergistic use of satellite observations and land surface models over conus based on machine learning\.Journal of hydrology609,pp\. 127749\.Cited by:[§3\.3](https://arxiv.org/html/2606.18316#S3.SS3.p3.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.14.13.1.1.1)\.
- \[23\]H\. Liu, S\. Hui, Y\. Zhao, Y\. Jiang, Y\. Qi, E\. W\. Boyer, C\. R\. Mello, L\. Guo, and H\. Li\(2025\)Spatiotemporal variability of soil moisture and its influencing factors in a forested catchment with complex terrain\.Catena256,pp\. 109079\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p1.1),[§1](https://arxiv.org/html/2606.18316#S1.p3.1),[§2\.1](https://arxiv.org/html/2606.18316#S2.SS1.p1.1)\.
- \[24\]K\. Liu, Y\. Zhao, Q\. Lu, Q\. Liu, and J\. Sun\(2026\)Mois\-efficientnet: deep learning\-based 3d mapping of root zone soil moisture from gpr data\.Journal of Geophysics and Engineering,pp\. gxag007\.Cited by:[§3\.4](https://arxiv.org/html/2606.18316#S3.SS4.p2.2),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.23.22.1.1.1)\.
- \[25\]M\. Liu, C\. Huang, L\. Wang, Y\. Zhang, and X\. Luo\(2020\)Short\-term soil moisture forecasting via gaussian process regression with sample selection\.Water12\(11\),pp\. 3085\.Cited by:[§3\.5](https://arxiv.org/html/2606.18316#S3.SS5.p1.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.28.27.1.1.1)\.
- \[26\]Y\. Liu, Y\. Zha, G\. Ran, Y\. Zhang, and L\. Shi\(2025\)SMRFR: a global multilayer soil moisture dataset generated using random forest from multi\-source data\.Scientific Data12\(1\),pp\. 1170\.Cited by:[§2\.1](https://arxiv.org/html/2606.18316#S2.SS1.p1.1),[§2\.2](https://arxiv.org/html/2606.18316#S2.SS2.p2.1),[§2\.4](https://arxiv.org/html/2606.18316#S2.SS4.p1.1),[§3\.3](https://arxiv.org/html/2606.18316#S3.SS3.p2.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.9.8.1.1.1)\.
- \[27\]J\. Martínez\-Fernández and A\. Ceballos\(2003\)Temporal stability of soil moisture in a large\-field experiment in spain\.Soil Science Society of America Journal67\(6\),pp\. 1647–1656\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p2.1)\.
- \[28\]S\. Mondal and A\. Mishra\(2024\)Quantifying the precipitation, evapotranspiration, and soil moisture network’s interaction over global land surface hydrological cycle\.Water Resources Research60\(2\),pp\. e2023WR034861\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p1.1)\.
- \[29\]I\. Mumtaz, R\. Niaz, Z\. Sajid, A\. Q\. Alameri, Z\. Ali, and K\. A\. Gepreel\(2025\)Utilising machine learning classification models for meteorological drought monitoring and analysis\.All Earth37\(1\),pp\. 1–21\.Cited by:[§3\.3](https://arxiv.org/html/2606.18316#S3.SS3.p4.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.17.16.1.1.1)\.
- \[30\]M\. N\. Navidi, J\. Seyedmohammadi, and S\. A\. Seyed Jalali\(2022\)Predicting soil water content using support vector machines improved by meta\-heuristic algorithms and remotely sensed data\.Geomechanics and Geoengineering17\(3\),pp\. 712–726\.Cited by:[§2\.1](https://arxiv.org/html/2606.18316#S2.SS1.p1.1),[§3\.3](https://arxiv.org/html/2606.18316#S3.SS3.p4.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.16.15.1.1.1)\.
- \[31\]T\. T\. Nguyen, H\. H\. Ngo, W\. Guo, S\. W\. Chang, D\. D\. Nguyen, C\. T\. Nguyen, J\. Zhang, S\. Liang, X\. T\. Bui, and N\. B\. Hoang\(2022\)A low\-cost approach for soil moisture prediction using multi\-sensor data and machine learning algorithm\.Science of the Total Environment833,pp\. 155066\.Cited by:[§2\.1](https://arxiv.org/html/2606.18316#S2.SS1.p1.1),[§3\.3](https://arxiv.org/html/2606.18316#S3.SS3.p3.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.12.11.1.1.1)\.
- \[32\]J\. Ondieki, G\. Laneve, M\. Marsella, and C\. Mito\(2023\)Enhancing surface soil moisture estimation through integration of artificial neural networks machine learning and fusion of meteorological, sentinel\-1a and sentinel\-2a satellite data\.Advances in Remote Sensing12\(04\),pp\. 99–122\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p3.1),[§2\.1](https://arxiv.org/html/2606.18316#S2.SS1.p1.1)\.
- \[33\]Z\. Pan, L\. Xu, and N\. Chen\(2025\)Combining graph neural network and convolutional lstm network for multistep soil moisture spatiotemporal prediction\.Journal of Hydrology651,pp\. 132572\.Cited by:[§3\.4](https://arxiv.org/html/2606.18316#S3.SS4.p1.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.19.18.1.1.1)\.
- \[34\]V\. Pandey and P\. K\. Pandey\(2010\)Spatial and temporal variability of soil moisture\.International Journal of Geosciences1\(2\),pp\. 87\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p2.1)\.
- \[35\]A\. J\. H\. Rash, L\. Khodakarami, D\. A\. Muhedin, M\. I\. Hamakareem, and H\. F\. H\. Ali\(2024\)Spatial modeling of geotechnical soil parameters: integrating ground\-based data, rs technique, spatial statistics and gwr model\.Journal of Engineering Research12\(1\),pp\. 75–85\.Cited by:[§3\.2](https://arxiv.org/html/2606.18316#S3.SS2.p2.2),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.7.6.1.1.1)\.
- \[36\]D\. R\. Robertset al\.\(2017\)Cross\-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure\.Ecography40\(8\),pp\. 913–929\.Cited by:[§2\.4](https://arxiv.org/html/2606.18316#S2.SS4.p1.1)\.
- \[37\]E\. E\. Salakpi, P\. D\. Hurley, J\. M\. Muthoka, A\. Bowell, S\. Oliver, and P\. Rowhani\(2022\)A dynamic hierarchical bayesian approach for forecasting vegetation condition\.Natural Hazards and Earth System Sciences22\(8\),pp\. 2725–2749\.Cited by:[§2\.3](https://arxiv.org/html/2606.18316#S2.SS3.p1.1),[§2\.4](https://arxiv.org/html/2606.18316#S2.SS4.p1.1),[§3\.5](https://arxiv.org/html/2606.18316#S3.SS5.p1.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.27.26.1.1.1)\.
- \[38\]I\. P\. Senanayakeet al\.\(2024\)Spatial downscaling of satellite\-based soil moisture products using machine learning techniques: a review\.Remote Sensing16\(12\),pp\. 2067\.Cited by:[§2\.2](https://arxiv.org/html/2606.18316#S2.SS2.p2.1),[§2](https://arxiv.org/html/2606.18316#S2.p1.1)\.
- \[39\]P\. Settu and M\. Ramaiah\(2025\)A data driven comparison of hybrid machine learning techniques for soil moisture modeling using remote sensing imagery\.Scientific Reports15\(1\),pp\. 43170\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p3.1)\.
- \[40\]A\. Singh, K\. Gaurav, G\. K\. Sonkar, and C\. Lee\(2023\)Strategies to measure soil moisture using traditional methods, automated sensors, remote sensing, and machine learning techniques: review, bibliometric analysis, applications, research findings, and future directions\.Ieee Access11,pp\. 13605–13635\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p3.1)\.
- \[41\]K\. Soltaniet al\.\(2024\)Enhancing spatial resolution of satellite soil moisture data through stacking ensemble learning techniques\.Scientific Reports14\.Cited by:[§2\.2](https://arxiv.org/html/2606.18316#S2.SS2.p2.1)\.
- \[42\]P\. Somarathna, R\. Searle, and D\. W\. Gladish\(2021\)Mapping available soil water capacity in new south wales, australia using sparse data\-an inverse bayesian approach\.Geoderma Regional25,pp\. e00396\.Cited by:[§3\.5](https://arxiv.org/html/2606.18316#S3.SS5.p1.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.26.25.1.1.1)\.
- \[43\]H\. Storm, T\. Heckelei, and K\. Baylis\(2024\)Probabilistic programming for embedding theory and quantifying uncertainty in econometric analysis\.European Review of Agricultural Economics51\(3\),pp\. 589–616\.Cited by:[§3\.1](https://arxiv.org/html/2606.18316#S3.SS1.p1.1)\.
- \[44\]C\. Tong, H\. Wang, R\. Magagi, K\. Goïta, and K\. Wang\(2021\)Spatial gap\-filling of smap soil moisture pixels over tibetan plateau via machine learning versus geostatistics\.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing14,pp\. 9899–9912\.Cited by:[§2\.1](https://arxiv.org/html/2606.18316#S2.SS1.p1.1),[§2\.2](https://arxiv.org/html/2606.18316#S2.SS2.p2.1),[§3\.2](https://arxiv.org/html/2606.18316#S3.SS2.p1.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.6.5.1.1.1)\.
- \[45\]Y\. Tramblay and P\. Quintana Seguí\(2022\)Estimating soil moisture conditions for drought monitoring with random forests and a simple soil moisture accounting scheme\.Natural Hazards and Earth System Sciences22\(4\),pp\. 1325–1334\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p1.1),[§2\.3](https://arxiv.org/html/2606.18316#S2.SS3.p1.1),[§3\.3](https://arxiv.org/html/2606.18316#S3.SS3.p2.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.10.9.1.1.1)\.
- \[46\]T\. Tunçay, P\. Alaboz, O\. Dengiz, and O\. Başkan\(2023\)Application of regression kriging and machine learning methods to estimate soil moisture constants in a semi\-arid terrestrial area\.Computers and Electronics in Agriculture212,pp\. 108118\.Cited by:[§2\.3](https://arxiv.org/html/2606.18316#S2.SS3.p1.1),[§2](https://arxiv.org/html/2606.18316#S2.p1.1)\.
- \[47\]B\. Usowicz, J\. Lipiec, M\. Łukowski, and J\. Słomiński\(2021\)Improvement of spatial interpolation of precipitation distribution using cokriging incorporating rain\-gauge and satellite \(smos\) soil moisture data\.Remote Sensing13\(5\),pp\. 1039\.Cited by:[§3\.2](https://arxiv.org/html/2606.18316#S3.SS2.p2.2),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.8.7.1.1.1)\.
- \[48\]M\. Vahidi, S\. Shafian, and W\. H\. Frame\(2025\)Multi\-depth soil moisture estimation via 1d convolutional neural networks from drone\-mounted ground penetrating radar data\.Computers and Electronics in Agriculture232,pp\. 110104\.Cited by:[§2\.1](https://arxiv.org/html/2606.18316#S2.SS1.p1.1),[§2\.2](https://arxiv.org/html/2606.18316#S2.SS2.p2.1),[§3\.4](https://arxiv.org/html/2606.18316#S3.SS4.p2.2),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.21.20.1.1.1)\.
- \[49\]N\. Vergopolan, N\. W\. Chaney, M\. Pan, J\. Sheffield, H\. E\. Beck, C\. R\. Ferguson, L\. Torres\-Rojas, S\. Sadri, and E\. F\. Wood\(2021\)SMAP\-hydroblocks, a 30\-m satellite\-based soil moisture dataset for the conterminous us\.Scientific data8\(1\),pp\. 264\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p4.1),[§2\.1](https://arxiv.org/html/2606.18316#S2.SS1.p1.1),[§2\.2](https://arxiv.org/html/2606.18316#S2.SS2.p2.1),[§2](https://arxiv.org/html/2606.18316#S2.p1.1)\.
- \[50\]V\. Volodina and P\. Challenor\(2021\)The importance of uncertainty quantification in model reproducibility\.Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences379\(2197\)\.Cited by:[§3\.1](https://arxiv.org/html/2606.18316#S3.SS1.p1.1)\.
- \[51\]G\. Wang, H\. Su, L\. Mo, X\. Yi, and P\. Wu\(2024\)Forecasting of soil respiration time series via clustered arima\.Computers and Electronics in Agriculture225,pp\. 109315\.Cited by:[§3\.1](https://arxiv.org/html/2606.18316#S3.SS1.p2.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.3.2.1.1.1)\.
- \[52\]Y\. Wang, L\. Shi, Y\. Hu, X\. Hu, W\. Song, and L\. Wang\(2024\)A comprehensive study of deep learning for soil moisture prediction\.Hydrology and Earth System Sciences28,pp\. 917–943\.External Links:[Document](https://dx.doi.org/10.5194/hess-28-917-2024)Cited by:[§2\.4](https://arxiv.org/html/2606.18316#S2.SS4.p1.1),[§3\.4](https://arxiv.org/html/2606.18316#S3.SS4.p3.1)\.
- \[53\]Y\. Wang and Y\. Zha\(2024\)Comparison of transformer, lstm and coupled algorithms for soil moisture prediction in shallow\-groundwater\-level areas with interpretability analysis\.Agricultural Water Management300,pp\. 109120\.External Links:[Document](https://dx.doi.org/10.1016/j.agwat.2024.109120)Cited by:[§3\.4](https://arxiv.org/html/2606.18316#S3.SS4.p3.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.24.23.1.1.1)\.
- \[54\]Y\. Wang, L\. Shi, Y\. Hu, X\. Hu, W\. Song, and L\. Wang\(2023\)A comprehensive study of deep learning for soil moisture prediction\.Hydrology and Earth System Sciences Discussions2023,pp\. 1–38\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p5.1),[§2\.2](https://arxiv.org/html/2606.18316#S2.SS2.p1.1),[§2\.2](https://arxiv.org/html/2606.18316#S2.SS2.p2.1),[§2](https://arxiv.org/html/2606.18316#S2.p1.1)\.
- \[55\]X\. Wen, J\. Wei, J\. Zhang, and J\. Yue\(2023\)Research on soil moisture prediction based on var\-arima model\.Highlights in Science, Engineering and Technology\.Cited by:[§3\.1](https://arxiv.org/html/2606.18316#S3.SS1.p2.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.4.3.1.1.1)\.
- \[56\]Z\. Xue, Y\. Zhang, L\. Zhang, and H\. Li\(2022\)Ensemble learning embedded with gaussian process regression for soil moisture estimation: a case study of the continental us\.IEEE Transactions on Geoscience and Remote Sensing60,pp\. 1–17\.Cited by:[§3\.5](https://arxiv.org/html/2606.18316#S3.SS5.p1.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.29.28.1.1.1)\.
- \[57\]L\. Zhang, Y\. Liu, L\. Ren, A\. J\. Teuling, X\. Zhang, S\. Jiang, X\. Yang, L\. Wei, F\. Zhong, and L\. Zheng\(2021\)Reconstruction of esa cci satellite\-derived soil moisture using an artificial neural network technology\.Science of the Total Environment782,pp\. 146602\.Cited by:[§2\.2](https://arxiv.org/html/2606.18316#S2.SS2.p2.1),[§3\.3](https://arxiv.org/html/2606.18316#S3.SS3.p3.1),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.13.12.1.1.1)\.
- \[58\]M\. Zhang, D\. Zhang, Y\. Jin, X\. Wan, and Y\. Ge\(2025\)Evolution of soil moisture mapping from statistical models to integrated mechanistic and geoscience\-aware approaches\.Information Geography,pp\. 100005\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p5.1)\.
- \[59\]X\. Zhang, B\. Ram, N\. Vullaganti, W\. Aderholdt, P\. Overby, and X\. Sun\(2025\)Soil moisture classification using hyperspectral imaging and deep learning models on ground robot vehicles\.Smart Agricultural Technology,pp\. 101413\.Cited by:[§2\.3](https://arxiv.org/html/2606.18316#S2.SS3.p1.1),[§2](https://arxiv.org/html/2606.18316#S2.p1.1)\.
- \[60\]X\. Zhang, B\. Ram, N\. Vullaganti, W\. Aderholdt, P\. Overby, and X\. Sun\(2025\)Soil moisture classification using hyperspectral imaging and deep learning models on ground robot vehicles\.Smart Agricultural Technology,pp\. 101413\.Cited by:[§3\.4](https://arxiv.org/html/2606.18316#S3.SS4.p2.2),[Table 1](https://arxiv.org/html/2606.18316#S4.T1.1.22.21.1.1.1)\.
- \[61\]X\. Zhou, H\. Lin, and Q\. Zhu\(2007\)Temporal stability of soil moisture spatial variability at two scales and its implication for optimal field monitoring\.Hydrology and Earth System Sciences Discussions4\(3\),pp\. 1185–1214\.Cited by:[§1](https://arxiv.org/html/2606.18316#S1.p1.1)\.
- \[62\]Z\. Zhou\(2021\)Machine learning\.Springer nature\.Cited by:[§3\.3](https://arxiv.org/html/2606.18316#S3.SS3.p1.1)\.Similar Articles
Can Machine Learning Forecast Rice Yields in Data-Constrained Settings? Satellite Climate Data, National Crop Statistics, and Lessons from Sierra Leone
This paper presents the first machine learning study for crop yield forecasting in Sierra Leone, finding that combining freely available satellite climate data (CHIRPS, NASA POWER) with national crop statistics reduces forecast error by a third compared to persistence, though crop statistics alone are insufficient.
Open Multimodal Datasets and Open-Source Software for Data-Driven Modeling of Multiphase Transport and Thermal Systems
This paper presents open multimodal datasets and open-source software packages for reproducible AI-enabled thermal-fluid research, introducing a spatial-temporal dimensionality framework and tools like SeqReg for sequence regression.
Multivariate Probability Models in Machine Learning [D]
A discussion thread on multivariate probability models in machine learning.
@bqbrady: https://x.com/bqbrady/status/2064055370809778371
A detailed personal survey of modern deep learning, focusing on foundation models, vision-language models, and their architectural decisions, written for those who want intuition rather than dense math.
Physics-Informed Machine Learning for Short-Term Flood Prediction
Researchers propose a Physics-Informed Machine Learning (PIML) framework that integrates hydrological constraints into an LSTM loss function to improve short-term flood forecasting, particularly in data-scarce regimes. A 'Trend Alignment' constraint enforcing consistency between precipitation and discharge trends improves Nash-Sutcliffe Efficiency and eliminates unphysical predictions during extreme events.