Spatial and correlation analyses were used to evaluate the relationship between forest degradation and beekeeping activity, combining forest loss data with beehive placement.
To identify optimal zones for beehive placement, Random Forest Regression combined with SHAP (Shapley Additive Explanations) was applied to determine the relative importance of variables influencing honey yield. Grid-based spatial analysis highlighted unexplored regions suitable for hive expansion and onboarding of new farmers. Climate effects on honey production were forecasted using SARIMAX time series models with 73–90% accuracy, and regularized regression models (Ridge and Lasso) identified significant climate variables and early warning signs for adaptive management, achieving ~90% accuracy. Demographic and transactional data were analyzed using spline regression and clustering models, predicting honey yields with ~60% test accuracy and segmenting smallholder farmers into four key profiles.