Ahn, D., Lee, K., Park, S., and et al (2021). Will we trust what we don’t understand? Impact of model interpretability and outcome feedback on trust in AI. arXiv preprint, arXiv:2111.08222, pp. 1-10.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL-HLT 2019, pp. 4171-4186.
Dosovitskiy, A., Beyer, L., Kolesnikov, A., and et al (2021). An image is worth 16×16 words: Transformers for image recognition at scale. Proceedings of the International Conference on Learning Representations (ICLR), pp. 1-21.
Hori, C., Wichern, G., Chen, R., and et al (2017). Attention-based multimodal fusion for video description. Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 4193-4202.
Huang, Y., Yang, H., and Wang, X. (2024). Rethinking attention weights as bidirectional coefficients. Journal of Data Science, 22(4), pp. 571-588.
Jain, S., and Wallace, B. C. (2019). Attention is not explanation. Proceedings of NAACL-HLT 2019, pp. 3543-3556.
Khaki, S., Wang, L., and Archontoulis, S. V. (2019). A CNN-RNN framework for crop yield prediction. Frontiers in Plant Science, 10, pp. 1750.
Kong, W., Yin, L., Li, X., and et al (2025). Phenology description is all you need: Mapping unknown crop types by early-season phenological signature. ISPRS Journal of Photogrammetry and Remote Sensing, 184, pp. 441-457.
Li, Z., Gao, J., Zhu, C., and Deng, F. (2025). Short-term power load forecasting based on spatial-temporal dynamic graph and multi-scale Transformer. Journal of Computational Design and Engineering, 12(2), pp. 92-111.
Lin, F., Zhu, Y., Chen, Y., and et al (2023). MMST-ViT: Climate change-aware crop yield prediction via multi-modal spatial-temporal vision transformer. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5774-5784.
Maimaitijiang, M., Sagan, V., Sidike, P., Hartling, S., Esposito, F., and Fritschi, F. B. (2020). Soybean yield prediction from UAV using multimodal data fusion and deep learning. Remote Sensing of Environment, 237, pp. 111599.
Park, K.-B., and Lee, J. Y. (2022). SwinE-Net: hybrid deep learning approach to novel polyp segmentation using convolutional neural network and Swin Transformer. Journal of Computational Design and Engineering, 9(2), pp. 616-632.
Rudrakar, S., and Rughani, P. (2023). The role of XAI in building trustworthy agricultural algorithms. International Journal of Precision Agricultural Aviation, 6(2), pp. 49-56.
Wang, W., Xie, E., Li, X., and et al (2021). Pyramid Vision Transformer: A versatile backbone for dense prediction without convolutions. Proceedings of ICCV 2021, pp. 568-587.
Wang, W., Xie, E., Li, X. et al. (2021b). Spatial-Reduction Attention (SRA) for high-resolution feature learning, Technical Report, pp. 1-12.
Yi, Z., Jia, L., and Chen, Q. (2020). Crop classification using multi-temporal Sentinel-2 data in the Shiyang River Basin of China. Remote Sensing, 12(24), pp. 4052.
Zhang, J., Zhou, T., Lin, Z., and et al (2025). Demystifying the accuracy-interpretability trade-off. arXiv preprint, pp. arXiv:2503.07914.pp. 1-12.