2025年度 / Apr. 2025 〜 Mar. 2026 の研究に関わる業績(研究発表・論文・外部資金・受賞等)です。
The list shows published papers, grants, and awards in 2025-2026 season.

  • 修士論文 / Master’s Thesis
  • 卒業論文 / Graduate Thesis
  • 雑誌論文 (Journal): 4 papers
      • Chee Siang Leow, Tomoki Kitagawa, Hideaki Yajima, Hiromitsu Nishizaki, “Handwritten Character Image Generation for Effective Data Augmentation,” IEICE Transactions on Information and Systems, Vol. E108-D, No. 8, pp.1-10, Aug. 2025, DOI: 10.1587/transinf.2024EDP7201, IF: 0.6
        Abstract: This study introduces data augmentation techniques to enhance training datasets for a Japanese handwritten character classification model, addressing the high cost of collecting extensive handwritten character data. A novel method is proposed to automatically generate a largescale dataset of handwritten characters from a smaller dataset, utilizing a style transformation approach, particularly Adaptive Instance Normalization (AdaIN). Additionally, the study presents an innovative technique to improve character structural information by integrating features from the Contrastive Language-Image Pre-training (CLIP) text encoder. This approach enables the creation of diverse handwritten character images, including Kanji, by merging content and style elements. The effectiveness of our approach is demonstrated by evaluating a handwritten character classification model using an expanded dataset, which includes Japanese hiragana, katakana, and Kanji from the ETL Character Database. The character classification model’s macro F1 score improved from 0.9733 with the original dataset to 0.9861 using the augmented dataset by the proposed approach. This result indicated that our proposed character generation model was able to generate new character images that were not included in the original dataset and that they effectively contributed to training the handwritten character classification model.
      • Ryosuke Shimazu, Chee Siang Leow, Prawit Buayai, Xiaoyang Mao, Wan-Young Chung, Hiromitsu Nishizaki, “Non-invasive estimation of Shine Muscat grape color and sensory evaluation from standard camera images,” The Visual Computer, pp.1-16, May 2025, DOI: 10.1007/s00371-025-03925-6, IF: 3.0, (international co-authored paper)
        Abstract: This study proposes a non-invasive method to estimate both color and sensory attributes of Shine Muscat grapes from standard camera images. First, we focus on color estimation by integrating a Vision Transformer (ViT) feature extractor with interquartile range (IQR)-based outlier removal. Experimental results show that our approach achieves 97.2% accuracy, significantly outperforming Convolutional Neural Network (CNN) models. This improvement underscores the importance of capturing global contextual information to differentiate subtle color variations in grape ripeness. Second, we address human sensory evaluation by collecting questionnaire responses on 13 attributes (e.g., “Sweetness,” “Overall taste rating”), each rated on a five-point scale. Because these ratings tend to cluster around midrange values (labels “2,” “3,” and “4”), we initially limit the dataset to the extreme labels “1” (“lowest grade”) and “5” (“highest grade”) for binary classification. Three attributes—“Overall color,” “Sweetness,” and “Overall taste rating”—exhibit relatively high classification accuracies of 79.9%, 75.1%, and 75.7%, respectively. By contrast, the other 10 attributes reach only 50%–66%, suggesting that subjective variations and limited visual cues pose significant challenges. Overall, the proposed approach demonstrates the feasibility of an image-based system that integrates color estimation and sensory evaluation to support more objective, data-driven harvest timing decisions for Shine Muscat grapes.
      • Taoqi Bao, Jiangnan Ye, Zhankong Bao, Chee Siang Leow, Haoji Hu, Jianfeng Lu, Issei Fujishiro, Jiayi Xu, “L2H-NeRF: low- to high-frequency-guided NeRF for 3D reconstruction with a few input scenes,” The Visual Computer, pp.1-12, May 2025, DOI: 10.1007/s00371-025-03974-x, IF: 3.0, (international co-authored paper)
        Abstract: Nowadays, three-dimensional (3D) reconstruction techniques are becoming increasingly important in the fields of architecture, game development, movie production, and more. Due to common issues in the reconstruction process, such as perspective distortion and occlusion, traditional 3D reconstruction methods face significant challenges in achieving high-precision results, even when dense data are used as inputs. With the advent of neural radiance field (NeRF) technology, high-fidelity 3D reconstruction results are now possible. However, high computational resources are usually required for NeRF computations. Recently, a few data inputs are used to ensure the highest quality. In this paper, we propose an innovative low- to high-frequency-guided NeRF (L2H-NeRF) framework that decomposes scene reconstruction into coarse and fine stages. For the first stage, a low-frequency enhancement network based on a vision transformer is proposed, where the low-frequency-based globally coherent geometric structure is recovered, with the dense depth restored in a depth completion way. In the second stage, a high-frequency enhancement network is incorporated, where the high-frequency-related detail is compensated by robust feature alignment across adjacent views using a plug-and-play feature extraction and matching module. Experiments demonstrate that both the accuracy of the geometric structure and the feature detail of the proposed L2H-NeRF outperforms state-of-the-art methods.
      • Ziying Li, Haifeng Zhao, Hiromitsu Nishizaki, Chee Siang Leow, Xingfa Shen, “Chinese Character Recognition based on Swin Transformer-Encoder” Digital Signal Processing, Vol. 161, No. C, 105080, pp.1-10, May 2025, DOI:https://doi.org/10.1016/j.dsp.2025.105080, IF: 2.9, ※international co-authored paper
        Abstract: Optical Character Recognition (OCR) technology, which converts printed or handwritten text into machine-readable text, holds significant application and research value in document digitization, information automation, and multilingual support. However, existing methods predominantly focus on English text recognition and often struggle with addressing the complexities of Chinese characters. This study proposes a Chinese text recognition model based on the Swin Transformer encoder, demonstrating its remarkable adaptability to Chinese character recognition. In the image preprocessing stage, we introduced an overlapping segmentation technique that enables the encoder to effectively capture the complex structural relationships between individual strokes in lengthy Chinese texts. Additionally, by incorporating a mapping layer between the encoder and decoder, we enhanced the Swin Transformer’s adaptability to small image scenarios, thereby improving its feasibility for Chinese text recognition tasks. Experimental results indicate that this model outperforms classical models such as CRNN and ASTER on handwritten and web-based datasets, validating its robustness and reliability.
    • 国際会議論文 (Reviewed conference papers): 12 papers
    • 口頭発表 (Domestic conference, not reviewed) 6 papers
    • 外部資金新規採択分 (Grant, only new acceptance)
      • 「人工知能技術を活用した言語バリアフリー授業の実現」,科研費・基盤研究(A),課題番号:25H00566,令和7年4月〜令和11年3月,西崎博光(代表)
    • 書籍・雑誌記事 (Books, Magazines)
    • 学外授業・セミナー
    • 表彰・報道等