The list shows published papers, grants, and awards in 2023-2024 season.

2023年度 / Apr. 2023 〜 Mar. 2024

  • 博士論文 / Doctral Thesis
    • Tasneem Binti Sofri: Intelligent Hybrid Multi-Stage Feature Selection and Assessment for 5G Base Station Antenna Health Effect Detection ※Dual Degree Student with University Perlis Malaysia
    • Leow Chee Siang:Studies on Text Detection and Character Image Generation for Advanced Text Recognition
      Abstract: This doctoral thesis focuses on improving the accuracy of Optical Character Recognition (OCR) using deep learning techniques. The research is divided into three main topics: 1) enhancing OCR accuracy through data augmentation using deep learning-based generative models, 2) improving the recognition of narrow multi-line text, and 3) developing methods for multi-line text recognition. The thesis proposes a novel Y-Autoencoder (Y-AE) model for generating diverse character images and introduces a post-processing method to improve character recognition rates in existing deep learning models, particularly for characters with narrow line spacing. Additionally, the research addresses the limitations of conventional TrOCR systems by proposing a pre-processing technique for multi-line character recognition within TrOCR’s fixed-size input constraints. The thesis contributes to advancing text recognition, text detection, and multiple-lines text recognition using deep learning techniques.
    • 王 宇:A Study on Real-Time Automatic Speech Recognition System on Edge Devices
      Abstract: This doctoral thesis aims to develop speech recognition systems for deployment on edge devices and cloud platforms. For cloud-based systems, a toolkit named ExKaldi-RT was created to facilitate the integration of deep learning models with the Kaldi decoder, enabling high-accuracy real-time speech recognition. For edge devices, a lightweight end-to-end speech recognition model based on convolutional neural networks and an optimized beam search decoder were proposed, significantly reducing memory usage while maintaining high recognition accuracy. Additionally, a noise-robust voice activity detection (VAD) model was developed for edge devices, enhancing the overall performance of the speech recognition system in noisy environments. The research contributes to the advancement of speech recognition technology by providing tailored solutions for both cloud and edge devices, enabling efficient deployment and high-quality performance.
    • 修士論文 / Master Thesis
      • 何 英浩:魚眼顔認識の向上ための顔画像データ拡張方法
      • 北川智樹:文字認識器訓練のための生成モデルを用いた手書き文字生成  ※優秀発表賞受賞
      • 土橋晃弘:対照学習を用いたEnd-to-End複数言語音声認識モデル
      • Zhao Haifeng:Chinese Character Recognition Based on Swin Transformer-Encoder ※ Dual Degree Student with HDU
    • 卒業論文 / Graduate Thesis
      • 白鳥雅也:講演音声の自動評価に向けた 印象データ収集と解析
      • 市川琢也:人間ロボット連携のためのブドウ栽培管理アプリケーション
      • 島津亮輔:自己注意機構を備えた深層学習モデルを用いたシャインマスカットの色推定 ※優秀発表賞受賞
        概要:本研究では、色に基づくブドウの熟度推定にVision Transformer (ViT)を用いる方法を提案した。ViTの有効性を検証するため、CNNと比較実験を行ったところ、ViTの方が高い精度を示した。また、色空間の変換による精度への影響も検証した。ViTはブドウの熟度推定に有望であることが示唆された。
      • 藤本 蓮:ブドウ栽培支援ロボットのための自律移動システムの開発 ※優秀発表賞受賞
      • 渡辺 蒼:接客訓練のための複数の大規模言語モデルを用いた接客応答生成 ※優秀発表賞受賞
        概要:本研究では、大規模言語モデル(LLM)を用いた推論システムReConcileを提案した。ReConcileは複数のLLMによる議論と合意形成によって推論を行う。GPT-4, Claude2, BardのLLMとReConcileの性能を比較したところ、ReConcileが単体のLLMを上回る精度を示した。ReConcileは複数のLLMを効果的に活用できる有望な手法であることが示された。
    • 雑誌論文 (Journal papers)
      • 西崎博光,”生成AIのこれまでの変遷と展望”,電子情報通信学会通信ソサイエティマガジン,No.68,2024春号,pp.281-284,2024年3月1日 DOI:10.1587/bplus.17.281
      • Hui Fern Soon, Amiza Amir, Hiromitsu Nishizaki, Nik Adilah Hanin Zahri, Latifah Munirah Kamarudin and Saidatul Norlyana Azemi, “Evaluating Tree-based Ensemble Strategies for Imbalanced Network Attack Classification” International Journal of Advanced Computer Science and Applications (IJACSA), Vol.15, No.1, Jan/2024. DOI:10.14569/IJACSA.2024.01501111 IF:1.162
        Abstract: With the continual evolution of cybersecurity threats, the development of effective intrusion detection systems is increasingly crucial and challenging. This study tackles these challenges by exploring imbalanced multiclass classification, a common situation in network intrusion datasets mirroring real-world scenarios. The paper aims to empirically assess the performance of diverse classification algorithms in managing imbalanced class distributions. Experiments were conducted using the UNSW-NB15 network intrusion detection benchmark dataset, comprising ten highly imbalanced classes. The evaluation includes basic, traditional algorithms like the Decision Tree, K-Nearest Neighbor, and Gaussian Naive Bayes, as well as advanced ensemble methods such as Gradient Boosted Decision Trees (GraBoost) and AdaBoost. Our findings reveal that the Decision Tree surpassed the Multi-Layer Perceptron, K-Nearest Neighbor, and Naive Bayes in terms of overall F1-score. Furthermore, thorough evaluations of nine tree-based ensemble algorithms were performed, showcasing their varying efficacy. Bagging, Random Forest, ExtraTrees, and XGBoost achieved the highest F1-scores. However, in individual class analysis, XGBoost demonstrated exceptional performance relative to the other algorithms. This is confirmed by achieving the highest F1-scores in eight out of the ten classes within the dataset. These results establish XGBoost as a predominant method for handling multiclass imbalance classification with Bagging being the closest feasible alternative, as Bagging gains an almost similar accuracy and F1-score as XGBoost.
      • Chee Siang Leow, Hideaki Yajima, Tomoki Kitagawa and Hiromitsu Nishizaki, “Single-line Text Detection in Multi-line Text with Narrow Spacing for Line-based Character Recognition,” IEICE Transaction on Information & Systems, Vol.E106-D, No.12, pp.2097-2106, 2023. DOI:10.1587/transinf.2023EDP7070, IF:0.834
        Abstract: Text detection is a crucial pre-processing step in optical character recognition (OCR) for the accurate recognition of text, including both fonts and handwritten characters, in documents. While current deep learning-based text detection tools can detect text regions with high accuracy, they often treat multiple lines of text as a single region. To perform line-based character recognition, it is necessary to divide the text into individual lines, which requires a line detection technique. This paper focuses on the development of a new approach to single-line detection in OCR that is based on the existing Character Region Awareness For Text detection (CRAFT) model and incorporates a deep neural network specialized in line segmentation. However, this new method may still detect multiple lines as a single text region when multi-line text with narrow spacing is present. To address this, we also introduce a post-processing algorithm to detect single text regions using the output of the single-line segmentation. Our proposed method successfully detects single lines, even in multi-line text with narrow line spacing, and hence improves the accuracy of OCR.
      • Yan San Woo, Prawit Buayai, Hiromitsu Nishizaki, Koji Makino, Latifah Munirah Kamarudin, Xiaoyang Mao, “End-to-end lightweight berry number prediction for supporting table grape cultivation”, Computers and Electronics in Agriculture, Volume 213, 108203, pp.1-15, 2023. DOI:10.1016/j.compag.2023.108203, IF:8.045
      • Azuddin, K. A. and Junoh, A. K. and Zakaria, A. and Rahman, M. T. A. and Nor, N. M. I. M. and Nishizaki, H. and Latiffah, Z. and Azuddin, N. F. and Abdullah, M. Z. and Terna, T. P., “Supervised segmentation on fusarium macroconidia spore in microscopic images via analytical approaches,” Multimedia Tools and Applications, pp.1–16, Oct./2023, Springer, ISBN:1573-7721, DOI:10.1007/s11042-023-17008-y, IF:3.6
      • 西崎香苗, 田中博之, 深澤貴裕, 西崎博光, 池上仁志, 出江紳一, “姿勢推定技術を活用した非接触型動作評価ツールの開発”,整形・災害外科,66巻,10号,pp.1219-1226 (2023年9月発行) DOI:10.18888/se.0000002708
      • Prawit Buayai, Kabin Yok-In, Daisuke Inoue, Hiromitsu Nishizaki, Koji Makino, Xiaoyang Mao, “Supporting table grape berry thinning with deep neural network and augmented reality technologies”, Computers and Electronics in Agriculture, Volume 213, 2023. DOI:10.1016/j.compag.2023.108194, IF:8.045
      • Takashi Minato, Ryuichiro Higashinaka, Kurima Sakai, Tomo Funayama, Hiromitsu Nishizaki, and Takayuki Nagai, “Design of a competition specifically for spoken dialogue with a humanoid robot”, Advanced Robotics, vol.37, no.21, pp.1349-1363, 2023. Taylor & Francis, DOI:10.1080/01691864.2023.2249530, IF:2.202
      • Yu Wang, Hiromitsu Nishizaki, “A Lightweight End-to-End Speech Recognition System on Embedded Devices,” IEICE Transaction on Information & Systems, Vol.E106-D, No.7, pp.1230-1239, 1st/July/2023. DOI:10.1587/transinf.2022EDP7221, IF:0.834
      • Tasneem Sofri, Hasliza A Rahim, Allan Melvin Andrew, Ping Jack Soh, Latifah Munirah Kamarudin, Hiromitsu Nishizaki, “Data Normalization Methods of Hybridized Multi-Stage Feature Selection Classification for 5G Base Station Antenna Health Effect Detection,” Journal of Advanced Research in Applied Sciences and Engineering Technology, vol.30, no.2, pp.133-140, 19/Apr/2023. DOI:10.37934/araset.30.2.133140
    • 国際会議論文 (Reviewed conference papers)
      • Chee Siang Leow, Ryosuke Shimazu, Tomoki Kitagawa, Hideki Yajima, Prawit Buayai, Koji Makino, Xiaoyang Mao, Hiromitsu Nishizaki “Estimation of Non-Invasive Grape Ripeness and Sweetness From Images Captured by a General-Purpose Camera,” Proceedings of the 2023 IEEE International Workshop on Metrology for Agriculture and Forestry, pp.295-300, 2023, 7th/Nov/2023, DOI:10.1109/MetroAgriFor58484.2023.10424087, Presented in Pisa, Italy
      • Shunsuke Fujisawa, Muhammad Faris Kamarudzaman, Prawit Buayai, Koji Makino, Hiromitsu Nishizaki, Xiaoyang Mao “Image-Based Measurement of Grape Inflorescence Length for Automatic Inflorescence Trimming,” Proceedings of the 2023 IEEE International Workshop on Metrology for Agriculture and Forestry, pp.289-294, 2023, 7th/Nov/2023, DOI:10.1109/MetroAgriFor58484.2023.10424126, Presented in Pisa, Italy
      • Prawit Buayai, Yin Suan Tan, Muhammad Faris Bin Kamarudzaman, Koji Makino, Hiromitsu Nishizaki, Xiaoyang Mao “Automating Grape Thinning: Predicting Robotic Arm End-Effector Positions Using Depth Sensing Technology and Neural Networks,” Proceedings of the 2023 IEEE International Workshop on Metrology for Agriculture and Forestry, pp.76-80, 2023, 6th/Nov/2023, DOI:10.1109/MetroAgriFor58484.2023.10424399, Presented in Pisa, Italy
      • Shuto Nakagomi, Yutaka Suzuki, Masayuki Morisawa, Hiromitsu Nishizaki, Takao Kubo, “Proposal of a Method for Evaluating Biological Responses During Swallowing Using the LF/HF Change Rate”, Proceedings of the 2023 IEEE 12th Global Conference on Consumer Electronics (GCCE 2023), pp. 341-342, DOI:10.1109/GCCE59613.2023.10315533, Presented on 11/Oct/2023, Nara, Japan.
      • Akihiro Dobashi, Chee Siang Leow, Hiromitsu Nishizaki, “Metric Learning Approach for End-To-End Multilingual Automatic Speech Recognition Model”, Proceedings of the 2023 IEEE 12th Global Conference on Consumer Electronics (GCCE 2023), pp. 446-450, DOI:10.1109/GCCE59613.2023.10315608, Presented on 11/Oct/2023, Nara, Japan.
      • Yinghao He, Chee Siang Leow, Hiromitsu Nishizaki, “Image Remapping Data Augmentation Approach for Improving Fisheye Face Recognition”, Proceedings of the 2023 IEEE 12th Global Conference on Consumer Electronics (GCCE 2023), pp. 742-746, DOI:10.1109/GCCE59613.2023.10315437, Presented on 11/Oct/2023, Nara, Japan.
      • Chee Siang Leow, Tomoki Kitagawa, Hideaki Yajima, Hiromitsu Nishizaki, “Data Augmentation With Automatically Generated Images for Character Classifier Model Training”, Proceedings of the 2023 IEEE 12th Global Conference on Consumer Electronics (GCCE 2023), pp. 845-849, DOI:10.1109/GCCE59613.2023.10315447, Presented on 12/Oct/2023, Nara, Japan.
      • Toki Sugiura and Hiromitsu Nishizaki, “Automatic Exploration of Optimal Data Processing Operations for Sound Data Augmentation Using Improved Differentiable Automatic Data Augmentation”, Proceedings of the 24th INTERSPEECH 2023, pp.5411-5415, 2023, 10.21437/Interspeech.2023-202, Dublin, Ireland, Presented on 24/Aug/2023
      • Soya Tsushima, Soichiro Iida, Hiromitsu Nishizaki, Takehito Utsuro and Junichi Hoshino, “Scenario-based Customer Service Training System with Honorific Exercise”, Proceedings of NICOGRAPH International 2023, pp.82-82, 2023, DOI:10.1109/NICOINT59725.2023.00022, Presented on 10th/June/2023, Hokkaido
      • Muhammad Husaini, Latifah Munirah Kamarudin, Hiromitsu Nishizaki, Intan Kartika Kamarudin, Muhammad Amin Ibrahim, Ammar Zakaria, Masahiro Toyoura, Xiaoyang Mao, “Non-contact breathing signal classification using Attention based CNN and XGBoost hybrid model”, Proceedings of the ERS/ESRS International Sleep and Breathing Conference 2023, DOI:10.1183/23120541.sleepandbreathing-2023.56 Presented on 21/Apr/2023 (Prague, Czech)
    • 口頭発表 (Domestic conference, not reviewed)
      • レオ チーシャン,北川智樹,矢島英明,西崎博光,”生成文字画像を用いた単・複数行テキストに対する文字認識精度向上の検討”,情報処理学会第86回全国大会講演論文集,no.2,4U-07,pp.703-704,2024.03.16発表(神奈川大学,横浜市)※学生奨励賞受賞
      • 矢島英明,レオ チーシャン,北川智樹,西崎博光,”Transformerデコーダを用いた画像内のテキスト領域検出の検討”,情報処理学会第86回全国大会講演論文集,no.2,4U-08,pp.705-706,2024.03.16発表(神奈川大学,横浜市)
      • 北川智樹,レオ チーシャン,矢島英明,西崎博光,”文字認識モデル訓練のためのスタイル変換を用いた手書き文字生成”,情報処理学会第86回全国大会講演論文集,no.2,5T-09,pp.623-624,2024.03.16発表(神奈川大学,横浜市)※学生奨励賞受賞
      • Bong Tze Yaw,Leow Chee Siang,丹沢勉,牧野浩二,西崎博光,”果実盗難通報装置のための小型マイコンで動作する不審音検出システム”,第24回計測自動制御学会システムインテグレーション部門講演会(SI2023)講演論文集,1C4-09,pp.,2023.12.14発表(朱鷺メッセ,新潟市)※優秀講演表彰
      • 牧野浩二,西崎博光,茅暁陽,”全方位カメラとAIを用いた共同作業領域を有するシャインマスカット栽培支援用移動ロボットの開発”,第24回計測自動制御学会システムインテグレーション部門講演会(SI2023)講演論文集,3C2-05,pp.,2023.12.16発表(朱鷺メッセ,新潟市)※優秀講演表彰
      • 牧野浩二,高橋洋翔,Song Ziwei,Prawit Buayai,西崎博光,石田和義,茅暁陽,”実環境への適用を考慮したAIを用いたタマネギ選別機の開発” 2024年電子情報通信学会総合大会講演予稿集,D-22-04,2024.3.8発表(広島大学、東広島市)
      • 土橋晃弘,レオ チーシャン,西崎博光,”End-to-End 複数言語音声認識モデル訓練における距離学習の効果”,日本音響学会2023年秋季研究発表会講演論文集,3-Q-3,pp.xx-xx,2023.9.29発表(名古屋工業大学)
      • 西崎博光,雨宮達佳,レオ チーシャン,ブアヤイ プラウィット,牧野浩二,茅 暁陽,”シャインマスカット栽培支援ロボットのための色推定モデルを用いた収穫適期判定システム”, ロボティクス・メカトロニクス講演会講演論文集, 講演番号 2A1-B03,pp. 2A1-B03(1)-(4),2023.6.30,(名古屋国際会議場)
    • 外部資金 (Grant)
    • 書籍・雑誌記事 (Books, Magazines)
      • 牧野浩二,西崎博光,Leow Chee Siang,Prawit Buayai,茅暁陽:「シャインマスカットの収穫時期をAIで判断」,インターフェース(Interface),pp. 218-219,2024年1月号(2023年11月25日発売),CQ出版社
      • 西崎博光:「これからの自然言語処理」,これからのコンピュータ技術555(第1部第2章人工知能),インターフェース(Interface),pp.52-56,2023年9月号,CQ出版社
    • 学外授業・セミナー
      • 【セミナー講師】西崎博光,牧野浩二:「AI・データ活用スペシャリスト育成講座:Pythonプログラミング応用」,山梨県情報通信業協会,2023年11月24日 18:30〜20:30,山梨大学実施
      • 【セミナー講師】西崎博光,牧野浩二:「AI・データ活用スペシャリスト育成講座:Pythonプログラミング基礎」,山梨県情報通信業協会,2023年11月16日 18:30〜20:30,山梨大学実施
      • 【講演】西崎博光:チュートリアル講演「深層学習チュートリアル~基礎から応用まで~」,電子情報通信学会 超知性ネットワーキングに関する分野横断型研究会(RISING研究会) RISING2023,2023年10月30日
      • 【セミナー講師】西崎博光:「AI実践講座: Pythonで学ぶはじめてのディープラーニング」,2023年7月20・21日 10:30〜16:30(2日間開催),Robot Innovation Week 2023,名古屋国際会議場
      • 【セミナー講師】西崎博光:電子情報通信学会ネットワークシステム研究会シュミレーションスクール「深層学習ハンズオン」,2023年5月13日(土)9:00〜16:00,オンライン実施
    • 表彰・報道等
      • 西崎博光,令和4年度山梨大学優秀教員表彰,2023年9月26日
      • 「秋田県タマネギ産地形成コンソーシアム」事業の紹介(2023年4月21日),テレビ秋田,NHK,日本経済新聞,日本農業新聞等。