[日本語 (Japanese)] Last update: Jan. 8, 2025.
MUKUNOKI Daichi
- Name: MUKUNOKI Daichi (椋木 大地)
- Contact: daichi.mukunoki _at_ gmail.com
- ORCID: https://orcid.org/0000-0002-0051-6811
- researchmap: https://researchmap.jp/mukunoki
- ResearchGate: https://www.researchgate.net/profile/Daichi_Mukunoki
- Google Scholar: https://scholar.google.co.jp/citations?user=TnysP90AAAAJ
- LinkedIn: https://www.linkedin.com/in/daichi-mukunoki
Work Experience
- December 1, 2024 - present: Assistant Professor, Information Technology Center, Nagoya University
- April 1, 2024 - November 30, 2024: Temporary Technical Staff, Shibaura Institute of Technology
- November 1, 2023 - February 29, 2024: Sr. Software Engineer, Sony Interactive Entertainment
- November 1, 2021 - March 31, 2023: Visiting Researcher, Information Technology Center, University of Tokyo
- April 1, 2019 - October 31, 2023: Research Scientist, Large-scale Parallel Numerical Computing Technology Research Team, Research Division, RIKEN Center for Computational Science
- April 1, 2019 - March 31, 2021: Research Scientist, Architecture Development Team, Flagship 2020 Project, RIKEN Center for Computational Science
- April 1, 2018 - March 31, 2019: Visiting Researcher, Architecture Development Team, Flagship 2020 Project, RIKEN Center for Computational Science
- April 1, 2018 - March 31, 2019: Visiting Researcher, Large-scale Parallel Numerical Computing Technology Research Team, Research Division, RIKEN Center for Computational Science
- October 1, 2017 - March 31, 2019: Postdoctoral Research Fellow, Graduate School of Science, Tokyo Woman's Christian University
- October 1, 2017 - March 31, 2018: Visiting Researcher, Architecture Development Team, Flagship 2020 Project, RIKEN Advanced Institute of Computational Science
- October 1, 2017 - March 31, 2018: Visiting Researcher, Large-scale Parallel Numerical Computing Technology Research Team, Research Division, RIKEN Advanced Institute of Computational Science
- April 1, 2017 - September 30, 2017: Postdoctoral Researcher, Architecture Development Team, Flagship 2020 Project, RIKEN Advanced Institute of Computational Science
- April 1, 2016 - March 30, 2017: Postdoctoral Researcher, Co-design Team, Flagship 2020 Project, RIKEN Advanced Institute of Computational Science
- May 1, 2015 - March 31, 2016: Postdoctoral Researcher, Co-design Team, Exascale Supercomputer Project, RIKEN Advanced Institute of Computational Science
- June 1, 2014 - September 30, 2017: Postdoctoral Researcher, Large-scale Parallel Numerical Computing Technology Research Team, Research Division, RIKEN Advanced Institute of Computational Science
- December 1, 2013 - May 31, 2014: Research Fellow (PD), Japan Society for the Promotion of Science (at University of Tsukuba)
- April 1, 2013 - November 30, 2013: Research Fellow (DC2), Japan Society for the Promotion of Science (at University of Tsukuba)
Education
- April 2011 - November 2013, Graduate School of Systems and Information Engineering, University of Tsukuba (Doctor of Philosophy in Engineering, November 2013)
- April 2009 - March 2011, Graduate School of Systems and Information Engineering, University of Tsukuba (Master of Engineering, March 2011)
- April 2006 - March 2009, School of Library and Information Science, University of Tsukuba (Bachelor of Library and Information Science, March 2009)
- April 2001 - March 2006, Gifu National College of Technology (Associate's degree in Engineering, March 2006)
Research Interests
- High performance computing (HPC), parallel computing, GPGPU, computer arithmetic, extended-/reduced-precision, accurate computation, reproducible computation, performance optimization, auto-tuning
Computer Skills
- C/C++, CUDA, MPI, OpenMP, Python, LaTeX, HTML
Grants
- April 2022 - October 2023: Japan Society for the Promotion of Science (JSPS), Fund for the Promotion of Joint International Research (Fostering Joint International Research (A)), as the deputation
- April 2019 - March 2023: Japan Society for the Promotion of Science (JSPS), Grant-in-Aid for Young Scientists, as the deputation
- April 2019 - March 2023: Japan Society for the Promotion of Science (JSPS), Grant-in-Aid for Scientific Research (B), as a collaborator (the deputation is Toshiyuki Imamura)
- April 2016 - March 2018: Japan Society for the Promotion of Science (JSPS), Grant-in-Aid for Young Scientists (B), as the deputation
- April 2015 - March 2018: Japan Society for the Promotion of Science (JSPS), Grant-in-Aid for Scientific Research (B), as a collaborator (the deputation is Toshiyuki Imamura)
- April 2013 - March 2015: Japan Society for the Promotion of Science (JSPS), Grant-in-Aid for JSPS Fellows, as the deputation
Awards
- Best Paper Award, 16th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC 2023), Dec. 2023 (Daichi Mukunoki, Masatoshi Kawai, Toshiyuki Imamura, Sparse Matrix-Vector Multiplication with Reduced-Precision Memory Accessor)
- Research Poster Award 2nd Place Winner, ISC High Performance 2022, Jun. 2022 (Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura, A Fast Infinite Precision Inner Product using Ozaki Scheme and Dot2, and Its Application to Reproducible Conjugate Gradient Solvers)
- RIKEN 2022 Research Incentive Award (Ohbu Award)
- Research Poster Award, ISC High Performance 2021, Jun. 2021 (Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura: Accurate Matrix Multiplication on Binary128 using Ozaki Scheme).
- Best Research Poster Award, Russian Supercomputing Days 2019 (RuSCDays 2019), Sep. 2019 (Daichi Mukunoki, Takeshi Ogita, Katsuhisa Ozaki: Accurate and Reproducible Linear Algebra Operations for Many-core Architectures).
- PRACE-ISC Research Poster Award 2017, ISC High Performance 2017, 2017 (Daichi Mukunoki, Toshiyuki Imamura, "Implementation & Evaluation of 2.5D Matrix Multiplication on K Computer").
- IPSJ Yamashita SIG Research Award, Information Processing Society of Japan, 2016.
- IPSJ Computer Science Research Award for Young Scientists, Information Processing Society of Japan, 2013.
- IPSJ SIGARC Young Researcher Award, IPSJ SIG-ARC, 2013.
Journal Papers
- Katsuhisa Ozaki, Daichi Mukunoki, Takeshi Ogita, Extension of accurate numerical algorithms for matrix multiplication based on error-free transformation, Japan Journal of Industrial and Applied Mathematics, Oct. 29, 2024.
- Kensuke Aihara, Katsuhisa Ozaki, Daichi Mukunoki, Mixed-precision conjugate gradient algorithm using the groupwise update strategy, Japan Journal of Industrial and Applied Mathematics, Volume 41, pp. 837-855, Feb. 6, 2024.
- Daichi Mukunoki, Takeshi Ogita, Performance and Energy Consumption of Accurate and Mixed-precision Linear Algebra Kernels on GPUs, Journal of Computational and Applied Mathematics, Vol. 372, p. 112701, Jul., 2020.
- 椋木大地, 高橋大介, GPUにおける3倍・4倍精度浮動小数点演算の実現と性能評価, 情報処理学会論文誌 コンピューティングシステム, Vol. 6, No. 1, pp. 66-77, 2013年1月31日 (in Japanese).
Peer-reviewed Conference Papers
- Ryunosuke Matsuzaki, Daichi Mukunoki, Takaaki Miyajima, Performance evaluation and modelling of single-precision matrix multiplication on Cerebras CS-2, 14th Workshop on Irregular Applications: Architectures and Algorithms, 2024 (short paper) (accepted).
- Stef Graillat, Fabienne Jézéquel, Théo Mary, Roméo Molina, Daichi Mukunoki, Reduced-Precision and Reduced-Exponent Formats for Adaptive-Precision Sparse Matrix-Vector Product, Proc. 30th International European Conference on Parallel and Distributed Computing (Euro-Par 2024), Lecture Notes in Computer Science, Vol. 14803, pp. 17-30, Aug. 26, 2024.
- Daichi Mukunoki, Masatoshi Kawai, Toshiyuki Imamura, Sparse Matrix-Vector Multiplication with Reduced-Precision Memory Accessor, Proc. 2023 IEEE 16th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC 2023), pp. 608-615, 2023.
- Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura, Infinite-precision Inner Product and Sparse Matrix Vector Multiplication using Ozaki Scheme with Dot2 on Many-core Processors, Proc. 14th International Conference on Parallel Processing and Applied Mathematics (PPAM 2022), Lecture Notes in Computer Science, vol 13826, pp. 40–54, 2023.
- Daichi Mukunoki, Yusuke Hirota, Toshiyuki Imamura, Task Scheduling Strategies for Batched Basic Linear Algebra Subprograms on Many-core CPUs, Proc. 2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC 2021), pp. 234-241, 2021.
- Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura, Accurate Matrix Multiplication on Binary128 Format Accelerated by Ozaki Scheme, Proc. The 50th International Conference on Parallel Processing (ICPP-2021), No. 78, pp. 1-11, Aug. 9, 2021.
- Takeyuki Harayama, Shuhei Kudo, Daichi Mukunoki, Toshiyuki Imamura, Daisuke Takahashi, A rapid Euclidean norm calculation algorithm that reduces overflow and underflow, Proc. The 2021 International Conference on Computational Science and Its Applications (ICCSA 2021), Lecture Notes in Computer Science, Vol. 12949, pp. 95-110, Sep. 9, 2021.
- Jens Domke, Emil Vatai, Aleksandr Drozd, Peng Chen, Yosuke Oyama, Lingqi Zhang, Shweta Salaria, Daichi Mukunoki, Artur Podobas, Mohamed Wahib, Satoshi Matsuoka, Matrix Engines for High Performance Computing: A Paragon of Performance or Grasping at Straws?, Proc. 35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), pp. 1056-1065, Jun. 28, 2021 (preprint: arXiv:2010.14373)
- Katsuhisa Ozaki, Takeshi Ogita, Daichi Mukunoki, Interval Matrix Multiplication using Fast Low-Precision Arithmetic on GPU, Proc. 9th International Workshop on Reliable Engineering Computing (REC2021), pp. 419-434, May 2021.
- Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Roman Iakymchuk, Conjugate Gradient Solvers with High Accuracy and Bit-wise Reproducibility between CPU and GPU using Ozaki scheme, Proc. The International Conference on High Performance Computing in Asia-Pacific Region (HPCAsia 2021), pp. 100-109, 2021 (preprint is also available: hal-02986873).
- Fabienne Jézéquel, Stef Graillat, Daichi Mukunoki, Toshiyuki Imamura, Roman Iakymchuk, Can we avoid rounding-error estimation in HPC codes and still get trustful results?, Proc. 13th International Workshop on Numerical Software Verification 2020 (NSV 20), Lecture Notes in Computer Science, Vol. 12549, pp. 163-177, Dec. 2020 (preprint: hal-02486753).
- Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura, DGEMM using Tensor Cores, and Its Accurate and Reproducible Versions, Proc. ISC High Performance 2020, Lecture Notes in Computer Science, Vol. 12151, pp. 230-248, Jun. 2020.
- Yiyu Tan, Toshiyuki Imamura, Daichi Mukunoki, Design of an FPGA-based Matrix Multiplier with Task Parallelism, Proc. International Conference on Parallel Computing (ParCo2019), Parallel Computing: Technology Trends, Vol. 36, pp. 241-250, 2020.
- Daichi Mukunoki, Takeshi Ogita, Katsuhisa Ozaki, Reproducible BLAS Routines with Tunable Accuracy Using Ozaki Scheme for Many-core Architectures, Proc. 13th International Conference on Parallel Processing and Applied Mathematics (PPAM 2019), Lecture Notes in Computer Science, Vol. 12043, pp. 516-527, Mar. 2020.
- Daichi Mukunoki, Toshiyuki Imamura, Performance Analysis of 2D-compatible 2.5D-PDGEMM on Knights Landing Cluster, Proc. International Conference on Computational Science (ICCS 2018), Lecture Notes in Computer Science, Vol. 10862, pp. 853-858, Jun. 2018.
- Daichi Mukunoki, Toshiyuki Imamura, Implementation and Performance Analysis of 2.5D-PDGEMM on the K Computer, Proc. 12th International Conference on Parallel Processing and Applied Mathematics (PPAM 2017), Lecture Notes in Computer Science, Vol. 10777, pp. 348-358, Mar. 2018.
- Toshiyuki Imamura, Daichi Mukunoki, Yusuke Hirota, Susumu Yamada, Masahiko Machida, Design Towards Modern High Performance LA Library Enabling Heterogeneity and Flexible Data Formats, Parallel Computing is Everywhere, Proc. International Conference on Parallel Computing (ParCo2017), Advances in Parallel Computing, pp. 97-106, Sep. 2017.
- Daichi Mukunoki, Toshiyuki Imamura, Daisuke Takahashi, Automatic Thread-Block Size Adjustment for Memory-Bound BLAS Kernels on GPUs, Proc. IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC-16). pp. 377-384, Sep. 2016.
- Daichi Mukunoki, Toshiyuki Imamura, Reduced-Precision Floating-Point Formats on GPUs for High Performance and Energy Efficient Computation, Proc. IEEE International Conference on Cluster Computing (Cluster 2016), pp. 144-145, Sep. 13, 2016 (extended abstract for poster presentation).
- Daichi Mukunoki, Toshiyuki Imamura, Daisuke Takahashi, Fast Implementation of General Matrix-Vector Multiplication (GEMV) on Kepler GPUs, Proc. 23rd Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP 2015), pp. 642-650, Mar. 2015.
- Daichi Mukunoki, Daisuke Takahashi, Using Quadruple Precision Arithmetic to Accelerate Krylov Subspace Methods on GPUs, Proc. 10th International Conference on Parallel Processing and Applied Mathematics (PPAM 2013), Part I, Workshop on Numerical Algorithms on Hybrid Architectures, Lecture Notes in Computer Science, Vol. 8384, pp. 632-642, May 2014.
- Daichi Mukunoki, Daisuke Takahashi, Optimization of Sparse Matrix-vector Multiplication for CRS Format on NVIDIA Kepler Architecture GPUs, Proc. 13th International Conference on Computational Science and Its Applications (ICCSA 2013), Part V, Lecture Notes in Computer Science, Vol. 7975, pp. 211-223, Jun. 2013.
- Daichi Mukunoki, Daisuke Takahashi, Implementation and Evaluation of Triple Precision BLAS Subroutines on GPUs, Proc. 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW 2012), The 13th Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC-12), pp. 1378-1386, May 2012.
- Daichi Mukunoki, Daisuke Takahashi, Implementation and Evaluation of Quadruple Precision BLAS Functions on GPUs, Proc. 10th International Conference on Applied Parallel and Scientific Computing (PARA 2010), Part I, Lecture Notes in Computer Science, Vol. 7133, pp. 249-259, 2012.
- 椋木大地, 高橋大介, GPUによる4倍・8倍精度BLASの実装と評価, 2011年ハイパフォーマンスコンピューティングと計算科学シンポジウムHPCS2011論文集, pp. 148-156, 2011年1月 (in Japanese).
Non-peer-reviewed Papers
- 椋木大地,尾崎克久, Quasi Triple-Word Arithmeticによる6倍精度演算の疎行列反復解法への応用, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2024-HPC-197, 2024-ARC-259, No. 11, pp. 1-7, 2024年12月 (in Japanese).
- Stef Graillat, Fabienne Jézéquel, Théo Mary, Roméo Molina, Daichi Mukunoki, Performance Evaluation of Adaptive-Precision SpMV with Reduced-Precision Formats, HAL, hal-04261073, Oct. 2023.
- 椋木大地, 尾崎克久, 荻田武史, 今村俊幸, 尾崎スキームによる無限精度内積と再現可能疎行列反復ソルバーへの応用, 日本応用数理学会2022年度年会講演予稿集, Sep. 10, 2022 (in Japanese).
- 椋木大地, 廣田悠輔, 今村俊幸, CPUにおけるbatched BLASのためのタスクスケジューリング戦略, 日本応用数理学会2021年度年会講演予稿集, Sep. 7, 2021 (in Japanese).
- 原山赳幸, 工藤周平, 椋木大地, 今村俊幸, 高橋大介, オーバー・アンダーフローを抑えた高精度かつ高速な2ノルム計算手法, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2020-HPC-177, No. 8, pp. 1-9, 2020年12月 (in Japanese).
- 椋木大地, 尾崎克久, 荻田武史: 尾崎スキームを用いたbinary128による4倍精度行列積, 日本応用数理学会2020年度年会講演予稿集, Sep. 10, 2020 (in Japanese).
- Roman Iakymchuk, Daichi Mukunoki, Artur Podobas, Fabienne Jézéquel, Toshiyuki Imamura, Norihisa Fujita, Jens Huthmann, Shuhei Kudo, Yiyu Tan, Jens Domke, Kai Torben Ohlhus, Takeshi Fukaya, Takeo Hoshi, Yuki Murakami, Maho Nakata, Takeshi Ogita, Kentaro Sano, Taisuke Boku, While Paper from Workshop on Large-scale Parallel Numerical Computing Technology (LSPANC 2020): HPC and Computer Arithmetic toward Minimal-Precision Computing, arXiv:2004.04628, hal-02536316, Apr. 2020.
- Toshiyuki Imamura, Daichi Mukunoki, Fabienne Jézéquel, Stef Graillat, Roman Iakymchuk, Numerical Reproducibility based on Minimal-Precision Validation, Computational Reproducibility at Exascale Workshop (CRE2019), in cooperation with SC19, Nov. 17, 2019.
- 椋木大地, 荻田武史, 尾崎克久, 今村俊幸, 尾崎スキームによる高精度かつ再現性のあるBLAS実装, 日本応用数理学会2019年年会講演予稿集, pp. 402-403, 2019年9月 (in Japanese).
- 椋木大地, 荻田武史, 尾崎克久, Level-3 BLASに基づく高精度行列積計算法による高精度かつ再現性のあるBLASルーチンの実装とその最適化, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2018-HPC-166, No. 9, pp. 1-8, 2018年9月 (in Japanese).
- 椋木大地, 今村俊幸, 2.5次元アルゴリズムを用いた高性能PDGEMMの開発, 東京大学情報基盤センター スーパーコンピューティングニュース, Vol. 20, No. 4, pp. 31-36, 2018年7月 (in Japanese).
- 椋木大地, 今村俊幸, 京コンピュータにおける2.5次元アルゴリズムを用いた分散並列行列積の実装と評価, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2017-HPC-159, No. 1, pp. 1-6, 2017年4月 (in Japanese).
- 森倉悠介, 椋木大地, 深谷猛, 山中脩也, 大石進一, 大規模並列計算機における連立1次方程式の精度保証付き数値計算に対する性能評価, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2016-HPC-157, No. 1, pp. 1-7, 2016年12月 (in Japanese).
- 今村俊幸, 椋木大地, コンシューマレンジGPUに最適化した固有値ソルバーの実装と評価, 情報処理学会研究報: ハイパフォーマンスコンピューティング, Vol. 2016-HPC-157, No. 7, pp. 1-9, 2016年12月 (in Japanese).
- 椋木大地, 今村俊幸, 短尺浮動小数点形式の検討, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2015-HPC-152, No. 4, pp. 1-10, 2015年12月 (in Japanese).
- 佐々木信一, 菱沼利彰, 藤井昭宏, 田中輝雄, 椋木大地, 今村俊幸, 京・FX10における倍々精度演算の高速化, 情報処理学会研究報告, Vol. 2015-HPC-151, No. 15, pp. 1-7, 2015年9月 (in Japanese).
- 今村俊幸, 椋木大地, 山田進, 町田昌彦, SYMV・GEMVルーチン群のマルチGPU化とその評価, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2015-HPC-151, No. 13, pp. 1-8, 2015年9月 (in Japanese).
- 佐々成正, 山田進, 町田昌彦, 椋木大地, 今村俊幸, FFTを使った時間発展問題における累積誤差, 応用数理学会2015年度年会講演論文集, 2015年9月 (in Japanese).
- 椋木大地, 今村俊幸, 高橋大介, NVIDIA GPUにおけるメモリ律速なBLASカーネルのスレッド数自動選択手法, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2015-HPC-150, No. 13, pp. 1-13, 2015年7月 (in Japanese).
- 椋木大地, 今村俊幸, 高橋大介, NVIDIA GPUにおけるGEMVカーネルの自動チューニング, 計算工学講演会論文集, Vol. 20, E-2-1, 2015年6月 (in Japanese).
- 今村俊幸, 椋木大地, 山田進, 町田昌彦, CUDA-BLAS等の選択による最速GPU固有値ソルバーの性能評価, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2015-HPC-148, No. 4, pp. 1-9, 2015年2月 (in Japanese).
- 椋木大地, 今村俊幸, MaxwellアーキテクチャGPUにおける疑似倍精度演算を用いたDGEMMの実装と評価, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2014-HPC-147, No. 26, pp. 1-6, 2014年12月 (in Japanese).
- 今村俊幸, 椋木大地, 山田進, 町田昌彦, CUDA-xSYMVの実装と評価, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2014-HPC-146, No. 14, pp. 1-12, 2014年10月 (in Japanese).
- 椋木大地, 高橋大介, GPUにおける4倍精度浮動小数点演算を用いたクリロフ部分空間法の高速化, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2013-HPC-140, No. 35, pp. 1-7, 2013年7月 (in Japanese).
- 椋木大地, 高橋大介, GPUにおける高速なCRS形式疎行列ベクトル積の実装, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2013-HPC-138, No. 5, pp. 1-7, 2013年2月 (in Japanese).
- 椋木大地, 高橋大介, GPUにおける4倍精度演算を用いた疎行列反復解法の実装と評価, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2012-HPC-137 (2012-ARC-202), No. 37, pp. 1-8, 2012年12月 (in Japanese).
- 椋木大地, 高橋大介, GPUによる3倍精度浮動小数点演算の検討, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2011-HPC-132 (2011-ARC-197), No. 23, pp. 1-9, 2011年11月 (in Japanese).
- 椋木大地, 高橋大介, GPUによる4倍精度BLASの実装と評価, 計算工学講演会論文集, Vol. 15, No. 2, pp. 891-894, 2010年5月 (in Japanese).
- 椋木大地, 高橋大介, GPUによる4倍精度BLASの実装と評価, 情報処理学会研究報告: ハイパフォーマンスコンピューティング, Vol. 2009-HPC-123 (2009-ARC-186), No. 13, pp. 1-6, 2009年11月 (in Japanese).
Peer-reviewed Poster Presentations
- Atsushi Suzuki, Daichi Mukunoki, Toshiyuki Imamura, tmBLAS: a Mixed Precision BLAS by C++ Template, ISC High Performance (ISC 2023), research poster session, May, 2023.
- Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura, A Fast Infinite Precision Inner Product using Ozaki Scheme and Dot2, and Its Application to Reproducible Conjugate Gradient Solvers, ISC High Performance (ISC 2022), research poster session, Jun. 1, 2022.
- Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura, Accurate Matrix Multiplication on Binary128 using Ozaki Scheme, ISC High Performance (ISC 2021), research poster session, Jun. 29, 2021.
- Roman Iakymchuk, Daichi Mukunoki, Takeshi Ogita, Katsuhisa Ozaki, Stef Graillat, Accurate and Reproducible Conjugate Gradient in Hybrid Parallel Environments, ISC High Performance (ISC 2021), Jun. 29, 2021.
- Daichi Mukunoki, Toshiyuki Imamura, Yiyu Tan, Atsushi Koshiba, Jens Huthmann, Kentaro Sano, Fabienne Jézéquel, Stef Graillat, Roman Iakymchuk, Norihisa Fujita, Taisuke Boku, Minimal-Precision Computing for High-Performance, Energy-Efficient, and Reliable Computations, SC19 research poster session, Nov. 19-21, 2019.
- Yusuke Hirota, Daichi Mukunoki, Toshiyuki Imamura, Automatic Generation of Full-Set Batched BLAS, ISC High Performance (ISC 2018), research poster session, Jun. 26, 2018.
- Daichi Mukunoki, Toshiyuki Imamura, Implementation and Evaluation of 2.5D Matrix Multiplication on K Computer, ISC High Performance (ISC 2017), research poster session, Jun. 20, 2017.
Poster Presentations
- Daichi Mukunoki, Atsushi Suzuki, Toshiyuki Imamura, Multiple and Mixed Precision BLAS with C++ Template, 5th R-CCS International Symposium, Feb. 6, 2023.
- Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura, Accurate Matrix Computations using Ozaki Scheme on CPUs and GPUs, The 30th Anniversary Symposium of the Center for Computational Sciences at the University of Tsukuba, Oct. 14, 2022.
- Daichi Mukunoki, Roman Iakymchuk, Fabienne Jezequel, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura, Remedies for Reproducibility Issue in Conjugate Gradient Solvers, SparseDays2022, poster session, Jun. 20-22, 2022.
- Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura, Roman Iakymchuk, High-Precision, Accurate, and Reproducible Linear Algebra Operations using Ozaki Scheme, 3rd R-CCS International Symposium, Feb. 15, 2021.
- Toshiyuki Imamura, Daichi Mukunoki, Yiyu Tan, Atsushi Koshiba, Jens Huthmann, Kentaro Sano, Fabienne Jézéquel, Stef Graillat, Roman Iakymchuk, Norihisa Fujita, Taisuke Boku, Minimal-Precision Computing for High-Performance, Energy-Efficient, and Reliable Computations, 2nd R-CCS International Symposium, Feb. 17, 2020.
- Yiyu Tan, Toshiyuki Imamura, Daichi Mukunoki, An FPGA-based Matrix Multiplier with Task Parallelism, 2nd R-CCS International Symposium, Feb. 17, 2020.
- Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura, Accurate DGEMM using Tensor Cores, The International Conference on High Performance Computing in Asia-Pacific Region (HPCAsia 2020), Jan. 15-17, 2020.
- Roman Iakymchuk, Fabienne Jézéquel, Stef Graillat, Daichi Mukunoki, Toshiyuki Imamura, Yiyu Tan, Atsushi Koshiba, Jens Huthmann, Kentaro Sano, Norihisa Fujita, Taisuke Boku, Optimizing Precision for High-Performance, Robust, and Energy-Efficient Computations, The International Conference on High Performance Computing in Asia-Pacific Region (HPCAsia 2020), Jan. 15-17, 2020.
- Daichi Mukunoki, Toshiyuki Imamura, Yiyu Tan, Atsushi Koshiba, Jens Huthmann, Kentaro Sano, Fabienne Jézéquel, Stef Graillat, Roman Iakymchuk, Norihisa Fujita, Taisuke Boku, Minimal-Precision Computing for High-Performance, Energy-Efficient, and Reliable Computations, France-Japan-Germany trilateral workshop: Convergence of HPC and Data Science for Future Extreme Scale Intelligent Applications, Nov. 7, 2019.
- Yiyu Tan, Daichi Mukunoki, Toshiyuki Imamura, Norihisa Fujita, Taisuke Boku, Reduced and Extended-Precision Computations on FPGAs and GPUs, The 11th symposium on Discovery, Fusion, Creation of New Knowledge by Multidisciplinary Computational Sciences, University of Tsukuba, Oct. 15, 2019.
- Daichi Mukunoki, Takeshi Ogita, Katsuhisa Ozaki, Accurate and Reproducible Linear Algebra Operations for Many-core Architectures, Russian Supercomputing Days 2019 (RuSCDays 2019), Sep. 23 - 24, 2019.
- Daichi Mukunoki, Takeshi Ogita, Katsuhisa Ozaki, OzBLAS: Accurate and Reproducible BLAS Based on Ozaki Scheme, GPU Technology Conference (GTC 2019), Mar. 17-21, 2019.
- Toshiyuki Imamura, Yusuke Hirota, Daichi Mukunoki, Shuhei Kudo, Akiyoshi Kuroda, Naoki Sueyasu, Development of Scientific Numerical Libraries on post-K computer, 1st R-CCS International Symposium, Feb. 18-19, 2019.
- 荻田武史, 椋木大地, 尾崎克久, HPC分野における精度保証付き数値計算学の展開, 第3回CDMSI(ポスト「京」重点課題(7))シンポジウム, 2017年12月5日 (in Japanese).
- 椋木大地, 今村俊幸, 高橋大介, PascalアーキテクチャGPUにおける線形計算カーネルの実装技術の検討, GTC Japan 2016, 2016年10月5日 (in Japanese).
- 大井祥栄, 廣田悠輔, 椋木大地, 今村俊幸, KMATHLIB -High Performance and Scalable Numerical Library for the K Computer-, 応用数理学会2016年度年会, 2016年9月13日 (in Japanese).
- Daichi Mukunoki, Toshiyuki Imamura, Daisuke Takahashi, Introduction of Research Activities for GPU Computing at Large-scale Parallel Numerical Computing Technology Research Team on AICS, The 6th AICS International Symposium, Feb. 22, 2016.
- Yusuke Morikura, Daichi Mukunoki, Takeshi Fukaya, Naoya Yamanaka, Shin'ichi Oishi, Performance Evaluation of Verified Computation for Linear Systems on Parallel Computers, 2nd Annual Meeting on Advanced Computing System and Infrastructure (ACSI2016), Jan. 19, 2016.
- 大井祥栄, 廣田悠輔, 椋木大地, 今村俊幸, 京コンピュータ向け数値計算ライブラリ群KMATHLIBの実装, 応用数理学会2015年度年会, 2015年9月9日 (in Japanese).
- 椋木大地, 今村俊幸, 高橋大介, GPUにおけるスレッド数自動選択機能を持ったメモリ律速な線形計算カーネル群「MUBLAS」の実装と評価, GTC Japan 2015, 2015年9月18日 (in Japanese).
- Daichi Mukunoki, Toshiyuki Imamura, Daisuke Takahashi, High-Performance GEMV and SYMV with Auto-Tuning for Performance Stabilization on Multiple GPU Generations, GPU Technology Conference (GTC 2015), Mar. 17, 2015.
- 椋木大地, 今村俊幸, 高橋大介, Kepler・MaxwellアーキテクチャGPUにおける性能が行列形状に依存しない高速なGEMVの実装, Annual Meeting on Advanced Computing System and Infrastructure (ACSI) 2015論文集, 2015年1月26日 (extended abstract in conference proceedings) (in Japanese).
- 佐々木信一, 藤井昭宏, 田中輝雄, 椋木大地, 今村俊幸, スーパコンピュータ京における倍々精度演算の高速化, Annual Meeting on Advanced Computing System and Infrastructure (ACSI) 2015論文集, 2015年1月26日 (extended abstract in conference proceedings) (in Japanese).
- 今村俊幸, 椋木大地, 佐々成正, 山田進, 町田昌彦, 疑似四倍精度拡張数学パッケージQP-Pack, Annual Meeting on Advanced Computing System and Infrastructure (ACSI) 2015論文集, 2015年1月26日 (extended abstract in conference proceedings) (in Japanese).
- 椋木大地, 今村俊幸, 高橋大介, KeplerアーキテクチャGPUにおける高速なSGEMVの実装, GTC Japan 2014, 2014年7月16日 (in Japanese).
- Daichi Mukunoki, Daisuke Takahashi, Linear Algebra Operations using. Quadruple-precision Arithmetic on GPU, GPU Technology Conference (GTC2014), Mar. 24, 2014.
- Daichi Mukunoki, Daisuke Takahashi, Performance Comparison of Double, Triple and Quadruple Precision Real and Complex BLAS Subroutines on GPUs, Proc. ATIP/A*CRC Workshop on Accelerator Technologies for High-Performance Computing: Does Asia Lead the Way? (ATIP/A*CRC Workshop '12), pp. 788-790, May. 7, 2012 (extended abstract in conference proceedings).
Oral Presentations
- Daichi Mukunoki, Masatoshi Kawai, Toshiyuki Imamura, Reduced-Precision Data Representation on Sparse Matrix-Vector Multiplications, 10th International Congress on Industrial and Applied Mathematics (ICIAM 2023), Aug. 21, 2023.
- Toshiyuki Imamura, Daichi Mukunoki, Atsushi Suzuki, Multiple- and Mixed-Precision BLAS with C++ Template, 10th International Congress on Industrial and Applied Mathematics (ICIAM 2023), Aug. 24, 2023.
- 椋木大地, 河合直聡, 疎行列ベクトル積における低精度データ表現の導入について, 第14回 自動チューニング技術の現状と応用に関するシンポジウム(ATTA2022), Dec. 23, 2022 (in Japanese).
- Kensuke Aihara, Katsuhisa Ozaki, Daichi Mukunoki, A mixed-precision algorithm of the CG method using the group-wise update strategy, The 41st JSST Annual International Conference on Simulation Technology (JSST2022), online, Aug. 31-Sep. 2, 2022.
- Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura, Roman Iakymchuk, Impact and Contribution of Ozaki scheme in High Performance Computing, International Workshop on Reliable Computing and Computer-Assisted Proofs (ReCAP 2022), online, Mar. 15, 2022.
- 相原研輔, 尾崎克久, 椋木大地, Flying restart付きCG法に対する混合精度演算による近似解精度の向上, 日本応用数理学会第18回研究部会連合発表会, online, Mar. 9, 2022 (in Japanese).
- 尾崎克久, 椋木大地, 荻田武史, 行列積に対する試行型エラーフリー変換に対する誤差の対処法とその応用, 日本応用数理学会第18回研究部会連合発表会, online, Mar. 8, 2022 (in Japanese).
- Daichi Mukunoki, Yusuke Hirota, Toshiyuki Imamura, Performance Evaluation of Batched BLAS on A64FX, 4th R-CCS International Symposium (lightning talk), online, Feb. 7, 2022.
- 椋木大地, 精度自動チューニングに向けた基盤技術の検討, 第13回自動チューニング技術の現状と応用に関するシンポジウム (ATTA2021), online, Dec. 13, 2021 (in Japanese).
- Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura, DGEMM using Tensor Cores, SIAM Conference on Computational Science and Engineering (CSE21), online, Mar. 4, 2021.
- Fabienne Jézéquel, Stef Graillat, Daichi Mukunoki, Toshiyuki Imamura, Roman Iakymchuk, Fast rounding error estimation for compute-intensive operations using standard floating-point arithmetic, Rencontres Arithmétiques de l'Informatique Mathématique (RAIM), Paris, May 2021.
- 椋木大地, 尾崎克久, 荻田武史, binary128 に対する尾崎スキーム行列積, 第4回精度保証付き数値計算の実問題への応用研究集会 (NVR 2020), online, Nov. 28-28, 2020 (in Japanese).
- Roman Iakymchuk, Daichi Mukunoki, Conjugate Gradient Solvers with Accuracy and Reproducibility Guarantees in Hybrid Parallel Environments, Sparse Days Cerfacs, online, Nov. 24t, 2020.
- Daichi Mukunoki, DGEMM using Tensor Cores and OzBLAS, 11th Joint Laboratory for Extreme Scale Computing (JLESC) Workshop, online, Sep. 8, 2020.
- Daichi Mukunoki, Minimal-Precision Computing for High-Performance, Energy-Efficient, and Reliable Computations, SIAM Conference on Parallel Processing for Scientific Computing (PP20), Seattle, Feb. 15, 2020 .
- Daichi Mukunoki, Accurate BLAS implementations: OzBLAS and BLAS-DOT2, Workshop on Largescale Parallel Numerical Computing Technology (LSPANC 2020 January), RIKEN R-CCS, Kobe, Jan. 30, 2020.
- Daichi Mukunoki, Minimal-Precision Computing for High-Performance, Energy-Efficient, and Reliable Computations, Sapporo Winter HPC Seminar 2020, Information Initiative Center, Hokkaido University, Jan. 24, 2020.
- Daichi Mukunoki, Takeshi Ogita, High-performance Implementations of Accurate Linear Algebra Kernels on GPUs, 3rd International Conference on Modern Mathematical Methods and High Performance Computing in Science & Technology (M3HPCST), Jan. 9-11, 2020.
- 椋木大地, 荻田武史, 尾崎克久, 尾崎スキームによる高精度BLAS実装「OzBLAS」とその応用, 第3回 精度保証付き数値計算の実問題への応用研究集会 (NVR 2019), 高松市, 2019年12月1日 (in Japanese).
- Daichi Mukunoki, Takeshi Ogita, Katsuhisa Ozaki, Accurate and Reproducible CG Method on GPUs, European Numerical Mathematics and Advanced Applications Conference 2019 (ENUMATH2019), Egmond aan Zee, Oct. 1, 2019.
- Daichi Mukunoki, High-Performance Implementations of Accurate and Reproducible BLAS Routines on GPUs, Workshop on Largescale Parallel Numerical Computing Technology (LSPANC 2019 June), RIKEN R-CCS, Kobe, Jun. 7, 2019.
- 椋木大地, 尾崎スキームに基づく高精度かつ再現性のあるBLASルーチンの実装と自動チューニングの適用, 第22回AT研究会オープンアカデミックセッション(ATOS22), 東京大学情報基盤センター, 東京都, May 13, 2019 (in Japanese).
- 椋木大地, 荻田武史, 尾崎克久, 尾崎スキームによる高精度かつ再現性のあるBLASルーチンの実装と評価, 第2回 精度保証付き数値計算の実問題への応用研究集会 (NVR 2018), 広島市, Dec. 2, 2018 (in Japanese).
- Daichi Mukunoki, Takeshi Ogita, Katsuhisa Ozaki, High Performance Implementation of Reproducible BLAS Routines with Tunable Accuracy Using Ozaki Scheme, Computational Reproducibility at Exascale 2018 (CRE2018), in cooperation with SC18, Dallas, Nov. 11, 2018.
- Daichi Mukunoki, Takeshi Ogita, High Performance Implementation of Accurate Matrix Multiplications on GPUs, The 18th International Symposium on Scientific Computing, Computer Arithmetic, and Verified Numerical Computations (SCAN2018), The International Conference Center at Waseda University, Tokyo, Sep. 11, 2018.
- Roman Iakymchuk, Pedro Valero-Lara, Daichi Mukunoki, Accurate and cost-efficient triangular solve, The 18th International Symposium on Scientific Computing, Computer Arithmetic, and Verified Numerical Computations (SCAN2018), The International Conference Center at Waseda University, Tokyo, Sep. 11, 2018.
- Daichi Mukunoki, Roman Iakymchuk, Stef Graillat, Takeshi Ogita, High-performance implementations of reproducible and accurate matrix-multiplication, 10th International Workshop on Parallel Matrix Algorithms and Applications (PMAA18), ETH Zurich, Zurich, June 27, 2018.
- Daichi Mukunoki, Toshiyuki Imamura, Performance Analysis of 2.5D-PDGEMM on the K Computer, SIAM Conference on Parallel Processing for Scientific Computing (PP18), Waseda University, Tokyo, Mar. 8, 2018.
- 椋木大地, 次世代計算機のための数値計算ライブラリの実装技術, 日本応用数理学会三部会連携「応用数理セミナー」, 早稲田大学西早稲田キャンパス, 東京都, 2017年12月26日 (in Japanese).
- 椋木大地, 今村俊幸, Reduced-/Extended-precision BLASの実装方法の検討, Fifth Workshop on Largescale Parallel Numerical Computing Technology (LSPANC 2017), RIKEN AICS, 神戸市, 2017年3月27日 (in Japanese).
- Daichi Mukunoki, Toshiyuki Imamura, Daisuke Takahashi, Implementation Techniques for High Performance BLAS Kernels on Modern GPUs, SIAM Conference on Computational Science and Engineering (CSE17), Hilton Atlanta, Atlanta, Feb. 28, 2017.
- Yusuke Morikura, Daichi Mukunoki, Takeshi Fukaya, Naoya Yamanaka, Shin’ichi Oishi, Performance Evaluation of Verified Computation for Linear Systems on Supercomputer, SIAM: East Asian Section Conference (EASIAM 2016), University of Macau, Macau, Jun. 20-22, 2016
- Daichi Mukunoki, Toshiyuki Imamura, Daisuke Takahashi, Automatic Thread-Block Size Adjustment for Dense Matrix-Vector Multiplication on CUDA, 2016 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT2016), Mathematics Research Center, National Taiwan University, Taipei, Feb. 19, 2016 (Invited).
- 椋木大地, 高橋大介, GPUにおける3倍精度演算と4倍精度疎行列反復解法, 第3回多倍長精度計算フォーラム, 工学院大学, 東京都, 2013年3月8日 (in Japanese).
- Daichi Mukunoki, Daisuke Takahashi, Iterative Method for Sparse Linear Systems using Quadruple Precision Operations on GPUs, SIAM Conference on Computational Science and Engineering (CSE13), The Westin Boston Waterfront, Boston, Massachusetts, Feb. 28, 2013.
- 椋木大地, 高橋大介, GPUによる4倍精度行列計算, 2011年並列/分散/協調処理に関する『鹿児島』サマー・ワークショップ(SWoPP鹿児島2011) , かごしま県民交流センター, 鹿児島市, 2011年7月27日 (in Japanese).
Software
- OzBLAS: Accurate and Reproducible BLAS based on Ozaki scheme, http://www.math.twcu.ac.jp/ogita/post-k/results.html
- MUBLAS (as a demonstration of automatic thread-block size determination on CUDA kernels), https://www.r-ccs.riken.jp/labs/lpnctrt/en/projects/mublas/
- BLAS-DOT2: Higher-precision BLAS based on Dot2, http://www.math.twcu.ac.jp/ogita/post-k/results.html
- GEMM-TC: S/DGEMM using Tensor Cores, http://www.math.twcu.ac.jp/ogita/post-k/results.html
- Semi-ScaLAPACK-Compatible 2.5D-PxGEMM based on SUMMA (SC-SUMMA-25D), https://www.r-ccs.riken.jp/labs/lpnctrt/projects/25dpdgemm/
- Batched BLAS Generator, https://www.r-ccs.riken.jp/labs/lpnctrt/projects/batchedblas/
- RpFp (reduced precision memory accessor), https://www.r-ccs.riken.jp/labs/lpnctrt/projects/rpfp/
- etc.
Professional Activities
- Program Committee Member, The 15th International Conference on Parallel Processing & Applied Mathematics (PPAM 2024), 2024.
- Program Chair, Special Session: Performance Optimization and Auto-Tuning of Software on Multicore/Manycore Systems (POAT 2023) (in conjunction with MCSoC-2023), 2023.
- Program Committee Member, 2023 IEEE 16th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC-2023), 2023.
- Program Committee Member, The 24th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2023) (in conjunction with IPDPS 2023), 2023.
- Mini-Symposium Organizer, Mini Symposium: Exploring Arithmetic and Data Representation Beyond the Standard in HPC (at ICIAM 2023), 2023.
- Program Committee Member, The 22nd International Conference on Computational Science (ICCS 2022), 2022.
- Program Chair, Special Session: Auto-Tuning for Multicore and GPU (ATMG2022) (in conjunction with MCSoC-2022), 2022.
- Program Committee Member, The 14th International Conference on Parallel Processing & Applied Mathematics (PPAM 2022), 2022.
- Program Committee Member (Algorithm track), 36th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2022), 2022.
- Publicity Chair, The International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2022), 2022.
- Program Committee Member, IEEE 22nd International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2021) (in conjunction with IPDPS 2021), 2021.
- 幹事(交流促進委員会), 自動チューニング研究会, 2021-2023.
- Research Poster Committee Member, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC20), 2020.
- Program Committee Member, The 21st IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2020) (in conjunction with IPDPS 2020), 2020.
- Program Committee Member, Workshop on Large-scale Parallel Numerical Computing Technology (LSPANC 2020 January), 2020.
- 編集委員, 情報処理学会論文誌コンピューティングシステム, 2020-2024.
- Program Committee Member, 2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC-2019), 2019.
- Program Committee Member, The 20th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2019) (in conjunction with IPDPS 2019), 2019.
- Program Committee Member, The 4th International Workshop on GPU Computing and AI (GCA'19) (in conjunction with CANDAR'19), 2019.
- Program Committee Member, The Fourteenth International Workshop on Automatic Performance Tuning (iWAPT2019) (in conjunction with IPDPS 2019), 2019.
- Program Committee Member, 2018 IEEE 12th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC-2018), 2018.
- Program Committee Member, The Third International Workshop on GPU Computing and AI (GCA'18) (in conjunction with CANDAR'18), 2018.
- Program Committee Member, The 19th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2018) (in conjunction with IPDPS 2018), 2018.
- Program Committee Member, The Thirteenth International Workshop on Automatic Performance Tuning (iWAPT2018) (in conjunction with IPDPS 2018), 2018.
- Program Committee Member, Special Session: Auto-Tuning for Multicore and GPU (ATMG 2018) (in conjunction with MCSoC-2018), 2018.
- Mini-Symposium Organizer, Mini Symposium: Development of Numerical Computing Software on Emerging Computing Platforms (at SIAM PP 18), 2018.
- Program Committee Member, The 18th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2017) (in conjunction with IPDPS 2017), 2017.
- Program Committee Member, The Second International Workshop on GPU Computing and AI (GCA'17) (in conjunction with CANDAR'17), 2017.
- Program Committee Member, The Twelfth International Workshop on Automatic Performance Tuning (iWAPT2017) (in conjunction with IPDPS 2017), 2017.
- Program Committee Member, Special Session: Auto-Tuning for Multicore and GPU (ATMG 2017) (in conjunction with MCSoC-17), 2017.
- Program Committee Member, The 17th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2016) (in conjunction with IPDPS 2016), 2016.
- Program Committee Member, The First International Workshop on GPU Computing and Applications (GCA'16) (in conjunction with CANDAR'16), 2016.
- Program Committee Member, The 16th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2015) (in conjunction with IPDPS 2015), 2015.
- Program Committee Member, The 15th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2014) (in conjunction with IPDPS 2014), 2014.