Multicore-optimized wavefront diamond blocking for optimizing stencil updates T Malas, G Hager, H Ltaief, H Stengel, G Wellein, D Keyes SIAM Journal on Scientific Computing 37 (4), C439-C464, 2015 | 98 | 2015 |
Deep learning at 15pf: supervised and semi-supervised classification for scientific data T Kurth, J Zhang, N Satish, E Racah, I Mitliagkas, MMA Patwary, T Malas, ... Proceedings of the International Conference for High Performance Computing …, 2017 | 97 | 2017 |
Applying the roofline performance model to the intel xeon phi knights landing processor D Doerfler, J Deslippe, S Williams, L Oliker, B Cook, T Kurth, M Lobet, ... High Performance Computing: ISC High Performance 2016 International …, 2016 | 85 | 2016 |
Multidimensional intratile parallelization for memory-starved stencil computations TM Malas, G Hager, H Ltaief, DE Keyes ACM Transactions on Parallel Computing (TOPC) 4 (3), 1-32, 2017 | 51 | 2017 |
Evaluating and optimizing the nersc workload on knights landing T Barnes, B Cook, J Deslippe, D Doerfler, B Friesen, Y He, T Kurth, ... 2016 7th International Workshop on Performance Modeling, Benchmarking and …, 2016 | 48 | 2016 |
Feature selection for recognizing handwritten Arabic letters GA Abandah, TM Malas Dirasat Engineering Sciences Journal 37 (2), 2010 | 32 | 2010 |
Toward optimal Arabic keyboard layout using genetic algorithm TM Malas, SS Taifour, GA Abandah Proc. 9th Int’l Middle Eastern Multiconf. on Simulation and Modeling (MESM …, 2008 | 28 | 2008 |
Optimization of an electromagnetics code with multicore wavefront diamond blocking and multi-dimensional intra-tile parallelization TM Malas, J Hornich, G Hager, H Ltaief, C Pflaum, DE Keyes 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2016 | 18 | 2016 |
High-performance seismic modeling with finite-difference using spatial and temporal cache blocking V Etienne, T Tonellot, T Malas, H Ltaief, S Kortas, P Thierry, D Keyes Third EAGE Workshop on High Performance Computing for Upstream 2017 (1), 1-5, 2017 | 13 | 2017 |
Optimization of the sparse matrix-vector products of an IDR Krylov iterative solver in EMGeo for the Intel KNL manycore processor T Malas, T Kurth, J Deslippe International Conference on High Performance Computing, 378-389, 2016 | 12 | 2016 |
Towards energy efficiency and maximum computational intensity for stencil algorithms using wavefront diamond temporal blocking T Malas, G Hager, H Ltaief, D Keyes arXiv preprint arXiv:1410.5561, 2014 | 11 | 2014 |
Optimizing the performance of streaming numerical kernels on the IBM Blue Gene/P PowerPC 450 processor T Malas, AJ Ahmadia, J Brown, JA Gunnels, DE Keyes The International journal of high performance computing applications 27 (2 …, 2013 | 10 | 2013 |
Prabhat, and P. Dubey,“Deep learning at 15PF: Supervised and semi-supervised classification for scientific data,” T Kurth, J Zhang, N Satish, E Racah, I Mitliagkas, MMA Patwary, T Malas, ... Proc. International Conference for High Performance Computing, Networking …, 2017 | 7 | 2017 |
Analyzing performance of selected NESAP applications on the Cori HPC system T Kurth, W Arndt, T Barnes, B Cook, J Deslippe, D Doerfler, B Friesen, ... High Performance Computing: ISC High Performance 2017 International …, 2017 | 3 | 2017 |
Tiling and asynchronous communication optimizations for stencil computations TMY Malas | 3 | 2015 |
Towards energy efficiency and maximum computational intensity for stencil algorithms using wavefront diamond temporal blocking. CoRR abs/1410.5561 TM Malas, G Hager, H Ltaief, DE Keyes arXiv preprint arXiv:1410.5561, 2014 | 3 | 2014 |
Optimization of finite-difference kernels on multi-core architectures for seismic applications V Etienne, T Tonellot, K Akbudak, H Ltaief, S Kortas, T Malas, P Thierry, ... Intel eXtreme Performance Users Group, 2018 | 2 | 2018 |
Optimizing Science Applications for the Cori, Knights Landing, System at NERSC J Deslippe, D Doerfler, B Cook, T Malas, S Williams, S Dosanjh New Frontiers in High Performance Computing and Big Data, 235-252, 2017 | | 2017 |
Towards Fast Reverse Time Migration Kernels using Multi-threaded Wavefront Diamond Tiling T Malas, G Hager, H Ltaief, D Keyes Second EAGE Workshop on High Performance Computing for Upstream 2015 (1), 1-5, 2015 | | 2015 |
Optimizing Stencil Computations: Multicore-optimized wavefront diamond blocking on Shared and Distributed Memory Systems T Malas, G Hager, H Ltaief, H Stengel, G Wellein, D Keyes | | 2014 |