A Guide for Achieving High Performance with Very Small Matrices on GPU: A Case Study of Batched LU and Cholesky Factorizations

Summary

This is a publication. If there is no link to the publication on this page, you can try the pre-formated search via the search engines listed on this page.

Authors: Azzam Haidar, Ahmad Abdelfattah, Mawussi Zounon, Stanimire Tomov, Jack Dongarra

Journal title: IEEE Transactions on Parallel and Distributed Systems

Journal number: 29/5

Journal publisher: Institute of Electrical and Electronics Engineers

Published year: 2018

Published pages: 973-984

DOI identifier: 10.1109/tpds.2017.2783929

ISSN: 1045-9219