淡江大學機構典藏:Item 987654321/59898
English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 62819/95882 (66%)
造訪人次 : 4007744      線上人數 : 589
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library & TKU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋
    請使用永久網址來引用或連結此文件: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/59898


    題名: Efficient Address Generation for Affine Subscripts in Data-Parallel Programs
    作者: Shih, Kuei-ping;石貴平;Sheu, Jang-ping;Chang, Chih-yung
    貢獻者: 淡江大學資訊工程學系
    關鍵詞: address generation;affine subscripts;data distribution;distributed-memory;multicomputers;data-parallel languages;multiple induction variables (MIVs);single program multiple data (SPMD)
    日期: 2000-09-01
    上傳時間: 2011-10-05 22:25:05 (UTC+8)
    摘要: Address generation for compiling programs, written in HPF, to executable SPMD code is an important and necessary phase in a parallelizing compiler. This paper presents an efficient compilation technique to generate the local memory access sequences for block-cyclically distributed array references with affine subscripts in data-parallel programs. For the memory accesses of an array reference with affine subscript within a two-nested loop, there exist repetitive patterns both at the outer and inner loops. We use tables to record the memory accesses of repetitive patterns. According to these tables, a new start-computation algorithm is proposed to compute the starting elements on a processor for each outer loop iteration. The complexities of the table constructions are O(k+s2), where k is the distribution block size and s2 is the access stride for the inner loop. After tables are constructed, generating each starting element for each outer loop iteration can run in O(1) time. Moreover, we also show that the repetitive iterations for outer loop are Pk/gcd(Pk, s1), where P is the number of processors and s1 is the access stride for the outer loop. Therefore, the total complexity to generate the local memory access sequences for a block-cyclically distributed array with affine subscript in a two-nested loop is O(Pk/gcd(Pk, s1)+k+s2).
    關聯: The Journal of Supercomputing 17(2), pp.205-227
    DOI: 10.1023/A:1008190606079
    顯示於類別:[資訊工程學系暨研究所] 期刊論文

    文件中的檔案:

    沒有與此文件相關的檔案.

    在機構典藏中所有的資料項目都受到原著作權保護.

    TAIR相關文章

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library & TKU Library IR teams. Copyright ©   - 回饋