Ιδρυματικό Αποθετήριο
Πολυτεχνείο Κρήτης
EN  |  EL

Αναζήτηση

Πλοήγηση

Ο Χώρος μου

FPGA implementation of a configurable cache/scratchpad memory with virtualized user-level RDMA capability

Pnevmatikatos Dionysios, Vassilis Papaefstathiou, George Nikiforos, George Kalokerinos, Xiaojun Yang

Απλή Εγγραφή


URIhttp://purl.tuc.gr/dl/dias/73F41371-2CA9-4267-A400-CC209CC96951-
Αναγνωριστικόhttps://doi.org/10.1109/ICSAMOS.2009.5289226-
Γλώσσαen-
Μέγεθος7 pagesen
ΤίτλοςFPGA implementation of a configurable cache/scratchpad memory with virtualized user-level RDMA capabilityen
ΔημιουργόςPnevmatikatos Dionysiosen
ΔημιουργόςΠνευματικατος Διονυσιοςel
ΔημιουργόςVassilis Papaefstathiouen
ΔημιουργόςGeorge Nikiforosen
ΔημιουργόςGeorge Kalokerinosen
ΔημιουργόςXiaojun Yangen
ΕκδότηςInstitute of Electrical and Electronics Engineersen
ΠερίληψηWe report on the hardware implementation of a local memory system for individual processors inside future chip multiprocessors (CMP). It intends to support both implicit communication, via caches, and explicit communication, via directly accessible local (ldquoscratchpadrdquo) memories and remote DMA (RDMA). We provide run-time configurability of the SRAM blocks near each processor, so that part of them operates as 2nd level (local) cache, while the rest operates as scratchpad. We also strive to merge the communication subsystems required by the cache and scratchpad into one integrated Network Interface (NI) and Cache Controller (CC), in order to economize on circuits. The processor communicates with the NI in user-level, through virtualized command areas in scratchpad; through a similar mechanism, the NI also provides efficient support for synchronization, using two hardware primitives: counters, and queues. We describe the block diagram, the hardware cost, and the latencies of our FPGA-based prototype implementation, which integrates four MicroBlaze processors, each with 64 KBytes of local SRAM, a crossbar NoC, and a DRAM controller on a Xilinx-5 FPGA. One-way, end-to-end, user-level communication completes within about 30 clock cycles for short transfer sizes.en
ΤύποςΠλήρης Δημοσίευση σε Συνέδριοel
ΤύποςConference Full Paperen
Άδεια Χρήσηςhttp://creativecommons.org/licenses/by/4.0/en
Ημερομηνία2015-10-19-
Ημερομηνία Δημοσίευσης2009-
Θεματική ΚατηγορίαNetwork on chip technologyen
Θεματική ΚατηγορίαNoC technologyen
Θεματική Κατηγορίαnetworks on a chipen
Θεματική Κατηγορίαnetwork on chip technologyen
Θεματική Κατηγορίαnoc technologyen
Βιβλιογραφική ΑναφοράG.Kalokerinos, V. Papaefstathiou, G. Nikiforos, S. Kavadias, M.Katevenis, Pnevmatikatos , X. Yang, "FPGA implementation of a configurable cache/scratchpad memory with virtualized user-level RDMA capability,"in 2009 Intern. Conf. on Embedded Comp. Systems: Architectures, Modellin and Simulation,pp.149 - 156.doi:10.1109/ICSAMOS.2009.5289226en

Υπηρεσίες

Στατιστικά