L1 Cache and TLB Enhancements to the RAMpage Memory Hierarch

时间:2022-11-24 09:48:15 作者:壹号 字数:2251字

Abstract. The RAMpage hierarchy moves main memory up a level to replace the lowest-level cache by an equivalent-sized SRAM main memory, with a TLB caching page translations for that main memory. This paper illustrates how more aggressive components higher

L1CacheandTLBEnhancementstothe

RAMpageMemoryHierarchy

PhilipMachanick1andZunaidPatel2

SchoolofITEE,UniversityofQueensland

Brisbane,Qld4072,Australia

philip@itee.uq.edu.au

SchoolofComputerScience,UniversityoftheWitwatersrand,

…… 此处隐藏0字 ……

Johannesburg,PrivateBag3,2050Wits,SouthAfrica

zunaid@cs.wits.ac.za12

Abstract.TheRAMpagehierarchymovesmainmemoryupaleveltoreplacethelowest-levelcachebyanequivalent-sizedSRAMmainmem-ory,withaTLBcachingpagetranslationsforthatmainmemory.ThispaperillustrateshowmoreaggressivecomponentshigherinthehierarchyincreasethefractionoftotalexecutiontimespentwaitingforDRAM.Foraninstructionissuerateof1GHz,thesimulatedstandardhierarchywaitedforDRAM10%ofthetime,increasingto40%ataninstructionissuerateof8GHz.ForalargerL1cache,thefractionoftimewaitingforDRAMwasevenhigher.RAMpagewithcontextswitchesonmisseswasabletohidealmostallDRAMlatency.AlargerTLBwasshowntoincreasetheviablerangeofRAMpageSRAMpagesizes.

1Introduction

TheRAMpagememoryhierarchymovesmainmemoryupaleveltoreplacethelowest-levelcachewithanSRAMmainmemory,whileDRAMbecomesa rst-levelpagingdevice.PreviousworkhasshownthatRAMpagerepresentsanalternative,viabledesignintermsofhardware-softwaretrade-o s[22]andthatitscalesbetterastheCPU-DRAMspeedgapgrows,particularlybyvirtueofbeingabletotakecontextswitchesonmisses[21].

Inpreviouswork,itwashypothesizedthatRAMpagewouldbemorecom-petitiveacrossawiderrangeofSRAMpagesizes(equivalenttolinesizeofthelowest-levelcache)withamoreaggressiveTLB.Secondly,itwashypothesizedthatamoreaggressiveL1cachewouldemphasizedi erencesinlowerlevelsofthehierarchy.Inthispaper,wereportoninvestigationofbothhypothesesasseparateissues.ImprovingtheTLBandL1hasdi erente ectsonperformance.TheintentinpresentingbothinthesamepaperistoaddseveraldatapointstoourcaseforRAMpage.

Insomestudies,TLBmisseshaveaccountedforasmuchas40%ofruntime

[13],with guresintheregionof20–30%common[6,23].RAMpagehasthepotentialtoreducethesigni canceoftheTLBonperformancefortworeasons.Firstly,unlessthereferencewhichcausesaTLBmisswouldalsomissinthe