Proactive Recovery for Long-Lasting Non-Volatile Memories

Author: ORCID icon orcid.org/0000-0003-1484-6333
Morgul, Muhammed Ceylan, Electrical Engineering - School of Engineering and Applied Science, University of Virginia
Advisor:
Stan, Mircea, Electrical and Computer Engineering Department, University of Virginia
Abstract:

This research outlines an integrative approach to significantly enhance the endurance and sustainability of non-volatile memories, with a primary focus on flash memory and Solid-State Drives (SSDs). This initiative is driven by the need to meet the stringent requirements of emerging applications, including processing in memory, remote IoT devices, and data centers, particularly for managing hot (frequently updated) data through the development and application of novel device- and system-level techniques.

Our work has demonstrated that operating flash memory in high-temperature environments notably improves its reliability and extends endurance, suggesting an innovative pathway for processing intensive applications. This research intends to explore a dual-temperature SSD architecture that optimizes flash memory for both processing and storage, leveraging temperature differentials to maximize longevity.

Additionally, the implementation of Proactive Recovery, inspired by biological Circadian Rhythms, has shown promise in mitigating wear-out and extending memory lifespan. This method is also proven to be effective for Bias Temperature Instability (BTI) and Electro-Migration (EM) wear-out of electronics. It is especially beneficial for applications requiring high demand and durability, such as data centers, as well as those needing sustainable memory solutions, like remote IoT devices. It leads to significant improvements in flash memory reliability and sustainability. Also, we further improve sustainability by developing the Page Isolation technique that enables the use of blocks that have reached and passed their end-of-life, deemed unusable ("bad" blocks).

Building on these insights, the research will refine stress and recovery models through real-time simulation to accurately capture the timing and interdependence of Proactive Recovery processes, which will be integrated into the SSD simulators. This effort seeks to align theoretical models with experimental data, offering a precise depiction of flash memory behavior under various conditions.

A central aspect of this research is the development of an advanced Flash Translation Layer (FTL) algorithm that integrates Proactive Recovery and Page Isolation techniques to enhance SSD controllers' efficiency and endurance. Testing the system-level implications of these integrated approaches has been conducted through simulation and benchmarking using the state-of-the-art SSD simulator MQSim, with a focus on reducing byte error rates, extending device longevity, and improving environmental sustainability.

Furthermore, we investigate the feasibility of applying Proactive Recovery to other non-volatile memory technologies, such as the ferroelectric capacitor, which is to be used as a memory element, to show its broadening impact. The anticipated outcomes include a substantial extension of device endurance, environmental sustainability, and the establishment of a new paradigm for "everlasting" memory and storage technologies.

Degree:
PHD (Doctor of Philosophy)
Keywords:
Reliability, Sustainability, Flash Memory, Non-Volatile Memory, Solid State Drive, Proactive Recovery, Page Isolation
Sponsoring Agency:
Semiconductor Research Corporation (SRC) under the Center for Research on Intelligent Storage and Processing-in-memory (CRISP)
Language:
English
Rights:
All rights reserved (no additional license for public reuse)
Issued Date:
2025/04/23