Power Management in Ultra-low Power Systems

The evolving vision of the Internet-of-Things (IoT) will revolutionize various applications such as remote health monitoring, home automation and remote surveillance. It has been projected that by 2025, there will be 1 trillion IoT devices influencing our daily lives. This will result in the generation of an enormous amount of data, which will have to be stored, processed and transmitted efficiently and reliably. Although advancements in Integrated Circuit (IC) design and the availability of various Ultralow Power (ULP) circuit components have helped us to visualize an ecosystem of numerous internetconnected devices, the overall system integration will become a major challenge. A System-on-Chip (SoC), catering to IoT applications is expected to contain many different circuit components such as sensors and Analog-Front-Ends (AFEs) for real-time signal acquisition, analog-to-digital converters, digital signal processors, memories, wireless transceivers etc. All these components have different supply voltage requirements and power profiles. Hence, power delivery to such components in an SoC will play an important role in the overall system architecture. Although, battery-powered systems have traditionally worked well in portable electronics but in an IoT ecosystem, the cost of battery replacement in a trillion-sensor node network will be enormous. In many applications, such as remote surveillance, systems require a long operational lifetime. Moreover, system deployment should be unobtrusive and hence such systems should have small form factors. The above requirements are hard to meet using conventional battery-powered systems. Hence, in most IoT SoCs, there is a strong motivation for an integrated Power Management Unit (PMU), with energy harvesting capability for near-perpetual battery-less operation, which can provide a range of supply voltage rails to satisfy the electrical specifications of different functional units. This dissertation will address the design challenges related to energy autonomy and power-delivery in a wireless sensor node. We propose a fully integrated energy harvesting platform with a capability to harvest from multiple sources of energy such as indoor solar and thermoelectric generators (TEGs). Additionally, we also propose a power-efficient supply regulation scheme to meet the electrical specifications of the various components of a self-powered, battery-less SoC. Finally, we demonstrate several ULP digital and mixed-signal circuit components, which can be leveraged in an energyautonomous system. The proposed solutions to power delivery will enhance the operational lifetime, reduce the overall form factor and contribute towards attaining energy-autonomy to facilitate a wide range of applications related to the IoT.


Introduction 1.1 Motivation for energy harvesting and power management
Technology scaling and shrinking device dimensions have helped a circuit designer to implement batteryoperated computing systems such as laptops, smartphones etc. with a high level of system integration.However, applications such as surveillance, remote health monitoring etc. require non-invasive, unobtrusive systems with extremely small form factors and a longer shelf life.Hence, if such systems are designed to be battery-operated, then they will be limited by the size of the battery.Moreover, battery replacement will be expensive, if needed in large numbers or in applications such as remote surveillance.In such scenarios, energy harvesting from ambient sources, such as solar, thermal and vibration energy provides a viable solution.The harvested energy can be stored in an energy reservoir, such as a supercapacitor and can be used by the system, when required.Thus, harvesting energy from the ambient sources can theoretically provide a near-perpetual system lifetime and enable further shrinking of system form factors.However, an energyharvesting self-powered system comes with the following design challenges, which will be addressed in this work: § System sustainability Energy harvesting systems, which can harvest from only one ambient source need to address a major limitation that how the system will operate when the source is unavailable.In such a scenario, a system can limit the range of applications and supported features, in order to prevent the complete discharge of the energy buffer or reservoir.Another approach to resolve this limitation would be to harvest from multiple sources of ambient energy.If a system can determine the dominant source of energy and utilize that source for harvesting and storing energy in an energy reservoir, then that system can operate reliably with varying environmental conditions.In this work, we will investigate multi-modal harvesting, explore circuits or methods to determine which source provides peak power and harvest energy from that source.§ System start-up In a self-powered system, a self-starting mechanism is necessary to generate a power-on-reset (POR) signal and kick-start system operation.The control logic of the energy harvester, as well as other circuit components, which are usually implemented using CMOS technology can only operate at a certain minimum voltage.
Assuming the worst-case scenario, the start-up mechanism needs to be designed, assuming that the storage reservoir is completely empty.If the energy harvester can cold-start at a low input voltage, then the overall system can be more autonomous.In this work, we will investigate ultra-low voltage start-up circuits to enable energy harvesting at ultra-low voltage levels.§ Power-efficiency Achieving high end-to-end power efficiency is a key requirement especially in a scenario where the ambient source of energy such as thermal, can provide only 10s of µW of power.It is essential to minimize the power loss in the energy harvester and utilize nearly all of the available power for storing energy in the energy reservoir or provide power for system operation.Hence, the powertrain architecture and the control circuits of the energy harvester need to be designed to minimize power loss.Another method to maximize power efficiency is to track the maximum power point of the harvester for a given environmental condition using a Maximum-Power-Point-Tracking (MPPT) scheme.In this work, apart from designing the powertrain architecture and control circuits to be power-efficient, we will explore adaptive MPPT schemes, which can work for multiple sources of ambient energy.

Motivation for integrated supply voltage regulation
Supply regulation plays a major role in delivering power to various hardware components in an SoC such as microprocessor cores, memories, I/O interfaces, wireless transceivers and other analog and mixed-signal circuits.Since a modern SoC involves a high degree of system integration, integrated voltage regulators have become common and with technology scaling, the efficiency and performance of such integrated regulators have improved.However, designing an integrated voltage regulator, especially for an ultra-low-power IoT SoC involves the following key challenges:

Goals
The overall goal of this work is to evaluate, design and demonstrate a complete power management system encompassing energy harvesting from multiple sources, supply voltage regulation along with Ultra-low Power (ULP) circuit components such as comparators, ULP processors etc., implemented using novel circuit topologies or in new process technologies.The individual goals of this work will include: § A model and framework to understand various loss mechanisms in different power converter topologies § A hybrid Maximum-Power-Point-Tracking (MPPT) circuit, which can autonomously adapt to changing environmental conditions as well as support different ambient sources such as solar and thermal energy § A mechanism for low-voltage start-up § A low power voltage reference circuit § A complete energy harvesting and supply regulation platform § Low power supply voltage droop measurement scheme using all-digital circuits § Evaluation of latch and register-based circuits to power supply variation § Establish a design flow for dynamic IR-drop analysis and decoupling capacitor insertion.§ Ultra-low power comparator design with low input-referred offset with threshold control § Evaluate new process technologies for energy-efficient implementations of an MSP430 digital processor and an FIR filter

Energy Harvesting from multiple ambient sources 2.1 Motivation:
Advancements in integrated circuit design have led to the development of Ultra-low Power (ULP) electronics such as wireless sensor nodes for surveillance, health monitoring and home automation applications.This new generation of smart electronic sensors and devices need to have small form factors, especially in biomedical applications such that they are non-invasive.Today, batteries represent the dominant source of energy in electronic systems but they largely dictate the overall size, making the system bulky and not scalable.In the case of surveillance applications, such systems need to be deployed in large numbers and in remote locations.Hence, the cost of battery replacement is high.Thus, such systems need a compact, low-cost, lightweight and nearperpetual energy for a long operational lifetime.Energy harvesting from ambient sources, such as solar and thermoelectric energy, vibration/motion and RF, provides a viable alternative to battery-powered systems.The reliability and the operational lifetime can be further improved if the system has the ability and the necessary electronics to harvest from multiple harvesting modalities.

Background and Prior Art:
The primary goal of any energy scavenging system is to harvest energy from ambient sources, such as light, motion, thermoelectric energy etc. and store it in a storage device or an energy buffer, such as a supercapacitor.
Another approach is to use the harvested energy to charge a re-chargeable battery.However, most state-of-the-art high energy-density rechargeable batteries have limited charge-discharge cycles, making battery-replacement unavoidable and thus restricting the system lifetime.A good supercapacitor can support more than 10000 chargedischarge cycles [1] and thus can be leveraged in energy-autonomous systems, provided that the supercapacitor has low leakage and has a small form factor to meet the size restrictions of a wireless sensing node.Energy storage in a wireless sensor node is necessary because the peak currents needed during wireless transmission cannot be supported directly by an energy harvester.Hence, in such scenarios, a storage device such as a supercapacitor or a re-chargeable battery acts as a buffer to support the peak current requirements of the system.Based on the overall powertrain architecture, integrated energy harvesters can be broadly classified into two categories: 1. Inductor-based Boost or Boost-Buck converters.
2. Voltage multipliers or charge-pumps based on switched-capacitor topologies.Inductor-based topologies have been found to be more power-efficient in systems where a wide range of input voltage is available from Thermoelectric Generators (TEGs), solar cells etc. Inductor-based topologies also provide a better efficiency than a charge-pump for a wide range of load currents.However, inductor-based switching converters need off-chip passives such as high-Q inductors and extra package pins, which increases the cost.Charge-pump circuits can be fully integrated and hence can be incorporated in systems requiring smaller form factors.

Sources of energy harvesting in micropower systems
Most self-powered wireless sensing systems are designed to harvest energy broadly from three different ambient sources: thermal energy, indoor/outdoor light energy and energy from vibration/motion/RF.In this work, we will focus on energy harvesting from solar and thermoelectric energy and hence we will only discuss the physics and the operating principles of thermoelectric generators and photovoltaic cells.

Thermoelectric energy/Thermoelectric generators (TEG)
Thermal energy harvesters are based on the principle of Seebeck effect i.e. when two junctions, made of two different conductors, are kept at different temperatures; an open circuit voltage develops between them.Fig. 2.1(a) shows a diagram of a thermocouple, which is the most basic voltage generator based on the Seebeck effect.The two pillars, or legs, are made of two different materials and connected by a metallic interconnect.When a temperature differential, ΔT is established between the bottom and the top pillars, a voltage, V develops between the points A and B. This voltage is given by: where S is the overall Seebeck coefficient.
The primary component inside a TEG is a thermopile (shown in Fig. 2.1(b)), which is constructed by connecting a large number of thermocouples electrically in series such that the contribution of each thermocouple to the voltage adds up.Other components of a TEG may include a radiator or a heat sink for efficient heat dissipation into the ambient or structures such as thermal shunts to direct the heat absorbed into the legs of a thermocouple for higher efficiency.Fig. 2.2 shows the equivalent electrical model of a TEG.The electrical resistance R EL of the thermopile is proportional to the resistivity ρ of thermoelectric material and to the number of thermocouples.Hence, where, n is the number of thermocouples connected in series, h is the height of the legs and a is the lateral dimension of the pillars.The maximum available output power on a matched load (Z LOAD = R EL ) is thus given by Light or solar power panels provide an inexhaustible source of energy, especially in outdoor conditions.The principle of photovoltaic energy harvesters is based on the photoelectric effect, which is the ability of photovoltaic materials such as crystalline and amorphous silicon, to emit electrons after absorbing light.The number of photons depends on the light intensity and if there are a sufficient number of photons incident on a photovoltaic material, electricity can be obtained.Hence, the power, which can be harvested from a solar cell, depends on the light intensity.However, the main disadvantage of using a photovoltaic source is the reduced output power in indoor light conditions or in conditions where the light intensity is not consistent.Table 2.1 describes the comparison of both photovoltaic, thermoelectric, and wind/motion power sources in outdoor/industrial and indoor conditions.

Table 2.1: Comparison of solar, TEG and other power harvesting sources in indoor and outdoor conditions [2]
The power-efficiency of indoor photovoltaic cells reduces drastically in indoor conditions.Connecting multiple solar cells electrically in series can increase the power generated from a solar panel but also increases the output impedance and limits the total available power.Fig. 2.3 shows the equivalent electrical model of a solar cell [2].The current source, I L , models the generated photoelectric current, which depends on the light intensity.I D denotes the current due to recombination of carriers.The shunt (R SH ) and series (R S ) resistance accounts for the solar cell non-idealities and second-order effects such as leakage currents around the edge of the cell, contact resistance and resistance of the material.I PV is the equivalent photovoltaic current and V PV is the equivalent output open-circuit voltage.Hence, the available power from a solar cell is given by:

Outdoor condition
Solar Panel 100µW/cm 2 @10W/cm 2 10mW/cm 2 @STC Wind turbinegenerator 35µW/cm 2 @ <1m/s 3.5mW/cm 2 @8.4m/sThermoelectric generator 100µW/cm 2 @ 5°C gradient 3.5mW/cm 2 @30°C gradient Electromagnetic generator 4µW/cm 3 @ human motion-Hz 800µW/cm 3 @ machine-kHz  current ramps up, thereby storing energy in the inductor.As a first order approximation, neglecting the parasitic DC resistance of the inductor and assuming that M LS has negligible voltage drop, we have: Where L = L BOOST I PEAK = peak inductor current in the inductor T L = ON-time of the LS power transistor which is governed by the pulse width of the LS pulse.
In the HS phase, the peak inductor current ramps down to zero and the stored energy in the inductor is delivered to the load through M HS by synchronous rectification.In the HS phase, it is important that M HS turns off when the inductor current reaches zero.If M HS turns off when the inductor current changes direction, then V STORE is discharged due to reverse conduction.If M HS turns off early then the node, V X goes high turning on the p-n junction diode of M HS and the extra energy is dumped across the diode.In either case, there is a loss in efficiency as some amount of energy is lost either during reverse conduction or wasted across the diode.Ignoring parasitic DC resistance of the inductor and assuming negligible voltage drop across M HS we have: Thus in an ideal case, the boost conversion factor is given by: where, T L is the ON-time of M LS , which is governed by the pulse width of the LS pulse and T H is the ON-time of M HS , which is governed by the pulse width of the HS pulse Hence, by modulating T L and T H, the required voltage gain can be achieved.However, in (1), the conduction losses in the inductor and power transistors as well as the switching losses are not accounted.The total conduction loss during the LS cycle,  !",! as given by [8]: where,  ! is the total resistance including the parasitic resistance of the inductor and the ON resistance of M LS.
Similarly, for the HS cycle, the total conduction loss,  !",! is given by: where,  ! is the total resistance including the parasitic resistance of the inductor and the ON resistance of M HS.The switching loss,  !" and leakage,  !"# is constant for a given control scheme and depends on the dimensions of M LS and M HS [8].Hence, the total loss is given by: In DCM mode, the sources of energy loss are due to conduction loss in the inductor and power FETs as well as switching loss due to charging-discharging of the gate capacitance and gate-drive circuits of the power FETs.
Subthreshold leakage also contributes significantly to the loss, especially when the load currents are extremely small.For ultra-light load systems, such as [5], the leakage and switching loss are more dominant than the conduction loss.Hence in [5], a charge-pump based voltage doubler circuit is proposed in the control scheme to super cut-off the power FETs, resulting in 53% efficiency at 1.2nW load with 544pW of quiescent power being consumed by the converter.In [6], the boost converter, operating in DCM can harvest energy from a TEG with an open-circuit voltage as low as 20mV.To achieve zero crossing detection, a comparator is used to monitor the Vx node and a counter keeps track of the ON-time of the HS power FET.In [7], a multi-modal energy harvesting scheme is proposed which can harvest from TEG, solar or piezoelectric energy harvesting modalities by using a shared inductor scheme.The inductor is multiplexed among multiple harvesting modalities and a dual-path approach is implemented in the powertrain architecture to support a wide range of load currents.In [8], a peak inductor current control scheme is implemented to optimize conduction and switching loss.A fast zero crossing detector with offset compensation is implemented for synchronous rectification.The boost converter in [8] can harvest from a TEG with an open-circuit voltage as low as 10mV and achieves a peak efficiency of 83%.
Another important requirement in self-powered energy harvesting systems is that the system needs to be selfstarting.Since the output voltage provided by a TEG is below 100mV under ambient conditions, a start-up scheme is necessary to power the control circuits and enable energy harvesting.The start-up scheme does not need to have a very high efficiency as it is only needed to start the control circuits.There are several start-up techniques discussed in the literature, which leverage technology, process, external kick-start and ambient RF energy to enable start-up.In [8], an on-chip cold-start circuit and an external RF-kickstart mechanism are leveraged to power the control circuits during start-up.The cold start circuit consists of a ring oscillator and a voltage-doubler to generate the control signals for the boost converter to start energy harvesting.The RF-kickstart circuit consists of an RF switch and a broadband rectifier implemented using the Dickson topology and operates in the subthreshold regime.A similar RF-kickstart mechanism is described in [5].In [14], a mechanically assisted switch is used in an auxiliary boost converter topology to begin energy harvesting and charge a storage capacitor.
The auxiliary boost converter with the mechanically assisted switch is disabled when the voltage on the stored capacitor is high enough to power the control circuits of the primary boost converter.In [15], an external transformer and a low-V T NMOS transistor is connected to incorporate positive feedback, such that device noise is able to start oscillations, which are used to transfer and build-up energy on a storage capacitor.In [13], an LC tank oscillator is used for low-voltage DC to AC conversion followed by a voltage multiplier to boost and rectify the AC signal to a higher DC voltage for start-up.

Charge Pumps and Switched-Capacitor topologies:
Charge pumps and switched-capacitor based architectures provide a fully integrated solution.An arrangement of CMOS switches, controlled by clock signals (which are mostly out-of-phase but can be poly-phase) along with charge storage and transfer capacitors form a network known as a switched capacitor network (SCN).One of the key goals is to optimize the overall output impedance of a switched-capacitor based converter.Fig. 2.6 shows a simple first-order model of a switched-capacitor converter with a DC voltage gain of N. The voltage drop across the output impedance, R O models all the conversion losses.The resistive output impedance accounts for the switching and conduction.Additional loss due to gate-drive in the control scheme, short-circuit currents due to overlapping control signals and bottom plate parasitic capacitances can be incorporated into this model.There are two asymptotic limits to the output impedance based on the switching frequency of the control signals.The slow switching limit (SSL) impedance is calculated under the assumption that the switch and interconnect resistances are negligible and accounts for the loss due to charge transfer through the capacitors.The fast switching limit (FSL) impedance accounts for the conduction loss through the switch and other resistive components.The switched-capacitor network (SCN) topology plays a major role in both the SSL and FSL impedance estimations.The conduction losses due to SSL and FSL impedances [10] are denoted by: where, I LOAD is the load current; F SW denotes the switching frequency of the control signals; M CAP and M SW are constants determined by the topology; R ON is the ON resistance density measured in Ω.m and W SW denotes the total width ( in m) for all switches.
Apart from conduction loss in the switches and transfer capacitors, there are shunt losses due to switching of bottom plate parasitic capacitance associated with the flying capacitors.Generally, metal-insulator-metal (MiM) capacitors have lower bottom plate parasitics as compared to the gate capacitance of devices.Hence, loss due to bottom-plate capacitor (P BOTT ) is given by [10]: where, M BOTT is determined from the topology, V O is the voltage swing across the bottom plate parasitic capacitor and C BOTT is the total bottom plate parasitic capacitance.There are also switching losses (P GATE ) associated with the gate capacitance of transistors in the clocked-control circuit [10] which generate out-of-phase non-overlapping clocks for charge transfer and are expressed by: Where, V SW denotes the voltage swing; C GATE is the gate capacitance density (F/m).Thus, the total loss (P LOSS ) in any switched-capacitor based converter that needs to be minimized is given by: Thus, for a given input voltage (V IN ), load current (I LOAD ), output ripple and the desired conversion ratio, it is important to select an appropriate topology, switching frequency and the number of clock phases for maximum efficiency.The area allocation for the switches and capacitors along with parameters such as bottom plate parasitic capacitance and the switch resistance per unit width, play an important role in realizing the peak efficiency of a switched-capacitor power converter.In [11], an integrated charge pump with a variable number of stages and a constant switching frequency per stage is used to obtain a peak efficiency of 70% and support a wide range of input power levels ranging from 10-1000µW.In [12], the authors have proposed a fully integrated selfoscillating switched-capacitor based energy harvester with 9X-23X configurable voltage conversion ratios.In [12], voltage doublers have been cascaded.Clock generation and level-shifting functions of the control scheme within each doubler are implemented using a self-oscillating architecture, eliminating the need for power-hungry ring oscillators and clock generation circuits.A leakage-based delay element allows frequency control for a wide range of load varying from 5nW-5µW with 40% efficiency and less than 3nW static power consumption.

Maximum-Power-Point-Tracking
High end-to-end power efficiency in self-powered systems across a wide range of environmental conditions is a necessity.Since the maximum power available from TEGs and solar cells varies significantly with environmental conditions, a built-in method, which keeps track of the Maximum Power Point (MPP) with changing conditions, is extremely useful.By keeping track of the MPP, which is roughly around 50% of the open-circuit voltage [3] of a TEG or around 73-80% [3] of the open-circuit voltage of an indoor solar cell, the system can extract the maximum power available in any condition.A Maximum Power Point Tracking (MPPT) scheme is even more useful if the system needs the flexibility to choose and harvest energy from multiple modalities such as TEG, solar, piezo etc.The basic idea behind MPPT is that a boost converter or a charge-pump needs to provide an optimal input impedance such that the source operates at its MPP under different environmental conditions.In case, the system needs the flexibility to choose and harvest from multiple modalities at the same time, the range of input impedance required for MPP varies significantly.For instance, in [2] the output impedance of a solar cell at different MPPs, subject to varying degrees of illumination, varies between 27-68kΩ whereas the output impedance of a TEG at MPP is roughly fixed at 82 kΩ.Thus, if the system needs the capability to harvest maximum power from diverse harvesting modalities, the MPPT circuit needs to tune the input impedance of the boost converter or the charge pump across a wide range.Several techniques to implement MPPT have been discussed in the literature.We will discuss the theory behind some of the more common methods, which are implemented in ULP systems.

Hill Climbing/Perturb and Observe
Hill Climbing (HC) involves perturbing the duty-cycle of a power converter (For instance T H and T L for the boost converter in Section 2.2.2) while Perturb and Observe (P&O) involves disturbing the voltage provided by a TEG or a solar cell.By allowing the output voltage from a TEG or a solar cell to increase or decrease, the output power is monitored using a voltage and/or a current sensor.With the increase in voltage, if the power increases then the perturbation is continued in the same direction (voltage is increased by a finite step) but if with the increase in voltage, the power decreases, then the direction of perturbation is reversed (voltage is decreased by a finite step).This process is repeated iteratively, such that the final operating point oscillates around the MPP.The degree of oscillation can be controlled using a smaller step size but this usually results in a longer response time to achieve MPP operation.However, under sudden changes in environmental conditions, especially when conditions change rapidly before the MPPT circuit responds, the HC/P&O methods do not provide an optimal solution.

Incremental Conductance
The theory behind the incremental conductance method is that the slope of the P-V curve of a TEG or a solar cell is zero at the MPP.The slope is positive toward the left of the P-V curve and changes direction to the right of the P-V curve.
where, V and I are the instantaneous output voltage and current and P is the instantaneous power from a TEG or solar cell.ΔP and ΔI represents the change in instantaneous power and current subject to an instantaneous change in output voltage, ΔV.Hence, by keeping track of instantaneous conductance, !! and incremental conductance, , MPP operation can be achieved.

Fractional Open-Circuit Voltage
From the P-V curves in Fig 2 .3(Section 2.2.1), it is evident that the output voltage at MPP (V MPP ) is a fraction of the open-circuit voltage (V OC ) of a solar cell.This fraction varies roughly between 0.71 and 0.78 [3] with varying solar irradiance conditions, since V OC and the output power changes with light intensity.For a TEG, with varying degrees of temperature differential (ΔT), V MPP is roughly 50% of V OC [2].Hence for MPP operation, Where, k varies from 0.71-0.78 in the case of solar cells while it is approximately 0.5 for TEGs.Thus, k needs to be determined empirically by characterizing a TEG or a solar cell under varying environmental conditions.Once k is known, V MPP can be computed and the output voltage of a TEG/solar cell can be compared with V MPP using an on-chip comparator to determine whether the system operates at MPP.Although this method provides a low-cost, low-power solution, it is not accurate with changing environmental conditions.For instance, in solar-energy harvesting, k varies significantly with environmental conditions, such that the system operates at near-MPP but not at the actual MPP.Additionally, if the system needs the capability to harvest from multiple harvesting modalities, different values of k are necessary which need to be adjusted dynamically resulting in a more complicated implementation, which might consume higher power.

Fractional Short-Circuit Current
This method is similar to the fractional open-circuit voltage but this scheme leverages short circuit current (I SC ) instead of open-circuit voltage (V OC ) to estimate the MPP.Just like the fractional open circuit voltage method, the current supplied during MPP (I MPP ) is a fraction of I SC and this fraction needs to be empirically evaluated.
However, measuring I SC during operation can be difficult because a separate control scheme is needed to periodically short-circuit the harvester and a current sensor is needed to measure I SC, which increases the number of components and cost.Most of the above techniques have been implemented in ULP energy harvesting systems.The MPPT circuit in [8] uses a fractional open-circuit voltage method and assumes that the MPP of a TEG is 50% of the open-circuit voltage [2] while the MPP of a solar cell is 73-80% of the open-circuit voltage [3].The MPPT circuit in [8] uses an external resistive divider to sample the MPP voltage (V MPP ).When the boost converter is functional, the energy source is loaded and its output voltage, V IN goes down.An on-chip comparator monitors V IN and compares it with V MPP .As soon as V IN is less than V MPP , the comparator issues a signal to disable the boost converter, such that the energy source is again unloaded and V IN rises.Again, when V IN is greater than V MPP, the comparator issues a pulse to engage the boost converter and the cycle is repeated.A similar method for MPPT is proposed in [18].In [14], the switching frequency is tuned using digital circuits to modulate the input impedance of the boost converter.The disadvantage of this method is that the frequency range is limited.Hence, the range over which the input impedance of the converter can be tuned, is limited.

Research
Questions: § Assuming the system needs the capability to harvest from both solar as well as thermal energy, what kind of baseline energy harvesting architecture, delivers the peak power efficiency?What factors will affect this decision and to what degree?§ On what factors will the multiplexing scheme to choose between solar and thermal energy harvesting depend on and how will they affect the implementation?§ What kind of Maximum-Power-Point-Tracking scheme will work in a hybrid energy harvesting system?Will one global scheme for MPPT work for both solar and thermal energy harvesting or separate independent schemes be needed?§ What kind of start-up schemes would work best in a hybrid energy harvesting system?What will be the architecture for start-up and what factors will contribute to this decision?

Approach:
In this work, we will attempt to answer the research questions by focusing on three major components of an energy-harvesting system: Powertrain Architecture, Maximum-Power-Point-Tracking (MPPT) and startup techniques.§ Energy delivery or powertrain architecture § Power Delivery Modeling: To achieve peak efficiency in a power delivery system, it is important to investigate what are the sources of power loss in that system.Since the various loss mechanisms such as conduction, switching loss, leakage etc. are heavily dependent on many variables such as load currents, output voltages, input voltages, biasing in the control circuits etc., it is important to evaluate the performance trends with respect to these variables.To achieve this goal, we will develop first-order models to describe the loss mechanisms of various powertrain topologies described in Section 2.2 using mathematical equations.Hence, based on design specifications (such as load current, input voltage), the model will help the designer to know what kind of power-loss is dominant.The model will also compare the performance of multiple converter topologies (such as inductor-based boost converter or a switched-capacitor based charge pump) to a firstorder.The goal for the model is not to achieve SPICE-level accuracy for a circuit topology but to aid the designer in design-space exploration.§ Control Scheme: Once we have a fair estimate of what kind of topology we should implement, we will focus on designing the control system architecture for the selected powertrain topology.The control system will include the following circuits and components: • Power-efficient, ULP comparators for making decisions • Ring-Oscillator, which can be current-starved or ULP relaxation oscillators for providing control either to the clocked-comparators or to the switches present in the powertrain.• Level Converters for signals crossing voltage domains or for providing sufficient gate drive to the switches.• Digital control logic.For instance, assuming a boost converter topology, if a single inductor needs to be shared across different harvesters, resource multiplexing might be necessary.• We will also design and evaluate efficient power-on-reset (POR) schemes for startup.§ Maximum Power Point Tracking (MPPT) Algorithms: In a system, which needs to harvest energy from two or more sources, the MPPT circuit needs to be flexible and adaptive.To investigate this, we will incorporate the following methods: § Characterizing state-of-the-art energy harvesters To design an energy harvesting system, a fundamental understanding of the output characteristics (such as open-circuit voltage, short-circuit current, output impedance etc.) of an energy source is important.Moreover, it is important to study how these characteristics change with environmental conditions.This will enable us to estimate design specifications of the energy harvester.We will evaluate several commercially available TEGs and solar cells and study the output power vs. output voltage characteristics subject to different environmental conditions, such as temperature, light intensity etc.We will also study the output power vs. output impedance characteristics for TEGs and solar cells.§ Design Exploration for MPPT algorithms Based on the output characteristics of TEGs and solar cells, we will evaluate some of the MPPT algorithms described in Section 2.2.4.Depending on the range of output impedances at the maximum power point for both TEGs and solar cells, adopting a hybrid approach to MPPT might be a possibility.Hill climbing and incremental conductance methods provide better accuracy at the cost of higher design complexity.To investigate lower power implementations of such algorithms in hardware, we will explore the following components and techniques: • Current sensors: Though traditionally, analog circuit techniques used for sensing currents consume significant power, we will investigate low power current sensing and understand the trade-offs between performance and accuracy.
• Digital circuit techniques: We will also investigate digital circuit techniques to quantify the input power as a digital equivalent.For instance, by estimating how fast a capacitor is charged by a TEG or solar cell, an estimate of the input power can be made.A comparator, relaxation or a ring oscillator and simple digital circuits such as counters can be leveraged to determine the charging time of the capacitor.§ Start-Up Techniques: Power-efficiency is not a critical factor during start-up as proper functionality is crucial.We will investigate the following techniques for system start-up: • Oscillator and Voltage Multiplier: We will further investigate the work in [13] and [8].The oscillator can be an LC tank oscillator followed by a Dickson multiplier or using a conventional current-starved ring oscillator and a charge-pump based voltage-doubler.
• RF Kickstart: We can use RF as a one-time source and use a rectifier to charge an input capacitor and power ring oscillators, which can generate control signals for an auxiliary boost converter to begin energy harvesting.

Evaluation Metrics:
A self-powered system, which harvests from multiple sources of energy, is an active area of research.There are very few works in literature, which demonstrates a self-sustaining system, which can intelligently choose between different sources of energy for harvesting.We will evaluate the proposed multi-modal energy-harvesting system based on the following metrics: § Efficiency: We will measure the overall power-efficiency across a range of input voltages and load currents.We will then test the functionality of the system with a TEG and an indoor solar cell and demonstrate energy harvesting with changing environmental conditions as a proof of concept for multimodal energy harvesting.§ Minimum input voltage: We will evaluate the minimum input voltage required from a TEG or a solar cell to enable energy harvesting and compare with the state-of-the-art.If the systems can harvest energy from a low-voltage source, the overall lifetime of the system can increase.As discussed earlier, a lower cold-start voltage will ensure that the system is functional across a wider range of environmental conditions or in the event of total loss of energy in the storage capacitor.§ Maximum output voltage: We will monitor the maximum output voltage while harvesting simultaneously from thermal and solar energy under different environmental conditions.§ Area: A fully integrated energy harvester without any external passives will provide a significant advantage to the overall system size.We will assess the feasibility for a fully integrated on-chip energy harvester

Anticipated Results:
Using the techniques and evaluation metrics discussed, we hope to extend the state-of-the-art ULP energyharvesting systems such as [5][7] [8].The proposed scheme should demonstrate energy harvesting from both TEGs and solar cells.With an adaptive MPPT approach, the system should intelligently decide which harvesting modality to use for scavenging energy.Biasing the comparators in the subthreshold region along with circuit techniques to reduce standby leakage should reduce power loss in the control circuits.A multi-modal fully integrated energy-harvesting system with a low-voltage start-up scheme has not been demonstrated before in literature.We hope to reduce the minimum voltage for cold-start as compared to [8][13].

Contributions:
The contributions from this chapter will be the following: § A first-order model, which will help in design space exploration for various inductor-based and switched capacitor based powertrain topologies.§ A hybrid MPPT control scheme, which will assist in achieving peak power efficiency for both solar and thermal energy harvesting § A start-up circuit/architecture, for enabling low-voltage system startup.§ A fully integrated energy harvesting system with cold-start and MPPT, for thermal and solar energy harvesting Supply regulation plays an important role in delivering power to general-purpose microprocessors and chipsets deployed in smartphones, tablets, laptops, as well as self-powered systems such as wireless and body sensor nodes.Each system has an application specific power profile.For instance, high-performance systems, such as personal computers consume hundreds of mW, depending on the type of application being executed by the operating system.Battery-powered systems, such as smartphones need to conserve energy to ensure the longevity of the battery and thus operate at much lower power levels in the order of 100s of µW.Self-powered systems, which operate from energy harvested from ambient sources, have a much stringent power budget.Various components of a system might have entirely different voltage level specifications.For instance, most analog and mixed-signal components need sufficient voltage headroom for stable operation whereas digital circuits leverage Dynamic Voltage and Frequency Scaling (DVFS) for energy-efficient operation.Hence, an integrated solution for supply regulation and power management is essential for delivering power to various analog and digital components in both high-performance as well as battery-operated or self-powered systems.

Background and Prior Art:
Technology scaling has allowed the integration of power delivery circuits resulting in fully integrated voltage regulators with higher power efficiencies as compared to off-chip regulators.Power delivery circuits can be broadly classified into two major categories: energy harvesting circuits and voltage regulators.While voltage regulation is needed in almost all systems to provide a stable power supply and to support variations in load currents, integrated energy harvesting circuits are application-specific.
Voltage regulation is typically achieved by regulating battery voltage in the case of battery-powered systems or the voltage on a storage capacitor, in case of energy-harvesting systems.Typically, supply regulation involves down-conversion of the battery voltage using buck converters.In some cases, a buck-boost topology is required if the voltage on the storage capacitor or the battery is lower than the desired regulated voltage levels of the system.Buck regulators can be implemented using linear regulators such as low-drop-out (LDO) regulators; switchedcapacitor or inductor-based switching regulators.Buck-boost topologies can be implemented using switching regulators (inductor/switched-cap topologies).

Low Drop-Out (LDO) Regulators Fig 3.1: LDO topology
An LDO is a type of a linear regulator, which can provide a regulated DC supply with input voltages, higher than or nearly equal to the required regulated output.The main advantages of using an LDO over a switching regulator are that it does not inject switching noise on the supply line and does not require off-chip passives for regulation.
Hence, an LDO can be fully integrated on-chip and consumes a smaller area as compared to some of the switching regulators, which require external passives and greater silicon real estate.Fig. 3.1 shows the topology of an LDO.It consists of an error amplifier (EA), a voltage reference circuit whose output is shown as V REF , a pass transistor (M LDO ) and a feedback network shown by resistors R1 and R2.Conventionally, an LDO is mostly used as an output stage of a switching regulator to reduce the ripple and switching noise injected by the switching regulator on the supply line.A low-power bandgap reference circuit, such as [20] or a voltage reference circuit, based on ΔV T of two CMOS transistors [21] can be leveraged to generate V REF .Ideally, V REF should have low sensitivity to the supply voltage (V STORE ) and temperature variations.A fraction of the regulated output voltage, V OUT is fed-back to the error amplifier EA by the resistive feedback network consisting of resistors, R1 and R2.
The error amplifier modulates the ON resistance of the pass transistor, M LDO to maintain a regulated V OUT, subject to changes in load current, I LOAD.The response time depends on the bandwidth of the error amplifier, which can be improved by employing compensation techniques, such as dominant pole or lead-lag compensation schemes.The efficiency (η) of an LDO is given by: where, I CONTROL represents the total current consumed by the control circuits, such as the voltage reference, error amplifier, leakage and current loss in the feedback network.

Inductor-based Voltage Regulators Fig 3.2: Inductor-based switching regulator topology
The advantage of a switching regulator over an LDO is that a switching regulator can provide a wider range of voltage conversion ratios across a wider range of load currents with higher power efficiencies.An inductor-based buck converter is a switching regulator, which uses an inductor as an intermediate storage element to transfer power to the load.The disadvantage of using an inductor-based buck converter is that it needs an off-chip inductor with a high-quality factor (Q) to achieve low conduction loss in the inductor.Although the DC-DC converter proposed in [22], implements an integrated on-chip inductor, it is difficult to achieve high power efficiency for high voltage conversion ratios.Moreover, on-chip inductors are not area-efficient.Hence, most switching regulators in literature, which use an inductor-based approach for voltage regulation use off-chip high-Q inductors [23][24][25] [26].Fig. 3.2 shows the powertrain topology of an inductor-based buck converter.It consists of two power transistors, M HS and M LS , which are used to transfer power to the load through the inductor, L BUCK and regulate the output voltage, V OUT at the desired conversion ratio.Depending on the architecture of the control scheme, either V OUT [26] or the inductor current [24] is sensed through a feedback network (not shown in Fig. 3.2) to generate pulse-width-modulated (PWM) or pulse-frequency-modulated (PFM) non-overlapping gate control signals of M HS and M LS .During the High-Side (HS) phase, the gate control signals ensure that M HS is ON while M LS is OFF.The inductor is charged up by V STORE and M HS .In the Low-Side (LS) phase, the energy stored in the inductor is transferred to the load by M LS and the inductor current ramps down to zero.Assuming Discontinuous Mode (DCM) operation, the inductor current remains at zero until the next switching cycle.It is important that M LS should turn OFF when the inductor current crosses zero.Thus in an ideal case, the voltage conversion ratio is given by: where, T L is the ON-time of M LS , which is governed by the pulse width of the LS pulse and T H is the ON-time of M HS , which is governed by the pulse width of the HS pulse

LS
Hence, by modulating T and T H, the desired conversion ratio can be achieved.The HS and LS pulses need to be non-overlapping so that there is no short-circuit current through M HS and M LS .A dead-time controller such as [23][24] can be implemented in the control scheme to ensure that HS and LS pulses are non-overlapping and there is no short-circuit current through M HS and M LS .In (1), the conduction losses in the inductor and power transistors, as well as the switching loss, are not considered.The total conduction loss during the HS cycle,  !",! as given by: Similarly, for the LS cycle, the total conduction loss,  !",! is given by where,  ! is the total resistance including the parasitic resistance of the inductor and the ON resistance of M HS and  ! is the total resistance including the parasitic resistance of the inductor and the ON resistance of M LS .
The switching loss,  !" and leakage,  !"# is constant for a given control scheme and depends on the dimensions of M LS and M HS .Hence, the total loss,  !"## is given by: Thus for a given conversion ratio, , in order to minimize  !"## , it is necessary to tune the peak inductor current, I PEAK , or modulate the ON-resistance of M LS and M HS by an appropriate gate-drive control scheme

Switched-Capacitor Voltage Regulators
Switched-capacitor DC-DC converters are a class of switching regulators, which offer a fully integrated solution to voltage regulation.The arrangement of CMOS switches and transfer capacitors can be reconfigured on-chip to achieve desired conversion ratios.Fig. 3.3 shows the topology of a simple 2:1 switched-capacitor based buck converter.

Fig 3.3: Switched capacitor 2:1 buck regulator topology
The only disadvantage of implementing a switched-capacitor architecture is that the regulator can be targeted for only a limited range of conversion ratios and load currents as compared to inductor-based DC-DC converters.Moreover, precise control signals are required for the switches to prevent undesirable short-circuit or contention currents, which can lower the power efficiency.The sources of power loss are due to the conduction loss in the switches and transfer capacitors, the switching loss in the control circuits and parasitic bottom-plate capacitance of the transfer capacitors.Depending on the load current and output voltage specifications such as ripple, switching frequency, some sources of power loss may be dominant.At lower load currents, switching loss and bottom-plate parasitic loss are more dominant than conduction loss in the switches.In chapter 2, section 2.2.3, the sources of power loss in switched-capacitor power converters are described in more detail.Existing work in literature, such as [27], implements a reconfigurable switched-capacitor topology and combines interleaved clocking and level shifting in gate-drive circuits.In [28], a hybrid architecture, consisting of switchedcapacitor regulators and LDOs, is implemented.In [29], a capacitance modulation scheme is implemented using digital circuits, which controls the amount of transfer capacitance involved with varying load currents.Will a hybrid-architecture (For instance, switching regulator + LDO) provide a higher power-efficiency or a single switching regulator is sufficient?§ In a hybrid topology, what will be the multiplexing scheme for selecting different powertrain architectures?What factors will govern this multiplexing scheme?§ If a switching regulator is implemented, how much dead-time is sufficient for peak power efficiency?What are the trade-offs between dead-time, switching loss and line regulation?§ How much ripple or power supply variation can be tolerated at the output?In ultra-low power systems, is it necessary to achieve strict line and load regulation?§ What are the trade-offs between achieving a strong line regulation and power-efficiency § For a voltage reference circuit, how much voltage/temperature sensitivity is desired?Does it need to have a high degree of tolerance to power supply variation and temperature?

Approach:
In order to design a power-efficient voltage regulator, it is important to understand the power profile of the load circuits.We propose the following methods in order to answer some of the research questions § Power analysis of different functional units It is important to understand the power and voltage specifications of each constituent block before the design of the power delivery and supply regulation framework.Based on a pre-defined power budget, which is expected to be 1µW or less, we will analyze the operating conditions of each block that lead to minimum power consumption and assess its feasibility.Apart from power consumption, tolerance to supply voltage variation is important to assess how much line regulation is required.Generally, analog and mixed-signal circuits have less tolerance to ripple and supply variations as compared to digital blocks.Hence, PSRR for each mixed-signal functional unit will be evaluated.We will assess the worst-case load transient for each block, which needs to be supported by the voltage regulator.§ Modeling output voltage variation and its impact on regulator power efficiency We hypothesize that a lightly regulated supply rail at lower load currents will provide higher power efficiency.Aggressive line regulation would theoretically require a greater number of comparisons between the output voltage and the reference, resulting in more switching and higher quiescent current loss.A more relaxed line regulation will reduce the extent of switching and a lower load current will reduce conduction loss, improving overall power efficiency.To better understand and validate this hypothesis, we will model ripple in different converter topologies as a function of switching frequency and load current and assess its impact on overall power efficiency.§ Voltage reference A stable voltage reference with a high PSRR and temperature stability is needed for all voltage regulators.A low-power voltage reference is needed for achieving high power efficiency at ultra-low load currents.A Bandgap reference (BGR) architecture is usually used in most voltage references but consumes higher power and operates at a higher supply voltage.Moreover, it needs a start-up circuit.Ultra-low power BGR circuits [20] and voltage references based on the threshold voltage difference (ΔV T ) of two CMOS transistors [21] have been proposed, which are suitable for supply regulation in self-powered systems.Ultra-low frequency timers and clock sources, which are based on the gate leakage of transistors, have been proposed which provide a stable clock reference for ultra-low power applications [32].We will evaluate a gate leakage-based voltage reference and assess the performance, subject to temperature and supply voltage variations.We will explore compensation techniques to improve temperature and voltage variations and evaluate trade-offs with power consumption.§ Architecture implementation: on the power analysis and load current requirements of each block, we will implement the powertrain architecture of the ultra-low-power system.While feasibility analysis is currently under process, it seems like we will have separate voltage domains for digital and analog components because analog circuits typically need more voltage headroom and are more susceptible to power supply noise etc.Based on the load current and output voltage specifications, a hybrid approach, such as a single switchedcapacitor converter with multiple outputs and an LDO might be incorporated for different voltage domains.

Evaluation Metrics:
We will assess our approach and methods using the following figures of merit.

§ Power Efficiency
The proposed converter topology should improve or, at least, equal the power-efficiency of state-of-theart power converters at load currents in the order of 10µA or less.We will also assess the powerefficiency at different unregulated input voltage levels and load currents.§ Line/Load Regulation and Settling time We will assess the line and load regulation metrics and their impact on overall power efficiency.The transient response of the converter will be evaluated with changes in load currents and the supply voltage.
Although a regulator with a faster response would generally use compensation techniques in the converter to improve the overall bandwidth, such schemes will also consume power and area.§ Operating range We will evaluate the operating range of the proposed converter.The input voltage and maximum load current range will be explored.

Anticipated Results:
An ultra-low power SoC, which consists of features such as signal acquisition, filtering, analog-to-digital conversion, digital processing, storage and wireless communication, consists of multiple analog and digital macros, which have different power and voltage specifications.Depending on the circuit architecture, different blocks might need a better transient response, immunity to supply noise etc.We envision a hybrid power architecture with dedicated supply rails for analog components and digital macros.The extent to which the powertrain architecture can be shared depends on the specifications of the different load circuits.At ultra-low load currents, the problems of cross regulation and conduction loss in the powertrain should not pose a major concern although we plan to evaluate the different loss mechanisms in a hybrid power architecture.The hypothesis of regulating the outputs only when required can yield benefits in the overall power efficiency and will be further assessed by system modeling, simulations and measurements.

Contributions:
§ An architecture for supply-regulation, targeted at achieving high power efficiency at 1-10µW.Variation in on-chip power supply continues to be a major challenge in modern CMOS processes due to technology scaling resulting in increasing device densities and operating currents.Since the length of global wires such as power and ground lines does not scale at the same rate as device dimensions, IR-drop continues to increase in deep-sub-micron processes.Since most modern microprocessors operate at clock frequencies in the GHz regime [33], such systems are most susceptible to Ldi/dt events, resulting in power supply overshoots and undershoots.While supply overshoots can cause reliability issues such as gate-oxide breakdown and hot-carrier injection (HCI), supply undershoots can result in timing violations such as setup-time and hold-time failures.Thus, power supply droops can limit the maximum operating frequency (F MAX ) of a modern microprocessor.In self-powered ultra-lowpower (ULP) systems, the magnitude of load current transients is negligible except when the system is in a mode where it needs to acquire physical data or send data over a radio link.Hence, it is hypothesized that line and load regulation requirements for powering digital circuits in ULP systems can be relaxed to some degree.Variation in the power supply can result in timing errors in low-voltage circuits as well [38].Additionally, analog and mixed signal components such as the radio or the analog front-end need a tight line and load regulation even in ULP systems.Hence, there is a need for a low-cost, low power method to monitor voltage variation even in ULP systems to account for the trade-off between relaxed voltage regulation and the susceptibility of digital circuits to timing failures.

Background and Prior Art:
An Ldi/dt event occurs if there is a sudden change in the current consumption, especially when the microprocessor switches from one operating mode to another, resulting in high-frequency overshoot or undershoot noise.Resonant supply noise in the mid-frequency range is another source of power supply noise, which results mainly from the resonance of the package inductance and the decoupling capacitors [37].During dynamic voltage scaling (DVS), the slow transient response time of voltage regulators can result in low-frequency droops.Fig. 4.1 describes the two major sources of power supply fluctuations.High-frequency noise is generally induced on the supply due to Ldi/dt events and influences timing in local circuit paths.Noise due to package resonance and low-frequency droops takes time to recover and thus is present for multiple clock cycles and impacts performance globally across the chip.Existing work in literature such as [34] has proposed on-die dynamic voltage monitoring and adaptive clock distribution schemes to enable tolerance to power supply variations across a wide operating range.In [35], techniques for timing error detection and correction are proposed to reduce metastability occurring due to dynamic power supply and temperature variations.Analog techniques have been employed in [36] where on-die sensors are distributed to monitor peak overshoots and undershoots.Adding decoupling capacitors can reduce dynamic IRdrop.Active decoupling capacitors can compensate the noise in the low to mid-frequency range [37].However adding decoupling capacitors increases gate leakage.Analog droop monitors [36] and metastability detectors [35] consume higher quiescent currents.Hence such techniques cannot be applied directly in subthreshold processors such as [4][39] which are used in energy-constrained systems such as wireless sensor nodes and other applications related to the IoT.Time § What are limits on droop resolution and how does this change with the power-supply noise frequency?What resolution is acceptable for a subthreshold processor?§ What will be the calibration scheme for measuring power supply noise?

Approach:
We propose the following methods to explore and address the impact of power supply variation in ultra-low power systems: § Latch-Based implementation for digital circuits We hypothesize that latch-based circuits can provide better immunity to power supply variation.A latchbased pipeline stage typically allows the designer to achieve higher performance than a register-based implementation owing to time-borrowing and allowing greater setup-time margin as compared to a flipflop.Assuming no clock skew, the setup-time constraint for a latch is: where,  !"#$ = transparency window of a latch Similarly for a flip-flop based design, where,  !"#_!"#$%&_!! = clock period of a flip-flop based stage  !"#!! = clock to flip-flop output delay Hence,  !"#_!"#$%&_!"#$! <  !"#_!"#$%&_!! which means that a latch-based pipeline stage can operate at a higher clock frequency than a flip-flop-based stage.Moreover, since a latch-based pipeline provides an additional transparency window, the incoming data has an additional setup-time margin equivalent to  !"#$ , which aids in resolving metastability issues arising due to power supply variations and lowfrequency supply noise.Short-paths in a latch-based design can be avoided as long as, where,  !!"#_!"#$ = hold time constraint Although a flip-flop based timing path has greater hold-time margin as compared to a latch-based path, employing out-of-phase non-overlapping clock signals can offset this limitation.To demonstrate the circuit robustness of a latch-based implementation to power supply variation, we analyzed the impact of lowfrequency power supply droops on both register-based and latch-based implementations of a 32-tap Finite Impulse Response (FIR) filter across a wide range of supply voltages.FIR filters play an important role in most low-power as well as high-performance DSP applications [41].We investigate the circuit robustness to power supply variation for both latch-based and register-based versions of the FIR filter by measuring the energy-delay (ED) trends.We use ED curves as a metric to evaluate the resiliency of a synthesized digital circuit (in this case an FIR filter) to power supply variations.We implement a low-power technique using digital circuits to measure the low-frequency droop present in the power supply.4.2 describes the block diagram of the system designed for analyzing and comparing the impact of power-supply variation on latch-based and register-based versions of the FIR filter.We implement a 16-bit, 32-tap FIR filter using both flip-flops and latches.For the latch-based implementation, we incorporate a dual-phase non-overlapping clock architecture to reduce the probability of hold-time failures.Both the latch-based and register-based FIR filters have dedicated ENABLE signals and supply rails while they share a common reset and ground rail.A global block-select signal helps in selecting the 32-bit output from each FIR filter.Fig. 4.2 also describes the proposed droop measurement scheme.The core of the droop measurement circuit is an on-chip 13-stage current-starved ring oscillator (RO) operating from the supply rail, VDD_DROOP, which contains voltage droops.The ring oscillator is biased in subthreshold by an external bias signal, VBIAS, which can be generated by an ultra-low-power bandgap reference such as [20].An 8-bit digital counter and comparator are powered by a clean, well-regulated supply without ripple, VDD_CLEAN.This 8-bit counter and comparator logic compares the number of clock cycles with a programmable 8-bit user-defined threshold, THR and generates an enable/disable signal to count the number of RO clock cycles denoted by DROOP.The number of RO clock cycles will vary depending on the magnitude of droop present.The difference between DROOP and THR provides an 8-bit digital proxy measurement for the amount of supply droop present.At a system-level, VDD_CLEAN can be obtained from a voltage regulator such as the buck-boost regulator proposed in [4].An on-chip voltage regulator needs to provide high conversion efficiency for a target load current range.For a fixed conversion efficiency of a regulator, a lower-power droop monitoring circuit would reduce the overhead on the limited power budget of an energy-constrained system.§ Decoupling capacitors Adding decoupling capacitors is a part of the standard physical design flow to resolve issues related to dynamic IR-drop.However, as discussed in Section 4.2, adding a large number of decoupling capacitors increases gate leakage.Thus, the designer needs to be more prudent with adding decoupling capacitors in ULP systems.We propose the following approaches to the design flow: § Methodology for adding decoupling capacitors during Physical Design Designers mostly use prior experience to justify the amount of decoupling capacitance in an SoC.This is dependent on the technology, package parasitics, load current profile in different operating modes and the sensitivity of custom macros to power supply variation.We propose to establish a vector-based dynamic-IR analysis methodology to optimize the amount of decoupling capacitance required.The flow will enable the designer to address power supply variation in a ULP system, without compromising on leakage.The flow will allow the designer to incorporate different circuit topologies of decoupling capacitors.§ Active Decoupling Capacitor design The basic concept behind active decoupling capacitors is to switch a pair of parallel decoupling capacitors to a series combination to give a local voltage boost in the presence of power supply droops.The control schemes for these switches have been implemented using power-hungry comparators [42] and opamps [37], which will not meet the power constraints of ULP systems.Using circuit techniques and biasing the comparators/op-amps in sub-threshold, we can achieve lower quiescent currents.

Evaluation Metrics:
We will evaluate the proposed methods using the following metrics VBIAS § Power consumption For any droop measurement or compensation circuit, which needs to be implemented in a ULP system, quiescent power consumption is an important factor.The power consumption of the proposed ring oscillator-based droop measurement scheme discussed in Section 4.4 was reported to be 0.9µW [43].The proposed opamp or comparator-based control scheme of the active decoupling capacitor should be sub-µW to fit into the overall power budget of a ULP system.§ Average Droop/IR drop The magnitude of the average power supply droop should be lower after compensating with active decoupling capacitors or by employing the proposed design methodology to counter dynamic IR drop.§ Resolution and Sampling Rate The various droop measurement schemes in literature, sample the noisy supply rails to record a digital equivalent of the power supply droop.For a low power system, dominated by power supply noise in the low-mid band frequencies, a tradeoff between power consumption with the sampling rate or resolution is necessary.However, the sampling rate and resolution should be high enough to capture the noise amplitude and behavior correctly.§ Area Although most of the proposed techniques tend to trade-off power consumption with area, we will evaluate the overall area footprint of the proposed circuits for comparison with state-of-the-art.

Results
We have explored latch-based digital circuit implementation to analyze circuit robustness to power supply noise.For the latch insertion in the FIR filter, we mapped a 16-bit, 32-tap FIR filter to logic gates using commercial synthesis tools.To control time-borrowing and allowing latch insertion only when necessary, custom scripts were used to replace each register with a pair of master and slave latches clocked by out-of-phase non-overlapping clocks [40].After inserting latches, timing optimization and logic restructuring was performed to balance all pipeline stages to achieve timing closure at 200kHz and 0.5V.Fabricated in a 130nm CMOS process, the test chip was packaged in a 64-pin PGA package for testing convenience.A Link Instruments IO3200 patterngenerator/logic-analyzer module was used to provide input patterns and off-chip clock signals to both latch-based and register-based FIR filters.Current measurements were performed using a Keithley 2401 sourcemeter.External droop was added to the supply with a function generator.A 1 kHz saw-tooth waveform of varying peak-to-peak amplitude was coupled to the power supply.External noise and supply droop can be injected off-chip by coupling a fast rising ramp signal to the power supply using a large coupling capacitor in the order of 47µF or higher.measurement circuit consumes less than 1.5µW across a range of supply voltage ranging from 0.5-0.8Vand can be leveraged in ULP systems such as wireless sensor nodes.Fig. 4.4 shows the energy-delay trends of both the latchbased and register-based FIR filters both with and without externally injected power supply noise.Fig. 4.4 shows that the latch-based implementation provides 25-37% improvements in energy-efficiency below 0.6V in the presence of 1kHz power supply droop ranging from 44-120mV.At higher voltages and operating frequencies, the register-based implementation provides better energy-efficiency.This is because active-energy dominates at higher voltages and the latch-based implementation has a higher switching capacitance owing to a dual-phase clocking scheme.

Anticipated Results
The proposed approach and the techniques discussed in Section 4.4 will allow us to understand the impact of power supply droop on the circuit robustness in ULP systems.We hope to design a low-cost solution to measure the magnitude of power supply droop without a significant power overhead.The proposed flow on using dynamic IR drop based on vectors from actual design test cases will allow the designer to achieve better immunity to supply variation with minimum leakage power overhead.The low power active decoupling capacitor implementation will provide supply noise compensation in the low-to-mid band frequency range.The design of ULP systems for IoT applications, such as health monitoring, surveillance and home automation involves a high degree of system integration, consisting of a variety of circuit components, such as ULP processors, subthreshold DSP accelerators, wakeup radios etc.While power delivery to such components plays a major role in defining the overall system-level power budget and electrical specifications, it is important to use circuit or architectural techniques to design and optimize such components for lower power.Before an energy harvesting or a voltage regulation scheme can be designed, it is imperative to understand the power or energy characteristics of such macros and analyze the circuit performance to power supply variations.Technology also plays a major role not only in the design of high-efficiency DC-DC converters and but also assists in lowering the energy or power consumption of circuit components.In this chapter, we will present an energy-efficient MSP430 processor designed in an FD-SOI process optimized for subthreshold operation.We will evaluate the energydelay and leakage power characteristics of a 32-tap FIR filter in a 55nm Deeply-Depleted-Channel (DDC) technology.Then we will discuss the need for a ULP comparator with a low input-referred offset in a 10nW wakeup radio for ULP applications.

Background and Prior Art:
Wearable sensors, portable biomedical electronics such as ECG monitors, and self-sustaining surveillance systems need to achieve energy-efficiency and ultra-low standby power.In this section, we will discuss the circuit architecture and implementation of two major components, which play an integral role in such systems.

Subthreshold processors and accelerators
The restrictions in size and the need for a longer operational lifetime render self-powered systems severely energy-constrained.Within the limited energy budget, such systems need to run application-specific programs and sub-routines such as ECG monitoring [4] [46].Hence, energy-efficient processing at the circuit and at the system level is essential to minimize the energy per operation in such systems.Existing work in literature has reported systems or processor implementations consuming nW to µW power levels by operating the system near the threshold voltage (V th ) of a transistor [44][45] [46][47].Operating a digital circuit in the subthreshold regime causes transistor leakage to be a dominant source of energy consumption because of exponentially large delays.Prior work in literature such as [45] has proposed digital logic styles to suppress subthreshold leakage of conventional bulk devices.Hence optimizing the leakage characteristics of a device can result in significant benefits at the overall system level.However low voltage transistor operation presents four key challenges: § Minimize the subthreshold swing and achieve maximum ON current below V th § Minimize static leakage current § Minimize V th variation § Minimize device capacitances.
Thus, if the process technology provides CMOS transistors, optimized for lower subthreshold leakage with reduced V th variation and minimal degradation in performance, then energy-efficient and reliable digital processors and circuits can be implemented for ULP applications.

RF-IN
To conserve energy, self-powered systems such as wireless sensor nodes spend most of the time in standby mode and perform active operation only when required.To synchronize with the base station and bring the system out of standby mode, a Wake-up radio (WRX) can provide a viable solution.Since a WRX is always active and listens to an incoming RF signal or pattern, the active power of a WRX needs to be lower than the overall standby power of the system, which tends to be in the nW range for digital components.Reducing the power consumption of a WRX comes at the cost of reduced sensitivity to the incoming RF signal.Existing work in literature, such as [48][49] implement a WRX architecture similar to Fig. 5.1, where the incoming RF signal is rectified and the output DC voltage from the rectifier is sampled using a low-power comparator.A ULP baseband correlator processes the sampled output from the comparator, compares the sample with an expected code word and issues a wake-up signal.The size of the correlator and the sampling frequency is determined by the overall receiver sensitivity and power budget, which is typically in the nW range.Since the input RF-signal power is typically limited, the rectified output voltage is restricted to less than 10s of millivolts.As a result, the comparator needs to have a very low input-referred offset.Moreover, the threshold of the comparator should be controllable to avoid false system wake-ups in the presence of noise or interference.Thus, the comparator needs a mechanism for offset control.Since a WRX is severely power-constrained, the comparator should consume a very low quiescent current (typically less than 10nA).The clocked comparator used in [48] uses a current DAC for setting the bias currents in both the pre-amplifier and the regenerative feedback circuit with the input common mode referenced to ground.The dynamic comparator in [50] consumes very low static current and uses a combination of high-V T and standard-V T devices to reduce leakage with reduced performance penalty.A dual-rail clocked-comparator architecture is proposed in [51] to provide greater resilience to kickback noise.

Research Questions:
§ How can we quantify the benefit due to process technology in the power and performance of digital circuits, such as processors and digital filters?§ What other circuit topologies and logic families can be used to implement ULP digital circuits, apart from static-CMOS?What are the trade-offs between performance and reliability?§ What is the optimal resolution for comparator threshold in a ULP wake-up receiver?How does adding more resolution bits influence the overall power consumption and receiver sensitivity?§ What are the preferred architectures to realize a ULP comparator with low input-referred offset?Can a ULP comparator in the nW range be realized using a continuous-time comparator or some other topology?§ How does the input-referred offset vary across different comparator topologies, supply voltage and power?

Approach:
We adopted the following methods to answer some of the research questions.

Use of Technology to optimize performance and energy consumption in low-voltage digital circuits
In order to evaluate the advantages of process technology and devices optimized for subthreshold operation, we implemented an MSP430 processor in a 90nm FD-SOI process and also demonstrated a 16-bit, 32-tap FIR filter in a 55nm DDC technology.Due to better I on /I off ratios and less V T variation in the devices supported by these technologies, we implement a 1.3µW MSP430 processor operating at 0.4V and 250kHz [52].It consumes 67% less energy as compared to [53] which demonstrates a similar processor implementation using conventional bulk devices.The FIR filter, implemented in a low-leakage 55nm process consumes 5x lower energy as compared to a similar sized FIR filter implemented in a conventional bulk technology [54].Substrate biasing in the DDC technology offers further 39.4% savings in energy due to reduced leakage.

Ultra-low-power Comparator topologies
We will explore different continuous-time and clocked comparator architectures implemented in literature [49][50] [51] and evaluate the advantages and disadvantages of each topology in a power-constrained system such as a wake-up radio.We will explore different techniques for offset-compensation and threshold control.For instance, in a clocked comparator, one phase of the clock can be used for offset compensation and the other phase can be used for comparator operation.We will explore the impact of the input common mode level on the comparator offset.We will explore the impacts of device noise such as thermal or flicker noise on each comparator topology.We will also evaluate kickback noise, which limits the functionality and performance in clocked-comparator topologies.

Ultra-low-power circuit families and logic styles
Static-CMOS logic families have been conventionally used in digital circuits as they offer higher Static-Noise-Margin (SNM), more reliability and lower power.However, in power-constrained systems, such as wake-up radios, new logic styles are needed to further reduce the power consumption.A buck regulator can provide a lower operating voltage but can have a very low power-efficiency at ultra-low loads (in the order of 10nW or less).Hence, we will explore new logic styles, which consume very low power but at a higher voltage-level such as [45], which utilizes the leakage current of a transistor for circuit operation at ultra-low frequencies.Stacked circuit topologies can provide another solution but is limited by signal swing and level shifting between different stages.

Evaluation Metrics:
We will evaluate the proposed methods and approach using the following metrics

Power
We will measure the power consumption of the ULP comparators and the overall low power wake-up receiver system.The power consumption of always-active components such as wake-up receivers, clock sources etc. are important in a wireless sensor node to estimate the total operational lifetime.We will also measure the standby and active power of the MSP430 processor and FIR filter to evaluate the benefits provided by the low-power, low-leakage 90nm FDSOI and 55nm DDC technologies.The power profile of the components discussed in this chapter will help the designer to estimate the specifications of the voltage regulator and power delivery circuits.

Energy
Energy per cycle or energy per instruction is another metric, which will help the designer to evaluate the energyefficiency of the system.Voltage scaling to subthreshold levels, reduces the active energy but also increases circuit delays exponentially.Thus, it is important for the system to operate at the energy-optimal point.We will compare the energy per cycle vs.delay for the proposed MSP430 processor and FIR filter implementation.

Operating voltage
The optimal operating voltage range for minimum power or energy needs to be evaluated for the proposed components.Along with power consumption, the voltage range will provide a design specification for voltage regulation and power delivery circuits.

Operating frequency
We will evaluate the performance of the proposed FIR filter and MSP430 processor in subthreshold regime by determining the maximum operating frequency.By comparing the maximum operating frequency at a fixed voltage, we can estimate the performance benefits due to technologies, such as FDSOI and DDC.

Input-referred offset and noise in comparators
As discussed, the comparators used in wake-up receivers need a very low input-referred offset for detecting an ultra-low power RF signal.Hence, it is important to select a comparator topology with compensation for process variation and mismatch.In the presence of interference signals and noise, it is important to control the offset and set the comparator threshold.We will design comparators with low input-referred offset and an offset compensation scheme to control the comparator threshold.Input-referred noise due to thermal and flicker noise will be used as another metric for the comparator design.

Noise and Noise Margins
The SNM will help in evaluating the circuit stability.Power supply noise or common mode noise can influence circuit functionality, especially in comparators.Within the available power budget, a high PSRR and CMRR are necessary for rejecting power supply noise and achieving a high overall signal-to-noise ratio (SNR).

Results: Subthreshold processors and digital FIR filters
We implemented a 16-bit MSP430 processor and a 16-bit, 32-tap FIR filter, designed for subthreshold operation in a 90nm Extremely Low Power (xLP) FDSOI and in a 55nm DDC technology respectively, using logic synthesis and auto-place-and route (APR) tools.For the MSP430 processor, a library of logic gates and sequential circuits such as flip-flops and latches were characterized to operate at 0.36V, and timing closure was achieved at 200 kHz using static-timing-analysis (STA) tools.Measurement in silicon shows an energy consumption of 5pJ/cycle at 0.4V running a QRS peak detection algorithm on an ECG data at 250 kHz on the processor.For the peak detection implementation, the processor consumes 5pJ per cycle at 0.4V and 250 kHz.If a higher performance is needed for overall ECG detection at the system level, the processor can operate at 1MHz at 0.6V, consuming 6.7pJ per cycle.Hence, if a higher performance is desired, by sacrificing 34% energy, 4x performance improvement can be achieved.Measured results show 55% reduction in V T variation of the fabricated devices in the xLP FDSOI process as compared to a standard FDSOI process.The measured minimum energy across 8 functional dies shows a σ/µ of 0.0405.Fig. 5.2 shows the measured energy-delay trends of the processor and I ds -V gs measurements of 46 PMOS transistors across two wafers.The 3σ variation in V T was found to be 8mV for a device with channel length, L g = 180nm and V ds = 0.3V.The reduced variation in V T was achieved due to reduced V T sensitivity to silicon thickness.The absence of random dopant fluctuations and reduced channel length sensitivity to source-drain anneal variations further minimize V T variation.Fig. 5.3 shows that the energy vs. delay and leakage power vs. supply voltage for the FIR filter with and without applying Reverse Body Biasing (RBB) to the transistors.The minimum energy per cycle for the FIR filter (at 0.36V) is ~5X lower than [54], and applying RBB of 0.25V gives 39.4% further reduction due to lower leakage energy.

Anticipated Results: Ultra-low-power Comparator topologies for a ULP wakeup radio
We have implemented several topologies of clocked, dynamic and continuous time comparators for a wake-up receiver system in 130nm CMOS technology.The specifications of the wake-up radio system are such that it needs to consume 10nW of total power and needs to be sensitive to a -60dBm input RF signal.For the comparators, we hope to achieve input-referred offset voltages in the order of 100s of µV and power consumption ranging from ~3-5nW.

Ultra-low-power circuit families and logic styles
Alternate logic styles and circuit families can hopefully reduce the power consumption, especially at battery voltages in the order of 1V so that an 8-bit correlator, designed using such logic styles consume less than or equal to 1nW.Lower power can be achieved by trading off either operating frequency (which might be reduced to 1kHz or less), SNM or layout area.

Fig 2 . 2
Fig 2.2 Electrical model of a TEG

3 . 3
Research Questions: § Which kind of regulation scheme would provide the peak end-to-end efficiency at sub-µW load power?

Fig 4 . 3 :
Fig 4.3: Variation in RO frequency with supply and measured digital equivalent of injected droop

Fig 4 . 4 :
Fig 4.4: Measured power of droop measurement unit and measured Energy-Delay trends of latch-based and flip-flop based FIR filters