Abstract

The wireless market has experienced a remarkable development and growth since the introduction of the first mobile phone systems, with a steady increase in the number of subscribers, new application areas, and higher data rates. As mobile phones and wireless connectivity have become consumer mass markets, a prime goal of the IC manufacturers is to provide low-cost solutions.

The power amplifier (PA) is a key building block in all RF transmitters. To lower the costs and allow full integration of a complete radio System-on-Chip (SoC), it is desirable to integrate the entire transceiver and the PA in a single CMOS chip. While digital circuits benefit from the technology scaling, it is becoming significantly harder to meet the stringent requirements on linearity, output power, and power efficiency of PAs at lower supply voltages. This has recently triggered extensive studies to investigate the impact of different circuit techniques, design methodologies, and design trade-offs on functionality of PAs in nanometer CMOS technologies.

This thesis addresses the potential of integrating linear and highly efficient PAs and PA architectures in nanometer CMOS technologies at GHz frequencies. In total four PAs have been designed, two linear PAs and two switched PAs. Two PAs have been designed in a 65nm CMOS technology, targeting the 802.11n WLAN standard operating in the 2.4-2.5GHz frequency band with stringent requirements on linearity. The first linear PA is a two-stage amplifier with LC-based input and interstage matching networks, and the second linear PA is a two-stage PA with transformer-based input and interstage matching networks. Both designs were evaluated for a 72.2Mbit/s, 64-QAM 802.11n OFDM signal with a PAPR of 9.1dB. Both PAs fulfilled the toughest EVM
requirement of the standard at average output power levels of 9.4dBm and 11.6dBm, respectively. Matching techniques in both PAs are discussed as well.

Two Class-E PAs have been designed in 130nm CMOS and operated at low ‘digital’ supply voltages. The first PA is intended for DECT, while the second is intended for Bluetooth. At 1.5V supply voltage and 1.85GHz, the DECT PA delivered +26.4dBm of output power with a drain efficiency (DE) and power-added efficiency (PAE) of 41% and 30%, respectively. The Bluetooth PA had an output power of +22.7dBm at 1.0V with a DE and PAE of 48% and 36%, respectively, at 2.45GHz. The Class-E amplifier stage is also suitable for employment in different linearization techniques like Polar Modulation and Outphasing, where a highly efficient Class-E PA is crucial for a successful implementation.
 PREFACE

This licentiate thesis presents my research during the period February 2007 through August 2009 at the Electronic Devices group, Department of Electrical Engineering, Linköping University, Sweden. The following papers are included in the thesis:


- **Paper 4 – Jonas Fritzin** and Atila Alvandpour, “Low Voltage Class-E Power Amplifiers for DECT and Bluetooth in 130nm CMOS,” in
My research has also included involvement in projects that has generated the following papers falling outside the scope of this thesis:


### Abbreviations

<table>
<thead>
<tr>
<th>Abbreviation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ACPR</td>
<td>Adjacent Channel Power Ratio</td>
</tr>
<tr>
<td>ADC</td>
<td>Analog-to-Digital Converter</td>
</tr>
<tr>
<td>BJT</td>
<td>Bipolar Junction Transistor</td>
</tr>
<tr>
<td>CMOS</td>
<td>Complementary Metal-Oxide-Semiconductor</td>
</tr>
<tr>
<td>CF</td>
<td>Crest Factor</td>
</tr>
<tr>
<td>DAC</td>
<td>Digital-to-Analog Converter</td>
</tr>
<tr>
<td>DC</td>
<td>Direct Current</td>
</tr>
<tr>
<td>DE</td>
<td>Drain Efficiency</td>
</tr>
<tr>
<td>DECT</td>
<td>Digital Enhanced Cordless Telecommunications</td>
</tr>
<tr>
<td>EVM</td>
<td>Error Vector Magnitude</td>
</tr>
<tr>
<td>FET</td>
<td>Field-Effect Transistor</td>
</tr>
<tr>
<td>GaAs</td>
<td>Gallium-Arsenide</td>
</tr>
<tr>
<td>GSM</td>
<td>Global System for Mobile communications</td>
</tr>
<tr>
<td>HBT</td>
<td>Heterojunction Bipolar Transistor</td>
</tr>
<tr>
<td>IC</td>
<td>Integrated Circuit</td>
</tr>
<tr>
<td>IEEE</td>
<td>The Institute of Electrical and Electronics Engineers</td>
</tr>
<tr>
<td>ITRS</td>
<td>International Technology Roadmap for Semiconductors</td>
</tr>
<tr>
<td>Abbreviation</td>
<td>Definition</td>
</tr>
<tr>
<td>--------------</td>
<td>------------</td>
</tr>
<tr>
<td>LO</td>
<td>Local Oscillator</td>
</tr>
<tr>
<td>LC</td>
<td>Inductance-Capacitance</td>
</tr>
<tr>
<td>LNA</td>
<td>Low-Noise Amplifier</td>
</tr>
<tr>
<td>MMIC</td>
<td>Monolithic Microwave Integrated Circuit</td>
</tr>
<tr>
<td>MOS</td>
<td>Metal-Oxide-Semiconductor</td>
</tr>
<tr>
<td>MOSFET</td>
<td>Metal-Oxide-Semiconductor Field Effect Transistor</td>
</tr>
<tr>
<td>NMOS</td>
<td>N-channel Metal-Oxide-Semiconductor</td>
</tr>
<tr>
<td>PA</td>
<td>Power Amplifier</td>
</tr>
<tr>
<td>PAE</td>
<td>Power-Added Efficiency</td>
</tr>
<tr>
<td>PAPR</td>
<td>Peak-to-Average Power Ratio</td>
</tr>
<tr>
<td>PCB</td>
<td>Printed Circuit Board</td>
</tr>
<tr>
<td>PMOS</td>
<td>P-channel Metal-Oxide-Semiconductor</td>
</tr>
<tr>
<td>PAE</td>
<td>Power-Added Efficiency</td>
</tr>
<tr>
<td>RF</td>
<td>Radio-Frequency</td>
</tr>
<tr>
<td>RMS</td>
<td>Root-Mean-Square</td>
</tr>
<tr>
<td>VCO</td>
<td>Voltage-Controlled Oscillator</td>
</tr>
<tr>
<td>WLAN</td>
<td>Wireless Local Area Network</td>
</tr>
</tbody>
</table>
Acknowledgments

There are several people that deserve credit for making it possible for me to write this thesis. I would like want to thank the following people and organizations:

- My supervisor and advisor Professor Atila Alvandpour, for your guidance, patience, and support. Thanks for giving me the opportunity to pursue a career as Ph.D. student.

- Professor Christer Svensson for interesting discussions, giving valuable comments and sharing his experience.

- Our secretary Anna Folkesson for taking care of all administrative issues, and Arta Alvandpour for solving all computer related issues.

- I want to thank Dr. Martin “Word” Hansson for being an excellent colleague and friend, assistance in Cadence, providing the Word template for this thesis, pulling me to the gym in the morning, and also proofreading this thesis.

- Dr. Henrik Fredriksson deserves a great deal of thanks for all help and useful discussions about all kinds of stuff, both work and non-work related, and for proofreading this thesis and contributing with many useful suggestions on improvements.

- I want to thank M.Sc. Timmy Sundström for excellent collaboration during student labs, graduate courses, and for being a great friend.
I want to thank Adj. Prof. Ted Johansson for being my supervisor during my internship at Infineon during my initial Ph.D. studies. I also appreciate his help in reviewing numerous of manuscripts whenever needed and as well as all discussions regarding power amplifier design.

Infineon Technologies Nordic AB, Sweden, and Infineon Technologies AG, Germany, deserves a great deal of thanks for sponsoring the chip tape-outs in their CMOS technologies. Intel Corporation, USA, is also acknowledged for sponsoring the research projects.

Dr. Rashad Ramzan, Dr. Naveed Ahsan, and M.Sc Shakeel Ahmad for all discussions regarding RF circuits and measurements.


I greatly appreciate the generous support from Rohde & Schwarz, Stockholm, to easily borrow equipment and the assistance of Henrik Karlström, Johan Brobäck, and Anders Sundberg. I also would like to thank Ronny Peschel and Thomas Göransson at Agilent Technologies, Kista, for kindly letting us borrow equipment whenever needed.

All my friends for enriching my out-of-work life.

My sweet brothers Joakim and Johan for all discussions about other things not related to science and technology.

Last, but not least, my wonderful parents Jörn and Berit Fritzin for always encouraging and supporting me in whatever I do.

To those of you that I have forgotten and feel that they deserve thanks I thank you.

Jonas Fritzin

Linköping, August 2009
Contents

Abstract iii
Preface v
Abbreviations vii
Acknowledgments ix
Contents xi
List of Figures xv

Part I Background 1
Chapter 1 Introduction 3
  1.1 Motivation and Scope of this Thesis ............................................... 3
  1.2 Organization of this Thesis............................................................. 5
  1.3 Brief History of RF Technology...................................................... 5
  1.4 History of Transistors and Integrated Circuits .................................. 6
  1.5 The Wireless Evolution ............................................................... 7
  1.6 Future Possibilities and Challenges ............................................. 8
Chapter 1

1.7  Semiconductor Materials
    1.7.1  Scaling Trend of CMOS
    1.7.2  Comparison of CMOS and Other Semiconductors
    1.8  References

Chapter 2

2.1  Introduction
2.2  The MOS Device
    2.2.1  Structure
    2.2.2  I/V Characteristics of the MOS transistor
    2.2.3  Small-Signal Model
    2.3  References

Chapter 3

3.1  Introduction
3.2  Power Amplifier Fundamentals
    3.2.1  Output Power
    3.2.2  Gain and Efficiency
    3.2.3  Peak Output Power, Crest Factor, and Peak to Average Power Ratio
    3.2.4  Power Amplifier Drain Efficiency for Modulated Signals
    3.2.5  Linearity
    3.3  Power Amplifier Topologies
        3.3.1  Class-A
        3.3.2  Class-B and AB
        3.3.3  Class-C
        3.3.4  Class-D
        3.3.5  Class-E
        3.3.6  Class-F
    3.4  Linearization of Non-Linear Power Amplifiers
    3.5  References

Chapter 4

4.1  Introduction
4.2  Conjugate and Power Match
4.3  Load-pull
4.4  Matching Network Design
    4.4.1  L-Match
    4.4.2  Balun
    4.5  Input and Interstage Matching
        4.5.1  LC-Based Matching Network
8.2 Design and Implementation of the Power Amplifiers
8.3 Experimental Results
8.4 Summary
8.5 Acknowledgement
8.6 References
List of Figures

<table>
<thead>
<tr>
<th>Figure</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>1.1</td>
<td>Application spectrum and semiconductors likely to be used today [18]</td>
<td>8</td>
</tr>
<tr>
<td>2.1</td>
<td>Schematic and cross section views of an NMOS transistor</td>
<td>16</td>
</tr>
<tr>
<td>2.2</td>
<td>Small-signal model of MOS transistor [3]</td>
<td>21</td>
</tr>
<tr>
<td>2.3</td>
<td>Extrinsic capacitances in the MOS transistor</td>
<td>22</td>
</tr>
<tr>
<td>2.4</td>
<td>Extrinsic elements added to the small-signal model in Figure 2.2</td>
<td>23</td>
</tr>
<tr>
<td>2.5</td>
<td>Vertical gate resistance</td>
<td>24</td>
</tr>
<tr>
<td>2.6</td>
<td>Circuit to estimate $\omega_T$</td>
<td>24</td>
</tr>
<tr>
<td>3.1</td>
<td>Block diagram of a direct-conversion transmitter</td>
<td>30</td>
</tr>
<tr>
<td>3.2</td>
<td>Power amplifier (PA) with two driver stages, A1 and A2, connected to an antenna</td>
<td>31</td>
</tr>
<tr>
<td>3.3</td>
<td>DE for normalized output amplitude in a Class-A amplifier</td>
<td>33</td>
</tr>
<tr>
<td>3.4</td>
<td>(a) Intermodulation spectrum of two-tone test (b) Gain compression curve</td>
<td>35</td>
</tr>
<tr>
<td>3.5</td>
<td>(a) Spectral mask for transformer-based PA [12] (b) Vector definitions in EVM</td>
<td>37</td>
</tr>
<tr>
<td>3.6</td>
<td>Generic single-stage power amplifier</td>
<td>39</td>
</tr>
<tr>
<td>3.7</td>
<td>Drain voltage and current waveforms in an ideal Class-A</td>
<td>39</td>
</tr>
<tr>
<td>3.8</td>
<td>Drain voltage and current waveforms in an ideal Class-B</td>
<td>41</td>
</tr>
<tr>
<td>3.9</td>
<td>Class-D amplifier</td>
<td>42</td>
</tr>
<tr>
<td>3.10</td>
<td>Class-D amplifier waveforms</td>
<td>43</td>
</tr>
<tr>
<td>3.11</td>
<td>Schematic of CMOS inverter including dynamic currents</td>
<td>45</td>
</tr>
<tr>
<td>3.12</td>
<td>Class-E amplifier</td>
<td>46</td>
</tr>
<tr>
<td>3.13</td>
<td>Normalized drain voltage and current waveforms in an ideal Class-E [27]</td>
<td>46</td>
</tr>
<tr>
<td>3.14</td>
<td>Simulation results of drain voltage ($v_{DS}$), driver signal ($v_{DRIVE}$), and output voltage ($v_{OUT}$)</td>
<td>48</td>
</tr>
<tr>
<td>3.15</td>
<td>Simulation current waveforms: drain current ($i_D$), current through shunt capacitance ($i_C$), output current ($i_{OUT}$)</td>
<td>49</td>
</tr>
<tr>
<td>3.16</td>
<td>Class-F amplifier</td>
<td>50</td>
</tr>
<tr>
<td>3.17</td>
<td>Class-F amplifier waveforms</td>
<td>50</td>
</tr>
</tbody>
</table>
Figure 6.6: Simplified schematic of output matching network. ..............................................106
Figure 6.7: RF performance (Pin, Pout, Gain). .................................................................106
Figure 6.8: RF performance (Pin av, Pout av, EVM). .........................................................107
Figure 6.9: Spectral mask and measured peak spectrum at an average output power of 17dBm
with an EVM of 13.1%. ......................................................................................................107
Figure 7.1: Matching networks for a differential two-stage PA. ........................................113
Figure 7.2: Simplified schematic of the LC-based PA.......................................................114
Figure 7.3: Simplified schematic of the transformer-based PA. ........................................115
Figure 7.4: Ideal transformer model................................................................................116
Figure 7.5: Chip photos of the LC-based (a) and transformer-based (b) PAs. ..................117
Figure 7.6: Output matching network. ..............................................................................118
Figure 7.7: RF performance (Pin, Pout, EVM) for the LC-based PA.................................118
Figure 7.8: RF performance (Pin, Pout, EVM) for the transformer-based PA...................119
Figure 7.9: Spectral mask and measured peak output spectrum for the LC-based (a) and
transformer-based (b) PAs. ...............................................................................................120
Figure 8.1: Simplified schematic of the PAs (single-ended section). .................................126
Figure 8.2: Chip photos: DECT (a) PA and Bluetooth (b) PA. ..........................................127
Figure 8.3: DECT PA: Pout, DE, and PAE: VDD_1 = VDD_2 at 1.85GHz. .........................127
Figure 8.4: DECT PA: Pout, DE, and PAE: VDD_1 = VDD_2 = 1.5V. ...............................128
Figure 8.5: BT PA: Pout, DE, and PAE at 2.45GHz. .......................................................128
Figure 8.6: BT PA: Pout, DE, and PAE: VDD_1 = 0.75, VDD_2 = VDD_3 = 1V. .................129
Figure 8.7: (a) Spectral measurement DECT PA for +26.4dBm. (b) Spectral measurement
Bluetooth PA for +22.7dBm. ..........................................................................................130
Part I

Background
Chapter 1

Introduction

1.1 Motivation and Scope of this Thesis

The wireless market has experienced a remarkable development and growth since the introduction of the first mobile phone systems, with a steady increase in the number of subscribers, new application areas, and higher data rates. As mobile phones and integration of wireless connectivity has become consumer mass markets, a prime goal of the IC manufacturers is to provide low-cost solutions.

CMOS has for a long time been the choice for digital integrated circuits due to its high level of integration, low-cost, and constant enhancements in performance. The RF circuits have typically been predominantly designed in GaAs [1] and silicon bipolar, due to the better performance at radio frequencies. However, due to the significant scaling of the MOS transistors, the transition frequency has been pushed over hundred GHz. Along with the enhancements in speed, the employment of MOS transistors in RF applications have acquired increased usage. The digital baseband circuits have successfully been integrated in CMOS, as well as most radio building blocks, but there is still one missing piece to be efficiently integrated in CMOS, and that is the power amplifier (PA). To lower the cost and to achieve full integration of a radio System-on-Chip
Introduction

(SoC), it is desirable to integrate the entire transceiver and the PA in a single CMOS chip. Since the PA is the most power hungry component in the transmitter, it is important to minimize the power consumption to achieve a highly power efficient and low-cost radio SoC.

However, with the scaling trend of CMOS transistors restrictions on the supply voltage must be enforced due to the scaling of the gate oxide, which pose further challenges in terms of reliability, efficiency, linearity, and output power. Not only the transistors are scaled, the interconnects are scaled as well, which introduces more losses.

This thesis addresses the potential of integrating linear and highly efficient PA architectures in nanometer CMOS technologies at GHz frequencies. In total four PAs have been designed, two linear PAs and two switched PAs. Two PAs have been designed in a 65nm CMOS technology, targeting the 802.11n WLAN standard operating in the 2.4-2.5GHz frequency band with stringent requirements on linearity. The PA described in Paper 1 is a two-stage amplifier with LC-based input and interstage matching networks, and the second PA described in Paper 2 is a two-stage PA with transformer-based input and interstage matching networks. Both designs were evaluated for a 72.2Mbit/s, 64-QAM 802.11n OFDM signal with a PAPR of 9.1dB. Both PAs fulfilled the toughest EVM requirement of the standard at average output power levels of 9.4dBm and 11.6dBm, respectively. Matching techniques in both PAs are discussed in Paper 3.

In Paper 4, two Class-E PAs have been designed in 130nm CMOS and operated at low ‘digital’ supply voltages. The first PA is intended for DECT, while the second is intended for Bluetooth. At 1.5V supply voltage and 1.85GHz, the DECT PA delivered +26.4dBm of output power with a DE and PAE of 41% and 30%, respectively. The Bluetooth PA provides an output power of +22.7dBm at 1.0V and at 2.45GHz with a DE and PAE of 48% and 36%, respectively. The Class-E amplifier stage is also suitable for employment in different linearization techniques like Polar Modulation and Outphasing, where a highly efficient Class-E is crucial for a successful implementation. These two architectures are further described in the thesis.
1.2 Organization of this Thesis

This thesis is organized into two parts:

- Part I - Background
- Part II - Papers

Part I provides the background for the concepts used in the papers. The remainder of Chapter 1 discusses the background of RF technology, history of integrated circuits, and future challenges in RF CMOS circuit design with emphasis on PA design. Chapter 2 treats the fundamental operation of the transistor, as well as the impact of scaling on the performance of the MOS device. Chapter 3 introduces many concepts and definitions used in power amplifiers, and describe the fundamental operation of several amplifier classes. Chapter 4 describes the matching techniques used in the implemented amplifiers in Paper 1, Paper 2 and Paper 4. Due to the very large devices, layout parasitics are needed to be extracted. The chapter covers this discussion as well. Finally, in Part II the papers, included in this thesis, are presented in full.

1.3 Brief History of RF Technology

The successful advances today within the field of wireless communication technology and integrated circuit design are made possible through a series of inventions and discoveries during the last two hundred years. This section aims at briefly enlighten the main contributions and milestones, which have made the electronics and wireless revolution possible, while making it a natural part of our lives. Furthermore, a comparison of semiconductor technologies and their current performance will be discussed along with their potential performance in the future.

The discovery of static electricity was already done in 600 BC [2], but not until Alessandro Volta in 1800 demonstrated the battery, no one before had been able to demonstrate a persisting current. Twenty years later, Hans Ørsted discovered the relationship between current and magnetism. The description of electromagnetic phenomenon was further refined and developed by James Clerk Maxwell and eventually defined as “Maxwell’s equations” in 1864. In the meanwhile, the communication era had taken its first initial steps, as the first electric telegraph was developed in 1837. About forty years later, in 1876, the first telephone was patented by A. G. Bell. The birth of wireless communication can be considered to be dated back to 1895, when Guglielmo Marconi managed to transmit a radio signal for more than a kilometer with a spark-gap transmitter,
which was followed up by the first transatlantic radio transmission in 1902. The
first analog mobile phone system in Scandinavia, Nordic Mobile Telephone
System (NMT), appeared in the early 1980s. The NMT system was succeeded
by the GSM system (Global System for Mobile communications; originally
Groupe Spécial Mobile) in the ’90s, which has been followed by several new
standards for long and short distance communications. This exciting
development of the wireless area would not have been possible without the
invention of the transistor.

1.4 History of Transistors and Integrated Circuits

Before the invention of the transistor, the vacuum tube was used. The vacuum
tube could also amplify electrical signals and operate as a switch, but was
limited by the life-time, fragileness, and the standby-power required.

The initial step towards solid-state devices was taken in 1874, as Ferdinand
Braun discovered the metal-semiconductor contact, but it took another 51 years
(1925) until the Field-Effect Transistor (FET) was patented by the physicist
Julius Edgar Lilienfeld. In 1947, at Bell Labs in the US, a bipolar transistor
device was developed by John Bardeen, Walter Brattain, and William Shockley,
who received the Nobel Prize for their invention in 1956.

The first integrated circuit (IC) was developed in 1958 by Jack Kilby,
working at Texas Instruments, and consisted of a transistor, a capacitor, and
resistors on a piece of germanium [3], [4]. In the next year Robert Noyce [5]
invented the first IC with planar interconnects using photolithography and
etching techniques still used today. However, it took another few years (1963)
until Frank Wanlass, at Fairchild Semiconductor, developed the
Complementary-MOS (CMOS) process, which enabled the integration of both
NMOS and PMOS transistors on the same chip. The first demonstration circuit,
the inverter, reduced the power consumption to one-sixth over the equivalent
bipolar and PMOS gates [6].

In 1965 Gordon Moore, one of Intel’s co-founders, predicted that the number
of devices would double every twelve months [7]. The prediction was modified
in 1975 [8], such that the future rate of increase in complexity would rather
double every two years instead of every year, and became known as Moore’s
law. In some people’s opinion this prediction became a self-fulfilling prophecy
that has emerged as one of the driving principles in the semiconductor industry,
as engineers and researchers have been challenged to deliver annual
breakthroughs to comply with the “law”.

1.5 The Wireless Evolution

Since the ‘70s, the progress in several areas has made it feasible to keep up the pace in the electronics development to deliver more reliable, complex, and high-performance integrated circuits. Recently, Intel announced their latest contribution on the microprocessor arena, the first microprocessor (codenamed Tukwila) with more than 2 billion transistors on the same die in a 65nm process [9]. This would not have been possible without the tremendous scaling of CMOS transistors.

1.5 The Wireless Evolution

As mass-consumer products require a low manufacturing cost, CMOS technologies have been preferred as semiconductor material, as it has been possible to integrate more and more functionality along with a constant increase of performance. Without the scaling of both transistors and the cost of manufacturing of CMOS transistors, high-technology innovations like portable computers and mobile phones would probably not have been realized [10]. With the breakthrough in wireless technology and the mobile phones in ‘80s, the development within mobile connectivity has been driven by a number of key factors like new applications, flexibility, integration, and not the least - cost.

The evolution from the GSM system in the ‘90s with raw data rates of some kbps, to today’s high-speed WLAN 802.11n with data rates of several 100Mbps, has made it feasible to not only transmit voice data, but also transmit and receive pictures and movies. The significant increase of data rates has been viable through several enhancements, not only on the device level, but also through the development of more complex modulation schemes. The modulation schemes have evolved from Gaussian Minimum-Shift Keying (GMSK) modulation used in GSM to amplitude and phase modulations with large PAPR as in the WLAN systems to support higher data rates, which also requires highly linear transmitters.

From the early days of the spark-gap transmitters, the radio architectures have evolved into architectures called transceivers, including both the transmitter and receiver sections. The digital baseband (DB) circuits, the LO, the mixer, the LNAs [11], the ADCs, and the DACs have successfully been implemented in CMOS and BiCMOS technologies [12]. However, there is still one missing piece to be efficiently integrated in CMOS together with the DB and radio building blocks, and that is the PA. It has been predominantly designed in other technologies due to the higher efficiency [1], like GaAs HBTs [13] and Metal-Semiconductor Field Effect Transistors (MESFET) [14], Si BJT or SiGe HBT [15] for mobile handsets. One of the first CMOS RF PAs capable of delivering 1W of output power was presented in 1997 [16] and implemented in a 0.8µm
technology operating at 824-849MHz. In the following year, a PA [17] in a 0.35µm technology was presented and operated at 2GHz with an output power of 1W.

1.6 Future Possibilities and Challenges

Since the early 1990s, silicon devices have had good enough performance for transceiver design [18]. By combining the low cost and integration capabilities of CMOS/BiCMOS, these technologies will make them the choice of RF transceivers with fully-integrated PAs, as long as RF and system design goals can be achieved. Figure 1.1 [18] shows an application spectrum, and what semiconductors are likely to be used in certain frequency ranges. The frequency spectrum is limited to 94GHz, but both Indium Phosphide (InP) High Electron Mobility Transistor (HEMT) and Gallium Arsenide (GaAs) Metamorphic High Electron Mobility Transistor (MHEMT) have shown acceptable performance in the THz regime and can be expected to continuously dominate for extremely high frequency applications.

The main drivers of wireless communications systems today are cost, frequency bands, power consumption, functionality, size, volume of production, and standards. As wireless functionality has been integrated into more and more applications and entered mass-consumer markets, silicon and also silicon-germanium have continuously replaced the traditional III-V semiconductors.
when acceptable RF performance has been met. In Figure 1.1 we can see that silicon has conquered the frequencies up to 28GHz, however, several PAs at 60GHz [19] and even up to 150GHz [20] have already been presented. It demonstrates the rapid development of silicon technologies, and how Figure 1.1 may lose its significance as silicon targets frequencies previously completely dominated by III-V semiconductors. Rather the discussion will regard output power, PAE, integration, and linearity. Currently, the market of WLAN transceivers is dominated by CMOS, where fully-integrated solutions, including the PA, have been presented [21], [22]. Silicon-based technologies will be the choice for high volume and cost sensitive markets, but is not expected to be the choice when the key demands are very high gain, very high output power, and extremely low noise.

1.7 Semiconductor Materials

1.7.1 Scaling Trend of CMOS

As shown in Figure 1.1 a number of semiconductor materials exist, which are suitable for RF circuit design. Considering the scaling trend of MOS device in Table 1-1, we can foresee almost a reduction of two of the gate oxide thickness

<table>
<thead>
<tr>
<th>Year of production</th>
<th>2010</th>
<th>2013</th>
<th>2016</th>
<th>2019</th>
<th>2022</th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology node [nm]</td>
<td>45</td>
<td>32</td>
<td>22</td>
<td>16</td>
<td>11</td>
</tr>
<tr>
<td>Thin oxide device</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>- Nominal VDD [V]</td>
<td>1.0</td>
<td>1.0</td>
<td>0.8</td>
<td>0.8</td>
<td>0.7</td>
</tr>
<tr>
<td>- $t_{ox}$ [nm]</td>
<td>1.5</td>
<td>1.2</td>
<td>1.1</td>
<td>1.0</td>
<td>0.8</td>
</tr>
<tr>
<td>- Peak $f_T$ (GHz)</td>
<td>280</td>
<td>400</td>
<td>550</td>
<td>730</td>
<td>870</td>
</tr>
<tr>
<td>- Peak $f_{max}$ (GHz)</td>
<td>340</td>
<td>510</td>
<td>710</td>
<td>960</td>
<td>1160</td>
</tr>
<tr>
<td>- $I_d$ (µA/µm): min L</td>
<td>8</td>
<td>6</td>
<td>4</td>
<td>3</td>
<td>2</td>
</tr>
<tr>
<td>Thick oxide device</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>- Nominal VDD [V]</td>
<td>1.8</td>
<td>1.8</td>
<td>1.8</td>
<td>1.5</td>
<td>1.5</td>
</tr>
<tr>
<td>- $t_{ox}$ [nm]</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>2.6</td>
<td>2.6</td>
</tr>
<tr>
<td>- Peak $f_T$ (GHz)</td>
<td>50</td>
<td>50</td>
<td>50</td>
<td>70</td>
<td>70</td>
</tr>
<tr>
<td>- Peak $f_{max}$ (GHz)</td>
<td>90</td>
<td>90</td>
<td>90</td>
<td>120</td>
<td>120</td>
</tr>
<tr>
<td>Passives: Power Amplifier</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>- Inductor Q (1GHz, 5nH)</td>
<td>14</td>
<td>18</td>
<td>18</td>
<td>18</td>
<td>18</td>
</tr>
<tr>
<td>- Capacitor Q (1 GHz, 10pF)</td>
<td>&gt;100</td>
<td>&gt;100</td>
<td>&gt;100</td>
<td>&gt;100</td>
<td>&gt;100</td>
</tr>
<tr>
<td>- RF cap. density (fF/µm²)</td>
<td>5</td>
<td>7</td>
<td>10</td>
<td>10</td>
<td>12</td>
</tr>
</tbody>
</table>
and a reduction of four of the gate length for the thin oxide devices [18] in the
next ten years, leading to potentially extreme $f_T$. Due to the very thin oxide and
the low supply voltages, it is more likely to use the thick oxide (I/O) devices or a
combination of both devices in PA design. The expected trend for the thick
oxide devices is not as extreme as for the thin oxide devices, as they are
expected to have an oxide thickness of 2.6nm in ten years, comparable to
existing thick oxide devices today.

1.7.2 Comparison of CMOS and Other Semiconductors
As GaAs was one of the first semiconductors used in RF design and is still used
in most terminal PAs, a short comparison of the specific properties of III-V
compounds and CMOS is given here. Table 1-2 shows the key characteristics of
the basic materials of the most common MMIC technologies.

When a low to moderate electrical field [12] is applied at the device, the
carrier velocity is higher for the electrons than for the holes. The difference

<table>
<thead>
<tr>
<th></th>
<th>Silicon</th>
<th>SiC</th>
<th>InP</th>
<th>GaAs</th>
<th>GaN</th>
</tr>
</thead>
<tbody>
<tr>
<td>Electron mobility at</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>300K [cm$^2$/Vs]</td>
<td>1500</td>
<td>700</td>
<td>5400</td>
<td>8500</td>
<td>1000-2000</td>
</tr>
<tr>
<td>Hole mobility at 300K</td>
<td>450</td>
<td>n.a</td>
<td>150</td>
<td>400</td>
<td>n.a.</td>
</tr>
<tr>
<td>[cm$^2$/Vs]</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Peak/saturated electron</td>
<td>1.0/1.0</td>
<td>2.0</td>
<td>2.0</td>
<td>2.1/n.a</td>
<td>2.1/1.3</td>
</tr>
<tr>
<td>velocity [10$^7$ cm/s]</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Peak/saturated hole</td>
<td>1.0/1.0</td>
<td>n.a</td>
<td>n.a</td>
<td>n.a</td>
<td>n.a</td>
</tr>
<tr>
<td>velocity [10$^7$ cm/s]</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Bandgap [eV]</td>
<td>1.1</td>
<td>3.26</td>
<td>1.35</td>
<td>1.42</td>
<td>3.49</td>
</tr>
<tr>
<td>Critical breakdown</td>
<td>0.3</td>
<td>3.0</td>
<td>0.5</td>
<td>0.4</td>
<td>3.0</td>
</tr>
<tr>
<td>field [MV/cm]</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Thermal conductivity</td>
<td>1.5</td>
<td>4.5</td>
<td>0.7</td>
<td>0.5</td>
<td>&gt;1.5</td>
</tr>
<tr>
<td>[W/(cm K)]</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>$\varepsilon_r$ constant</td>
<td>11.8</td>
<td>10.0</td>
<td>12.5</td>
<td>12.8</td>
<td>9</td>
</tr>
<tr>
<td>Substrate resistance</td>
<td>1-20</td>
<td>1-20</td>
<td>&gt;1000</td>
<td>&gt;1000</td>
<td>&gt;1000</td>
</tr>
<tr>
<td>[Ωcm]</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Number of transistors</td>
<td>&gt;1 billion</td>
<td>&lt;200</td>
<td>&lt;500</td>
<td>&lt;1000</td>
<td>&lt;50</td>
</tr>
<tr>
<td>in IC</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Transistors</td>
<td>MOSFET,</td>
<td>MESFET,</td>
<td>MESFET,</td>
<td>MESFET,</td>
<td>MESFET,</td>
</tr>
<tr>
<td></td>
<td>Bipolar,</td>
<td>HEMT,</td>
<td>HEMT,</td>
<td>HEMT,</td>
<td>HEMT,</td>
</tr>
<tr>
<td></td>
<td>HBT</td>
<td>HBT</td>
<td>HBT</td>
<td>HBT</td>
<td>HEMT</td>
</tr>
<tr>
<td>Costs prototype, mass</td>
<td>High,</td>
<td>Very high,</td>
<td>High,</td>
<td>Low,</td>
<td>Very high,</td>
</tr>
<tr>
<td>fabrication</td>
<td>low</td>
<td>n.a.</td>
<td>very high</td>
<td>high</td>
<td>n.a.</td>
</tr>
</tbody>
</table>
1.7 Semiconductor Materials

between electrons and holes is much larger in III-V devices (i.e. GaAs), than for silicon devices, but the carrier velocity of electrons is lower for silicon devices. However, due to the significant difference in carrier velocity of complementary III-V devices and the lower hole carrier velocity in GaAs, silicon technologies are better suited when it comes to high speed complementary logic.

Another important speed metric is the mobility ($\mu$) as in (1.1), which describes how fast the carrier velocity and the associated current can be varied with respect to an applied electrical field ($E$), and therefore the mobility determines the time it takes to approach the maximum velocity [12]. This property is very important in switches and RF circuits, where fast control of currents is important to be able to operate at high speeds. The electron mobility of GaAs is higher than the corresponding value of the silicon devices. However, the hole mobility in silicon is higher than the hole mobility in GaAs, and the gap in mobility in silicon is smaller than in GaAs. Consequently, for high-speed circuits n-based GaAs devices are preferred as long as no complementary devices are needed.

\[ \mu = \frac{dv}{dE} \]  

(1.1)

Another important parameter when considering complementary logic is the thermal conductivity as listed in Table 1-2. If the parameter is low, it implies issues to eliminate heating. Considering the two billion transistor processor [9], obviously a good thermal conductivity of the substrate material is a necessity in order to make sure that the chip is not “vaporized”. The comparable integration level in GaAs is typically limited to a number around 1000 transistors [12].

A parameter not beneficial in the silicon case is the substrate resistivity, which is relatively low compared to the III-V semiconductors, and degrades the quality factor of integrated passives [24]. In Table 1-1 the predicted quality factors at 1GHz are given of inductors and capacitors suitable for integration, and obviously, the inductors will continue to be the limiting factor in on-chip matching networks.

A common argument to use silicon and CMOS as semiconductor material is cost, and also as previously discussed, the relative speed performance between the electron and hole carrier based devices makes silicon a preferable choice for complementary logic. To even further lower the cost, the PA can be integrated with the CMOS transceiver. BiCMOS solves the integration of the PA, but has an approximately 20% higher mask count [1] and therefore also a higher price for the same technology node. When comparing the manufacturing costs of MPW runs in GaAs or CMOS, several parameters have to be considered. Typically, GaAs have lower masks costs since less processing steps are needed,
but when considering yield aspects, the CMOS processes can use larger wafers, which makes CMOS processes favorable in mass fabrication [12].

The historical trend of CMOS scaling has enabled high-speed CMOS devices, and as seen in Table 1-1, and the trend is expected to continue, but at the expense of continuously lower supply voltages. Due to the inherently better performance at RF frequencies, the III-V based devices do not have to be scaled as aggressive as the silicon devices, and thus the supply voltage and the associated RF output power of III-V technologies are larger [12], [15]. Therefore III-V-based devices have occupied the market of terminal PAs. For higher output power SiC, GaN, and also LDMOS have superior performance over the other devices, due to the much higher breakdown field and thermal conductivity.

Even if high-performance CMOS-based GSM/GPRS PAs [25] and fully-integrated WLAN CMOS transceivers with integrated front-ends [21], [22] have been reported recently, significant research is needed in the field of CMOS power amplifiers. Challenges that lie ahead are the supply voltage reduction, scaling of interconnects with increased losses as a result, and scaling of gate oxide, posing further challenges in terms of efficiency, reliability, linearity, and output power.

1.8 References


1.8 References


[13]. SKY77328 iPAC™ PAM for Quad-Band GSM/GPRS, Skyworks.


Chapter 2

Background to CMOS Technology

2.1 Introduction

The electronic revolution would not have been made feasible without the invention of the CMOS devices in the 1960s and the magnificent progress of smaller and faster devices till today’s fine-line nanometer CMOS technologies. While designing digital and analog integrated circuits, it is important to understand possibilities and limitations of CMOS devices. Simultaneously, as the CMOS technologies are being scaled down to nanometer dimensions, new phenomenon arises and limits the predicted scaling [1], [2]. This chapter aims at describing the fundamental operation of the MOS device, as well as enlightens the main limitations when CMOS technologies are being scaled.

2.2 The MOS Device

2.2.1 Structure

This section will describe the operation of an n-channel MOS device (NMOS), and since the main operation principles of the p-channel MOS device (PMOS) are the same, the reader is referred to [3] for further investigation.
Figure 2.1 shows a simplified structure of an NMOS consisting of two strongly-doped ‘n’ areas in the substrate called Source (S), and Drain (D). Between the substrate and the Gate (G), there is an insulating layer made of silicon dioxide (SiO$_2$). The device is located in a p-substrate, which is addressed as the Bulk (B) or Body, typically connected to the lowest potential in the system in order to keep the source/drain junction diodes reverse-biased.

The region located between the drain and source, and beneath the gate, is called the channel (L), even though a channel between the drain and the source only exists under certain biasing conditions at the four terminals. Furthermore, the perpendicular extension of source and drain terminals, relatively the channel, is denoted as the width (W). The thickness of the layer separating the channel and the gate is called $t_{ox}$ and has a physical thickness of ~1.5-5.0nm.

### 2.2.2 I/V Characteristics of the MOS transistor

#### 2.2.2.1 Threshold Voltage

To get a basic understanding of the operation of the MOS transistor in Figure 2.1 a few assumptions are made. The bulk and the source are connected to the lowest potential in the circuit, i.e. ground, which means that $V_{SB} = 0$. The voltages gate-source ($V_{GS}$) and drain-source ($V_{DS}$) are assumed to be larger than zero.

The space between the channel region and the gate creates a capacitor. As the gate voltage increases (from 0V) along with the increased electric field, the holes in the substrate are repelled from the channel region below the gate. At the same time negative ions are left behind and the electrons are attracted towards the surface of channel.

Continuing to increase the gate voltage eventually leads to a channel region filled with negative charges. The channel is then in a state called “strong

![Figure 2.1: Schematic and cross section views of an NMOS transistor](image)
2.2 The MOS Device

inversion” [4], and for $V_{DS} > 0$ there is a current flow between the drain and source. The gate voltage required for the channel region to reach this state is called the threshold voltage ($V_{th}$) of the MOS device. However, the bulk and source potential, which were previously assumed to be grounded, may modulate the threshold voltage according to (2.1).

$$V_{th} = V_{th0} + \gamma \left( \sqrt{V_{SB}} + \phi_0 - \sqrt{\phi_0} \right)$$  \hspace{1cm} (2.1)

In (2.1) $\phi_0$ represents the channel surface potential [4], and $\gamma$ is called the body-effect coefficient, which depends on a number of device parameters, like doping concentration and oxide thickness.

The transition from being in the “off” state, to being “on” and conducting current, is not perfectly “switch-like”. Even if $V_{GS}$ is approximately equal to $V_{th}$, or even lower than $V_{th}$, a “weak inversion” layer may still exist and a small current between drain and source still flows for $V_{GS} < V_{th}$. Consequently, a too low threshold is not desirable as it leads to large power dissipation, but on the other hand a high $V_{th}$ leads to a slower device.

The issues to turn on and off the transistor have enforced constraints on the supply voltage and threshold voltage deviating from the ideal scaling theory [1], and the scaling principle addressed by Dennard in 1974 [2]. The scaling principle [2] aimed at scaling all physical dimensions and voltages in the transistor by a factor $\alpha$, in order to keep all internal fields constant.

2.2.2.2 Drain Current and Channel-Length Modulation

As the current in the channel originated from applying voltage and creating electric fields to attract charge carriers under the gate, a natural conclusion is that the charge density does vary along the channel region depending on the applied voltages at the terminals of the MOS device. By considering the charge distribution, the velocity of charge carriers (electrons) in the channel, and the electric field along the channel, the current that flows between the drain and source can be defined according to (2.2) in a first order model [5]. Taking the derivative (in regard to $V_{DS}$) of the defined current, the maximum current ($I_{DS,max}$) is reached when the drain-source voltage equals the threshold voltage subtracted from the gate-source voltage. (2.2) and (2.3) do represent an IV characteristics denoted as the long-channel model, where no effects due to the shrinking dimensions of the MOS devices are taken into account.
Background to CMOS Technology

\[ I_{DS} = \mu_{n,0} C_{ox} \frac{W}{L} \left[ (V_{GS} - V_{th}) V_{DS} - \frac{1}{2} V_{DS}^2 \right] V_{DS} < V_{GS} - V_{th} \quad (2.2) \]

\[ I_{DS,max} = \frac{1}{2} \mu_{n,0} C_{ox} \frac{W}{L} \left( V_{GS} - V_{th} \right)^2 V_{DS} > V_{GS} - V_{th} \quad (2.3) \]

The intersection when \( V_{DS} \) becomes larger than \( V_{GS} - V_{th} \) also divides the transistor operation into two major operation regions called the linear region (2.2) and the saturation region (2.3). Increasing the drain-source voltage in (2.3) would not ideally increase the drain current. However, as the drain voltage is increased, the charge density vanishes at some point along the channel as shown in Figure 2.1. At this point along the channel, where the inversion layer stops, the channel is denoted to be “pinched off”, and for increasing drain voltages this point wanders closer to the source. The drain current (2.3) in saturation changes according to the channel-length modulation coefficient (\( \lambda \)), as in (2.4).

\[ I_{DS} = \frac{1}{2} \mu_{n,0} C_{ox} \frac{W}{L} \left( V_{GS} - V_{th} \right)^2 \left[ 1 + \lambda V_{DS} \right] \quad (2.4) \]

### 2.2.2.3 Mobility and Velocity Saturation

In (2.2)-(2.4) \( C_{ox} (= \varepsilon_r \varepsilon_0 / t_{ox}) \) does represent the gate oxide capacitance, and \( \mu_{n,0} \) represents the mobility of the electrons in the channel. In the previous expressions of the current, it has been assumed that the carrier velocity is proportional to the longitudinal [3] electrical field between drain and source (\( \varepsilon \)) which is relatively low. However, as the channel length of the devices shrink, the assumption no longer holds, and charge carriers reaches a “saturated velocity” \( (|V_d|_{\text{max}}) \) defined in (2.5). To take this effect into account a compensation factor for this effect is defined in (2.6) [3].

\[ \varepsilon_c = \frac{|V_d|_{\text{max}}}{\mu} \quad (2.5) \]

\[ I_{DS, \text{velocity saturation}} = \frac{I_{DS, \text{no velocity saturation}}}{1 + V_{DS} / (L \varepsilon_c)} \quad (2.6) \]

As the longitudinal field increases, another effect comes into play, namely “hot carriers” [3]. As the electrons travel through the channel, they acquire a high kinetic energy due to the high electrical field. While the average velocity saturates for electrical fields, the kinetic energy does continue to increase for shorter gate-lengths [5]. In the proximity of the drain, the electrons may have acquired significant amounts of energy and hit the silicon atoms at high speed. When the “hot” carriers hit the silicon, new electrons and holes are created, where the holes are pulled towards the substrate, and the electrons move towards
the drain. Some carriers may also achieve sufficiently high energy to get injected into the oxide, and move to the gate, leading to an increase in gate current. As the carriers travel through the gate they may get trapped in the gate oxide as well as damaging the oxide, which results in device degradation [6], [7].

Another field which increases as the transistor are scaled is the field between the gate and the channel, which at large gate voltages confines the electrons to a thinner region below the oxide [5] and leads to more carrier scattering and lower mobility (2.7). Besides the degradation of current capability, mobility degradation affects the harmonics generated in the drain current. Supposing the numerator in (2.7) is small, an approximate expression for the drain current is defined in (2.8).

\[ \mu_{n,\text{eff}} = \frac{\mu_{n,0}}{1 + \theta(V_{GS} - V_{th})} \]  

(2.7)

\[
I_{DS} = \frac{1}{2} \frac{\mu_{n,0}}{1 + \theta(V_{GS} - V_{th})} \frac{W}{L} \left[ (V_{GS} - V_{th})^2 \right] \left[ 1 + \lambda V_{DS} \right]
\]

\[
\approx \frac{1}{2} \mu_{n,0} C_{ox} \frac{W}{L} \left[ 1 - \theta(V_{GS} - V_{th}) \right] \left[ (V_{GS} - V_{th})^2 \right] \left[ 1 + \lambda V_{DS} \right] 
\]

(2.8)

\[
\approx \frac{1}{2} \mu_{n,0} C_{ox} \frac{W}{L} \left[ (V_{GS} - V_{th})^2 - (V_{GS} - V_{th})^3 \right] \left[ 1 + \lambda V_{DS} \right]
\]

From (2.8) it is clear, that for sinusoidal input signals the drain current do not only contain odd harmonics, as predicted by the square law, but also odd harmonics as well [5], [8], and is a source of distortion in power amplifiers.

2.2.3 Small-Signal Model

The previous section described the large-signal DC operation of the transistor. However, in analog circuit design small-signal models have found widespread use, as it describes the linearized operation of the transistor at a specific DC bias point. The small-signal model presented here is based on [3], which takes into account all intrinsic capacitances, and also describes what extrinsic capacitances and resistances should be included.

2.2.3.1 Intrinsic Capacitances

The intrinsic model is obtained by independently applying small signal changes at the terminals of the device and identifying the changes in charges and currents in the device. By applying very small changes of the bias voltages at the device terminals, one at a time, and studying the effect on the drain current, an expression for the overall small change on the drain current can be expressed as
Background to CMOS Technology

in (2.9). Note that the intrinsic modeling do not include the extension of the drain and source, as well as the overlay capacitance between the gate, drain, and source shown in Figure 2.3. In (2.9) the derivatives were replaced by a number of transconductances and the output conductance as defined in (2.10).

\[ i_{ds} \approx g_m v_{gs} + g_{mb} v_{bs} + g_{sd} v_{ds} \]  \hspace{1cm} (2.9)

\[ g_m = \frac{\partial I_{DS}}{\partial V_{GS}} \bigg|_{V_{bs}, V_{ds}}, \quad g_{mb} = \frac{\partial I_{DS}}{\partial V_{BS}} \bigg|_{V_{gs}, V_{ds}}, \quad g_{sd} = \frac{\partial I_{DS}}{\partial V_{DS}} \bigg|_{V_{gs}, V_{bs}} \]  \hspace{1cm} (2.10)

\( g_m \) represents the gate transconductance, usually just called the transconductance. \( g_{mb} \) represents the substrate transconductance, and \( g_{sd} \) the output conductance. Depending on how the transistor is biased, the transistor operates in different regions, and consequently the computation of the parameters depends on in what region the transistor operates. Assuming that the transistor operates in the saturation region, the transconductances and the output conductance can be computed according to (2.11)-(2.13).

\[ g_m \approx \frac{2I_{DS}}{V_{GS} - V_{th}} \approx \sqrt{\frac{2\mu C_{ox} W I_{DS}}{L}} \]  \hspace{1cm} (2.11)

\[ g_{sd} \approx \frac{\mu C_{ox} W}{2L} (V_{GS} - V_{th})^2 \lambda \approx \lambda I_{DS} \]  \hspace{1cm} (2.12)

\[ g_{mb} \approx \frac{g_m \gamma}{2\sqrt{\phi_0 + V_{SB}}} \]  \hspace{1cm} (2.13)

Combining the transconductances and the intrinsic capacitances of the device a small-signal model can be drawn as in Figure 2.2, where the output conductance is replaced by a resistor [3]. However, regarding the transconductances we can conclude that the substrate conductance (\( g_{mb} \)), only come into play as there is a difference in potential between the source and the substrate.

With increasing \( V_{DS} \), the output conductance degrades [3], which in turn leads to a higher output impedance of the device. However, as \( V_{DS} \) is further increased, the depletion region associated with the drain extends further into the substrate and affects the source depletion region. Due to this interaction, the difference in potential between drain and source is lowered, and results in a lower threshold voltage [9], [10]. This effect is called drain-induced barrier lowering (DIBL) [11] and counteracts the impact on the output impedance, due
2.2 The MOS Device

Further increase of $V_{DS}$ leads to impact ionization, which lowers the output impedance [5].

These phenomena are important in analog circuit design, as the output conductance is directly related to intrinsic voltage gain of the transistor. In a typical power amplifier circuit, the voltage swings are large (especially in the

---

**Figure 2.2:** Small-signal model of MOS transistor [3]

**Figure 2.3:** Extrinsic capacitances in the MOS transistor
output stage), and therefore this phenomenon has an impact on the output impedance of the device. The same considerations apply to the intrinsic capacitors of the device, and have to be taken into account in the transistor model in order to achieve reliable simulation results.

2.2.3.2 Extrinsic Components

The extrinsic capacitances do exist between all terminals and model effects like overlay capacitances ($C_{ov}$), fringing capacitances ($C_{fringe}$), related to the extension of the drain and source ($C_{bottom}$), and sidewall capacitances ($C_{sidewall}$) as in Figure 2.3. The capacitances are then added in parallel to the intrinsic small-signal model in Figure 2.4. However, the capacitance between drain and source is small and therefore neglected.

To achieve a complete model and to accurately predict power gain, input and output impedance, and phase delay between the current and the gate voltage, a number of resistive components should also be included at the drain, source, gate, and substrate. The resistive components at the drain and source typically depend on the resistivity in the regions and how the regions are contacted. The

---

**Figure 2.4: Extrinsic elements added to the small-signal model in Figure 2.2**
substrate resistance can be modeled by a single resistor up to frequencies of 10GHz [12], and as a parallel RC circuit for higher frequencies [13]. In order to reduce the resistance to ground, guard-rings are typically used.

The gate has been made predominantly by silicided poly-silicon with a resistance up to 10Ω [14] per square (R_{sq}) and the lateral gate resistance (R_{g,lateral}) can be computed according to (2.14). If the gate is connected at one side α becomes 1/3, and if connected at both sides, α can be reduced to 1/12 [14].

\[
R_{g,lateral} = \alpha \frac{W}{L} R_{sq} \tag{2.14}
\]

Another gate resistance component, which has not always been taken into account in the transistor models, is a contact resistance [15] between the silicide and the poly-silicon in the MOS transistor gate, denoted as R_{g,vertical} in Figure 2.5. Assuming the contact resistivity is r_{C}, the additional contact resistance can be computed according to (2.15). Since the additional contact resistance may be as large as the resistance in (2.14) and is expected [15] to be the dominant factor for technologies with smaller gate lengths than 0.35µm, it is important to consider the resistive contribution to accurately predict transistor performance at higher frequencies. Both resistive contributions are usually represented by a single resistor computed as the sum [16] of (2.14) and (2.15), which has been considered in the amplifiers in Paper 1 and Paper 2. However, the gate resistance can be significantly reduced by using silicided gates, multiple contacts, and splitting the device into several parallel devices.

The silicided poly-silicon gate has, however, been replaced by metal gates in
some recently developed 45nm CMOS processes [17]-[19]. The high-k metal-gate makes it possible to fabricate gates with physically thicker oxides, but still with improved electrical properties. In [19] a reduction in gate leakage of up to >25X has been demonstrated. The improvement in reduced gate leakage is very important, as the leakage power has grown to become a large portion of the total power consumption [9], [10], in microprocessors [20].

Two common figure-of-merits (FoM) of the transistors are the transition frequency (f\(T\)) and maximum oscillation frequency (f\(\text{max}\)). The transition frequency is defined as the frequency at which the small-signal current gain equals unity as a DC source is connected between drain and source [3] (while neglecting the small current through C\(g_{gd}\)). An estimation of f\(T\) (2.17) can be made by using the simplified circuit in Figure 2.6. The total capacitance seen at the gate to ground is defined as C\(_g\) (2.16), including both intrinsic and extrinsic capacitances.

\[
C_g = C_{gs} + C_{gb} + C_{gd} \tag{2.16}
\]

\[
f_T = \frac{1}{2\pi} \left. \frac{i_D}{i_I} \right|_{v_{ds}=0} = \frac{1}{2\pi} \frac{g_m}{C_g} = \frac{1}{2\pi} \frac{g_m}{C_{gs} + C_{gb} + C_{gd}} \tag{2.17}
\]

f\(\text{max}\) is also called unity power gain frequency. When computing f\(\text{max}\) it is assumed that the transistor is conjugately matched at the input and output to compute the unilateral power gain [3], [21], and is defined at the frequency as the power gain drops to unity. The relationship [3] between f\(T\) and f\(\text{max}\) is then

Figure 2.6: Circuit to estimate \(\omega_T\)
found in (2.18).

From (2.18), we can conclude the dependency of the effective gate resistance ($R'_{ge}$) and how it limits the usefulness of the device. However, as concluded in [12], $f_{\text{max}}$ is a small-signal parameter, presuming a conjugate-matched output, which is not likely the case in the output stage of a power amplifier [8] and can roughly be used in the context of the driver stages where the signal levels are smaller. The gate resistance can be reduced by several layout techniques as previously described. Furthermore, impedance matching techniques are treated in Chapter 4.

2.3 References


[7]. C.D. Presti, F. Carrara, A. Scuderi, S. Lombardo, G. Palmisano, “Degradation Mechanisms in CMOS Power Amplifiers Subject to Radio-


2.3 References


Chapter 3

The RF Power Amplifier

3.1 Introduction

This chapter will provide the fundamental knowledge and aspects of RF power amplifier design. Initially, the PA is considered in the context of a radio transmitter. Then several performance metrics, like output power, drain efficiency, and power-added efficiency, of a PA are introduced, which are then followed by a description of PA classes and how they operate. The PA classes covered are the transconductance amplifiers, like class A, AB, B, and C, and the switched mode classes D, E, and F. Eventually, the theory behind the linearization techniques Polar Modulation and Outphasing is described.

3.2 Power Amplifier Fundamentals

A typical transmitter includes a Digital Baseband (DB), DAC, Mixers (X), two phase-shifted LO signals, followed by the PA, and a matching network including filters. Typical transmitter configurations are two-step transmitters [1], direct-modulation transmitters, and direct-conversion transmitters as in Figure 3.1.
The signal to be transmitted \( (x_{BB}(t)) \), is initially processed by the DB and split into the in-phase (I) and quadrature (Q) channels, which are then upconverted to the RF carrier by the quadrature modulator, usually implemented by two mixers and two LO signals with a phase difference of 90 degrees.

\[
x_{BB}(t) = I(t) + jQ(t) \tag{3.1}
\]

\[
x(t) = r(t) \cos(\omega_c t + \phi(t)) \tag{3.2}
\]

\[
r(t) = \sqrt{I^2(t) + Q^2(t)} \tag{3.3}
\]

\[
\phi(t) = \arctan\left(\frac{Q(t)}{I(t)}\right) \tag{3.4}
\]

Since the power of the output signal \( (x(t)) \), from the quadrature modulator usually is too low for radio transmission, the signal is amplified by the PA before being sent to the antenna. In the most ideal of worlds, the output of the PA is just an amplified version of the quadrature modulator output. A detailed description of transceiver design will not be covered by this chapter, but has been extensively discussed by a number of authors [2], [3].

### 3.2.1 Output Power

To define Output Power \( (P_{out}) \) we consider the basic circuit in Figure 3.2, which shows a PA, with two driver stages, connected to an antenna. The output power is defined as the active power delivered in the load (antenna) at the fundamental frequency. Assuming that the load is purely resistive at the frequencies of interest we can represent the antenna as a resistor \( (R_L) \) usually having a value of 50Ω. The load impedance, however, can be transformed to have both a higher and lower value with an imaginary part by the matching network. This matter will be discussed in the next chapter, but for the moment we assume that the load resistor can be represented by a resistance.

![Figure 3.1: Block diagram of a direct-conversion transmitter](image)
3.2 Power Amplifier Fundamentals

![Power Amplifier Diagram]

**Figure 3.2: Power amplifier (PA) with two driver stages, A1 and A2, connected to an antenna**

Based on the circuit in Figure 3.2, we can define the instantaneous output power at any particular moment as $P_{\text{out, inst}}$ (3.5) with an average power of $P_{\text{out, av}}$ (3.6). The PA also generates power at frequencies other than at the intended one, but these are neglected at the moment. We define $V_{\text{out, max}}$ as the sinusoidal amplitude (A) of the signal at the fundamental frequency and $V_{\text{out, rms}}$ as the corresponding rms value. Henceforth, the power generated at the fundamental frequency will be called $P_{\text{out}}$ (3.7).

$$P_{\text{out, inst}} = v_{\text{OUT}}(t)i_{\text{OUT}}(t) \quad (3.5)$$

$$P_{\text{out, av}} = \lim_{T \to \infty} \frac{1}{T} \int_{-T/2}^{T/2} P_{\text{out, inst}}(t) \, dt \quad (3.6)$$

$$P_{\text{out}} = \frac{A^2}{2R_L} = \frac{V_{\text{out, max}}^2}{2R_L} = \frac{V_{\text{out, rms}}^2}{R_L} \quad (3.7)$$

### 3.2.2 Gain and Efficiency

Considering the circuit in Figure 3.2 again, we introduce the input RF power ($P_{\text{in}}$) driving the whole amplifier chain. By combining the input power ($P_{\text{in}}$) the output power ($P_{\text{out}}$) the Gain (G) can be defined as the ratio of the output power and the input power, which is usually expressed in dB (3.8).

$$G_{\text{dB}} = 10 \log_{10} \left( \frac{P_{\text{out}}}{P_{\text{in}}} \right) \quad (3.8)$$

An important measure of the PA is the efficiency, as it directly affects the talk-time in handheld devices and has a significant impact on the electricity bill in base station PAs. One of the efficiency measures is the Drain Efficiency (DE)
(3.9), which is defined as the ratio between the average output power at the fundamental \( P_{\text{out}} \) and the DC power consumption \( P_{\text{DC,drain}} \) of the very last stage in the amplifier chain of the PA. When considering the input power \( P_{\text{in}} \) needed to drive the amplifier chain, we can define another efficiency metric called Power-Added Efficiency (PAE) (3.10), as the input power subtracted from the output power, which is then divided by the total DC power consumption \( P_{\text{DC,tot}} \). Moreover, the total DC power consumption includes the DC power consumed at the drain, and the total DC power consumed by all other amplifier stages (A1 and A2).

\[
\begin{align*}
DE &= \frac{P_{\text{out}}}{P_{\text{DC,drain}}} \\
PAE &= \frac{P_{\text{out}} - P_{\text{in}}}{P_{\text{DC,tot}}} = \frac{P_{\text{out}} - P_{\text{in}}}{P_{\text{DC,drain}} + P_{\text{DC,A}}} = \frac{P_{\text{out}} - P_{\text{in}}}{P_{\text{DC,drain}} + \sum_{k=1}^{n} P_{\text{DC},A_k}}
\end{align*}
\]

### 3.2.3 Peak Output Power, Crest Factor, and Peak to Average Power Ratio

Due to the development of modulation schemes utilizing both amplitude and phase modulation, i.e. WLAN, we need to introduce a new measure called Crest Factor (CF) and Peak-to-Average Power Ratio (PAPR), sometimes also denoted as Peak-to-Average Ratio (PAR). For a signal with envelope profile \( A(t) \), the average output power \( P_{\text{out,av}} \) and Peak Envelope Power (PEP) can be defined according to (3.11) and (3.12) [4], [5], [6], respectively.

\[
\begin{align*}
P_{\text{out,av}} &= \frac{A^2(t)}{2R_L} = \frac{A^2_{\text{rms}}}{2R_L} \\
PEP &= \frac{\left(\max\{A(t)\}\right)^2}{2R_L} = \frac{A^2_{\text{max}}}{2R_L}
\end{align*}
\]

The CF is defined as the ratio of the peak voltage to the rms value (3.13), while PAPR refers to the ratio of the average output power and the peak output power (3.14) and is usually expressed in dBs.

\[
\begin{align*}
CF &= \frac{A_{\text{max}}}{A_{\text{rms}}} \\
PAPR_{\text{dB}} &= 10\log_{10}(CF^2) = 20\log_{10}\left(\frac{A_{\text{max}}}{A_{\text{rms}}}\right)
\end{align*}
\]
3.2 Power Amplifier Fundamentals

Signals with high PAPR are especially troublesome to transmit from an efficiency point of view, as the signal requires significant signal headroom such that the peak envelope amplitudes are transmitted without being distorted too much. Therefore, in conventional transconductance amplifiers there is significant power dissipation as the transistor output stage is biased to handle the large power peaks, even though the output power for most of time is relatively low compared to the peak power level.

3.2.4 Power Amplifier Drain Efficiency for Modulated Signals

The efficiency concept can be further explored for modulated signals [5]. In (3.7) we have defined the efficiency for a signal with constant amplitude (A), and therefore the instantaneous DE can be calculated for a specific output voltage amplitude. Moreover, assuming that the amplitude changes over time (A(t)) means that the efficiency DE will also vary over time. The average efficiency can then be calculated as (3.15).

\[
\overline{DE(A(t))} = \lim_{T \to \infty} \frac{1}{T} \int_{-T/2}^{T/2} DE(A(t)) \, dt
\]  

(3.15)

Consider an ideal Class-A amplifier [7], [8], as in section 3.3.1, with an output RF voltage amplitude of A, with a constant DC power (P_{DC, drain}) consumed by the output stage. Then the DE can be computed for a specific output amplitude as in (3.16), also plotted in Figure 3.3 for an output amplitude normalized against V_{DD}. Obviously, the curve has a quadratic behavior.

![Efficiency of Class A amplifier](image)

Figure 3.3: DE for normalized output amplitude in a Class-A amplifier
dependent on the output amplitude with a maximum efficiency of 50%. Furthermore, consider a transmission of a signal with a PAPR of 10dB, equivalent to a CF of ~3.2. Consequently, the average efficiency would drop to only 5% [9] in the ideal case. This simple example efficiently addresses the need for power efficient PAs transmitting signal with high demands on linearity, and high efficiency during power back-off.

\[
DE(A) = \frac{P_{\text{out}}(A)}{P_{\text{DC, drain}}(A)} = \frac{A^2/2R_L}{V_{DD}I_{DC}} = \frac{A^2/2R_L}{V_{DD}(V_{DD}/R_L)} = \frac{1}{2} \left( \frac{A}{V_{DD}} \right)^2
\] (3.16)

### 3.2.5 Linearity

As previously discussed, several wireless communication standards employ modulation schemes with non-constant envelopes, which needs to be amplified by PAs capable of linear amplification. To quantify the level of linearity, or rather, the level of non-linearity, several measures do exist. Initially, a number of fundamental non-linearity concepts [1] will be introduced, followed by a number of application-related non-linearity measures like Spectral Mask, Error Vector Magnitude (EVM) and Adjacent Channel Power Ratio (ACPR).

#### 3.2.5.1 Gain Compression, Harmonics, and Intermodulation

We return to Figure 3.2, but now to investigate the gain characteristics of the PA. The analysis will be limited to memoryless systems, which can be approximated by a polynomial (3.17) up to the fifth order, neglecting higher order nonlinearities. Let the input signal \(x(t)\) be transmitted by a differential PA, such the output signal of the PA \(y(t)\) now contains additional components including the input signal to the power of three and five. Recall that even order nonlinearities do not generate in-band distortion if the amplifier is fully differential. Assuming that a sinusoidal signal (3.18) is applied to the non-linear system, the resultant signal (3.19) now not only contains power at the fundamental frequency component. We can see that the first term, the in-band component, is distorted by the nonlinearities of the PA, but the phase component is unchanged and explains why nonlinear PAs can be used for constant amplitude modulation [10].

\[
y(t) \approx \alpha_1 x(t) + \alpha_3 x^3(t) + \alpha_5 x^5(t)...
\] (3.17)

\[
x(t) = A \cos(\omega t + \phi(t))
\] (3.18)
3.2 Power Amplifier Fundamentals

\[
y(t) = \left[ \alpha_1 A(t) + \frac{3}{4} \alpha_3 A^3(t) + \frac{5}{8} \alpha_5 A^5(t) \right] \cos(\omega t + \varphi(t)) + \\
\left[ \frac{1}{4} \alpha_3 A^3(t) + \frac{5}{16} \alpha_5 A^5(t) \right] \cos(3\omega t + 3\varphi(t)) + \\
\left[ \frac{1}{16} \alpha_5 A^5(t) \right] \cos(5\omega t + 5\varphi(t))
\] (3.19)

If the input amplitude (or power) is increased even further, the gain of a PA begins to decline. As the gain is 1dB less than the small-signal gain, we define this compression point as the 1dB Compression Point \( P_{1dB} \) as in Figure 3.4a. When the output power does not further increase, due to a higher input power, the PA is said to be saturated and it cannot deliver more power regardless of the input power to PA.

Another distortion “phenomenon” in PAs, and amplifiers in general, is intermodulation, which appears when two closely located frequencies are transmitted through the PA at the same time (3.20). This effect can also be evaluated by the polynomial in (3.17), but for simplicity only first, second, and third order nonlinearities are included (3.21). For an input signal (3.20) the generated intermodulation products are found in (3.22) [1], as well as some DC terms and harmonics not shown. Of particular interest are the frequencies generated at \( 2\omega_1-\omega_2 \) and \( 2\omega_2-\omega_1 \), as these components show up very closely to the frequency components of the signal, \( \omega_1 \) and \( \omega_2 \), and increase proportionally to \( A^3 (A_1=A_2) \). When increasing the input amplitude even further, ideally the lines of the fundamental term and the third-order terms \( (2\omega_1-\omega_2 \) and \( 2\omega_2-\omega_1 \) will

Figure 3.4: (a) Intermodulation spectrum of two-tone test
(b) Gain compression curve
eventually cross each other as in Figure 3.4b. This point is called third-order intercept (IP3). Graphically IIP3 can be calculated according to (3.23).

\[
x(t) = A_1 \cos(\omega_1 t) + A_2 \cos(\omega_2 t)
\]

\[
y(t) \approx \alpha_1 x(t) + \alpha_2 x^2(t) + \alpha_3 x^3(t)...
\]

\[
\omega_1 \pm \omega_2 : \alpha_2 A_1 A_2 \cos((\omega_1 + \omega_2) t) + \alpha_2 A_1 A_2 \cos((\omega_1 - \omega_2) t)
\]

\[
2\omega_1 \pm \omega_2 : \frac{3\alpha_3 A_1^2 A_2}{4} \cos((2\omega_1 + \omega_2) t) + \frac{3\alpha_3 A_1^2 A_2}{4} \cos((2\omega_1 - \omega_2) t)
\]

\[
2\omega_2 \pm \omega_1 : \frac{3\alpha_3 A_2^2 A_1}{4} \cos((2\omega_2 + \omega_1) t) + \frac{3\alpha_3 A_2^2 A_1}{4} \cos((2\omega_2 - \omega_1) t)
\]

\[
\omega_1, \omega_2 : \left(\alpha_1 A_1 + \frac{3\alpha_3 A_1^3}{4} + \frac{3\alpha_3 A_1 A_2^2}{2}\right) \cos(\omega_1 t) + \left(\alpha_1 A_2 + \frac{3\alpha_3 A_2^3}{4} + \frac{3\alpha_3 A_2 A_1^2}{2}\right) \cos(\omega_2 t)
\]

\[
\text{IIP3}_{dbm} = \frac{\Delta P_{db}}{2} + P_{in|dbm}
\]

Other measures to describe the nonlinearities of the PA, are AM-AM (amplitude modulation to amplitude modulation) and AM-PM (amplitude modulation to phase modulation) distortion. AM-AM is defined as the relationship between the amplitude of the input signal and the output signal, similarly to the relationship between output power and input power to the PA and the gain compression of the system. AM-PM represents the distortion process, as the increase of input power causes an additional phase shift on the output signal [8].

3.2.5.2 Spectral Mask and Adjacent Channel Power Ratio

As the radio transmission have a frequency bandwidth (channel) allocated around the carrier, where the transmission may be conducted, any power falling outside these frequency will disturb neighboring channels and the transmission therein. The boundaries specifying between what frequencies the transmission should occur is usually specified with a spectral mask, where the power around the carrier is specified in dBc (decibel to carrier) or in exact power levels given in dBm in a specified bandwidth at certain offsets. The Adjacent Channel Power Ratio (ACPR) is defined [11] as the ratio of power in a bandwidth away from
the main signal to the power in a bandwidth within the main signal, where the bandwidths and acceptable ratios are determined by the standard being employed.

As described in the previous section, the distortion power levels grows with the input amplitude due to gain compression, such that even a linear PA will cause spectral distortion and leak into adjacent channels when the PA is forced into saturation. In Figure 3.5 this is exemplified, by showing the measured spectrum and spectral mask (solid line) for a linear WLAN PA [12], where the peak amplitudes of the signal are compressed, and creating distortion such that the signal is about to violate the spectral mask.

### 3.2.5.3 Error Vector Magnitude

Another signal quality measure is the Error Vector Magnitude (EVM), which is computed on I and Q data (or amplitude and phase) measured around the carrier. In the I versus Q plane (Figure 3.5) each location encodes a specific data symbol, which has a certain number of bits depending on the complexity of the modulation scheme used. At any point in time the magnitude and phase of the signal can be measured and mapped towards an ideal reference signal based on transmitted data stream, clock timing, filtering parameters, etc. [13], [14]. The difference between the measured signal and the ideal reference signal creates the error vector, and is generally defined as the rms value of the error vector over

![Figure 3.5: (a) Spectral mask for transformer-based PA [12] (b) Vector definitions in EVM](image-url)
time [15]. Sometimes the peak value is also used.

The EVM measure provides the direct measure of the signal quality and the transmitter and receiver/demodulation accuracy, and the result captures several signal impairments like AM-AM distortion, AM-PM distortion, phase noise, and random noise.

### 3.3 Power Amplifier Topologies

As the foundation of efficiency and linearity has been covered in the previous sections, the fundamental PA topologies will here be described with the trade-offs between linearity and power efficiency emphasized. Initially, the transconductance amplifiers are described. These amplifiers use the device as a voltage-controlled current-source, where the input voltage controls the output current. The following section describes the switching amplifiers, utilizing the transistor as a switch to modulate the signal.

#### 3.3.1 Class-A

The Class-A PA is the most “classical” PA, with a transistor biased so that it never turns off, which means a conduction angle of 360 degrees (Figure 3.6). The conduction angle is defined as the portion of the input signal during which the transistor conducts. The typical drain voltage and drain current waveforms are shown in Figure 3.7, which assume a highly linear relationship between the signal drain current and the input voltage (3.24). Due to the non-abrupt drain current, the linearity of the amplifier is certainly high, but suffers from low efficiency due to the same reason. In reality, however, the relationship is not that perfectly linear [8], but the ideal model is used since it is very tractable from an analytical perspective.

\[
i_D = k (v_{in} - V_{th}) \tag{3.24}
\]

\[
i_D = I_{DC} + i_{rf} \sin \omega t \tag{3.25}
\]

The derivation of the efficiency has been done by several authors before [7], [8], [16], but a short review will be given here. The basic circuit considered is shown in Figure 3.6, with a transistor biased at a certain voltage level with a certain bias current \(I_{DC}\), and the signal component of the drain current \(i_{rf}\) [7].

The output voltage is simply the signal current multiplied with the load resistance (3.26). Due to the large supply inductor, only DC current flows through the inductor, and consequently, the signal current is just the signal component of the drain current. The drain voltage is the sum of the signal

\[
V_D = V_{DD} - V_{DS} = V_{DD} - k i_D \tag{3.26}
\]
voltage and DC voltage, and as the inductor is short-circuited for DC frequencies, the DC drain voltage is the supply voltage.

It means that the peak drain voltage is $2V_{DD}$ with a peak drain current of $2V_{DD}/R$. From the assumptions mentioned above, the output power can now be stated according to (3.27), and the dissipated DC power (3.28) – which is independent of the output RF signal. Eventually, the maximum efficiency of 50% can be computed according to (3.29). Assuming a lower output swing (3.30) with amplitude $A$, the efficiency drops significantly and more power is dissipated across the device. One should also note that the efficiency of 50% in

Figure 3.6: Generic single-stage power amplifier

Figure 3.7: Drain voltage and current waveforms in an ideal Class-A
Class-A PAs, is the absolute maximum, assuming the full voltage swing is attainable, no losses in matching network, and no amplitude modulation is present.

\[ v_{out} = -i_{rf} R \sin \omega t \]  \hspace{1cm} (3.26)

\[ P_{out} = P_{ef} = \frac{i_{ef}^2 R_L}{2} = \frac{V_{DD}^2}{2R_L} \]  \hspace{1cm} (3.27)

\[ P_{dc} = V_{DD} J_{DC} = V_{DD} i_{ef} \]  \hspace{1cm} (3.28)

\[ DE = \frac{P_{ef}}{P_{DC}} = \frac{i_{ef}^2 (R_L / 2)}{i_{ef} V_{DD}} = \frac{i_{ef} R_L}{2 V_{DD}} = \frac{V_{DD}}{2 V_{DD}} = \frac{1}{2} \]  \hspace{1cm} (3.29)

\[ DE(A) = \frac{P_{out}(A)}{P_{DC,drain}(A)} = \frac{A^2 / 2R_L}{V_{DD} J_{DC}} = \frac{A^2 / 2R_L}{V_{DD} (V_{DD} / R_L)} = \frac{1}{2} \left( \frac{A}{V_{DD}} \right)^2 \]  \hspace{1cm} (3.30)

In **Paper 1** and **Paper 2**, two linear CMOS PAs are designed in a 65nm CMOS technology to operate in the 2.4-2.5GHz band. The PAs utilize thick oxide (5.2nm) transistors, but used different input and interstage matching networks. A 72.2Mbit/s, 64-QAM 802.11n OFDM signal was applied to both PAs. The design presented in **Paper 1** achieved an average and peak output power of 9.4dBm and 17.4dBm, respectively, while an EVM of 3.8% was measured. The design presented in **Paper 2** achieved an average and peak output power of 11.6dBm and 19.6dBm, respectively, while an EVM of 3.8% was measured. The WLAN performance is comparable to several state-of-art WLAN PAs as described in **Paper 2**.

### 3.3.2 Class-B and AB

The ‘sister’ PA of the Class-A is the Class-B, which has the same type of basic circuitry as Class-A, but is biased differently. In Class-B the bias voltage is adjusted such that the transistor only conducts current half of the RF cycle, i.e. at the threshold voltage, such that the conduction angle is 180 degrees. Consequently, with intermittent operation of the transistor, we can expect more distortion on the output voltage and a high-Q tank is needed at the output to get a fairly sinusoidal signal back. Similarly for the Class-B amplifier as for the Class-A amplifier, we can analyze the drain voltage and current waveforms in Figure 3.8, where it is assumed that the drain current is sinusoidal for the part of the period when the transistor is conducting, which is a quite crude approximation as the change is abrupt. The fundamental component of the drain current can be computed according to (3.31), based on Fourier coefficients. As
3.3 Power Amplifier Topologies

the maximum output voltage is \( V_{DD} \), the maximum value of the signal component of the drain current is equal to the one in Class-A amplifiers, i.e. \( 2V_{DD}/R_L \). As the maximum output voltage is the same as for Class-A amplifiers, the maximum output power is equal to (3.27). The DC supply current can also be found through Fourier coefficients (3.32), and eventually the maximum DE (3.33) can be found for the Class-B amplifier.

\[
i_{D,\text{fund}} = \frac{2}{T} \int_{0}^{T/2} i_{rf} \sin(\omega t) \sin(\omega t) \, dt = \frac{i_{rf}}{2}
\]

\[
i_D = \frac{1}{T} \int_{0}^{T/2} 2V_{DD} \frac{R_L}{2} \sin(\omega t) \, dt = \frac{2V_{DD}}{\pi R_L}
\]

\[
DE = \frac{P_{\text{out, max}}}{P_{DC}} = \frac{V_{DD}^2/2R_L}{2V_{DD}^2/\pi R_L} = \frac{\pi}{4} \approx 0.785
\]

It is clear, that Class-B amplifiers achieve a significantly higher efficiency than Class-A amplifiers, but at the expense of more distortion. Additionally, in theory the gain is reduced by 6dB compared to Class-A [8], and therefore many practical PAs are a mix of Class-A and Class-B with a conduction angle between 180 and 360 degrees, and an acceptable trade-off between linearity, gain, and efficiency.

Figure 3.8: Drain voltage and current waveforms in an ideal Class-B
3.3.3 Class-C
Reducing the conduction angle even further, obviously leads to a situation when the transistor is more turned off, than turned on. Mathematical derivations show an efficiency of 100% as the conduction angle is reduced to 0. But since the gain and output power goes to zero simultaneously, this type of amplifier is not very frequently used in RF applications in GHz frequencies, even though successful implementations at 900MHz do exist [17]. The operation of the transistor can be done in a similar way when it is operated in Class-C mode, and a full derivation is given in [7] and [16].

3.3.4 Class-D
The amplifier circuits so far have focused on providing an acceptable trade-off between linearity and efficiency. By taking advantage of the complementary MOS devices, a highly efficient amplifier can be designed by using inverters as in Figure 3.9. Instead of using the transistors as current-sources, the transistors are now used as switches instead such that the output voltage toggles between two voltage extremes, i.e. ground and $V_{DD}$. The basic principle behind the high efficiency can be found by looking at its I-V characteristics in Figure 3.10, and conclude that the voltage across the transistors is zero as the current flows through the switches (transistors). Similarly, when there is voltage across the switch, the current is zero. However, the drawback with this topology is that there is no amplitude linearity between the input signal and the output signal, but by modulating the duty cycle of the driving signal, the amplitude of the output voltage amplitude can be controlled.

Assuming a driving signal with 50% duty cycle, consequently, the output

![Figure 3.9: Class-D amplifier](image-url)
signal would have the same duty cycle. Analyzing the Fourier coefficients of the square output voltage waveform reveals that power will be lost in the harmonics (3.34). To force as much power as possible to fundamental tone, a filter is needed on the output before the load. The filter (ideally) provides a short at the fundamental frequency, and infinite impedance at all other frequencies, which means that all power is forced to the fundamental frequency.

Further utilizing (3.34), we can compute the fundamental component of the output voltage across the load resistance and the current through the load, eventually leading to the output power at the fundamental (3.35). The average (DC) current from $V_{DD}$, is the average of the current passing through the PMOS, a half wave sinusoidal signal (3.36) with amplitude $i_{OUT,1}$.

With all relationships established, the DE is computed to be 100% according to (3.37). However, such an ideal and lossless filter characteristics is hard to find. Therefore, it might be more tractable to consider the power available in fundamental component, compared to the total power in the square wave as in (3.38). It is clear that a significant amount of power is wasted in the harmonics, unless we provide an ideal filter, which is not realistic in real implementations. Therefore the efficiency can be expected to be lower than 100%.

Analyzing the amplifier output stage, the inverter in Figure 3.11, the power dissipation can be divided into dynamic power and static power. The dynamic power origins from switching power and short-circuit power, while the static
The RF Power Amplifier

\[ P_{\text{out,1}} = \frac{v_{\text{OUT,1}} i_{\text{OUT,1}}}{2} = \left( \frac{2V_{\text{DD}}}{\pi} \right) \left( \frac{2V_{\text{DD}}}{\pi R_L} \right) / 2 = \frac{2V_{\text{DD}}^2}{\pi^2 R_L} \]  

(3.35)

\[ I_{\text{DC}} = \frac{i_{\text{OUT,1}}}{\pi} = \frac{2V_{\text{DD}}}{\pi^2 R_L} \]

(3.36)

\[ DE = \frac{P_{\text{out,1}}}{P_{\text{DC}}} = \frac{2V_{\text{DD}}^2 / \pi^2 R_L}{V_{\text{DD}} I_{\text{DC}}} = \frac{2V_{\text{DD}}^2 / \pi^2 R_L}{V_{\text{DD}} \left( \frac{2V_{\text{DD}}}{\pi^2 R_L} \right)} = 1 \]

(3.37)

\[ b_i^2 = \left( \frac{2V_{\text{DD}}}{\pi} \right)^2 \]

\[ \sum_{n=1}^{\infty} \frac{b_i^2}{\sum_{n=1}^{\infty} \left( \frac{V_{\text{DD}}}{n\pi} \left[ 1 - (-1)^n \right] \right)^2} \approx 0.814 \]

(3.38)

power is related to leakage current as previously discussed in Chapter 2. The switching power relates to the charging and discharging of the capacitive load, which includes the drain capacitance of the inverter, and may dissipate significant power when large transistors are used.

Assuming the input voltage is zero the PMOS will turn on and start to charge the total load capacitance \( C_D \) requiring energy of \( C_D V_{\text{DD}}^2 \). When the input toggles to \( V_{\text{DD}} \), the PMOS is turned off, and the NMOS is turned on. The charge stored in \( C_D \) is then dumped to ground through the NMOS, while no additional energy is pulled from \( V_{\text{DD}} \). Moreover, the switching power can be expressed as in (3.39), where \( f \) is the clock frequency of the input signal, and \( \alpha \) is the switching activity ratio, which determines how frequently the output switches from low-to-high per clock cycle [18].

However, the input signal must have a finite rise-time, which for a short moment leads to that both the NMOS and PMOS transistors are turned on causing to a direct path between \( V_{\text{DD}} \) and ground. The power dissipated, due to the direct path, is denoted as short-circuit power and can be calculated according to (3.40) [19], where \( \beta \) is the gain factor of the transistors, and \( \tau \) is the input rise/fall time. The short-circuit power can be kept below 10% of the switching component in a properly designed circuit [20].

Due to the significant scaling of the MOS transistors, new power dissipation mechanisms are introduced. The leakage power has become a major contributor

\[ P_{\text{switching}} = \alpha f C_D V_{\text{DD}}^2 \]

(3.39)

\[ P_{\text{short-circuit}} = \frac{\beta}{12} (V_{\text{DD}} - 2V_{\text{th}})^3 f\tau \]

(3.40)
3.3 Power Amplifier Topologies

3.3.5 Class-E

The Class-D amplifier showed the potential over “classical” transconductance amplifiers topologies to achieve higher efficiencies up 100%, by operating the transistors as switches. However, it was also clear, that the topology suffered from loss mechanisms, due to finite rise- and fall-times of driver signals, and the parasitic capacitances of the output stage. The Class-E topology, Figure 3.12, aims at solving these issues by shaping the drain voltage with a reactive load impedance in order to decrease the drain voltage to zero as the switch turns on, \( t_1 \) (3.41)[25]. Another condition (3.42), concerns the slope of the drain voltage, which should be zero at turn-on in order to allow for component mismatches without causing significant power loss [26], [27]. The conditions on the drain voltage result in ideal efficiency of 100%, elimination of losses associated with charging the drain capacitance as in Class-D, reduction of switching losses, and good tolerance of component variation [28]. Ideally, the resulting drain voltage and current waveforms would look like the waveforms in Figure 3.13.

\[
v_{DS}(t = t_1) = 0 \quad (3.41)
\]

\[
\left. \frac{\partial v_{DS}}{\partial t} \right|_{t = t_1} = 0 \quad (3.42)
\]

Figure 3.11: Schematic of CMOS inverter including dynamic currents
A major drawback in Class-E implementations is the very high peak drain voltage, which are troublesome in nanometer CMOS technologies with thin gate oxides and occurs while the device is turned off, posing potentially severe reliability issues. To ensure reliable operating at RF, a safe approach is to make sure that safe operation is met even for DC conditions, which translate into never exceeding the critical oxide field of $\sim 1 \text{V/nm}$ gate oxide [29] at the drain of the transistor.
At first sight, the conditions in (3.41) and (3.42) may seem to be trivial to solve analytically, but soon one realize the complex dependencies between all circuit components. A number of authors have already derived [16], [30], [31], the design equations and will not be repeated here. Instead a more intuitive approach of the idealized operation of the Class-E amplifier in Figure 3.12 is described. Similarly to Class-D amplifiers, Class-E amplifiers do not inherently feature linear amplification, but is a good candidate to be used in polar-modulated amplifiers [32], [33], pulse-width modulation [34], and outphasing.

The device (M₁) is assumed to be driven by a square wave with 50% duty cycle [30]. The RF choke (L₁) is assumed to only carry DC current. The Q of the series-tuned circuit (L₂ and C₂) is high enough so the output can be assumed to be a sinusoidal signal, and the reactance jωLₐ applies only at the fundamental. For all other frequencies the reactance is assumed to be infinitely large. The switching operation of the device is lossless, except for any charges stored in C₁, which are discharged to ground at device turn on, and the transition of the device is assumed to be instantaneous. Moreover, the device is assumed to have zero on-resistance, and an infinitely large off-resistance.

When the switch is closed, the DC current through inductor L₁, flows through the switch. As the switch opens, the sinusoidal output current subtracted from the DC current will charge the capacitor C₁ and the parasitic capacitors of the device for a non-negligible amount of the period time. Simultaneously, the voltage across the device increases, and eventually rises to a peak voltage level of ~3.56V_DD [27] and the charges stored in the capacitors are dumped into the load. The utilization of the capacitor (C₁) is a major benefit of Class-E compared to Class-D, where the parasitic drain capacitance always discharges to ground. Beneficially, this capacitor can be made up entirely by the parasitic drain capacitance of the device, eliminating the need for an additional capacitor, while also reducing the on-resistance for more output power.

A simulation of a Class-E amplifier in a 130nm CMOS process is performed with the parameters and performance as given in Table 3-1. All components are ideal, except for the transistor, and its on-resistance. The voltage and current waveforms are plotted in Figure 3.14 and Figure 3.15, respectively, and are defined as in Figure 3.12. The current through inductor L₁ is not perfect DC, but is always positive, and therefore it is not plotted. Considering the voltage waveforms, the peak of the v_DS do not reach 3.56V_DD, and as the transistor turns

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>230</td>
<td>0.431</td>
<td>6</td>
<td>1.05</td>
<td>35.5</td>
<td>5.27</td>
<td>50</td>
<td>1</td>
<td>88</td>
<td>9</td>
<td>2</td>
</tr>
</tbody>
</table>
on by $v_{\text{DRIVE}}$, $v_{\text{DS}}$ is not zero.

Due to the nonzero $v_{\text{DS}}$, we can clearly see from the current waveforms how the charges stored on $C_1$ are dumped to ground through the transistor at $t_1 = \pi$. As long as the transistor is turned on (between $\pi$ and $2\pi$), the current ($i_D$) through the transistor increases. At transistor turn off, the current through the capacitor increases rapidly, due to the current ($i_L$) through $L_1$ and the reversed output current ($i_{\text{out}}$). After some time, the output current becomes positive, and shortly, the capacitors are discharged. At the time the currents through the capacitors are zero, $v_{\text{DS}}$ reaches its highest peak.

As previously stated, the dependencies between the component values are not trivial from an analytical point of view. In [26] the effect of various component values is elaborated, and it is found that variations of the shunt susceptance ($B = \omega C$) still results in high efficiency over a large range of values. However, for small values $C_1$ will experience both high and low voltage peaks.

For large values, the rise-time of $V_{\text{DS}}$ will be long, and thus, the peak voltage is reduced. A similar behavior is found for the load angle ($\tan(j\omega L X/R)$), as it is varied from negative to positive values. Further discussion on circuit parameters is also found in [3].

One important component is obviously the transistor itself, and intuitively we can pull more current through for a larger device when it is turned on, than for a smaller device, since the on-resistance is smaller. As derived in [35], the DE

![Figure 3.14: Simulation results of drain voltage ($v_{\text{DS}}$), driver signal ($v_{\text{DRIVE}}$), and output voltage ($v_{\text{OUT}}$)](image-url)
could be approximated by (3.43). Due to the continuous scaling of the transistors it becomes increasingly challenging to design Class-E amplifiers, due to the high voltage peaks generated. It forces the design to use a smaller supply voltage, but at the same time, the load resistance decreases rapidly (3.44).

\[ DE \approx \frac{1}{1 + 1.4r_{on} / R_L} \]  
\[ P_{out} = 0.577 \frac{V_{DD}^2}{R_L} \]

In Paper 4 [36], two Class-E CMOS PAs in 130nm CMOS are operated at low supply voltages. The first PA is intended for DECT, while the second is intended for Bluetooth. Both are using inverters as driver stages. At 1.5V supply voltage, the DECT PA delivers +26.4dBm of output power with a DE and PAE of 41% and 30%, respectively. The Bluetooth PA has an output power of +22.7dBm at 1.0V with a DE, and PAE of 48% and 36%, respectively. Recent fully-integrated Class-E amplifiers in 65nm CMOS have provided +28.5dBm of output power with 29% PAE at 2.5V supply voltage [34], and +30dBm with 60% at 5V using extended-drain thick oxide devices [37].

### 3.3.6 Class-F

Similar to Class-E, the Class-F amplifier employs drain voltage waveform shaping to achieve a high efficiency. Figure 3.16 shows a Class-F amplifier with...
a transmission line at the drain [7], and a high-Q tank in parallel with the load resistor. The length of the transmission line is $\lambda/4$ at the fundamental frequency, and the Q is considered high enough to short circuit all frequencies outside the desired bandwidth. Due to the transmission line, the load impedance seen at the drain can be computed through (3.45).

\[
Z_{in} = Z_0 \frac{R_L + jZ_0 \tan(2\pi \lambda / \lambda)}{Z_0 + jR_L \tan(2\pi \lambda / \lambda)}
\]  

(3.45)

Figure 3.16: Class-F amplifier

Figure 3.17: Class-F amplifier waveforms
3.4 Linearization of Non-Linear Power Amplifiers

From (3.45), we can conclude that the impedance seen at the drain is simply the load resistor, $Z_o^2/R_L$, at the fundamental. At all even harmonics, the impedance seen is simply $R_L$, which is zero at all harmonics. Moreover, at all odd harmonics, the transmission line shows an infinitely large impedance for a load impedance, which is equal to zero. Furthermore, assuming a square drive voltage with 50% duty cycle only containing odd harmonics, consequently the square wave would also appear at the drain and the load current is purely sinusoidal at the fundamental frequency. Figure 3.17 also reveals that the Class-F amplifier ideally is capable of providing 100% efficiency, and from the basic topology presented, different circuit combinations with the same characteristics have been presented as inverse Class-F and Class-E/F amplifiers [38].

3.4 Linearization of Non-Linear Power Amplifiers

From the description of the PA topologies, it was clear that the switching amplifiers could achieve a high efficiency by operating the devices as switches, but also at the expense of losing the control of the amplitude of the output signal. There exist two major linearization techniques where highly efficient non-linear PA topologies can be used in order to achieve an overall linear PA. These two techniques are called Polar Modulation and Outphasing. Other linearization techniques do include feedback and analog, and digital predistortion [7], but is not covered in this thesis.

3.4.1 Polar Modulation

The basic principle in Polar Modulation, Figure 3.18, is to combine a highly efficient non-linear RF PA with a highly efficient envelope amplifier (ENV AMP) to achieve a highly efficient linear PA. The idea of Polar Modulation
The RF Power Amplifier

originates to from the Envelope Elimination and Restoration (EER) technique developed by Kahn in 1952 [39], and therefore it is also denoted as the Kahn Technique Transmitter.

A suitable non-linear RF PA to be used is the Class-E PA, where the drain efficiency does not depend on the supply voltage, and consequently the output stage in the linear PA can operate at a high efficiency even when backed-off from peak output power. This feature should be compared with the transconductance PAs (Class-A, AB, B), where the optimum efficiency is found when the PA is operated at maximum output power and drops when backed-off.

Figure 3.18 demonstrates the concept of Polar Modulation, where $\phi(t)$ and $A(t)$ contains the phase and amplitude information of the RF signal, respectively. The RF output signal is proportional to the supply voltage as in (3.46), where $\alpha$ represents the ratio of the output amplitude to the supply voltage. Furthermore, the supply voltage was also modulated according to the amplitude signal ($A(t)$) as in (3.47), where $\beta$ represents the ratio of the supply voltage to the amplitude signal. The modulated RF output signal can then be expressed as in (3.48), and it is clear that it contains both amplitude and phase modulation.

\[
v_{\text{OUT}}(t) = \alpha V_{DD}(t) \cos(\omega t + \phi(t)) \quad (3.46)
\]

\[
V_{DD}(t) = \beta A(t) \quad (3.47)
\]

\[
v_{\text{OUT}}(t) = \alpha \beta A(t) \cos(\omega t + \phi(t)) \sim A(t) \cos(\omega t + \phi(t)) \quad (3.48)
\]

3.4.2 Outphasing

The basic principle in the Outphasing concept, Figure 3.19, is that an amplitude- and phase-modulated signal, $s(t)$ in (3.49), is decomposed into two constant amplitude signals, $S_1(t)$ and $S_2(t)$ as in (3.50) [40], [41], based on the original signal and the quadrature signal ($e(t)$).

The two constant envelope signals are applied to two highly efficient and highly non-linear PAs, whose outputs are summed in a power combiner. In the power combiner, the quadrature signals cancel each other, and the output signal is an amplified version of the amplitude- and phase-modulated signal ($s(t)$). The perfect cancellation of the quadrature signal occurs when the matching of the PAs is well-balanced. Any imbalance in gain or phase leads to incomplete cancellation of the quadrature signal, whose spectrum extends into adjacent channels [41], and therefore will cause adjacent channel interference.
3.4 Linearization of Non-Linear Power Amplifiers

Figure 3.19: Outphasing

\[ s(t) = r(t) e^{j\theta(t)}; \quad 0 \leq r(t) \leq r_{\text{max}} \]

\[ S_1(t) = s(t) - e(t) \]

\[ S_2(t) = s(t) + e(t) \]

\[ e(t) = js(t) \sqrt{\frac{r_{\text{max}}^2}{r^2(t)} - 1} \]

Using a conventional matched combiner for summing the signals, much of the efficiency inherent in the Outphasing is lost [40], [42]. Assuming that the output voltages of the individual PAs can be represented as ideal voltage sources (V_1 and V_2), and by utilizing a non-isolated combiner the efficiency can be preserved as the DC power consumption will scale according to the load impedance, so that a high efficiency can be achieved independently of the phase difference between V_1 and V_2 [42]. However, driving the load resistance differentially will present a variable reactive load to the PAs [42]. To cancel the reactive contribution and achieve a high efficiency at a certain phase difference between V_1 and V_2, two compensating Chireix elements (C_X and L_X) can be used [43]. It means that a high efficiency can be achieved even at a high level of power back-off.

Suitable PAs to be used for the Outphasing are typically Class-D and Class-E, since Class-D has a low sensitivity to load variations, and Class-E due to its high efficiency and absorption of drain capacitance. A SiGe 0.18\( \mu \)m Class-D Outphasing PA with transmission lines and Chireix compensation elements is presented in [44]. Recently, a Class-D Outphasing PA in 45nm CMOS was successfully used to amplify WiFi/WiMAX signals with average output power levels around 20dBm [45]. In [46] a differential Class-E CMOS PA is used together with an external wideband balun transformer. In order to compensate
for the varying load impedance, parallel capacitors are placed next to the supply
inductors and are being dynamically switched during operation of the PA.

Figure 3.20: Reactive compensation

3.5 References


3.5 References


3.5 References


Chapter 4

Matching Techniques

4.1 Introduction

In the previous chapter it was assumed that the load connected to the PA could represent any desired value. However, usually the antenna impedance has to be transformed to a lower value to achieve a sufficiently high output power. As CMOS technologies continue to scale, the available voltage headroom is reduced significantly, and the risk of destroying the thin gate oxides increases. This simple example demonstrates the necessity of using matching networks. Assume there is an available voltage swing of 1V. This would only generate as little as 10mW across a 50Ω resistor, which is not sufficient for many applications. In the previous chapter a great portion of the material dealt with the efficiency of the different classes, but no energy was spent on investigating the load itself. In this chapter we will clearly see how the design of the matching network has a significant impact on the overall efficiency of the matching network, and consequently also on the total efficiency of the transmitter. Moreover, the input and interstage matching in a multi-stage amplifier will be discussed.
4.2 Conjugate and Power Match

Figure 4.1a shows a single transistor amplifier stage with internal output impedance comprised by a parallel resistor (neglecting the drain capacitance). Considering the maximum power theorem [1], we would choose a load resistor equivalent to the real part of the generator’s impedance [2] in order to achieve maximum output power.

However, a quick analysis reveals that 50% of the power would be lost in the internal resistor, making this choice of load resistor unattractive [3], [4]. Additionally, the maximum power theorem do not pay attention to physical limitations of the device, such as maximum allowed drain voltage before gate oxide breakdown or how much current the device can deliver. As seen in Figure 4.1b, we can see that ideally the maximum allowed drain voltage is quickly reached as the load resistor ($R_L$) is chosen equivalent to the internal output resistor ($R_{out}$), while the current is significantly lower than the physical maximum $I_{max}$. Choosing another load resistor to an approximate value of $V_{max}/I_{max}$ (4.1) ($R_{out} \gg R_L$), indicates that a more suitable load resistor can utilize the transistor capacity in a better way.

$$\frac{R_{out} R_L}{R_{out} + R_L} = \frac{V_{max}}{I_{max}} \quad (4.1)$$

Figure 4.1: (a) Transistor with output resistance ($R_{out}$) and load resistor ($R_L$) (b) Conjugate match ($R_L=R_{out}$) and loadline match ($R_L=V_{max}/I_{max}$); $R_{out} \gg R_L$. 
Consequently, another choice of load resistor than $R_{\text{out}}$, leads to higher output power, higher efficiency, and improved utilization of the transistor. It means that the maximum power theorem is not useful when designing the output stage in a PA as we are matching for optimum power characteristics to squeeze out the maximum power available from the transistor.

### 4.3 Load-pull

The effect of using a loadline match instead of conjugate is also evident in Figure 4.2, where a Class-A amplifier has been matched in two different ways. The solid line represents the power characteristics in a conjugate match for low input drive levels, while the dashed line represents a power match (loadline). The figure shows that the conjugate matched amplifier has a higher gain at low input drive levels than the power matched amplifier. However, the conjugate matched amplifier has a lower 1dB compression point and lower saturated output power than the power matched case. In linear amplifier design, the 1dB compression point is a key parameter to evaluate the linear performance of a PA, and a slightly lower gain is usually acceptable if a higher 1dB compression point can be high achieved. Typically, a power matched amplifier can push the compression parameters to 1-2dB higher levels [2], [4].

Obviously, the IV characteristic of the transistors in Figure 4.1 was very ideal. A more realistic characteristic is shown in Figure 4.3 for a MOS device, where

---

**Figure 4.2:** Compression characteristics for conjugate (c) and power match (p) with markers at maximum linear points ($A_c$, $A_p$), and at the 1dB compression points ($B_c$, $B_p$)
also the characteristic of a typical power device (i.e. GaAs) is drawn. Apparently, the loadline concept works better for the power device, than for the MOS device which has a relatively soft transition from the linear to the saturation region. For the power transistor, a suitable choice of load resistor would be [2]:

\[ R_L = \frac{V_{\text{max}} - V_{\text{knee}}}{I_{\text{max}}} \]  \hspace{1cm} (4.2)

The characteristic is very profound in deep-submicron CMOS technologies, where the voltage knee may be as high as 50% of the supply voltage [2], when the same knee voltage in typical power transistors is about 10-15% of the supply voltage. Due to the high voltage knee, the loadline concept may not be very useful when determining the optimum load resistor for the MOS device, as the capability of the transistor would not be fully utilized. Therefore, a better approach is to use load-pull technique to determine the optimum load impedance. In the load-pull technique, a calibrated load capable of covering the Smith chart, is varied at the output of the PA. In the absence of calibrated mechanical tuners [4] or in an early design stage, software packages can perform virtual load-pull simulations with an accuracy mainly determined by the transistor models.

The typical output from load-pull simulations are power contours representing the boundaries between specific output power levels. Typically, the

Figure 4.3: Loadline for a typical power transistor and a CMOS transistor in Class-A biasing
maximum power level, the 1dB, and 2dB power contours are of interest, as the compression points are directly related to the linearity in an amplifier providing linear amplification. From the power contours, an optimum or suitable load impedance can be chosen in order to allow for component mismatches. Figure 4.4 shows the result of a load-pull simulation for a 1000μm device in a 130nm CMOS technology with a 1.5V supply voltage, where the maximum output power was found to be about 23dBm with an optimum load of approximately 5+0.5jΩ. Obviously, a load-pull simulation based on an existing CMOS transistor models do include the drain capacitance previously neglected, which for large device may approach several pF leading to a significant impact on the impedance levels at the drain of the transistor [4]. Due to the large-signal variations in PAs, the motive for using load-pull simulations is further strengthened, since the definition of output impedance become vague due to the varying output resistance and capacitances.

4.4 Matching Network Design

Up till now, the load impedance has been considered to be purely resistive, but as shown in the load-pull simulations, the load impedance may contain imaginary parts. Furthermore, it was also clear, as indicated in the introduction
of this chapter, that the resistive portion of the load impedance maybe significantly smaller than the common 50Ω in order to achieve sufficient output power. To provide a down-transformation of the 50Ω load impedance, typically an L-match network is used.

### 4.4.1 L-Match

Here the load transformation will only transform the 50Ω to a lower resistive value to demonstrate the efficiency of such a matching network. The L-match typically consists of a capacitor (C) in parallel with the 50Ω load impedance (R\text{out}) as Figure 4.5. Additionally a negative reactance \(-j\overline{X_C}\) is created due to the parallel combination of the capacitor (C) and the load impedance (R\text{out}). In order to compensate for the negative reactance, an inductor located in series with the parallel combination as in Figure 4.5. In an ideal matching network without any losses, all the power will be dissipated in the load (R\text{out}). But since the matching components do have finite Q, and due to the frequent use of this type of matching network to achieve sufficient output power, a derivation is given. From equations (4.3), (4.5), and Figure 4.5, it is clear that for infinite quality factors of the inductor and capacitor, the Power Enhancement Ratio (E) [5] as defined in (4.6) is 1+Q\text{m}^2.

\[
Q_m = \frac{R_{out} / / X_C Q_C}{X_C} = \frac{R_{out} Q_C}{R_{out} + X_C Q_C} \quad (4.3)
\]

\[
\overline{X_C} = X_C \left( \frac{Q_m^2}{1 + Q_m^2} \right) \quad (4.4)
\]

\[
\overline{R_{out}} = \frac{1}{1 + Q_m^2} \left( \frac{X_C R_{out} Q_C}{R_o + X_C Q_C} \right) = \frac{X_C Q_m}{1 + Q_m^2} \quad (4.5)
\]

\[
E = \frac{P_{out \ with \ matching \ network}}{P_{out \ without \ matching \ network}} \quad (4.6)
\]

Denote the maximum voltage across the load to V\text{max}, while not using any matching network at all. The E for a lossless matching network is then just 1+Q\text{m}^2 or the ratio of the load resistance (R\text{out}) and the ideal input resistance (R\text{in}). However, due to the finite quality factors, there are losses in the signal path, which are defined as the Insertion Loss (IL) [6]. Moreover, the IL can be divided into two parts, one part for the inductor loss (IL\text{1})(4.9), and one part for the capacitor (IL\text{2})(4.10). Consequently, the E can be rewritten (4.8) as a function of the impedance transformation (r), and insertion losses (IL\text{1} and IL\text{2}) and eventually it can be defined as a function of quality factors.
Consequently, if the quality factors and the desired output power are known, the efficiency can be calculated by solving the equations in a given order. Determining the E from (4.6), leads to the computation of $Q_m$ in (4.13), so that $X_C$ can be computed from (4.3). As the inductor must cancel the reactance created by the capacitor, $X_L$ must be equal to $X_C$ in (4.4). Moreover, the input resistance, $R_{in}$, and the total insertion loss (IL) can be evaluated from (4.11) and (4.12) [7].

$$IL = \frac{\text{Power received by load}}{\text{Power received by load} + \text{Power loss}}$$  \hspace{1cm} (4.7)

$$E = \frac{V_{max}^2/(2R_{in})IL}{V_{max}^2/(2R_{out})} = \frac{R_{out}}{R_{in}} \text{IL} = \left(\frac{r}{r_{IL}}\right)$$  \hspace{1cm} (4.8)

$$IL_1 = \frac{i_{IN}^2 R_{out}}{i_{IN}^2 \left(R_{out} + \frac{X_L}{Q_L}\right)} = \frac{1}{1 + \frac{Q_m}{Q_L}}$$  \hspace{1cm} (4.9)

$$IL_2 = \frac{V_o^2/(2R_{out})}{V_o^2/(2(R_{out} // X_C Q_C))} = \frac{X_C Q_C}{X_C Q_C + R_{out}} = \frac{1}{1 + \frac{Q_m}{Q_L}}$$  \hspace{1cm} (4.10)

$$IL = IL_1 IL_2 = \frac{1 - \frac{Q_m}{Q_C}}{1 + \frac{Q_m}{Q_L}}$$  \hspace{1cm} (4.11)
Matching Techniques

When implementing the matching networks on PCBs, the quality factors can be quite high. However, when on-chip matching networks are used, they tend to be quite lossy, due to the low Q of the inductors. To evaluate the performance of on-chip L-match network, the derived equations can be used.

As the Q of the on-chip capacitors are much higher than the inductors, $Q_C$ is assumed to be infinitely large in the equations (4.3)-(4.13). In Figure 4.6, the desired E is swept for a variety of inductor quality factors ($Q_L = Q$), where the impact of a low inductor quality factor is apparent. This is especially troublesome in low-voltage CMOS technologies, where significant enhancement ratios are needed to achieve sufficient output power. One should also recall that the efficiency plotted only corresponds to the L-matching network itself - losses associated with the amplifier are not included, and therefore the efficiency of a complete transmitter is lower than the efficiencies plotted in Figure 4.6.

\[
R_{in} = \frac{R_{out}}{Q_L} + \frac{X_L}{Q_L} = \frac{R_{out}}{1 + Q_m^2 \left( \frac{1}{Q_m} \left( 1 + \frac{Q_m}{Q_L} \right) \right)} (4.12)
\]

\[
E = \frac{1 + Q_m^2}{\left( 1 + \frac{Q_m}{Q_L} \right)^2} (4.13)
\]

Figure 4.6: Efficiency [8] of L-matching network for inductor quality factor ($Q_L = Q$) and Power Enhancement Ratio (E)
4.4.2 Balun

From Figure 4.6 it was clear that the efficiency of the L-match network dropped significantly for large power enhancement ratios. Therefore, in order to obtain a higher efficiency, two amplifiers could be operated in parallel and combine the output power from each amplifier. A convenient way of combining the power from two amplifiers is to use a balun [4], [9] as in Figure 4.7. By operating the amplifiers differentially, the double voltage swing is available, which means that the differential impedance the overall amplifier has to drive is four times as large than if a single-ended amplifier would have been used. As each amplifier sees a higher impedance than if a single L-match would have been used, each amplifier can then utilize a lower power enhancement ratio with a higher efficiency. Investigating and applying nodal analysis for the differential amplifier connected to a balun in Figure 4.7, the input impedance of the balun can be found as in (4.14) (neglecting the parasitic resistances in the inductors).

If the input impedance is supposed to be resistive (4.15), the component

\[
Z_{in} = \frac{4L}{C} R_{out} + \left( \omega L - \frac{1}{\omega C} \right)^2 R_{out} + j \left( 2R_{out}^2 - 2 \frac{L}{C} \left( \omega L - \frac{1}{\omega C} \right) \right) \left( \omega L - \frac{1}{\omega C} \right)
\]

(4.14)

![Figure 4.7: Lattice-type balun inside the box with dashed lines driven by a differential signal (+\(v_{IN}\) and –\(v_{IN}\))](image)

\[
4R_{out}^2 + \left( \omega L - \frac{1}{\omega C} \right)^2
\]
values can be calculated according to (4.16).

\[
Z_{in} = \left\{ \omega L = \frac{1}{\omega C} \right\} = \frac{L}{R_{out}C} \tag{4.15}
\]

\[
\sqrt{Z_{in}R_{out}} = \omega L = \frac{1}{\omega C} \tag{4.16}
\]

Suppose the balun is implemented on-chip. As for the L-matching network we assume the capacitors are lossless. The nodal analysis shows that the output voltage across the load can be written as [8]:

\[
v_{OUT} = v_{IN} \frac{R_{out} \left( 1 + \omega^2 CL - j \omega CR_L \right)}{R_{out} + R_L - \omega^2 R_{out} CL + j \omega R_{out} CR_L + j \omega L} \tag{4.17}
\]

The nodal analysis also makes it possible to calculate the currents through the inductors and the corresponding losses, such that an overall efficiency can be computed. In a similar way as for the L-match network, the power enhancement ratio is swept and the efficiency can be evaluated. Figure 4.8 shows the efficiency of an L-match and a balun for different power enhancement ratios. For enhancement ratios larger than two, the balun outperforms the L-match, and as the ratio is approaching one, the matching network can be left out completely.

Due to the properties of the balun it was used in the implemented amplifiers presented in Paper 1, Paper 2, and Paper 4, but using off-chip components.

---

![Figure 4.8: Efficiency of L-match and balun for $Q_L=10$](image-url)
The quality factors of the off-chip matching networks are significantly higher, but show the same trends as in Figure 4.6 and Figure 4.8. Since the balun cannot be directly connected to output of the transistors, bondwires and transmission lines were used at the immediate output of the differential amplifiers as in Figure 4.9.

For simplicity we assume that the bondwire inductance and the interconnections from the PA to the balun can be represented by $L_P$, and that the balun makes an impedance transformation from a resistive value $R_2$ to a resistive $R_{out}$. A pre-matching capacitor ($C_P$) was used before the balun to compensate for the bondwire inductance and interconnection lines from the PA to the balun [10], and transforms the optimum load impedance ($\text{Re}\{Z_1\}$) to a higher level ($R_2$). Analyzing the matching network in Figure 4.9, the relationships between the parasitic inductance $L_P$, the virtual resistance $R_2$ of the balun, the balun components ($L_{Bx}$, and $C_{Bx}$), and the pre-matching capacitor are found in (4.18)-(4.20).
The matching issues discussed so far mainly targets the last stage of the power amplifier. However, typically the amplifier consists of more than a single stage to achieve sufficient gain. The amplifier in Paper 1 [11], Figure 4.10, uses input and interstage matching networks based on inductors and capacitors. As the power amplifier targeted a high level of linearity, it means that a high level of linearity must be maintained through the amplifier chain to the output stage. A simplified schematic of the interstage matching between the first and second stage is considered in Figure 4.11. For simplicity $M_1$ and $M_2$ is assumed to have an output impedance comprising a parallel resistor ($R_{out}$) and drain capacitance ($C_D$). Further, $C_D$ is assumed to be much smaller than the equivalent input

$$C_p = \frac{2L_p\omega - \text{Im}[Z_1]}{\omega(\text{Re}[Z_1]^2 + 4L_p\omega^2 - 4L_p\omega \text{Im}[Z_1] + \text{Im}[Z_1]^2)}$$

(4.18)

$$R_2 = \frac{\text{Re}[Z_1]^2 + 4L_p\omega^2 - 4L_p\omega \text{Im}[Z_1] + \text{Im}[Z_1]^2}{\text{Re}[Z_1]}$$

(4.19)

$$\sqrt{R_2 R_{out}} = \omega L_{B1} = \omega L_{B2} = \frac{1}{\omega C_{B1}} = \frac{1}{\omega C_{B2}}$$

(4.20)

### 4.5 Input and Interstage Matching

#### 4.5.1 LC-Based Matching Network

The matching issues discussed so far mainly targets the last stage of the power amplifier. However, typically the amplifier consists of more than a single stage to achieve sufficient gain. The amplifier in Paper 1 [11], Figure 4.10, uses input and interstage matching networks based on inductors and capacitors. As the power amplifier targeted a high level of linearity, it means that a high level of linearity must be maintained through the amplifier chain to the output stage. A simplified schematic of the interstage matching between the first and second stage is considered in Figure 4.11. For simplicity $M_1$ and $M_2$ is assumed to have an output impedance comprising a parallel resistor ($R_{out}$) and drain capacitance ($C_D$). Further, $C_D$ is assumed to be much smaller than the equivalent input

![Figure 4.10: Power amplifier with LC-based input and interstage matching networks [11]](image-url)
capacitance ($C_{in}'$), as defined in (4.22) and Figure 4.11. $R_{out}$ is assumed to be sufficiently large to be neglected.

The inductor ($L_2$) is used to tune out the equivalent capacitance ($C_{in}'$) of $C_2$ ($C_2 = k C_{in}$) and $C_{in}$ (gate capacitance) according to (4.21) at the operating frequency $\omega_0$. The parallel capacitor $aC_2$ of $C_{in}$, does represent the parasitic capacitance to the substrate of $C_2$. The combination of $C_2$ and $C_{in}$ creates a voltage divider leaving a signal swing at the gate of $M_3$ as in (4.23).

$$\omega_0 = \frac{1}{\sqrt{L_2 C_{in}'}}$$ \hspace{1cm} (4.21)

$$C_{in}' = \frac{C_2 (a C_2 + C_{in})}{C_2 + a C_2 + C_{in}} = \frac{k (ka + 1)}{k + ka + 1} C_{in} = \frac{1}{b} C_{in}$$ \hspace{1cm} (4.22)

$$v_G = v_D \frac{C_2}{C_2 + a C_2 + C_{in}} = v_D \frac{k}{k + ak + 1}$$ \hspace{1cm} (4.23)

To maximize the voltage swing at the gate of $M_3$, the capacitor $C_2$ should ideally be made infinitely large. However, the parasitic capacitance to the substrate and the large gate capacitance would require unreasonable small inductance values. Additionally, since the target was to amplify WLAN signals [11], which requires significant linearity, forces the design to use a limited voltage swing in order not to introduce significant distortion by driving $M_3$ with too large signals. Considering (4.22), $C_{in}$ can be reduced by a factor $b$ [7]. Thus, if the effective input capacitance is lowered by a factor $b$, the inductance and its parallel resistor can be increased by a factor of $b$ (assuming constant $Q$). Additionally, the voltage gain is a factor of $b$ larger at $v_D$. By knowing the maximum available voltage swing at $v_D$, and the needed drive signal at $M_3$, the

![Figure 4.11: Interstage matching between first and second stage](image)
ratio between $C_2$ and $C_{in}$ can be determined. $C_2$ also separates the drain of $M_{1,2}$ and the gate of $M_3$, which makes it possible to bias $M_3$ via a large resistor, $R_2$.

### 4.5.2 Transformer-Based Matching Network

Instead of using integrated inductors, integrated transformers have received a lot of attention lately [12]-[16] as they have proven to give satisfactory performance over a wide range of frequencies, and due to their capability of signal combining and impedance transformation. The basic principle is shown in Figure 4.13 [12].

A current passes through the primary inductor ($L_P$). The magnetic flux created by current in the primary winding ($I_1$) induces a current ($I_2$) in the secondary winding ($L_S$), which produces a current ($V_2$) across the load ($Z_S$) connected between the secondary terminals. The impedance seen at the primary side of the transformer is $Z_P$, and the transformation of the voltages and currents in an ideal transformer are related to the turns ratio as in equation (4.24). The strength of the magnetic coupling between the two windings is defined as the magnetic coupling coefficient, $k$ (=1 for an ideal transformer), by equation (4.25), where
M is the mutual inductance between the two windings.

The integrated transformers are, however, in CMOS technologies implemented as coupled integrated inductors, which means that resistance reduction in the windings can be accomplished by using several metal layers on top of each other to reduce the losses in the transformer. The magnetic coupling between the windings is mainly determined by the width and spacing of the traces [12], [17]. The magnetic coupling between the windings can be maximized by letting two adjacent conductors belong to two different windings, as the mutual inductance increases. Two conductors of the same winding only contribute to the self-inductance, and lower the coupling coefficient (4.25). Moreover, the coupling factor was increased [12] for smaller width of the traces than for larger widths, but simultaneously the winding resistance increases, as well as the losses, leading to a compromise between coupling and loss.

To evaluate the performance of a transformer, a full 3D electromagnetic simulation would be preferable, but due to the time-consuming simulations, lumped models representing the physical operation of the transformer are tractable in the early design phase. The planar square transformers used in the

![Figure 4.14: Lumped transformer model](image-url)
Matching Techniques

\[ n = \frac{V_2}{V_1} = \frac{I_1}{I_2} = \frac{\sqrt{L_S}}{L_P} = \frac{\sqrt{Z_S}}{Z_P} \]  
(4.24)

\[ k = \frac{M}{\sqrt{L_P L_S}} \]  
(4.25)

amplifier in Paper 2 [18] are based on a lumped transformer model described in [19], where the circuit component values were computed using FastHenry [20] and FastCap [21]. Figure 4.14 shows the lumped model, which includes coupling to the substrate, inductances, the coupling between the primary and secondary windings, capacitive coupling, and the series resistance in the windings. As the quality factor is important for the integrated inductors, so are the quality factors of the primary and secondary windings of the integrated transformer, as well as the coupling factor \( k \). Consequently, when a suitable transformer has been found using the lumped transformer models, an electromagnetic simulation should be performed in order to improve the simulation model of the amplifier. A common figure-of-merit used to characterize transformers has been the maximum available gain \( (G_{\text{max}}) \) [22], [23], [24], defined in terms of S-parameters [6] for any termination impedances as in equation (4.26) and (4.27). The maximum available gain \( (G_{\text{max}}) \) is a measure of the gain of the system when the source and load reflections coefficients are conjugately matched to \( S_{11} \) and \( S_{22} \) [22], and puts a number on how efficient the transformer can be when transferring power from the input to the output during optimal conditions.

Instead of expressing \( G_{\text{max}} \) in terms of S-parameters, the equivalent circuit parameters of the transformer T model [6] can be used [24]. For the T model shown in Figure 4.15 [5], the efficiency was derived as in (4.28), while using tuning capacitors and optimum choice of \( L_P \). This model was further used in the development of a fully-integrated GSM/GPRS PA [25].

\[ G_{\text{max}} = \frac{S_{21}}{S_{12}} \left( k_s - \sqrt{k_s^2 - 1} \right) \]  
(4.26)

\[ k_s = 1 - \left| S_{11} \right|^2 - \left| S_{22} \right|^2 + \left| S_{11} S_{22} - S_{12} S_{21} \right|^2 \]  
\[ 2 \left| S_{12} \right| \left| S_{21} \right| \]  
(4.27)
4.5 Input and Interstage Matching

In (4.28), $Q_P$ and $Q_S$ are the quality factors of the primary and secondary windings, respectively. From (4.28), we can see that the efficiency can be maximized by using a coupling factor as close as possible to unity and making the Q of the primary and secondary windings, as large as possible. However, the number of turns and the inductances are limited by the parasitic capacitances from the traces to the substrate and limits the usable frequency range, as well as layout constraints of the process chosen, making it challenging to find an optimum transformer design. Moreover, in PA design the inductances of the transformers are limited by the large capacitances of the transistors when the transformers are used for interstage matching as in [18].

$$\eta = \frac{1}{1 + 2 \sqrt{\left(1 + \frac{1}{Q_P Q_S k^2}\right) \frac{1}{Q_P Q_S k^2} + \frac{2}{Q_P Q_S k^2}}}$$ (4.28)

Figure 4.16 shows the amplifier presented in [18], which uses integrated transformers for the input and interstage matching. At the primary windings of both transformers ($T_1$ and $T_2$) tuning capacitors ($C_1$ and $C_2$) are located in order to reduce the losses [12], while the gate capacitance is put in series with the secondary windings (and its inductance) of the transformer. Moreover, since the primary and secondary windings of the implemented transformers are galvanically isolated, the center taps can be used for either biasing of the amplifying transistor, as in the first stage, or power supply of the amplification stage, as in the second stage. The input tuning capacitors ($C_1$) at the input of the first stage were replaced by a single off-chip component. The input power to the amplifiers was applied differentially with an external 50-to-100Ω balun connected to the signal source. The input impedance of the PAs was designed to present 100Ω differentially.

**Transformer T-Model**

![Figure 4.15: Transformer T-model](image)

**Figure 4.15: Transformer T-model**
Matching Techniques

4.5.3 Cascode Stage

Both amplifiers in Paper 1 [11] and Paper 2 [18] utilize a cascode configuration in the amplifier stages, which is a combination of a common-gate and a common-source stage. The major reason is that in transconductance power amplifiers with an RF-choke, the drain voltage may reach levels approaching $2V_{DD}$, and therefore cause destructive oxide breakdown. To prevent oxide breakdown to occur, usually another thick gate oxide transistor is put as cascode transistor ($M_2$ and $M_4$) in order to protect the input transistors ($M_1$ and $M_3$), and split the voltage stress during normal operation. To provide highest protection for the transistors, the gates of the cascode transistors should be biased at $V_{DD}$, but a lower bias level can provide better performance [26], as it reduces the smallest drain-source voltage for which the cascode transistor operates in the saturation region.

However, the cascode stage is also used to enable a higher output impedance and higher supply voltage, than if a single common-source stage would have been used [27]. Though, one has to make sure that the width of the cascode transistor is large enough to not degrade the linearity, but at the same time not become too large make an significant impact on i.e. the interstage matching ($C_D$ in Figure 4.11) [28] or the output matching [4]. The input transistors were also chosen as thick oxide transistors, due to the low breakdown voltage of the thin oxide low-voltage transistors. Consequently, to achieve sufficient gain, large transistors were chosen, but results in low input impedance of the device.
4.6 EM-Simulated Parasitics

As there are many transistor parameters and parasitics, which can be included into the transistor model, there are also parasitics that are not directly related to the MOS device itself. If the transistor is used in power amplifier applications, the current flowing through the transistors may reach several hundred milli-amperes or even amperes. Consequently, not only the transistor has to withstand the large currents, but also the interconnects around the device. As the current flows between the drain and source, one solution is to stack several metal layers on top of each other at the drain and source to meet the current density limitations of the metal traces. This would not only lead to a lower electromigration [29], but also to lower resistive drop across the interconnects and introduction of more capacitive coupling between gate, drain, and source as seen in Figure 4.17. Since not all metal layers are included in the transistor model, the additional capacitances and dielectric losses need to be taken into account and added to the existing transistor model. The parasitics can typically be represented as either a π or T equivalent circuits [30].

Considering the amplifiers in Figure 4.10 and Figure 4.16, the cascode

![Diagram](image_url)

Figure 4.17: Parasitic capacitances between gate, drain, and source
amplifier stage in Figure 4.18, would have layout parasitics associated in a similar way as in Figure 4.17. However, instead of inserting π or T equivalent circuits between every two nodes in the simulation model of the amplifier, the parasitic connections were approximated with series connections of capacitance and resistance between gate, drain, and source at the frequency of operation. It means that two components, one real and one imaginary component were used instead of six. The values of the parasitic components were estimated through electromagnetic simulations and depending on how the signals were applied between a pair of terminals, two different extraction formulas were used. Equation (4.29) was used for differential signals, and (4.30) for computation of the input impedance at port 1 [31]. In a similar way the signal traces includes series inductance and series resistance, and for long signal traces [32] the parasitic capacitance to the substrate should be included for better accuracy.

\[
Z_{dd} = Z_{11} - Z_{12} - Z_{21} + Z_{22} \tag{4.29}
\]

\[
Z_{se} = Z_{11} - Z_{12} Z_{21} / Z_{22} \tag{4.30}
\]
4.7 Class-E PA: Simulation Results and Measured Performance

Considering the accuracy of the estimated parasitic impedances, the single-ended impedance has a perfect accuracy since only one terminal is excited with a signal. In the differential case, the current going through the impedance depends on the amplitude of the differential signals, as well as the phase difference between them. For a 190 degrees phase shift (instead of 180 degrees) and a ratio of two between the two differential signals (instead of one), the error in current between the approximate network and a π or T equivalent circuit representation is kept below 10%. For an amplitude ratio of 10 the maximum error stays below 20%. However, the parasitic impedance is placed in parallel with the small input impedance of the large devices, and the error introduced by photo-lithography effects can be as large as 20% in RC extraction from design to fabrication [33]. Consequently, there is no need for exaggerated optimization of the parasitic components, but nonetheless a larger number of components have a higher accuracy in general.

4.7 Class-E PA: Simulation Results and Measured Performance

The Class-E PA in Figure 4.19 was presented Paper 4 and used a matching network as presented in Figure 4.9. A full simulation model was developed based on the proposed methodology to extract parasitics, PCB transmission lines, S-parameters of lumped components, and transistor models of the 130nm CMOS process. Figure 4.20 shows the measured and simulated output power, as well as the corresponding drain efficiencies at 2.45GHz. As seen in the figure, the difference between measured and simulated output power is negligible. At 1V supply voltage, the difference is ~0.2dB, and at 0.3V, the difference is less than 0.45dB.

![Class-E Amplifier in Paper 4](image_url)
The DE, however, shows a larger difference, which can be the result of several reasons. Obvious reasons are PCB modeling of closely located traces at the output of the PA, different quality factors of the S-parameter models and the components on the PCB, soldering of PCB components, the accuracy of transistor models, and the simulation model of the buffered driver stages. In Chapter 2 it became obvious that the accuracy of the transistor was largely dependent on the parameters included in the model itself. Most transistors characterized are usually significantly smaller than the device used in this design (4000µm), and consequently scaling phenomenon may exist. However, we shall also keep in mind that the perfect match in output power can also be the result of these inaccuracies. Nonetheless, by using a detailed simulation model, including major layout parasitics, it is shown that a very accurate prediction of output power and performance can be achieved.

Figure 4.20: Simulated (dashed line) and measured (solid line) output power and drain efficiency at 2.45GHz
4.8 References


4.8 References


Part II

Papers
Chapter 5

Paper 1

A 72.2Mbit/s LC-Based Power Amplifier in 65nm CMOS for 2.4GHz 802.11n WLAN

Jonas Fritzin\textsuperscript{(1)}, Ted Johansson\textsuperscript{(2)}, and Atila Alvandpour\textsuperscript{(1)}

\textsuperscript{(1)} Electronic Devices, Department of Electrical Engineering
Linköping University, SE-581 83 Linköping, Sweden
\{fritzin, atila\}@isy.liu.se
\textsuperscript{(2)} Infineon Technologies Nordic AB
Isafjordsgatan 16, SE-164 81 Stockholm, Sweden
ted.johansson@ieee.org

A 72.2Mbit/s LC-Based Power Amplifier in 65nm CMOS for 2.4GHz 802.11n WLAN

Jonas Fritzin\textsuperscript{(1)}, Ted Johansson\textsuperscript{(2)}, and Atila Alvandpour\textsuperscript{(1)}

\textsuperscript{(1)} Electronic Devices, Department of Electrical Engineering
Linköping University, SE-581 83 Linköping, Sweden
\{fritzin, atila\}@isy.liu.se

\textsuperscript{(2)} Infineon Technologies Nordic AB
Isafjordsgatan 16, SE-164 81 Kista, Sweden
ted.johansson@ieee.org

Abstract

This paper describes the design and evaluation of a power amplifier (PA) for WLAN 802.11n in 65nm CMOS technology. The PA utilizes 3.3V thick-gate oxide (5.2nm) transistors and a two-stage differential configuration with two integrated inductors for input and interstage matching. For a 72.2Mbit/s, 64-QAM 802.11n OFDM signal at an average and peak output power of 9.4dBm and 17.4dBm, respectively, the measured EVM is 3.8%. The PA meets the spectral mask up to an average output power of 14dBm.

5.1 Introduction

The power amplifier (PA) is a key building block in all RF transmitters. To lower the costs and allow full integration of a complete radio system-on-chip, it is highly desirable to integrate the entire transceiver and the PA in a single CMOS chip. However, integration of RF power amplifiers in low-cost CMOS technologies proves to be a challenging task [1].

While digital circuits benefit from the technology scaling, it is becoming significantly harder to meet the stringent requirements on linearity, output power, and power efficiency of PAs at lower supply voltages and in the presence of large on-chip parasitics [2]. This has recently triggered extensive studies to investigate the impact of different circuit techniques, design methodologies, and
5.2 Design and Implementation of the Power Amplifier

The PA in this work is realized as a differential PA with two amplification stages using integrated inductors (L₁ and L₂) for input and interstage matching, as shown in Figure 5.1. The differential structure enables a lower impedance transformation ratio for the off-chip output matching network, which leads to lower output currents and lower losses in the matching network compared to single-ended PAs. In Figure 5.1 the differential inductor (L₁) forms the input matching network together with the capacitors (C₁) for the first amplifier stage. The capacitors (C₁) are used to block the DC level of the input signal and for tuning to achieve resonance in the matching network at the operation frequency. C₁ forms a voltage divider with the gate capacitance and layout parasitics, and in order to minimize the losses large capacitors (C₁) may be needed.

In the interstage matching network, the capacitors (C₂) form a voltage divider with the large gate capacitance and parasitic capacitances of the second amplifier stage. As for C₁, the losses are minimized for large values of C₂. However, the large gate capacitance and parasitic capacitances would require unreasonable small inductance values. Therefore it is advantageous to choose a smaller C₂ and a larger L₂, and permit some losses over C₂, to achieve a high input impedance of the second amplifier stage and thereby increase the voltage gain.

To reduce the resistive losses in the inductors, the two upper layers in the seven-layer metal stack are connected to form one conductor. The thicknesses of the top aluminum and second top copper layers are 1.3µm and 0.6µm, respectively. At 2.45GHz the Q-values of the differential input and interstage
inductors are approximately 8.5 (L₁) and 6.7 (L₂), with inductor values of 4.1nH (L₁) and 1.8nH (L₂) designed in Cadence Virtuoso Passive Component Modeller (VPCM) using the quasi-static EM solver. Typically, quasi-static EM solvers do not take into account field radiation and consider L, and C as frequency independent.

The PA utilizes thick-gate oxide (5.2nm) transistors with a gate length of 0.6µm using a supply voltage of 3.3V. At full output power, the drain voltage of the PA can reach up to two times VDD. To ensure reliable operation and protect the transistors from hot electrons and breakdown due to high voltage peaks, each amplifying stage uses a pair of transistors in a cascode configuration. To provide highest protection for the transistors, the gate of the cascode transistors are connected to VDD. The width of the transistors in the first (M₁ and M₂) and second (M₃ and M₄) amplification stages transistors is 0.4mm and 6mm, respectively.

5.3 Parasitics Extraction

Extraction of interconnects parasitics and inherited losses are used to predict the frequency behaviour and gain. Our simulation model of signal traces includes series inductance and series resistance extracted through electromagnetic (EM) simulations. The simulated S-parameters are converted into Z-parameters and by applying the two following formulas for the impedances, the parasitic component values can be calculated. For differential signals (5.1) is applied to calculate the differential impedance [7]. For single-ended signals (5.2) is applied
for Z-parameters in a T-type connection for a two-port network [8] with one port shorted.

\[ Z_{dd} = Z_{11} - Z_{12}Z_{21} + Z_{22} \]  \hspace{1cm} (5.1)

\[ Z_{sc} = Z_{11} - Z_{12}Z_{21}/Z_{22} \]  \hspace{1cm} (5.2)

To meet the current density limitations, and to reduce the losses in the drain and source connections at the output transistors, several metal layers are stacked on top of each other in the structure, as shown in Figure 5.2.

For such a structure, the capacitive coupling between gate, source, and drain is increased. Since the metal layers not are included in the existing transistor model, we need to add the parasitic capacitances and resistances from the metal layers into our extended model. The values were extracted through EM simulations and added in the simulation model as series connections of capacitance and resistance between the gate, drain, and source, as seen in Figure 5.3.

To accurately model the gate resistance at high frequencies, we added a vertical gate resistance [9], which due to dopant segregation during silicidation can be relatively high, and thus will influence \( f_{\text{max}} \) and gain of the PA. In Figure 5.3, this resistance is denoted as \( R_{\text{vgr}} \), inserted in series with the lateral gate resistance, \( R_{\text{lateral}} \), calculated by the gate sheet resistivity and layout geometry. The sum of both resistance values for the first and second amplifier stages are approximately 3.2\( \Omega \) and 250m\( \Omega \), respectively. The vertical gate resistance represents approximately 20% of the total gate resistance in our design. The estimated impact of these resistances is a gain reduction of 1.1dB.

The design of the PA is based on the described parasitics extraction approach. For evaluation of the PA design, the schematic from Cadence was simulated in ADS WLAN 802.11g testbench together with the layout of the fabricated

![Figure 5.2: Parasitic capacitances between gate, drain and source.](image)
testboard using RFIC Dynamic Link. In such a setup the influence of parasitics can be evaluated and corrected.

5.4 Experimental Results

Figure 5.4 shows a photograph of the 1x1 mm$^2$ fabricated chip. The chip was directly bonded on the testboard, which was a two-layer 0.5mm thick FR4 PCB with $\varepsilon_r$ of 4.2 and tan $\delta$ of 0.035.

The input power to the testboard is applied differentially with an external 50-to-100 Ohm balun connected to the signal source. The PA utilizes an off-chip
lumped element balun [10] (Figure 5.5) for differential-to-single-ended conversion and load impedance transformation. A pre-matching network (L_p and C_p) was used to compensate for the bond wire inductance and interconnection lines from PA to the balun (part of L_p). For a differential-to-single-ended impedance conversion, from R_2 to R_L, the following equations apply at the operating frequency:

$$\sqrt{R_2 R_L} = \omega L_{B1} = \omega L_{B2} = \frac{1}{\omega C_{B1}} = \frac{1}{\omega C_{B2}}$$  \hspace{1cm} (5.3)

### 5.5 Measurement Results

The target frequency of the PA was set to 2.45GHz in the design work. After tuning of the output matching network, the best performance in terms of power gain was found at 2.57GHz, a difference of 4.8% from the target frequency. At 2.57GHz the input and output 1dB compression points (P1dB) were found to be 5.5dBm and 17dBm, respectively, with a power-added efficiency (PAE) of 3.3%. The Pout and gain curves are found in Figure 5.6.

Compared to 802.11g, the data rate of 802.11n is increased from 54Mbit/s to 72.2Mbit/s by the use of more subcarriers, short guard interval of 400ns, and 5/6 coding rate [11]. For an input signal with a Peak-to-Average Power Ratio (PAPR) of 9.1dB, the measured average and peak output power were 9.4dBm and 17.4dBm, respectively, with an EVM of 3.8%, meeting the most demanding EVM requirement of 802.11n [11].
At an average and peak output power of 11.3dBm and 18.9dBm, respectively, the measured EVM and PAE is 5.4% and 1%. Due to the high linearity requirements of OFDM modulation, the PA was biased in class-A which leads to large bias currents and low PAE \[5\].

A plot of the measured average output power and EVM is provided in Figure 5.7 for a 72.2Mbit/s, 64-QAM 802.11n OFDM signal under nominal supply voltage of 3.3V at 2.57GHz. The measured peak output spectrum of a WLAN 72.2Mbit/s, 64-QAM 802.11n OFDM signal for a 20MHz channel is plotted in Figure 5.8. The PA showed an average output power of 14dBm with an EVM of 10.5%. As seen in the figure, the PA meets the spectral requirements of the 802.11n draft 2.05 \[11\].

In Table 5.1 the performance of the implemented PA and some recently presented WLAN PAs is listed. The PA presented in this paper has a lower average output power than recently presented WLAN PAs. However, the PA is capable of running at a lower supply voltage than \[5\], maintaining a low EVM for a 802.11n OFDM signal with more subcarriers than 802.11g \[4][5\], and is implemented in a more advanced technology with increased resistances in the back-end \[2\].
5.5 Measurement Results

Figure 5.7: Measured average output power and EVM for a 72.2Mbit/s, 64-QAM 802.11n OFDM signal.

Figure 5.8: Spectral mask and measured peak output spectrum at an average output power of 14dBm with an EVM of 10.5%.
### Table 5.1: Performance comparison of WLAN PAs.

<table>
<thead>
<tr>
<th>Reference</th>
<th>Technology</th>
<th>VDD [V]</th>
<th>Data rate [Mbit/s]</th>
<th>Pout [dBm] @ Pout</th>
<th>EVM [%]</th>
</tr>
</thead>
<tbody>
<tr>
<td>[4]</td>
<td>180nm CMOS</td>
<td>3.3</td>
<td>54</td>
<td>17.9</td>
<td>3</td>
</tr>
<tr>
<td>[5]</td>
<td>180nm CMOS</td>
<td>3.5</td>
<td>54</td>
<td>11.6</td>
<td>2.8</td>
</tr>
<tr>
<td>[6]*</td>
<td>90nm CMOS</td>
<td>3.3</td>
<td>54</td>
<td>11.6</td>
<td>3.6</td>
</tr>
<tr>
<td>This work</td>
<td>65nm CMOS</td>
<td>3.3</td>
<td>72.2</td>
<td>9.4</td>
<td>3.8</td>
</tr>
</tbody>
</table>

* without applying the implemented digital pre-distortion algorithm

With EM simulations, the inductances are found to be overestimated by approximately 10% by VPCM when using the quasi-static EM solver. Since the input and interstage matching networks are sensitive to deviations from the matched values, the power gain of the whole PA drops and the frequency with highest gain is shifted in frequency. Additional parasitic simulations show that the power gain could be increased by about 1dB with an improved layout of the ground connections at the output transistors, and by another 2dB by more accurate modelling of the inductors and interconnect parasitics. These improvements would result in a higher P1dB and improved WLAN performance as well as PAE.

### 5.6 Summary

In this paper we have presented a CMOS PA for WLAN 802.11n, fabricated in 65nm CMOS technology. For a 72.2Mbit/s, 64-QAM 802.11n OFDM signal, the measured average and peak output power were 9.4dBm and 17.4dBm, respectively, with an EVM of 3.8%, meeting the most demanding EVM requirement of 802.11n.

### 5.7 Acknowledgment

The authors would like to thank the RF design team at Infineon Technologies Nordic AB, Sweden, and Dr. Ronald Thüringer and Dr. Stefan van Waasen at Infineon Technologies AG, Germany, for the technical and financial support of this project. The authors would also like to thank Henrik Karlström at Rohde & Schwarz, Sweden, for providing the measurement equipment.
5.8 The Authors

Jonas Fritzin and Atila Alvandpour are with the Department of Electrical Engineering, Linköping University, Sweden.
E-mail: fritzin@isy.liu.se / atila@isy.liu.se
Ted Johansson was with Infineon Technologies Nordic AB, Sweden. He is now with Huawei Technologies Sweden AB, Sweden.

5.9 References


Chapter 6

Paper 2

A 72.2Mbit/s Transformer-Based Power Amplifier in 65nm CMOS for 2.4GHz 802.11n WLAN

Jonas Fritzin and Atila Alvandpour
Division of Electronic Devices, Department of Electrical Engineering
Linköping University, SE-581 83 Linköping, Sweden
E-mail: {fritzin, atila} @isy.liu.se

Proceedings of 26th IEEE NORCHIP Conference, Tallinn, Estonia,
November 17 – 18, 2008
A 72.2Mbit/s Transformer-Based Power Amplifier in 65nm CMOS for 2.4GHz 802.11n WLAN

Jonas Fritzin and Atila Alvandpour
Division of Electronic Devices, Department of Electrical Engineering
Linköping University, SE-581 83 Linköping, Sweden
E-mail: {fritzin, atila} @isy.liu.se

Abstract
This paper describes the design of a power amplifier (PA) for WLAN 802.11n fabricated in 65nm CMOS technology. The PA utilizes 3.3V thick-gate oxide (5.2nm) transistors and a two-stage differential configuration with two integrated transformers for input and interstage matching. For a 72.2Mbit/s, 64-QAM, 802.11n OFDM signal at an average and peak output power of 11.6dBm and 19.6dBm, respectively, the measured EVM is 3.8%. The PA meets the spectral mask up to an average output power of 17dBm.

6.1 Introduction
The power amplifier (PA) is a key building block in all RF transmitters. To lower the costs and allow full integration of a complete radio system-on-chip, it is highly desirable to integrate the entire transceiver and the PA in a single CMOS chip. However, integration of RF power amplifiers in low-cost CMOS technologies proves to be a challenging task [1].

While digital circuits benefit from the technology scaling, it is becoming significantly harder to meet the stringent requirements on linearity, output power, and power efficiency of PAs at lower supply voltages and in the presence of large on-chip parasitics [2]. This has recently triggered extensive studies to investigate the impact of different circuit techniques, design methodologies, and design trade-offs on functionality of PAs in deep-submicron CMOS technologies [3]. Particularly, the demand for higher data rates in wireless communication has led to an increased interest in both phase and envelope modulations, necessitating a special focus on design issues for linear CMOS PAs.
6.2 Design and Implementation of the Power Amplifier

Previously several high performance PAs for WLAN have been fabricated in 180nm [4], [5] and 90nm [6]CMOS technologies. In this paper, we present the design and evaluation of a linear 2.4GHz WLAN PA in 65nm CMOS supporting the draft of the IEEE 802.11n standard. The PA utilizes 3.3V thick-gate oxide CMOS transistors and integrated transformers for input and interstage matching. The output matching network is located off-chip on a PCB. The paper discusses the design and implementation of the PA including the circuit architecture, modeling and design of the transformers, extraction of transistor/interconnect parasitics, followed by the experimental results.

6.2 Design and Implementation of the Power Amplifier

The PA utilizes 3.3V thick-gate oxide CMOS transistors with a gate length of 0.6µm and integrated transformers (T₁ and T₂) for input and interstage matching (Figure 6.1). The PA is differential and use two amplifier stages. Transformers have not been commonly used in integrated PAs, but it has been shown that they can provide sufficient performance for impedance matching purposes [7]. The transformers are also used for biasing of the first and second amplifier stages, which is possible due to the galvanic isolation between the primary and secondary sides.

To ensure reliable operation and protect the transistors from hot electrons and breakdown due to high voltage peaks, each amplifying stage uses a pair of transistors in a cascode configuration. To provide highest protection for the transistors, the gates of the cascode transistors are connected to VDD.

The width of the transistors in the first (M₁ and M₂) and second (M₃ and M₄)
amplification stages is 0.8mm and 6mm, respectively. The differential structure enables a lower impedance transformation ratio for the off-chip output matching network, which leads to lower output currents and lower losses in the matching network compared to single-ended PAs [3].

A. Transformer Model and Losses

The planar square transformers used in this design are based on a model described in [8]. The model includes coupling to the substrate, inductances, and the coupling between the primary and secondary sides, parasitic capacitances, and the series resistance in the windings, as seen in Figure 6.2.

To reduce the resistive losses at primary and secondary sides, the two upper layers in the seven-metal stack are connected to form one conductor. The thicknesses of the aluminum and copper layers are 1.3µm and 0.6µm, respectively. The winding ratios of the transformers, in Figure 6.1, are 2:3 ($T_1$) and 3:2 ($T_2$) with a coupling factor of approximately 0.7 for both transformers. Estimations of the power losses in the transformer can be calculated by the maximum available gain, $G_{ma}$, based on S-parameters for any termination impedances [9] calculated according to (6.1) and (6.2).

![Figure 6.2: Transformer model.](image-url)
6.2 Design and Implementation of the Power Amplifier

\[ G_{ma} = \left| \frac{S_{21}}{S_{12}} \right| \left( k_s - \sqrt{k_s^2 - 1} \right) \]  \hspace{1cm} (6.1)

where \( k_s \) is the stability factor defined as:

\[ k_s = \frac{1 - |S_{11}|^2 - |S_{22}|^2 + |S_{11}S_{22} - S_{12}S_{21}|^2}{2|S_{12}||S_{21}|} \]  \hspace{1cm} (6.2)

The maximum available gain is a measure of the gain of the system when the source and load reflection coefficients are conjugately matched to \( S_{11} \) and \( S_{22} \). The simulated maximum available gain, \( G_{ma} \), for the input and interstage transformers is approximately -2.15dB for both transformers at the target operating frequency of 2.45GHz.

B. Parasitics Extraction

Extraction of interconnect parasitics and inherited losses are used to predict the frequency behaviour and gain. Our simulation model of the signal traces includes series inductance and series resistance, which are extracted through electromagnetic (EM) simulations.

To meet the current density limitations, and to reduce the losses in the drain and source connections at the output transistors, several metal layers are stacked on top of each other in the structure, as shown in Figure 6.3. For such a structure, the capacitive coupling between gate, source, and drain is increased. Since the metal layers are not included in the existing transistor model, we need to add the parasitic capacitances, while taking into account the associated dielectric losses [10], into our extended model. The values were extracted through EM simulations and added in the simulation model as series connections of capacitance and resistance between the gate, drain, and source, as seen in Figure 6.4. Additionally, there will be an interconnect resistance between the drain (\( M_{drive} \)) and source (\( M_{casc} \)), but by making the transistors wide with multiple fingers and using several metal layers as in Figure 6.3, the resistive

\[ \text{Figure 6.3: Parasitic capacitances between gate, drain, and source.} \]
drop across this resistance is reduced to a few mV and therefore this resistance is omitted in Figure 6.4. The large transistors were split into gate widths of 20µm, resulting in 40 and 300 fingers for the transistors in the first and second amplifier stages, respectively.

The S-parameters were converted to Z-parameters [12] of the reciprocal network [10] into a T-type connection and by applying expression (6.3) and (6.4), the parasitic component values can be calculated at the operating frequency. For differential signals, (6.3) was applied to calculate the differential impedance \( Z_{dd} \). For single-ended excitation, (6.4) is applied to calculate the input impedance at port 1 \( Z_{se} \) [11], [12].

\[
Z_{dd} = Z_{11} - Z_{12} - Z_{21} + Z_{22} \tag{6.3}
\]

\[
Z_{se} = Z_{11} - Z_{12}Z_{21}/Z_{22} \tag{6.4}
\]

To accurately model the gate resistance at high frequencies, we added a vertical gate resistance [13], which due to dopant segregation during silicidation can be relatively high, and thus will influence the gain of the PA and \( f_{\text{max}} \) [14]. In Figure 6.4, this resistance is denoted as \( R_{vgr} \), inserted in series with the lateral gate resistance, \( R_{\text{lateral}} \), calculated by the gate sheet resistivity and layout geometry. The sum of both resistance values for the first and second amplifier stages are approximately 1.6Ω and 250mΩ, respectively. The vertical gate resistance represents ~20% of the total gate resistance in our design. The estimated impact of these resistances is a gain reduction of 1.6dB. For

![Figure 6.4: Model of cascode stage with parasitics.](image-url)
6.3 Experimental Results

Figure 6.5 shows the photograph of the PA. The size of the chip is $1\times1$ mm$^2$. The chip was directly bonded on the testboard, which was a two-layer 0.5mm thick FR4 PCB with $\varepsilon_r$ of 4.2 and $\tan\delta$ of 0.035. The input power to the testboard was applied differentially with an external 50-to-100 Ohm balun connected to the signal source. The PA utilizes an off-chip lumped element balun [15] (Figure 6.6) for differential-to-single-ended conversion and load impedance transformation. A pre-matching network ($L_P$ and $C_P$) was used for load impedance tuning, and for compensation of the bond-wire inductance and interconnection lines from the PA to the balun (part of $L_P$).

A. Measurement Results

The target frequency of the PA was set to 2.45GHz in the design work. After tuning of the output matching network, the best performance in terms of power gain was found at 2.48GHz. For a frequency offset of $\pm30$MHz around 2.48GHz, the drop in gain was measured to be 0.4dB (2.45GHz) and 0.2dB (2.51GHz), respectively.
The input and output 1dB compression points (P1dB) at 2.48GHz were found to be 2.6dBm and 19.6dBm, with a power-added efficiency (PAE) of 5.8%. Plots of the measured average output power, EVM, and gain are provided in Figure 6.7 and Figure 6.8 for a 72.2Mbit/s, 64-QAM, 802.11n OFDM signal for a supply voltage of 3.3V. Compared to 802.11g, the data rate of 802.11n is increased from 54Mbit/s to 72.2Mbit/s by the use of more subcarriers, short guard interval, and 5/6 coding rate [16].

For an input signal with Peak-to-Average Power Ratio (PAPR) of 9.1dB, the

![Figure 6.6: Simplified schematic of output matching network.](image)

Figure 6.6: Simplified schematic of output matching network.

![Figure 6.7: RF performance (Pin, Pout, Gain).](image)

Figure 6.7: RF performance (Pin, Pout, Gain).
6.3 Experimental Results

PA gave an EVM of 3.8% at an average output power of 11.6dBm, and 5.4% at 13.4dBm average output power with a PAE of 1.4%. Due to the high linearity requirements of OFDM modulation, the PA was biased in class-A which leads to large bias currents and low PAE [5].

The measured output spectrum of a WLAN 72.2Mbit/s, 64-QAM, 802.11n

![Figure 6.8: RF performance (Pin av, Pout av, EVM).](image)

Figure 6.8: RF performance (Pin av, Pout av, EVM).

![Figure 6.9: Spectral mask and measured peak spectrum at an average output power of 17dBm with an EVM of 13.1%.](image)

Figure 6.9: Spectral mask and measured peak spectrum at an average output power of 17dBm with an EVM of 13.1%.
OFDM signal for a 20MHz channel is plotted in Figure 6.9. As seen in the figure, the PA meets the spectral requirements of the 802.11n draft 2.05 [16]. The PA showed an average output power of 17dBm with an EVM of 13.1%. In Table 6.1 the performance of the implemented PA and some recently presented WLAN PAs is listed.

The transformer-based PA in this work has similar average output power compared to recently presented WLAN PAs. However, our PA is capable of running at a lower supply voltage than [5], maintaining a low EVM for a 802.11n OFDM signal with more subcarriers than 802.11g [4], [5], and is implemented in a more advanced technology with increased resistances in the back-end [2].

<table>
<thead>
<tr>
<th>Reference</th>
<th>Technology</th>
<th>VDD [V]</th>
<th>Data rate [Mbit/s]</th>
<th>Pout [dBm]</th>
<th>EVM [%] @ Pout</th>
</tr>
</thead>
<tbody>
<tr>
<td>[4]</td>
<td>180nm CMOS</td>
<td>3.3</td>
<td>54</td>
<td>17.9</td>
<td>3</td>
</tr>
<tr>
<td>[5]</td>
<td>180nm CMOS</td>
<td>3.5</td>
<td>54</td>
<td>11.6</td>
<td>2.8</td>
</tr>
<tr>
<td>[6]*</td>
<td>90nm CMOS</td>
<td>3.3</td>
<td>54</td>
<td>11.6</td>
<td>3.6</td>
</tr>
<tr>
<td>This work</td>
<td>65nm CMOS</td>
<td>3.3</td>
<td>72.2</td>
<td>11.6</td>
<td>3.8</td>
</tr>
</tbody>
</table>

* without applying the implemented digital pre-distortion algorithm

Table 6.1: Performance comparison of WLAN PAs.

6.4 Summary

The paper has presented a transformer-based CMOS PA for WLAN 802.11n, fabricated in 65nm CMOS. The PA meets the EVM and spectral requirements for a 72.2Mbit/s, 64-QAM, 802.11n OFDM signal, at an average output power of 11.6dBm, with an EVM of 3.8%.

6.5 Acknowledgement

The authors would like to thank Dr. Ted Johansson and the RF design team at Infineon Technologies Nordic AB, Sweden, for valuable technical discussions. The authors would also like to thank Dr. Ronald Thüringer and Dr. Stefan van Waasen at Infineon Technologies AG, Germany, for the technical and financial support of this project, and Henrik Karlström at Rohde & Schwarz, Sweden, for providing the measurement equipment.
6.6 References


Chapter 7

Paper 3

Impedance Matching Techniques in 65nm CMOS Power Amplifiers for 2.4GHz 802.11n

Jonas Fritzin\(^{(1)}\), Ted Johansson\(^{(2)}\), and Atila Alvandpour\(^{(1)}\)

(1) Electronic Devices, Department of Electrical Engineering
Linköping University, SE-581 83 Linköping, Sweden
{fritzin, atila}@isy.liu.se

(2) Infineon Technologies Nordic AB
Isafjordsgatan 16, SE-164 81 Kista, Sweden
ted.johansson@ieee.org

Proceedings of the 38\(^{th}\) IEEE European Microwave Conference (EuMC),
Impedance Matching Techniques in 65nm CMOS Power Amplifiers for 2.4GHz 802.11n

Jonas Fritzin(1), Ted Johansson(2), and Atila Alvandpour(1)

(1) Electronic Devices, Department of Electrical Engineering
Linköping University, SE-581 83 Linköping, Sweden
{fritzin, atila}@isy.liu.se

(2) Infineon Technologies Nordic AB
Isafjordsgatan 16, SE-164 81 Kista, Sweden
ted.johansson@ieee.org

Abstract

This paper describes the design of two power amplifiers (PA) for WLAN 802.11n fabricated in 65nm CMOS technology. Both PAs utilize 3.3V thick-gate oxide (5.2nm) transistors and employ a two-stage differential structure, but the input and interstage matching networks are realized differently. The first PA uses LC matching networks for matching, while the second PA uses on-chip transformers. The impedance matching techniques applied for the matching networks will be described. EVM, output power levels, and spectral masks are obtained for a 72.2Mbit/s, 64-QAM, 802.11n, OFDM signal.

7.1 Introduction

The power amplifier (PA) is a key building block in all RF transmitters. To lower the costs and allow full integration of a complete radio system-on-chip, it is highly desirable to integrate the entire transceiver and the PA in a single CMOS chip. However, integration of RF power amplifiers in low-cost CMOS technologies proves to be a challenging task [1].

While digital circuits benefit from the technology scaling, it is becoming significantly harder to meet the stringent requirements on linearity, output power, and power efficiency of PAs at lower supply voltages and in the presence of large on-chip parasitics [2]. This has recently triggered extensive studies to investigate the impact of different circuit techniques, design methodologies, and
design trade-offs on functionality of PAs in deep-submicron CMOS technologies [3]. Particularly, the demand for higher data rates in wireless communication has led to an increased interest in both phase and envelope modulations, necessitating a special focus on design issues for linear CMOS power amplifiers.

Previously several high performance PAs for WLAN have been fabricated in 180nm [4], [5] and 90nm [6] CMOS technologies. In this paper, we present the design and evaluation of two linear 2.4GHz WLAN PAs in 65nm CMOS supporting the draft of the IEEE 802.11n standard. The first PA uses LC matching networks for input and interstage matching, while the second PA uses on-chip transformers. The paper discusses the different impedance matching techniques applied and the sources of losses in the matching networks will be described, followed by the experimental results of both designs.

7.2 Impedance Matching

The PAs in this work are realized as differential two-stage amplifiers with on-chip input (INP) and interstage (IM) matching, as shown in Figure 7.1. After the second amplifier stage (PA₂), an off-chip output matching network (OUT) is used for differential-to-single-ended impedance conversion.

To ensure that the signal power is amplified and eventually delivered to the antenna, it is important to minimize the voltage reflections due to impedance mismatches, which cause power losses.

The following section describes the design of the PAs and the impedance transformation techniques used for the input and interstage matching networks in the PAs. The output matching network will be discussed in the context of experimental results.
7.3 Design and Implementation of the Power Amplifiers

The PAs utilize 3.3V thick-gate oxide CMOS transistors with a gate length of 0.6µm, LC matching networks (Figure 7.2) and integrated transformers (Figure 7.3) for input and interstage matching. Both PAs are differential and use two amplifier stages.

To ensure reliable operation and protect the transistors from hot electrons and breakdown due to high voltage peaks, each amplifying stage uses a pair of transistors in a cascode configuration. To provide highest protection for the transistors, the gates of the cascode transistors are connected to VDD.

The width of the transistors in the second amplification stage (M3 and M4) is 6mm in both designs, while the transistors in the first amplification stage (M1 and M2) of the PA with LC matching networks and integrated transformers have a width of 0.4mm and 0.8mm, respectively. The differential structure enables a lower impedance transformation ratio for the off-chip output matching network, which leads to lower output currents and lower losses in the matching network compared to single-ended PAs [3].

A. PA with LC matching networks

In Figure 7.2 the differential inductor (L1) forms the input matching network together with the capacitors (C1) for the first amplifier stage. The capacitors (C1) are used to block the DC level of the input signal and for tuning to achieve resonance in the matching network at the operation frequency. C1 forms a voltage divider with the gate capacitance and layout parasitics, and in order to

![Figure 7.2: Simplified schematic of the LC-based PA.](image-url)
minimize the losses large capacitors \( (C_1) \) may be needed. If the impedance of the
matching network matches the impedance of signal generator at resonance, then
the transfer of power is maximized and amplified.

In the interstage matching networks, the capacitors \( (C_2) \) form a voltage
divider with the large gate capacitance and parasitic capacitances of the second
amplifier stage. As for \( C_1 \), the losses are minimized for large values of \( C_2 \).
However, the large gate capacitance and parasitic capacitances would require
unreasonable small inductance values. Therefore it is advantageous to choose a
smaller \( C_2 \) and a larger \( L_2 \), and permit some losses over \( C_2 \), to achieve a high
input impedance of the second amplifier stage and thereby increase the voltage
gain.

In the inductors, the two upper layers in the seven-metal stack are connected
to form one conductor to reduce the resistive losses. At 2.45GHz the Q-values of
the differential input and interstage inductors are approximately 8.5 \( (L_1) \) and 6.7
\( (L_2) \), with inductor values of 4.1nH \( (L_1) \) and 1.8nH \( (L_2) \).

B. PA with transformers in matching networks

Figure 7.3 shows the input section of the first amplifier stage of the transformer-
based PA. One important feature is the galvanic isolation between primary and
secondary side of the transformers \( (T_1 \text{ and } T_2) \), which eliminates the need for a
DC blocking capacitor. The galvanic isolation also enables independent biasing
of the primary and secondary sides through the center taps of the transformers.

Due to the relationships between voltages and currents [7], [8] in the primary
and secondary windings, a transformer (Figure 7.4) can transform the input
impedance to a desirable value according to the turns ratio, $n$, defined in Eq. (7.1).

$$n = \frac{L_S}{L_P} = \frac{V_2}{V_1} = \frac{I_1}{I_2} = \sqrt{\frac{Z_S}{Z_P}}$$ (7.1)

To reduce the losses between the primary and secondary sides, resonant tuning has to be performed on both sides [8]. Creating cross-coupled dual resonance circuits [9] and achieving desired impedance levels makes it challenging to find an optimum transformer design.

As for LC matching networks, it is equally important to consider losses in the transformers. Estimations of the power losses in the transformer can be calculated by the maximum available gain, based on S-parameters for any termination impedances [10]. The maximum available gain, $G_{ma}$, is a measure of the gain of the system when the source and load reflections coefficients are conjugately matched to $S_{11}$ and $S_{22}$.

$$G_{ma} = \frac{|S_{21}|}{|S_{12}|} \left( k_s - \sqrt{k_s^2 - 1} \right)$$ (7.2)

where $k_s$ is the stability factor defined as:

$$k_s = \frac{1-|S_{11}|^2 - |S_{22}|^2 + |S_{11}S_{22} - S_{12}S_{21}|^2}{2|S_{12}\|S_{21}|}$$ (7.3)

To reduce the resistive losses and improve the coupling between the primary and secondary sides, the two top layers in the seven-metal stack were connected to form one conductor. The thicknesses of the top aluminum and second top copper layers are 1.3µm and 0.6µm, respectively. The winding ratios of the transformers, in Figure 7.3, are 2:3 ($T_1$) and 3:2 ($T_2$) with a coupling factor of approximately 0.7 for both transformers. The planar square transformers used in this design are based on a model described in [11]. Based on Eq. (7.2) and (7.3),

![Figure 7.4: Ideal transformer model.](image-url)
the simulated maximum available gain, $G_{\text{max}}$, for the input and interstage transformers is approximately -2.15dB for both transformers.

Since the transformer can provide balun functionality, single-ended excitation of the PA is possible by shorting one of the differential input signals to ground. However, single-ended excitation would also narrow the available bandwidth due to different interaction with the substrate parasitics than for differential signals [8]. The transformer can also work as an ESD protecting device for stand-alone PAs. Recent LNA implementations show that protection up to 5kV is achievable [12].

### 7.4 Experimental Results

Figure 7.5 shows the photographs of the LC-based and transformer-based PAs. The size of both chips is 1x1 mm$^2$. The chips were directly bonded on the testboard, which is a two-layer 0.5mm thick FR4 PCB with $\varepsilon_r$ of 4.2 and tan δ of 0.035.

The input power to the testboard is applied differentially with an external 50-to-100 Ohm balun connected to the signal source. The PA utilizes an off-chip lumped element balun [13] (Figure 7.6) for differential-to-single-ended conversion and load impedance transformation. A pre-matching network ($L_p$ and $C_p$) was used for load impedance tuning, and compensation of the bond wire inductance and interconnection lines from the PA to the balun (part of $L_p$). For a differential-to-single-ended impedance conversion, from $R_2$ to $R_L$, the following equations apply at the operating frequency:

$$\sqrt{R_2 R_L} = \omega L_{B1} = \omega L_{B2} = \frac{1}{\omega C_{B1}} = \frac{1}{\omega C_{B2}}$$

(7.4)

![Figure 7.5: Chip photos of the LC-based (a) and transformer-based (b) PAs.](image)
A. Measurement Results

The input and output 1dB compression points (P1dB) at 2.57GHz were found to be 5.5dBm and 17dBm for the LC-based PA, with a power-added efficiency (PAE) of 3.3% and small-signal power gain of 12.5dB. For the transformer-based PA the corresponding compression points at 2.48GHz were found to be 2.6dBm and 19.6dBm, with a PAE of 5.8% and small-signal power gain of 18dB. Plots of the measured average output power and EVM, are provided in Figure 7.7 and Figure 7.8 for a 72.2Mbit/s, 64-QAM, 802.11n OFDM signal.

![Output matching network diagram](image)

Figure 7.6: Output matching network.

![RF performance graph](image)

Figure 7.7: RF performance (Pin, Pout, EVM) for the LC-based PA.
under nominal supply voltage of 3.3V.

For an input signal with Peak-to-Average Power Ratio (PAPR) of 9.1dB, the LC-based PA gave an EVM of 3.8% at an average output power of 9.4dBm, and 5.4% at 11.3dBm output power with a PAE of 1%. The transformer-based PA gave an EVM of 3.8% at an average output power of 11.6dBm, and 5.4% at 13.4dBm output power with a PAE of 1.4%. Due to the high linearity requirements of OFDM modulation, the PAs were biased in class-A which leads to large bias currents and low PAE [5].

The measured output spectrum of a WLAN 72.2Mbit/s, 64-QAM, 802.11n OFDM signal for a 20MHz channel is plotted in Figure 7.9 for the PAs. As seen in Figure 7.9, the PAs meet the spectral requirements of the 802.11n draft 2.05 [14]. The LC-based PA showed an average output power of 14dBm with an EVM of 10.5%. The transformer-based PA showed an average output power of 17dBm with an EVM of 13.1%.

The measurements showed a higher gain and a higher P1dB for the transformer-based PA than for the LC-based PA. Therefore the transformer-based PA can achieve a higher average output power level with a low EVM.
In Table 7.1 the performance of the implemented PAs and some recently presented WLAN PAs is listed. Comparing the performance of the implemented PAs, we can conclude that the LC5-based PA has lower average output power, and the transformer5-based PA has similar average output power compared to recently presented WLAN PAs. However, the implemented PAs are capable of running at a lower supply voltage than [5], maintaining a low EVM for a 802.11n OFDM signal with more subcarriers than 802.11g [4], [5], and are implemented in a more advanced technology with increased resistances in the back-end [2].

<table>
<thead>
<tr>
<th>Reference</th>
<th>Technology</th>
<th>VDD [V]</th>
<th>Data rate [Mbit/s]</th>
<th>Pout [dBm]</th>
<th>EVM [%] @ Pout</th>
</tr>
</thead>
<tbody>
<tr>
<td>[4]</td>
<td>180nm CMOS</td>
<td>3.3</td>
<td>54</td>
<td>17.9</td>
<td>3</td>
</tr>
<tr>
<td>[5]</td>
<td>180nm CMOS</td>
<td>3.5</td>
<td>54</td>
<td>11.6</td>
<td>2.8</td>
</tr>
<tr>
<td>[6]*</td>
<td>90nm CMOS</td>
<td>3.3</td>
<td>54</td>
<td>11.6</td>
<td>3.6</td>
</tr>
<tr>
<td>Transformer PA</td>
<td>65nm CMOS</td>
<td>3.3</td>
<td>72.2</td>
<td>11.6</td>
<td>3.8</td>
</tr>
<tr>
<td>LC PA</td>
<td>65nm CMOS</td>
<td>3.3</td>
<td>72.2</td>
<td>9.4</td>
<td>3.8</td>
</tr>
</tbody>
</table>

* without applying the implemented digital pre-distortion algorithm

**Table 7.1: Performance comparison of WLAN PAs.**
7.5 Summary

The paper has presented two CMOS PAs for WLAN 802.11n, fabricated in 65nm CMOS. Both PAs meet the EVM, and spectral requirements for a 72.2Mbit/s, 64-QAM, 802.11n OFDM signal, at average output powers of 9.4dBm and 11.6dBm, with EVM of 3.8%. Impedance matching techniques for input, interstage, and output matching of the PAs have been described.

7.6 Acknowledgments

The authors would like to thank the RF design team at Infineon Technologies Nordic AB, Sweden, and Dr. Ronald Thüringer and Dr. Stefan van Waasen at Infineon Technologies AG, Germany, for the technical and financial support of this project. The authors would also like to thank Henrik Karlström at Rohde & Schwarz, Sweden, for providing the measurement equipment.

7.7 References


Low Voltage Class-E Power Amplifiers for DECT and Bluetooth in 130nm CMOS

Jonas Fritzin and Atila Alvandpour
Division of Electronic Devices, Department of Electrical Engineering
Linköping University, SE-581 83 Linköping, Sweden
E-mail: {fritzin, atila} @isy.liu.se

Proceedings of 9th IEEE Topical Meeting on Silicon Monolithic Integrated Circuits in RF Systems (SiRF), San Diego, CA, USA, January 19 – 21, 2009
Low Voltage Class-E Power Amplifiers for DECT and Bluetooth in 130nm CMOS

Jonas Fritzin and Atila Alvandpour
Division of Electronic Devices, Department of Electrical Engineering
Linköping University, SE-581 83 Linköping, Sweden
E-mail: {fritzin, atila} @isy.liu.se

Abstract
This paper presents the design of two low-voltage differential class-E power amplifiers (PA) for DECT and Bluetooth fabricated in 130nm CMOS. In order to minimize the on-chip losses and to achieve a high efficiency at low supply voltages, the PAs do not use on-chip output matching networks. At 1.5V supply voltage, the DECT PA delivers +26.4dBm of output power with a drain efficiency (DE) and power-added efficiency (PAE) of 41% and 30%, respectively. The Bluetooth PA delivers +22.7dBm at 1V with a DE and PAE of 48% and 36%, respectively. A continuous long-term test of 100 hours proves the reliability of the design.

8.1 Introduction
The power amplifier (PA) is a key building block in all RF transmitters. Today, most radio frequency building blocks have been successfully integrated into CMOS processes, while the power amplifier is usually designed in a different technology. To lower the costs, by reducing board space and the number of components, it is highly desirable to integrate the entire transceiver and the PA in a single CMOS chip operated with a single low ‘digital’ supply voltage. Therefore, there is a need for highly efficient CMOS power amplifiers using a low supply voltage to achieve the goal of single-chip radio systems.

In this paper, we present two class-E CMOS PAs operating at a low ‘digital’ supply voltage. To achieve high efficiency and minimize on-chip losses, all output matching network components are put off-chip. Hence, low-Q on-chip inductors are avoided. Additionally, the integrated inductors commonly used for
matching between the different amplifier stages are removed, which result in a significantly reduced area required for the PA.

A major obstacle in the design of class-E CMOS PAs is the high peak drain voltage generated, and the low breakdown voltage of the MOS device, making it challenging to design a reliable class-E CMOS PA. To evaluate the reliability of the design, a continuous long-term test of 100 hours has been performed. The paper discusses the design and implementation of the DECT and Bluetooth (BT) PAs including the circuit architecture, the experimental results, and eventually a comparison with other published low-voltage CMOS PA designs. The comparison shows a clear power-efficiency trade-off between the utilization of on-chip and off-chip output matching networks.

### 8.2 Design and Implementation of the Power Amplifiers

Both PAs utilize a differential structure (Figure 8.1 shows a single-ended section) with buffers, and drivers based on 1.5V thin gate oxide transistors, with a physical gate oxide thickness ($t_{ox}$) of 2.2nm, and gate length of 0.12µm, which are also used in the output stage ($T_1$) of the BT PA. The DECT PA utilizes 3.3V thick gate oxide ($t_{ox}=5.2$nm) transistors ($T_1$) with a gate length of 0.4µm.

Due to the relationship between DE, switch on-resistance ($r_{on}$), and load resistance ($R_L$) in (8.1), it is important to reduce the on-resistance for high output power and high power efficiency [1]. Moreover, in order to achieve the same output power for a reduced power supply voltage [2], the load resistance needs to scale quadratically as in (8.2). Since the on-resistance does not reduce as fast as $R_L$ when technology scales, a low-voltage high-efficiency PA requires wider transistors in deep-submicron CMOS technologies [1].

\[
DE \propto \frac{1}{1 + 1.4 \frac{r_{on}}{R_L}} \tag{8.1}
\]

\[
P_{out} = 0.577 \frac{V_{DD}^2}{R_L} \tag{8.2}
\]

To benefit from the low on-resistance of the wider transistors, the parasitic inductance and resistive losses in the ground plane must be minimized. This is done by maximizing the number of metal layers used as ground, especially around the output stage transistors. However, as the size of the output stage transistor becomes larger, the capacitive loading of the buffer increases, and therefore a buffer with high driving capability is required.
The signal driving the PA is buffered with a buffer-chain consisting of regular inverters optimized to achieve a high overall efficiency. It means a trade-off between providing a good edge-rate on the gate of the following inverter to minimize the short-circuit current and not consume too much power. Figure 8.1 shows the tapered buffer with a load capacitance \( C_L \) representing the gate capacitance of the output stage transistor and drain capacitance of the previous inverter stage. For a four-stage buffer with a tapering factor of three, the power dissipation of the last driver theoretically consumes two thirds of the total power consumption of the driver stages [3]. In order to achieve a good edge-rate at the output, the gate resistance is minimized by using a small finger gate width of 10\( \mu \)m for the BT PA. To achieve the required output power for DECT a wider transistor was needed, but to get reasonable layout proportions of the output stage, a finger width of 30\( \mu \)m was used in the output stage.

As the transistors become large, the drain capacitance increases and can be incorporated in the required capacitance \( C_1 \) for class-E operation. However, it is important to minimize the voltage across the capacitance as the transistor is turned on, in order to minimize the energy losses.

In the implemented DECT and BT PAs the output power level can be adjusted by controlling the supply voltages of the output stage and driver stages of the PAs. Therefore, a voltage modulator will be needed for power control.

In Figure 8.1, the NMOS transistor widths (in \( \mu \)m) of the three inverter driver stages and the output stage (\( T_1 \) in Figure 8.1) of the DECT PA are 100, 400, 1200, and 8400. The NMOS transistor widths of the four inverter driver stages and the output stage of the BT PA are 100, 200, 600, 1300, and 4000. The PMOS transistors were a factor of 2.2 larger.
8.3 Experimental Results

Figure 8.2 shows the photographs of the fabricated PAs, with the output stage at the top in the photos. The size of the chips is 0.7x1.2 mm\(^2\) and were directly bonded on the PCB (FR4, \(\varepsilon_r = 4.2\), \(\tan \delta = 0.035\)). The input power was applied differentially with an external balun connected to the signal source. The PA utilizes an off-chip lumped element balun [4] for differential-to-single-ended conversion and load impedance transformation.

A. Measurement Results – Output Power

Figure 8.3 shows how the output power of the DECT PA varies as the supply voltages (VDD\(_1\) and VDD\(_2\)) are swept from 0.8V to 1.5V. The highest output

![Figure 8.3: DECT PA: Pout, DE, and PAE: VDD\(_1\) = VDD\(_2\) at 1.85GHz.](image)
power is +26.4dBm at 1.5V with DE and PAE of 41% and 30%, respectively. As seen in Figure 8.4, the performance of the PA varies over frequency, with an optimum performance at 1.85GHz when the supply voltage is 1.5V.

Figure 8.5 shows how the output power of the BT PA varies as the supply voltage ($V_{DD1}$) is swept from 0.1V to 1.1V, with a maximum output power of +23.5dBm at 1.1V. At 1V the output power is +22.7dBm with DE and PAE of

Figure 8.4: DECT PA: Pout, DE, and PAE: $V_{DD1} = V_{DD2} = 1.5V$.

Figure 8.5: BT PA: Pout, DE, and PAE at 2.45GHz.
48% and 36%, respectively. The buffers and the drivers use a 1V supply voltage. As seen in Figure 8.6 the performance of the PA varies over frequency, but has an optimum performance at 2.45GHz for a 0.75V supply voltage. The estimated power needed for the buffers to excite the drivers of the output stage is ~4dBm (~2.5mW) for both PAs, based on simulations and measurements, and is used in PAE calculations. This results in approximate maximum gains of 22.5dB and 19dB of the DECT and BT PAs.

B. Measurement Results – Spectral Requirements

In Figure 8.7 the measured output spectrums of the GFSK modulated signals for the DECT and BT PAs are seen. Based on the measured output spectrum and ACP calculations, the DECT PA meets the ACP requirements that are for channel offsets of ±1, ±2, ±3, and any other channel (±4). The Bluetooth PA meets the spectral mask requirement of -20dBc at an offset of ±500kHz, and ACP requirements of -20dBm, and -40dBm, for channel offsets of ±2, and ±3.

C. Measurement Results – Reliability

Since the drain voltage of a class-E PA ideally can reach levels up to 3.56xVDD [2], a low supply voltage is needed to minimize the stress on the output stage transistors and to not exceed the critical gate oxide field of ~1V/nm [2] for DC conditions. The damages in PAs are mainly due to channel hot carrier (HC) stress [6] or Fowler-Nordheim (F-N) gate oxide wearout [7]. Typically in a class-E PA, the drain voltage is high when the drain current is zero, and

![Figure 8.6: BT PA: Pout, DE, and PAE: VDD1 = 0.75, VDD2 = VDD3 = 1V.](image-url)
therefore the HC stress is minimized and the transistor wearout will be dominated by the F-N gate oxide wearout.

Simulations of both PAs indicate that the peak drain voltage reaches levels close to 3xVDD. As the DECT PA uses thick gate oxide transistors ($t_{\text{ox}}=5.2\text{nm}$) and a maximum supply voltage of 1.5V, the peak drain voltage is $\sim 4.5\text{V}$ ($<1\text{V/nm}$). In the technology used, the thick gate oxide transistor can withstand voltages of $\sim 8.5\text{V}$ for zero current (typical Class-E behavior), and a reasonable lifetime of the device can be expected.

In order not to exceed the critical gate oxide field in the BT PA a supply voltage of 0.75V has to be used for the thin gate oxide transistors ($t_{\text{ox}}=2.2\text{nm}$) in the output stage. However, the thin gate oxide NMOS transistor of the technology used, can withstand voltages of $\sim 4.5\text{V}$ for zero current, indicating that a higher supply voltage can be used. To estimate the lifetime of the PA, one device was operated at 1V with output power of 22.7dBm. The device showed no output power level degradation after 100 hours of operation with 100% duty-cycle, however a minor increase of drain current was observed similar to [8].

D. Measurement Results – Performance Comparison

Table 8.1 shows the performance of the DECT PA and two recently presented DECT PAs [9], which are also designed in the same CMOS technology and also use off-chip output matching network, however both designs feature linear amplification. The PA shows a slightly lower output power and efficiency, while delivering a sufficiently high output power to leave some margin for output matching network losses to have +24dBm [9] at the antenna. It can be concluded
that the supply voltage is reduced by 40% compared to [9], and the area has been reduced by approximately 91-93%, as seen in Table 8.1.

<table>
<thead>
<tr>
<th>Reference</th>
<th>Frequency</th>
<th>Technology</th>
<th>Pout [dBm]</th>
<th>DE [%]</th>
<th>PAE [%]</th>
<th>VDD [V]</th>
<th>Fully integrated</th>
<th>Differential</th>
<th>Single-ended</th>
<th>Area* [um²]</th>
</tr>
</thead>
<tbody>
<tr>
<td>[9] LC-based</td>
<td>1.9GHz</td>
<td>0.13µm CMOS</td>
<td>27.3</td>
<td>-</td>
<td>37</td>
<td>2.5</td>
<td>X</td>
<td>-</td>
<td>121280</td>
<td></td>
</tr>
<tr>
<td>[9] Trafo-based</td>
<td>1.9GHz</td>
<td>0.13µm CMOS</td>
<td>27.4</td>
<td>-</td>
<td>34</td>
<td>2.5</td>
<td>X</td>
<td>88200</td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>This work:</strong> DECT PA</td>
<td><strong>1.85GHz</strong></td>
<td><strong>0.13µm CMOS</strong></td>
<td><strong>26.4</strong></td>
<td><strong>41</strong></td>
<td><strong>30</strong></td>
<td><strong>1.5</strong></td>
<td><strong>X</strong></td>
<td>-</td>
<td><strong>8026</strong></td>
<td></td>
</tr>
<tr>
<td>[5]</td>
<td>2.45GHz</td>
<td>0.13µm CMOS</td>
<td>23</td>
<td>35</td>
<td>29</td>
<td>1.5</td>
<td>X</td>
<td>X</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>[6]</td>
<td>2.4GHz</td>
<td>0.25µm CMOS</td>
<td>24</td>
<td>-</td>
<td>48</td>
<td>2.5</td>
<td>X</td>
<td>-</td>
<td></td>
<td></td>
</tr>
<tr>
<td>[10]</td>
<td>2.4GHz</td>
<td>0.18µm CMOS</td>
<td>23</td>
<td>-</td>
<td>42</td>
<td>2.4</td>
<td>X</td>
<td>-</td>
<td></td>
<td></td>
</tr>
<tr>
<td>[11]</td>
<td>2.45GHz</td>
<td>0.25µm CMOS</td>
<td>21.4</td>
<td>38</td>
<td>26</td>
<td>2.6</td>
<td>X</td>
<td>X</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>[12]</td>
<td>2.4GHz</td>
<td>0.35µm CMOS</td>
<td>23</td>
<td>-</td>
<td>37</td>
<td>1.5</td>
<td>X</td>
<td>-</td>
<td></td>
<td></td>
</tr>
<tr>
<td>[13]</td>
<td>2.4GHz</td>
<td>0.13µm CMOS</td>
<td>27</td>
<td>32</td>
<td>-</td>
<td>1.2</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td></td>
</tr>
<tr>
<td>[14]</td>
<td>5.8GHz</td>
<td>90nm CMOS</td>
<td>24.3</td>
<td>27</td>
<td>-</td>
<td>1</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td></td>
</tr>
<tr>
<td><strong>This work:</strong> BT PA</td>
<td><strong>2.45GHz</strong></td>
<td><strong>0.13µm CMOS</strong></td>
<td><strong>22.7</strong></td>
<td><strong>48</strong></td>
<td><strong>36</strong></td>
<td><strong>1</strong></td>
<td><strong>X</strong></td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
</tbody>
</table>

* Including inductors/transformers, tuning capacitors (not decoupling), and transistors (WxL).

Table 8.1: Performance comparison of Bluetooth and DECT PAs.

Our BT PA is the only PA achieving +20.4dBm of output power from a supply voltage as low as 0.75V compared to [5], [6], [10]-[12] in Table 8.1, however it does not have on-chip output matching networks as [5], [11], and needs a voltage supply modulator for linear power amplification. At 1V and +22.7dBm output power, the PA has a similar [12] or higher [5], [11], efficiency, even if the supply voltage is reduced by ~33% [5], [12] or by ~60% [11]. Compared to the state-of-the-art 1-1.5V CMOS PAs [5], [13], [14], which use on-chip power combiners, the BT PA achieves almost double DE [13], [14] at output power levels of 23-24dBm, and a higher overall efficiency than [5] at a reduced supply voltage.

8.4 Summary

Two low-voltage class-E power amplifiers in 130nm CMOS intended for DECT and Bluetooth have been presented. At 1.5V supply voltage, the DECT PA
manages to deliver +26.4dBm with a DE of 41% and PAE of 30%. The Bluetooth PA shows reliable operation at a supply voltage of 1V at an output power of +22.7dBm with DE, and PAE of 48%, and 36%, respectively. In comparison with other low-voltage CMOS PAs, the design shows a clear power-efficiency trade-off between the utilization of on-chip and off-chip output matching networks. The Bluetooth power amplifier device was operated for 100 hours with no output power level degradation.

### 8.5 Acknowledgement

The authors would like to thank Dr. Ted Johansson for useful technical discussions, and Intel Corporation, USA, for the financial support of this project. The authors would also like to thank Infineon Technologies AG, Germany, for providing the silicon, and Henrik Karlström, and Anders Sundberg at Rohde & Schwarz, Sweden, for providing the measurement equipment.

### 8.6 References


