1

Initial commit

This commit is contained in:
2023-02-26 21:12:51 +01:00
commit d2ec37332f
69 changed files with 58831 additions and 0 deletions

427
chap/background.tex Normal file
View File

@ -0,0 +1,427 @@
\chapter{Background}
\label{ch:background}
In this section, important domain specific concepts will be explained, to create the necessary
foundation to follow \autoref{ch:implementation}. Important terms that are present in the glossary
are marked in \textbf{bold} on their first occurrence.
\clearpage
\section{Handling of External Events}
\label{sec:eventhandling}
There are two different strategies to ``listen'' for external events: ``Polling'' and
``Interrupts''.
The first strategy works by periodically \textit{polling} a device to check for changes. This could
mean reading a register of the keyboard every 50 ms to determine if a key was pressed, or reading a
sensor output even when the value remains unchanged. Because every device would have to be polled
constantly, no matter if any change actually occurred, this method is computationally inefficient
(although easy to implement without extra hardware).
The second strategy are \textbf{\glspl{interrupt}}. Instead of the CPU actively ``looking'' for
changes, the devices signal the events to the CPU themselves. Every time the CPU is notified of an
external event, it pauses the current code execution and calls a function designated to handle this
specific event.
This approach is much more efficient than the polling strategy, because the CPU does not have to
waste its processing power to check for events when none are occurring.
\section{Fundamental Concepts}
\label{sec:fundamentals}
\subsection{Interrupt}
\label{subsec:interrupt}
When a device signals an external event to the CPU, the current code execution gets
\textit{interrupted}, to handle the event. Thus, the process of signaling and handling an external
event is called ``interrupt''.
Interrupts can be caused by all sorts of events, like key presses on a keyboard or packets received
via a network card. These interrupts from external hardware devices are called
\textbf{\glspl{external interrupt}}.
Other types of interrupts mentioned in this thesis are \textbf{\glspl{ipi}}, \textbf{\glspl{msi}}
and \textbf{\glspl{local interrupt}}:
\begin{itemize}
\item IPIs: Interrupts sent between different processor cores in multiprocessor systems, for example to
initialize different cores on system startup.
\item MSIs: Interrupts sent in-band\footnote{In in-band signaling some control information (like an IRQ)
is sent over the same channel as the data. This stands in contrast to out-of-band signaling, which
uses a dedicated control line (like an interrupt line).}, for example over a PCI-bus\footnote{PCI
supports MSIs since PCI 2.2~\cite[sec.~6.8]{pci22}.}.
\item Local Interrupts: Some specific CPU-internal interrupts.
\end{itemize}
Some interrupts are essential for continuous CPU operation. These \textbf{\glspl{nmi}} always have
to be handled and cannot be ignored (usually\footnote{An example where NMIs should be disabled is
the startup of additional processors in multiprocessor systems.}). Hardware error interrupts are a
typical example for NMIs.
The existence of NMIs hints that it is possible to ignore regular interrupts. Marking an interrupt
as ``ignored'' is called \textbf{\gls{masking}} an interrupt.
\subsection{Interrupt Controller}
\label{subsec:controller}
A computer system has to process interrupts from many sources, so it is practical to have a
designated \textbf{\gls{interrupt controller}}, that receives interrupts from different devices and
forwards them to the CPU. The masking functionality is integrated into the interrupt controller, so
masked interrupts don't reach the CPU at all.
The interrupt controller receives interrupts through signals over physical connections (interrupt
lines) to different devices. These signals can be represented in multiple ways through the level on
the interrupt lines:
\textbf{\Gls{trigger mode}}:
\begin{itemize}
\item Edge-triggered signal: A signal is received when the interrupt line level changes
\item Level-triggered signal: A signal is received when the interrupt line reaches a certain level
\end{itemize}
\textbf{\Gls{pin polarity}}:
\begin{itemize}
\item High: A signal is received when the interrupt line level changes from either low to high (in
edge-triggered mode) or reaches a level above a threshold (in level-triggered mode)
\item Low: A signal is received when the interrupt line level changes from either high to low (in
edge-triggered mode) or reaches a level below a threshold (in level-triggered mode)
\end{itemize}
A signal is represented through a combination of trigger mode and pin polarity, so there are a
total of 4 combinations.
When the interrupt controller receives an interrupt signal, it requests the CPU to handle the
specific event: The interrupt controller sends an \textbf{\gls{irq}} to the CPU. Because there are
multiple devices that can signal interrupts, an IRQ is usually indexed by its hardware pin on the
interrupt controller: If a keyboard is connected to the interrupt controller on pin 1, the CPU
receives an IRQ1.
To signal that an interrupt has been handled by the software, the interrupt controller receives an
\textbf{\gls{eoi}} notice from the OS\@.
Information on specific interrupt controllers follows in \autoref{sec:intelcontrollers}.
\subsection{Interrupt Handler}
\label{subsec:handler}
When the CPU receives an IRQ, it pauses its current code execution to handle the interrupt. This is
done by executing the \textbf{\gls{interrupt handler}} function, that is registered to the specific
interrupt.
During interrupt servicing (execution of the interrupt handler), other interrupts can occur. When
the execution of an interrupt handler is paused to handle another interrupt, this is called a
\textbf{\gls{cascaded interrupt}}.
Interrupt handlers have to be registered to different IRQs, so the CPU can locate and execute the
specific code that is designated to handle a certain interrupt. The interrupt handler addresses are
stored in an \textit{interrupt vector table}: Each address corresponds to an \textbf{\gls{interrupt
vector}} number, interrupt vectors 0 to 31 are reserved for the CPU, interrupt vectors 32 to 255
are usable for IRQs. In Intel's 32 bit IA-32 CPU architecture, this table is called the
\textbf{\gls{idt}}. The \textbf{\gls{idtr}} keeps the location of the IDT, it is written by the
\code{lidt}~\cite{x86isa} instruction.
\subsection{Interrupt Trigger Mode}
\label{subsec:triggermode}
When an edge-triggered IRQ is received, its interrupt handler is called a single time, serviced,
and marked as completed in the interrupt controller (completion does not require interacting with
the device the interrupt originated from). The handler of an edge-triggered IRQ is called every
time the interrupt line changes its level. This could lead to problems if ``glitches'' occur on the
interrupt line.
An alternative is the level-triggered IRQ: When the interrupt line is above/below a certain level,
the interrupt is signaled continuously. Servicing the interrupt thus requires not only signaling
completion to the interrupt controller, but also to the device the interrupt originated from, to
de-assert the interrupt line. Otherwise, the interrupt handler would be called again after
completion.
\subsection{Spurious Interrupt}
\label{subsec:spurious}
When an interrupt disappears while the interrupt controller is issuing the IRQ to the CPU, the
interrupt controller sends a \textbf{\gls{spurious interrupt}}. The reason for this could be
electrical noise on the interrupt line, masking of an interrupt through software at the same moment
this interrupt was received, or incorrectly sent EOIs. Thus, before an interrupt handler is called,
the OS has to check the validity of the occurred interrupt and ignore it, if it is deemed spurious.
\section{Used Technologies}
\label{sec:technologies}
\subsection{Advanced Configuration and Power Interface}
\label{subsec:acpi}
\textbf{\gls{acpi}} allows the kernel to gather information about the system hardware during
runtime. It also provides interactive functionality for power management, plug and play,
hot-swapping or status monitoring. To interact with ACPI, the
\textbf{\gls{aml}}~\cite[sec.~16]{acpi1}, a small language interpreted by the kernel, has to be
used\footnote{In this thesis, information from ACPI is only read, so AML is not required.}.
ACPI defines abstractions for different types of hardware, that are organized in multiple
\textit{system description tables}. In this thesis, ACPI 1.0~\cite{acpi1} is used to read
information about the system's interrupt hardware configuration, located in the ACPI
\textbf{\gls{madt}}~\cite[sec.~5.2.8]{acpi1}. The MADT contains information on used interrupt
controllers (version, physical memory addresses to access registers, etc.), available CPU cores in
multiprocessor systems and specific interrupt configuration (trigger mode and pin polarity).
To allow compatibility to different interrupt controllers, ACPI abstracts external interrupts
through \textbf{\glspl{gsi}}~\cite[sec.~5.2.9]{acpi1}. Different interrupt controllers may have the
same devices connected to different interrupt lines, maintaining a mapping between actual hardware
interrupt lines and GSIs allows the OS to operate on different interrupt
controllers\footnote{Additional information in \autoref{subsec:ioapicpcat}.}.
\subsection{CPU Identification}
\label{subsec:cpuid}
\textbf{\gls{cpuid}} is used to gather information about a system's CPU. The x86 architecture
provides the \code{cpuid}~\cite{x86isa} instruction, which allows the software to identify present
CPU features at runtime, e.g.\ what instruction set extensions a processor implements\footnote{This
thesis uses CPUID to determine what APIC feature set is supported, see ``APIC'' and ``x2APIC''
features in the CPUID application note~\cite[sec.~5.1.2]{cpuid}.}.
\subsection{Symmetric Multiprocessing}
\label{subsec:smp}
\textbf{\gls{smp}} is a form of multiprocessing, where a system consists of multiple, homogeneous
CPU cores, that all have access to a shared main memory and the I/O devices. SMP systems are
controlled by a single OS, that treats all cores as individual, but equal processors. The shared
main memory allows every core to work on an arbitrary task, independent of its memory location.
Programming for SMP systems requires the usage of multithreading, to allow for an efficient
distribution of tasks to multiple CPU cores.
In SMP systems, a single CPU core is active initially, this core is called the \textbf{\gls{bsp}}.
Other cores, called \textbf{\glspl{ap}}, will be initialized by this core.
% TODO: I think I could remove this section
% \subsection{BIOS and UEFI}
% \label{subsec:biosuefi}
%
% The \textbf{\gls{bios}} provides low-level control over a computer system's hardware, most commonly
% used for booting an OS on system power-up. Its pre-boot environment is limited to the 16-bit
% \textbf{\gls{real mode}} and 1 MB of addressable memory. BIOS originates from the IBM PC and has
% been the default convention for computer system firmware until the \textbf{\gls{uefi}} standard.
%
% The UEFI standard is developed by the UEFI Forum, based on Intel's EFI. It improves on BIOS by
% providing modularity, support for larger partitions, 32-bit or 64-bit environments before booting
% and advanced features like networking capabilities, graphical user interfaces with mouse support or
% security features\footnote{See \url{https://uefi.org/specifications} (visited on 22/02/2023).}.
\subsection{Model Specific Register}
\label{subsec:msr}
A \textbf{\gls{msr}} is a special control register in an IA-32 CPU, that is not guaranteed to be
present in future CPU generations (it is not part of the architectural standard). MSRs that carried
over to future generations and are now guaranteed to be included in future IA-32 or Intel 64 CPUs
are called ``architectural'' MSRs. To interact with MSRs, the special instructions \code{rdmsr} and
\code{wrmsr} are used~\cite{x86isa}. These instructions allow atomic read/write operations on
MSRs\footnote{Detailed description in the IA-32 manual~\cite[sec.~4.2]{ia32}.}.
\subsection{Memory Mapped I/O}
\label{subsec:mmio}
\textbf{\gls{mmio}} is a way for the CPU to perform I/O operations with an external device.
Registers of external devices are mapped to the same address space as the main memory, thus a
memory address can either be a location in physical memory, or belong to a device's register. Read
and write operations are then performed by reading or writing the address the register is mapped
to.
\subsection{Programmable Interval Timer}
\label{subsec:pit}
The \textbf{\gls{pit}}\footnote{Specifically the Intel 8253/8254.} is a hardware timer from the
original IBM PC. It consists of multiple decrementing counters and can generate a signal (for
example an interrupt) once 0 is reached. This can be used to control a preemptive\footnote{A
scheduler is called ``preemptive'', when it is able to forcefully take away the CPU from the
currently working thread, even when the thread did not signal any form of completion.} scheduler or
the integrated PC speaker.
\section{Intel's Interrupt Controllers}
\label{sec:intelcontrollers}
Because it is logistically difficult to connect every external device directly to the CPU, this
task is delegated to an interrupt controller, usually located in the system's chipset. Notable
interrupt controllers (and the only ones considered in this thesis) are the Intel
\textbf{\gls{pic}} and Intel \textbf{\gls{apic}}.
\subsection{Programmable Interrupt Controller}
\label{subsec:intelpic}
The Intel 8259A PIC~\cite{pic} is the interrupt controller Intel used in the original Intel
8086\footnote{The 8259 was used in the Intel 8080 and 8085, the 8259A is a revised version for the
Intel 8086}, the ``father'' of the x86 processor architecture. A single PIC has 8 interrupt lines
for external devices, but multiple PICs can be cascaded for a maximum of \(8 \cdot 8 = 64\)
interrupt lines with 9 PICs. The most common PIC configuration supports 15 IRQs with two PICs: A
slave, connected to interrupt line 2 of the master, which is connected to the CPU\footnote{This
configuration is called the \textbf{\gls{pcat pic architecture}}, as it was used in the IBM
PC/AT~\cite[sec.~1.13]{pcat}.}.
\begin{figure}[h]
\centering
\begin{subfigure}[b]{0.3\textwidth}
\includesvg[width=1.0\linewidth]{img/mp_spec_pic_configuration.svg}
\end{subfigure}
\caption{The PC/AT PIC architecture~\cite[sec.~1.13]{pcat}.}
\label{fig:pcatpic}
\end{figure}
If multiple interrupts get requested simultaneously, the PIC decides which one to service first
based on its \textbf{\gls{interrupt priority}}. The PIC IRQ priorities are determined by the used
interrupt lines, IRQ0 has the highest priority.
The PIC interrupt handling sequence can be briefly outlined as follows:
\begin{enumerate}
\item When the PIC receives an interrupt from a device (PIC interrupts are edge-triggered with high pin
polarity), it sets the corresponding bit in its \textbf{\gls{irr}}\footnote{Received IRQs that are
requesting servicing are marked in the IRR.}
\item If this interrupt is unmasked, an IRQ is sent to the CPU
\item The CPU acknowledges the IRQ
\item The PIC sets the corresponding bit in its \textbf{\gls{isr}}\footnote{IRQs that are being serviced
are marked in the ISR. To reduce confusion, interrupt service routines will be denoted as
``interrupt handler'' instead of ``ISR''.} and clears the previously set bit in the IRR. If the
same interrupt occurs again, it can now be queued a single time in the IRR\@.
\item When the interrupt has been handled by the OS, it sends an EOI to the PIC, which clears the
corresponding bit in the ISR\@.
\end{enumerate}
With the introduction of multiprocessor systems and devices with an ever-increasing need for large
amounts of interrupts (like high-performance networking hardware), the PIC reached its
architectural limits:
\begin{itemize}
\item Multiple CPU cores require sending IPIs for initialization and coordination, which is not possible
using the PIC\@.
\item The PIC is hardwired to a single CPU core. It can be inefficient to handle the entire interrupt
workload on a single core.
\item The IRQ priorities depend on how the devices are physically wired to the PIC\@.
\item The PC/AT PIC architecture only has a very limited number of interrupt lines.
\item The PIC does not support MSIs, which allow efficient handling of PCI interrupts.
\end{itemize}
\subsection{Advanced Programmable Interrupt Controller}
\label{subsec:intelapic}
The original Intel 82489DX discrete APIC was introduced with the Intel i486 processor. It consisted
of two parts: A \textbf{\gls{local apic}}, responsible for receiving special \textbf{\glspl{local
interrupt}}, and external interrupts from the second part, the \textbf{\gls{io apic}}. Unlike in
modern systems, the i486's local APIC was not integrated into the CPU core (hence ``discrete''),
see \autoref{fig:discreteapic}.
The i486 was superseded by the Intel P54C\footnote{The Intel P54C is the successor of the Intel P5,
the first Pentium processor.}, which contained an integrated local APIC for the first time, see
\autoref{fig:integratedapic}\footnote{This is now the default configuration.}.
The APIC was designed to fix the shortcomings of the PIC, specifically regarding multiprocessor
systems:
\begin{itemize}
\item Each CPU core contains its own local APIC. It has the capabilities to issue IPIs and communicate
with the chipset I/O APIC. Also, it includes the APIC timer, which can be used for per-core
scheduling.
\item The chipset I/O APIC allows configuration on a per-interrupt basis (priorities, destination,
trigger mode or pin polarity are all configurable). Also, it is able to send interrupts to
different CPU cores, allowing the efficient processing of large amounts of
interrupts\footnote{There are tools that dynamically reprogram the I/O APIC to distribute
interrupts to available CPU cores, depending on heuristics of past interrupts (IRQ-balancing).}.
\item The order in which external devices are connected to the I/O APIC is flexible.
\item The APIC architecture supports MSIs.
\end{itemize}
To allow distributing the APIC hardware between local APICs of multiple CPU cores and I/O APICs,
the APIC system sends messages over a bus. In the Intel P6 and original Pentium processors a
dedicated APIC bus is used, but since the Intel Pentium 4 and Xeon processors the chipset's system
bus is used instead (see \autoref{fig:systemvsapicbus})\footnote{The Intel Core microarchitecture
is based on the P6 architecture (replacing the Pentium 4's NetBurst architecture), but still uses
the system bus instead of a discrete bus.}. Using the system bus is considered the default in this
document.
Because the order in which external devices are connected to the interrupt controller is flexible
in the APIC architecture, but fixed in the PC/AT PIC architecture, there can be deviations in
device-to-IRQ mappings between the PIC and APIC interrupt controllers. To handle this, ACPI
provides information about how the PIC compatible interrupts (IRQ0 to IRQ15) map to ACPI's GSIs in
the MADT, which has to be taken into account by the os when operating with an APIC. The mapping
information for a single interrupt will be called \textbf{\gls{irq override}}.
There have been different revisions on the APIC architecture after the original, discrete APIC,
notably the \textbf{\gls{xapic}} and \textbf{\gls{x2apic}} architectures. This thesis is mostly
concerned about the older xApic architecture\footnote{The x2Apic architecture is backwards
compatible to the xApic architecture, also, QEMU's TCG does not support the x2Apic architecture.}.
A notable difference between xApic and x2Apic architectures is the register access: xApic uses
32-bit MMIO based register access, while x2Apic uses an MSR based register access, which allows
atomic register access to the 32- and 64-bit wide APIC control MSRs~\cite[sec.~3.11.12]{ia32}.
\section{PC/AT Compatibility}
\label{sec:pcatcompat}
For compatibility to older computer systems, two cascaded PICs are usually present in current
computer systems (see \autoref{fig:discreteapic}/\autoref{fig:integratedapic}). The local APIC can
be initialized to operate with these PICs instead of the I/O APIC, this is called
\textbf{\gls{virtual wire mode}}. It is possible to operate with both the PIC and I/O APIC as
controllers for external interrupts in ``mixed mode'', but this is very uncommon. Because the
presence of a local APIC usually guarantees the presence of an I/O APIC, this thesis only focuses
on interrupt handling with either two PICs (\textbf{\gls{pic mode}}), in case no APIC is available,
or the full APIC architecture (local APIC and I/O APIC in \textbf{\gls{symmetric io
mode}})\footnote{For more information on operating modes, consult the MultiProcessor
specification~\cite[sec.~3.6.2.1]{mpspec}.}.
To switch from PIC to symmetric I/O mode, some\footnote{References to the IMCR are only found in
the relatively old MultiProcessor specification~\cite{mpspec}.} hardware\footnote{QEMU does not
emulate the IMCR.} provides the \textbf{\gls{imcr}}, a special register that controls the physical
connection of the PIC to the CPU. In the original discrete APIC architecture, the IMCR can choose
if either the PIC or local APIC is connected to the CPU (see \autoref{fig:discreteapic}), since the
xApic architecture the IMCR only connects or disconnects the PIC to the local APIC's LINT0 pin (see
\autoref{fig:integratedapic}). When the IMCR is set to connect the PIC, and the xApic ``global
enable/disable'' flag is unset (see \autoref{subsec:lapicenable}), the system is functionally
equivalent to an IA-32 CPU without an integrated local APIC~\cite[sec.~3.11.4.3]{ia32}.
Specifics on handling PC/AT compatible external interrupts follow in \autoref{subsec:ioapicpcat}.
\section{Interrupt Handling in hhuOS}
\label{sec:currenthhuos}
In hhuOS, external interrupts are handled in two steps:
\begin{enumerate}
\item After an IRQ is sent by an interrupt controller, the CPU looks up the interrupt handler address in
the IDT. In hhuOS, every IDT entry contains the address of the \code{dispatch} function (see
\autoref{lst:irqdispatch}), which is invoked with the vector number of the interrupt.
\item The \code{dispatch} function determines which interrupt handler will be called, based on the
supplied vector number. In hhuOS, interrupt handlers are implemented through the
\code{InterruptHandler} interface (see \autoref{lst:irqhandler}), that provides the \code{trigger}
function, which contains the interrupt handling routine. To allow the \code{dispatch} function to
find the desired interrupt handler, it has to be registered to a vector number by the OS
beforehand. This process is handled by the \code{plugin} function of the interrupt handler
interface, which uses the interrupt dispatcher's \code{assign} function (see
\autoref{lst:irqassign}) to register itself to the correct vector number. HhuOS supports assigning
multiple interrupt handlers to a single interrupt vector and cascading interrupts.
\end{enumerate}
\begin{codeblock}[label=lst:irqhandler]{The InterruptHandler interface (InterruptHandler.h)~\cite{hhuos}}{C++}
\cppfile{code/interrupthandler_def.cpp}
\end{codeblock}
\begin{codeblock}[label=lst:irqassign]{Assigning an interrupt handler (InterruptDispatcher.cpp)~\cite{hhuos}}{C++}
\cppfile{code/interruptdispatcher_assign.cpp}
\end{codeblock}
\begin{codeblock}[label=lst:irqdispatch]{Triggering an interrupt handler (InterruptDispatcher.cpp)~\cite{hhuos}}{C++}
\cppfile{code/interruptdispatcher_dispatch.cpp}
\end{codeblock}
To prevent the need of interacting with a specific interrupt controller implementation (e.g.
\code{Pic} class) or the dispatcher, a system service (the \code{InterruptService}) is implemented
to expose this functionality to other parts of the OS (it allows e.g.\ registering interrupt
handlers or masking and unmasking interrupts).
Currently, hhuOS utilizes the PIT to manage the global system time (see \autoref{lst:pithandler}),
which in turn is used to trigger hhuOS's preemptive round-robin scheduler (the \code{Scheduler}
class).
\begin{codeblock}[label=lst:pithandler]{The PIT interrupt handler (Pit.cpp)~\cite{hhuos}}{C++}
\cppfile{code/pit_trigger.cpp}
\end{codeblock}
The PIT and other devices are initialized before the system entry point, in the
\code{System::initializeSystem} function.