00-CPT.Rmd

---
knit: (function(input, ...) {
    input <- "index.Rmd";
    thesis_formats <- "bs4";

    source("scripts_and_filters/knit-functions.R");
    knit_thesis(input, thesis_formats, ...)
  })
---

# Consumer and Producer Theory

::: {.rmdcaution}
**This section is still in development.**
:::

::: {.rmdnote}
The definitions and theorems below can also be used for courses on:

1. **calculus**,
2. **linear algebra**,
3. **optimization**, and
4. **real analysis**.
:::

## Linear Algebra

### Basic Notions of Linear Algebra

> Vectors are always columns, but it is customary to write them as rows to save space.

::: {.definition name="Homogeneous System"}
If $b = 0_{m\times 1}$, the system is said to be *homogeneous* and there is at least one solution, given by $x = 0_{n\times 1}$.
$$
\ie \quad Ax = 0.
$$
:::

::: {.proposition}
If $b \neq 0$, there need not exist $x$ such that $Ax = b$.
:::

::: {.definition name="Linear Combination"}
$y \in \R^m$ is a _linear combination_ of $x_1, \dots, x_n \in \R^m$ if there exists $\alpha_1, \dots, \alpha_n \in \R$ such that $\displaystyle y = \sum_{i = 1}^n \alpha_i x_i$.
:::

::: {.example}
Consider three matrices $A_{m\times n}$, $B_{k\times m}$ and $C_{k\times n}$ such that $BA = C$.

Observe that:

1. rows of $C$ are linear combinations of rows of $A$, and
2. columns of $C$ are linear combinations of columns of $B$.

> These statements are in fact equivalent since $BA = C$ if and only if $A^T B^T = C^T$.

:::

### Elementary Row Operations and RREF

::: {.definition name="Elementary Row Operations"}
_Elementary row operations_ are simple operations that allow to transform a system of linear equations into an equivalent system.

There are three elementary row operations:

1. switching rows,
2. multiplying a row with a non-zero constant, and
3. replacing a row with the sum of that row and another row.
:::

::: {.theorem}
Let $E_{n\times n}$ be an elementary matrix. Then

1. $E$ is invertible.
2. There exist a number $k$ of elementary matrices $E_1, \dots, E_k$ such that $E^{-1} = E_1 \cdots E_k$.
:::

::: {.definition name="Row Reduced Echelon Form"}
Applying the three elementary row operations sequentially to any given matrix, we can find its _row reduced echelon form_ (RREF).

The RREF of a matrix has the following properties:

1. all the zero rows are at the bottom,
2. the first non-zero entry of each non-zero row is 1,
3. if the first non-zero entry of a row occurs on column $j$ and if the first non-zero entry of the next row occurs on column $j'$, then $j < j'$, and
4. if the first non-zero entry of a row occurs on column $j$, then all the earlier rows have zeros on column $j$.
:::

::: {.theorem}
Every matrix has a unique RREF.
:::

### Rank

::: {.definition name="Rank"}
The _rank_ of $A_{m\times n}$ is the number of non-zero rows in its RREF $A'$.
:::

::: {.definition name="Pivotal Column"}
A _pivotal column_ is one which contains the first non-zero entry of some row.
:::

::: {.corollary}
If $A$ is $m\times n$, then $\rk A = \rk A' \leq \min\{m, n\}$.

Also, for a given $A_{m\times n}$:

1. $\rk A = m$ if and only if $A'$ has no zero rows.
2. $\rk A = n$ if and only if all columns of $A'$ are pivotal.
:::

::: {.theorem}
Let $A$ be $m \times n$. Then $\rk A = m$ if and only if, $\forall b \in \R^m$, $\exists x \in \R^n$ such that $Ax = b$. Namely, there is **at least one solution** to the system.
:::

::: {.theorem}
Let $A$ be $m \times n$. Then $\rk A = n$ if and only if, $\forall b \in \R^m$, if $Ax = Ay = b$, then $x = y$. Namely, there is **at most one solution** to the system.
:::

::: {.corollary}
Let $A$ be $m\times n$. For each $b \in \R^m$, $Ax = b$ has a unique solution in $\R^n$ if and only if $\rk A = m = n$.
:::

### Subspaces

::: {.definition name="Subspace"}
A non-empty set $S \subseteq \R^m$ is a _subspace_ of $\R^m$ if $\alpha x + \beta y \in S$ whenever $x, y \in S$ and $\alpha, \beta \in \R$.
:::

::: {.proposition}
The intersection of members of an arbitrary family of subspaces is a subspace. The union of subspaces need not be a subspace.
:::

::: {.definition name="Range"}
The _range_ of $A_{m\times n}$ is the set
$$
R(A) = \left\{ b \in \R^m : \exists x \in \R^n,\ Ax = b \right\}.
$$
:::

::: {.definition name="Null Space"}
The _null space_ of $A_{m\times n}$ is the set
$$
N(A) = \left\{ x \in \R^n : Ax = 0 \right\}.
$$
:::

::: {.proposition}
For any $A_{m\times n}$, the following propositions hold:

1. $N(A) = N(A')$,
2. $R(A) \neq R(A')$,
3. $N(A) \subseteq N(BA)$,
4. $R(BA) \subseteq R(B)$,
5. $R(A) = \R^m \iff \rk A = m$, and
6. $N(A) = \{0\} \iff \rk A = n$.
:::

::: {.definition name="Linear Span"}
Vectors $a_1, \dots, a_n \in S$ _span_ $S$ if, for every $x \in S$, there exist $\alpha_1, \dots, \alpha_n \in \R$ such that $\displaystyle x = \sum_{i = 1}^n \alpha_i a_i$.

Namely, each $x$ is a linear combination of $a_1, \dots, a_n$.
:::

::: {.definition name="Linear Independence"}
Vectors $a_1, \dots, a_n$ in $\R^m$ are _linearly independent_ if, for all $\alpha_1, \dots, \alpha_n \in \R$,
$$
\sum_{i=1}^n \alpha_i a_i = 0 \implies \alpha_1 = \cdots = \alpha_n = 0.
$$
:::

> A linearly independent set cannot contain $0$.

::: {.proposition}
For any given matrix, the following propositions hold:

1. the non-zero rows of any matrix in RREF are linearly independent,
2. all subsets of a linearly independent set are also linearly independent, and
3. a set $\{a_1, \dots, a_n\}$ in $\R^m$ is linearly independent if and only if $N(A_{m\times n}) = \{0_{n\times 1}\}$, where $\displaystyle A = \left[ a_1 \mid \cdots \mid a_n \right]_{m\times n}$ is formed by merging members of $\{a_1, \dots, a_n\}$ as columns.
:::

::: {.lemma}
$\{a_1, \dots, a_n\} \subset \R^m$ is linearly dependent if and only if, for some $k \in \{2, \dots, n\}$, $a_k$ is a linear combination of $a_1, \dots, a_{k-1}$.
:::

::: {.definition name="Basis"}
Let $S$ be a subspace of $\R^m$. A set $\{a_1, \dots, a_n\} \subseteq S$ is a _basis_ for $S$ if it spans $S$ and is linearly independent.
:::

::: {.theorem}
Let $S$ be a subspace of $\R^m$ with basis $\{a_1, \dots, a_n\}$. If $\{b_1, \dots, b_r\} \subset S$ is linearly independent, then $r \leq n$.
:::

::: {.corollary}
Any $m + 1$ vectors in $\R^m$ are linearly dependent.
:::

::: {.corollary}
If $\{a_1, \dots, a_n\}$ and $\{b_1, \dots, b_r\}$ are two bases for $S$, then $n = r$.
:::

::: {.definition name="Dimension"}
The _dimension_ of a subspace $S$ is the number of elements in any basis for $S$.
:::

::: {.theorem name="Fundamental Theorem of Linear Algebra I"}
Let $A$ be $m\times n$. Then

1. $\dim R(A^T) = \rk A$,
2. $\rk A = \rk A^T$, and
3. $\dim N(A) = n - \rk A$.
:::

### Orthogonality

::: {.definition name="Orthogonal Complement"}
Let $S$ be a subspace of $\R^m$. The _orthogonal complement_ of $S$ is the set
$$
S^\perp = \left\{ y \in \R^m : y \cdot x = 0,\ \forall x \in S \right\}.
$$
:::

::: {.lemma}
For any subspace $S$ of $\R^m$, $S \cap S^\perp = \{0\}$.
:::

::: {.theorem name="Fundamental Theorem of Linear Algebra II"}
Let $A$ be $m\times n$. Then

1. $R(A) = N(A^T)^\perp$, and
2. $R(A)^\perp = N(A^T)$.
:::

::: {.corollary}
For any subspace $S$ of $\R^m$, $(S^\perp)^\perp = S$.
:::

### Inverse

::: {.theorem}
Let $A$ and $C$ be both $n\times n$. If $AC = I$, then $\rk A = \rk C = n$.
:::

::: {.theorem}
Let $A$ and $C$ be both $n\times n$. If $AC = I$, then $CA = I$.
:::

::: {.definition name="Invertible Matrix"}
$A_{n\times n}$ is _invertible_ if there exists $C_{n\times n}$ such that $AC =CA = I$.
:::

::: {.theorem}
If $AC = CA = I$ and if $AB = BA =I$, then $C = B$.
:::

::: {.remark}
If $A$ and $B$ are invertible, then $AB$ is invertible as well.
:::

::: {.remark}
If $A$ is invertible, then so is $A^T$ and $(A^T)^{-1} = (A^{-1})^T$.
:::

## Real Analysis

### Basic Notions of Topology

::: {.definition name="Metric"}
Let $X$ be a set. A function $d : X \times X \to \R$ is a _metric_ on $X$ if, for all $x, y, z \in X$, the following conditions hold:

1. $d(x,y) \geq 0$,
2. $d(x,y) = 0 \iff x = y$,
3. $d(x, y) = d((y, x)$, and
4. $d(x, y) \leq d(x, z) + d(z, y)$.
:::

::: {.definition name="Metric Space"}
A _metric space_ is a pair $(X, d)$ where $X$ is a set and $d$ is a metric on $X$.
:::

::: {.definition name="Open Ball"}
Let $(X, d)$ be a metric space. For any $x \in X$ and $\varepsilon > 0$, $B(x, \varepsilon) = \{ y \in X : d(x, y) < \varepsilon \}$ is the _open ball_ centered at $x$ with radius $\varepsilon$.
:::

::: {.definition name="Interior"}
Let $S \subseteq X$. $x$ is an _interior point_ of $S$ if $B(x, \varepsilon) \subset S$ for some $\varepsilon > 0$. The set of all interior points of $S$ is the _interior_ of $S$.
:::

::: {.remark}
$$
\interior (S) \subseteq S
$$
:::

::: {.definition name="Open Set"}
$S$ is _open_ if $S = \interior (S)$.
:::

::: {.remark}
$\O$, $\interior(S)$ and $B(x, \varepsilon)$ are open.
:::

::: {.definition name="Closure"}
Let $S \subseteq X$. $x$ is an _closure point_ of $S$ if $B(x, \varepsilon) \cap S \neq \O$ for every $\varepsilon > 0$. The set of all closure points of $S$ is the _closure_ of $S$.
:::

::: {.remark}
$$
S \subseteq \closure(S)
$$
:::

::: {.definition name="Closed Set"}
$S$ is _closed_ if $S = \closure (S)$.
:::

::: {.remark}
$\O$ and $\closure(S)$ are closed.
:::

::: {.theorem}
$S$ is closed if and only if $X \setminus S$ is open.
:::

::: {.definition name="Boundary"}
Let $S \subseteq X$. $x$ is an _boundary point_ of $S$ if $B(x, \varepsilon) \cap S \neq \O$ and $B(x, \varepsilon) \cap (X \setminus S) \neq \O$ for every $\varepsilon > 0$. The set of all boundary points of $S$ is the _boundary_ of $S$.
:::

::: {.remark}
<br/>

* $\partial S = \partial(X \setminus S)$.
* $\partial S \subseteq \closure(S)$.
* $S$ is closed if and only if $\partial S \subseteq S$.
* $\partial S$ is a closed set.
:::

::: {.definition name="Metric Subspace"}
Let $(X, d)$ be a metric space and $S \subseteq X$. $d$ is a metric on $S$ and therefore $(S, d)$ is a metric space as well, usually referred to as a _metric subspace_ of $(X, d)$.
:::

::: {.theorem}
Let $(S, d)$ be a metric subspace of $(X, d)$.

* $A \subseteq S$ is open in $(S, d)$ if and only if there exists an open set $U$ in $(X, d)$ such that $A = S \cap U$.
* $A \subseteq S$ is closed in $(S, d)$ if and only if there exists a closed set $U$ in $(X, d)$ such that $A = S \cap U$.
:::

### Sequences

::: {.definition name="Sequence"}
Let $(X, d)$ be a metric space. A _sequence_ in $(X, d)$ is a map $f : \N \to X$.
:::

::: {.definition name="Convergent Sequence"}
A sequence $\{x_k\}$ is said to be _convergent_ if the following property holds: there exists some $x \in X$ such that, for every $\varepsilon > 0$, there exists some $l \in \N$ such that, for every $k > l$, $x_k \in B(x, \varepsilon)$. Such $x$ is called the limit of $\{x_k\}$.
:::

::: {.theorem}
If $\lim x_k$ exists, it is unique.
:::

::: {.definition name="Subsequence"}
Let $\{x_k\}$ be a sequence. A _subsequence_ of $\{x_k\}$ is a sequence obtained by deleting some (possibly none, possibly infinitely many) members of $\{x_k\}$. Namely, let $\{k_n\}$ be a non-decreasing sequence of integers. Then $\{x_{k_{n}}\}$ is a subsequence of $\{x_k\}$.
:::

::: {.theorem}
$\{x_k\}$ is convergent with a limit $x$ if and only if every subsequence of $\{x_k\}$ is convergent with limit $x$.
:::

::: {.theorem}
Let $(X, d)$ be a metric space and $S \subseteq X$. $x \in \closure(S)$ if and only if there exists a sequence $\{x_k\}$ such that $x_k \in S$ for all $k$, and $\lim x_k = x$.
:::

::: {.theorem}
$S$ is closed if and only if every convergent sequence in $S$ converges to an element of $S$.
:::

::: {.definition name="Bounded Sequence"}
Let $(X, d)$ be a metric space. A sequence $\{x_k\}$ in $X$ is _bounded_ if it is contained in a ball with finite radius, i.e., if for some $\varepsilon > 0$ and $y \in X$, $x_k \in B(y, \varepsilon)$ for every $k$.
:::

::: {.theorem name="Bolzano–Weierstrass"}
Every bounded sequence in $\R^m$ has a convergent subsequence.
:::

::: {.definition name="Cauchy Sequence"}
Let $(X, d)$ be a metric space. A sequence $\{x_k\}$ in $X$ is a _Cauchy sequence_ if, for every $\varepsilon > 0$, there exists a positive integer $k_\varepsilon$ such that, whenever $k, l > k_\varepsilon$, $d(x_k, x_l) < \varepsilon$.
:::

::: {.theorem}
If $\{x_k\}$ is convergent, then $\{x_k\}$ is Cauchy. If $\{x_k\}$ is Cauchy, then $\{x_k\}$ is bounded.
:::

::: {.definition name="Completeness"}
Let $(X, d)$ be a metric space. $(X, d)$ is complete if every Cauchy sequence in $X$ converges to an element of $X$.
:::

::: {.theorem}
Let $(X, d)$ be complete and $S \subseteq X$. $(S, d)$ is complete if and only if $S$ is a closed subset of $X$.
:::

::: {.definition name="Open Cover"}
Let $(X, d)$ be a metric space and $S \subseteq X$. A collection of open sets $O_i$, $i \in I$, is an _open cover_ for $S$ if $\displaystyle S \subset \bigcup_{i \in I} O_i$.
:::

::: {.definition name="Finite Subcover"}
Let $O_i$, $i \in I$, be an open cover for $S$. If $I_0$ is a finite subset of $I$ and $\displaystyle S \subset \bigcup_{i \in I_0} O_i$, then the collection $O_i$, $i \in I_0$, is a _finite subcover_ of $S$.
:::

::: {.definition name="Compactness"}
$S$ is _compact_ if every open cover of $S$ has a finite subcover.
:::

::: {.theorem name="Compactness"}
A compact set is bounded and closed.
:::

::: {.theorem name="Sequential Compactness"}
$S$ is compact if and only if every sequence in $S$ has a subsequence that converges to a point in $S$.
:::

::: {.theorem name="Heine-Borel"}
A subset of $\R^m$ is compact if and only if it is closed and bounded.
:::

### Continuity

::: {.definition name="Continuity"}
Let $(X, d)$ and $(Y, \sigma)$ be metric spaces.  A function $f : X \to Y$ is _continuous at $x \in X$_ if, for every $\varepsilon > 0$, there exists $\delta > 0$ such that, whenever $x' \in B(x, \delta)$, $f(x') \in B(f(x), \varepsilon)$.
:::

::: {.definition name="Global Continuity"}
$f$ is _continuous_ if it is continuous at $x$ for every $x \in X$.
:::

::: {.theorem}
$f$ is continuous at $x$ if and only if for every sequence $\{x_k\}$ in $X$, whenever $\lim x_k = x$, $\lim f(x_k) = f(x)$.
:::

::: {.definition name="Image"}
Let $(X, d)$ and $(Y, \sigma)$ be metric spaces and $f : X \to Y$. If $S \subseteq X$, then $f(S) = \{ f(x) \in Y : x \in S \}$.
:::

::: {.definition name="Inverse Image"}
Let $(X, d)$ and $(Y, \sigma)$ be metric spaces and $f : X \to Y$. If $S \subseteq Y$, then $f^{-1}(S) = \{ x \in X : f(x) \in S \}$.
:::

::: {.theorem}
The following are equivalent:

* $f$ is continuous.
* $f^{-1}(S)$ is open if $S$ is open.
* $f^{-1}(S)$ is closed if $S$ is closed.
:::

::: {.definition name="Upper Semicontinuous"}
Let $(X, d)$ be a metric space. A function $f : X \to \R$ is _upper semicontinuous_ if $\{x \in X : f(x) \geq \alpha \}$ is closed for every $\alpha \in \R$.
:::

::: {.definition name="Lower Semicontinuous"}
Let $(X, d)$ be a metric space. A function $f : X \to \R$ is _lower semicontinuous_ if $\{x \in X : f(x) \leq \alpha \}$ is closed for every $\alpha \in \R$.
:::

::: {.theorem}
$f : X \to \R$ is continuous if and only if $f$ is upper semicontinuous and lower semicontinuous.
:::

::: {.definition name="Maximizer / Minimizer"}
Let $(X, d)$ be a metric space and $S \subseteq X$. A function $f : S \to \R$ has a _maximizer_ if for some $x^* \in S$, $f(x) \leq f(x^*)$ for every $x \in S$. Similarly, $f$ has a _minimizer_ if for some $x_* \in S$, $f(x) \geq f(x_*)$ for every $x \in S$.
:::

::: {.theorem name="Weierstrass"}
Let $S$ be compact and $f : S \to \R$ be continuous. Then $f$ has a maximizer and a minimizer.
:::

::: {.theorem}
Let  $S$ be compact and $f : S \to \R$ be upper semicontinuous. Then $f$ has a maximizer.
:::

::: {.theorem}
Let  $S$ be compact and $f : S \to \R$ be lower semicontinuous. Then $f$ has a minimizer.
:::

::: {.theorem name="Separation"}
Let $(X, d)$ be a metric space. If $S$ and $T$ are disjoint and closed subsets of $X$, then there exists disjoint and open sets $U$ and $V$ such that $U \supset S$ and $V \supset T$.
:::

## Differentiable Functions

::: {.definition name="Differentiability"}
Let $S \subseteq \R$, $x \in S$ and $f : S \to \R$. Then $f$ is _differentiable at $x$_ if
$$
\lim_{u \to 0} \frac{f(x + u) - f(x)}{u}
$$
exists.

Namely, $f$ is _differentiable at $x$_ if there is some $y \in \R$ such that, for every $\varepsilon > 0$, there exists some $\delta > 0$ such that
$$
\left\{ x + u \in B_\delta(x) \cap S : u \neq 0 \right\} \implies \left\lvert \frac{f(x + u) - f(x)}{u} - y \right\rvert < \varepsilon
$$
where  $y = f'(x)$.
:::

::: {.definition name="Partial Derivative"}
Let $S \subseteq \R^n$, $x \in S$ and $f : S \to \R$. The _$j$th partial derivative of $f$ at $x$_ is
$$
D_j f(x) := \lim_{u \to 0} \frac{f(x + ue_j) - f(x)}{u}
$$
if this limit exists.
:::

::: {.definition name="Gradient"}
If $D_j f(x)$ exists for all $j = 1, \dots, n$, then the _gradient of $f$ at $x$_ is the vector
$$
\nabla f(x) :=
\begin{bmatrix}
D_1 f(x) \\
\vdots \\
D_n f(x)
\end{bmatrix}
\in \R^n.
$$
:::

::: {.definition name="Jacobian"}
Let $S \subseteq \R^n$, $x \in S$ and $f : S \to \R^m$. Suppose that $D_j f_i(x)$ exists for all $i$ and $j$, then the _Jacobian of $f$ at $x$_ is the $m\times n$ matrix
$$
J_f(x) :=
\begin{bmatrix}
D_1 f_1(x) & \cdots & D_n f_1(x) \\
\vdots & \ddots & \vdots \\
D_1 f_m(x) & \cdots & D_n f_m(x)
\end{bmatrix}_{m\times n}
=
\begin{bmatrix}
\nabla f_1(x)^T \\
\vdots \\
\nabla f_m(x)^T
\end{bmatrix}.
$$
:::

::: {.definition name="Differentiability"}
Let $S \subseteq \R^n$. Then $f : S \to \R^m$ is _differentiable at $x \in S$_ if the following two conditions hold:

1. $D_j f_i (x)$ exists for all $i$ and $j$,
2. we have
$$
\lim_{u \to 0} \frac{\lVert f(x + u) - f(x) - J_f(x)u \rVert}{\lVert u \rVert} = 0.
$$

> The second condition holds if and only if, for every $\varepsilon > 0$, there is some $\delta > 0$ such that, whenever $x + u \in B_\delta(x) \cap S$ and $u \neq 0_{n\times 1}$, we have

$$
\frac{\lVert f(x + u) - f(x) - J_f(x)u \rVert}{\lVert u \rVert} < \varepsilon.
$$

:::

::: {.theorem}
Suppose $S\subseteq\R^n$, $x \in S$ and $f : S \to \R^m$. Suppose that, for each $i$ and $j$, $D_j f_i(x)$ exists and that $x \mapsto D_j f_i(x)$ is continuous on $S$. Then $f$ is differentiable at $x$.
:::

::: {.theorem name="Implicit Function"}
Let $F : \R^2 \to \R$. Suppose that $D_1F(x,y)$ and $D_2F(x,y)$ exist and are continuous in an open $U$ containing $(a, b) \in \R^2$. Suppose that $F(a, b) = 0$ and that $D_2F(a, b) \neq 0$. Then there exists $\varepsilon > 0$ and $g : B_\varepsilon \to \R$ such that $g(a) = b$, $F(x, g(x)) = 0$ for all $x \in B_\varepsilon (a)$ and
$$
g'(a) = -\frac{D_1F(a,b)}{D_2F(a,b)}.
$$
:::

::: {.theorem name="Implicit Function"}
Suppose $F : \R^{n + m} \to \R^m$ is differentiable in an open set $U$ containing $(a, b) \in \R^{n + m}$ and suppose that the entries of $J_F (x,y)$ are continuous at each $(x,y) \in U$. Let $A_{m\times n}$ and $B_{m\times m}$ be defined by $J_F (a, b)$ as
$$
A =
\begin{bmatrix}
D_1 F_1(a,b) & \cdots & D_n F_1(a,b) \\
\vdots & \ddots & \vdots \\
D_1 F_m(a,b) & \cdots & D_n F_m(a,b)
\end{bmatrix}_{m\times n} \\ \\
B =
\begin{bmatrix}
D_{n+1} F_1(a,b) & \cdots & D_{n+m} F_1(a,b) \\
\vdots & \ddots & \vdots \\
D_{n+1} F_m(a,b) & \cdots & D_{n+m} F_m(a,b)
\end{bmatrix}_{m\times m}
$$
and suppose that $F(a,b) = 0$ and $B$ is non-singular. Then there exists $\varepsilon > 0$ and $g : B_\varepsilon (a) \to \R^m$ such that $g(a) = b$, $F(x, g(x)) = 0$ for all $x \in B_\varepsilon (a)$ and $J_g (a) = -B^{-1}_{m\times m} A_{m\times n}$.
:::

::: {.theorem name="Inverse Function"}
Suppose that $f : \R^m \to \R^m$ is differentiable in an open set $U$ containing $b \in \R^m$. Suppose that the partials of $f$ are continuous on $U$. Suppose that $f(b) = a$ and $J_f(b)$ is non-singular. Then there exists some $\varepsilon > 0$ and some $g : B_\varepsilon (a) \to \R^m$ such that $g(a) = b$, $f(g(x)) = x$ for all $x \in B_\varepsilon (a)$, and $g$ is differentiable at $a$ with $J_g (a) = J_f (b)^{-1}$.
:::

::: {.theorem name="Weierstrass"}
Let $S \subseteq \R^n$ be compact and let $f : S \to \R$ be continuous. Then there exists $x'$ and $x''$ in $S$ such that, for all $x \in S$, $f(x') \leq f(x) \leq f(x'')$.
:::

::: {.theorem}
Suppose $f : \R^n \to \R$ attains its maximum or minimum on an open set $U$ at some $x \in U$. If $D_i f(x)$ exists, then $D_i f(x) = 0$.
:::

::: {.lemma name="Rolle's}
Suppose $f : [a, b] \to \R$ is continuous at each $x \in [a, b]$ and $f$ is differentiable at each $x \in (a, b)$. Furthermore, suppose that $f(a) = 0 = f(b)$. Then there exists some $c \in (a, b)$ such that $f'(c) = 0$.
:::

::: {.theorem name="Mean Value"}
Suppose that $f : [a, b] \to \R$ is continuous at each $x \in [a, b]$ and that $f$ is differentiable at each $x \in (a, b)$. Then there exists $c \in (a, b)$ such that
$$
f'(c) = \frac{f(b) - f(a)}{b - a}.
$$
:::

::: {.theorem name="Mean Value"}
Let $U$ be an open set in $\R^n$ containing $a$ and $b$. Suppose that $(1 - t)a + tb \in U$ whenever $t \in [0, 1]$. Suppose that $f : U \to \R$ is differentiable at each $x \in U$. Then there exists $t' \in (0,1)$ such that $\nabla f(c') \cdot (b - a) = f(b) - f(a)$ where $c' = (1 - t')a + t'b$.
:::

::: {.definition name="Hessian Matrix"}
Let $U$ be open set in $\R^n$ and let $f : U \to \R$. The _Hessian of $f$ at $x \in U$_ is the matrix
$$
H_f (x) :=
\begin{bmatrix}
D_1 D_1 f(x) & \cdots & D_n D_1 f(x) \\
\vdots & \ddots & \vdots \\
D_1 D_n f(x) & \cdots & D_n D_n f(x)
\end{bmatrix}_{n \times n}
$$
:::

::: {.theorem}
Suppose $U$ is open in $\R^n$, $f : U \to \R^n$ and $x \mapsto D_i D_j f(x)$ is continuous at each $x \in U$ for every $i$ and $j$. Then $H_f (x)$ is symmetric for each $x \in U$.
:::

::: {.theorem}
Suppose $U \subseteq \R^n$ is an open set containing $a$ and $b$ such that $f : U \to \R$. Furthermore, suppose that $U$ contains $(1 - t) a + tb$ for each $t \in [0, 1]$. Suppose that both $f$ and $\nabla f$ are differentiable on $U$. Then there is some $t' \in (0, 1)$ such that
$$
f(b) = f(a) + \nabla f(a) \cdot (b - a) + \frac{1}{2} (b - a)^T H_f (c') (b - a)
$$
where $c' = (1 - t')a + t'b$.
:::

::: {.definition name="Convex Function"}
Let $S \subseteq \R^n$ be convex. A function $f : S \to \R$ is _convex_ if, for all $t \in [0,1]$ and for all $x, x' \in S$, we have $f((1 - t)x + tx') \leq (1 - t) f(x) + tf(x')$. A function $f : S \to \R$ is _strictly convex_ if, for all $t \in (0,1)$ and for all $x, x' \in S$ such that $x \neq x'$, we have $f((1 - t)x + tx') < (1 - t) f(x) + tf(x')$.
:::

::: {.theorem}
Suppose that $U \subseteq \R^n$ is open and convex and that $f : U \to \R$ is differentiable. Then

1. $f$ is convex if and only if $f(x') \geq f(x) + \nabla f(x) \cdot (x' - x)$ for each $x, x' \in U$.
2. $f$ is strictly convex if and only if $f(x') > f(x) + \nabla f(x) \cdot (x' - x)$ for each $x, x' \in U$ such that $x \neq x'$.
:::

## Correspondences

::: {.definition name="Correspondence"}
A _correspondence_ from $X$ to $Y$ (both Euclidean) is a map $\varphi : X \rightrightarrows Y$ which associates with every $x \in X$ a non-empty set $\varphi (x) \subseteq Y$. Namely, a _correspondence_ from $X$ to $Y$ is a function from $X$ to $\{A \subseteq Y : A \neq \O\}$.
:::

::: {.definition name="Closed-Valued / Open-Valued / Compact-Valued / ... "}
A correspondence $\varphi$ is _closed-valued_ is $\varphi(x)$ is a closed subset of $Y$ for every $x \in X$. We similarly define correspondences which are _open-valued_, _compact-valued_, etcetera.
:::

::: {.definition name="Graph"}
The _graph_ of $\varphi$ is the set
$$
G_\varphi := \{ (x, y) \in X \times Y : y \in \varphi(x) \} .
$$
:::

::: {.proposition}
$\varphi$ is _closed / open_ if $G_\varphi$ is a closed / open subset of $X \times Y$.
:::

::: {.definition name="Inverses"}
Let $U \subseteq Y$. The _strong and weak inverses_ of $U$ under $\varphi$ are defined respectively as
$$
\varphi_s^{-1} (U) := \{x \in X : \varphi(x) \subseteq U\} \\
\varphi_w^{-1} (U) := \{x \in X : \varphi(x) \cap U \neq \O\}
$$
:::

::: {.proposition}
<br/>

* $\varphi_s^{-1} (U) \subseteq \varphi_w^{-1} (U)$.
* If $\varphi$ is such that $\mid \varphi (x) \mid = 1$ for every $x$, then $\varphi_s^{-1} (U) = \varphi_w^{-1} (U)$.
* In general, $\varphi_w^{-1} (U) = \left(\varphi_s^{-1} \left(U^C\right)\right)^C$.
:::

::: {.definition name="Upper Hemi-Continuity"}
A correspondence $\varphi : X \rightrightarrows Y$ is _upper hemi-continuous (UHC) at $x \in X$_ if, whenever $U$ is an open subset of $Y$ and $x \in \varphi_s^{-1} (U)$, $B(x, \delta) \subseteq \varphi_s^{-1} (U)$ for some $\delta > 0$. $\varphi$ is _UHC_ if it is UHC at every $x \in X$.
:::

::: {.proposition}
$\varphi$ is UHC if and only if the strong inverse of any open set in $Y$ under $\varphi$ is open in $X$.
:::

::: {.definition name="Lower Hemi-Continuity"}
A correspondence $\varphi : X \rightrightarrows Y$ is _lower hemi-continuous (LHC) at $x \in X$_ if, whenever $U$ is an open subset of $Y$ and $x \in \varphi_w^{-1} (U)$, $B(x, \delta) \subseteq \varphi_w^{-1} (U)$ for some $\delta > 0$. $\varphi$ is _LHC_ if it is LHC at every $x \in X$.
:::

::: {.proposition}
$\varphi$ is LHC if and only if the weak inverse of any open set in $Y$ under $\varphi$ is open in $X$.
:::

::: {.definition name="Continuity"}
A correspondence is _continuous_ if it is LHC and UHC.
:::

::: {.theorem}
Let $f : X \to Y$ be a function and let $\varphi : X \rightrightarrows Y$ be defined as $\varphi (x) = \{f(x)\}$ for every $x$. Then the following are equivalent:

1. $\varphi$ is UHC.
2. $\varphi$ is LHC.
3. $f$ is continuous.
:::

::: {.proposition}
If the function $f : X \to Y$ is upper / lower semicontinuous, the correspondence $\varphi : X \rightrightarrows Y$ defined by $\varphi (x) = \{f(x)\}$ need not be upper / lower hemicontinuous.
:::

::: {.remark}
Closedness and UHC are not nested.
:::

::: {.theorem}
If $\varphi : X \rightrightarrows Y$ is UHC and closed-valued, then it is also closed.
:::

::: {.theorem}
If $\varphi : X \rightrightarrows Y$ is closed and $Y$ is compact, then $\varphi$ is UHC.
:::

::: {.theorem}
If $\varphi$ is open, then it is LHC.
:::

::: {.proposition}
For every $x \in X$, every sequence $\{x_n\}$ converging to $x$ and every sequence $\{y_n\}$ such that $y_n \in \varphi(x_n)$ for every $n$, there exists a convergent subsequence of $\{y_n\}$ with limit in $\varphi(x)$.
:::

::: {.proposition}
For every $x \in X$, every sequence $\{x_n\}$ converging to $x$ and every $y \in \varphi(x)$, there exists a sequence $\{y_n\}$ converging to $y$ such that $y_n \in \varphi(x_n)$ for every $n$.
:::

::: {.theorem}
<br/>

1. If $\varphi$ satisfies **proposition 1.10**, then $\varphi$ is UHC.
2. If $\varphi$ is UHC and compact-valued, then $\varphi$ satisfies **proposition 1.10**.
:::

::: {.theorem}
$\varphi$ satisfies **proposition 1.11** if and only if $\varphi$ satisfies LHC.
:::

::: {.corollary}
If $\varphi : X \rightrightarrows Y$ is UHC and compact-valued and $K \subseteq X$ is compact, then $\displaystyle \varphi(K) := \bigcup_{x \in K} \varphi (x)$ is compact.
:::

::: {.remark}
A subset $S$ of a metric space is _compact_ if and only if every sequence $S$ has a subsequence that converges to a point in $S$.
:::

::: {.theorem name="The Maximum"}
Let $T$ and $X$ be Euclidean sets. Let $\varphi : T \rightrightarrows X$ and $f : X \times T \to \R$. For every $t \in T$, consider the problem
$$
\max_{x \in \varphi(t)} f(x, t).
$$
Define $\displaystyle \mu (t) = \argmax_{x \in \varphi(t)} f(x, t)$. Suppose that $\mu(t) \neq \O$ for every $t$.

We can define a function $g : T \to \R$ as $g(t) = f(x, t)$ for some $x \in \mu(t)$.

If $\varphi$ is compact-valued, UHC and LHC and if $f$ is continuous, then

1. $\mu : T \rightrightarrows X$ is compact-valued and UHC, and
2. $g : T \to \R$ is continuous.
:::

## Convexity

::: {.definition name="Convex Set"}
A non-empty set $S \subseteq \R^m$ is _convex_ if, for every $x, x' \in S$ and every $t \in [0, 1]$, $tx + (1 - t)x' \in S$.
:::

::: {.definition name="Convex Combination"}
A _convex combination_ of vectors $x_1, \dots, x_n$ is a vector of the form $\displaystyle \sum_{i = 1}^n \alpha_i x_i$ where $\alpha_1, \dots, \alpha_n$ are non-negative numbers which add up to 1.
:::

::: {.theorem}
$S \subseteq \R^m$ is convex if and only if $S = \tilde{S}$ where
$$
\tilde{S} := \left\{ \sum_{i = 1}^n \alpha_i x_i : n \in \N,\ x_i \in S,\ \alpha_i \geq 0,\ \forall i,\ \sum_{i = 1}^n \alpha_i = 1 \right\};
$$
i.e., if and only if it contains all convex combinations of its elements.
:::

::: {.proposition}
<br/>

* Arbitrary intersections of convex sets are convex.
* $S + T := \{ s + t : s \in S, t \in T\}$ is convex if $S$ and $T$ are convex.
* For every scalar $\lambda \geq 0$, $\lambda S := \{\lambda s : s \in S\}$ is convex if $S$ is convex.
* The closure and interior of a convex set are convex using the Euclidean metric.
:::

::: {.definition name="Convex Hull"}
The _convex hull_ of a set $S \subseteq \R^m$ is the "smallest" convex superset of $S$. Namely,
$$
\convex S := \bigcap \left\{ G \subseteq \R^m : S \subseteq G \wedge G\ \text{is convex} \right\}.
$$
:::

::: {.proposition}
$\convex S$ is convex and $S \subseteq \convex S$.
:::

::: {.theorem}
$$
\convex S = \tilde{S}
$$
:::

::: {.proposition}
$S$ is convex if and only if $\convex S \subseteq S$.
:::

::: {.theorem name="Carathéodory"}
Let $S \subseteq \R^m$ be non-empty. If $x \in \convex S$, then $x$ can be written as a convex combination of no more than $m + 1$ members of $S$, i.e., there exists $x_1, x_2, \dots, x_{m+1} \in S$ and $\alpha_1, \alpha_2, \dots, \alpha_{m+1} \geq 0$ with $\displaystyle \sum_{i = 1}^{m + 1} \alpha_i = 1$ such that $\displaystyle x = \sum_{i = 1}^{m + 1} \alpha_i x_i$.
:::

::: {.proposition}
<br/>

* $\displaystyle \convex \left( \sum_{i = 1}^n S_i \right) = \sum_{i = 1}^n \convex S_i$.
* If $A \subset \R^m$ is open, then $\convex A$ is open.
* The convex hull of a closed set in $\R^m$ need not be closed. But if $K \subset \R^m$ is compact, then $\convex K$ is compact.
:::

::: {.theorem name="Shapley-Folkman"}
Let $S_i \subseteq \R^m$ for every $i = 1, \dots, n$, and let $\displaystyle x \in \convex \sum_{i = 1}^n S_i$. Then there exists $x_1, \dots, x_n$ such that

1. $x_i \in \convex S_i$ for every $i$,
2. $\displaystyle x = \sum_{i = 1}^n x_i$, and
3. $\# \{i : x_i \notin S_i\} \leq m$.
:::

::: {.remark}
If $x \in \R^m$ can be written as $\displaystyle x = \sum_{i = 1}^k \alpha_i x_i$ where $\alpha_1, \dots, \alpha_k \in \R_+$, $x_1, \dots, x_k \in \R^m$ and, if $k > m$, then there exist $\beta_1, \dots, \beta_k \in \R_+$ with $\# \{i : \beta_i > 0\} \leq m$ such that $\displaystyle x = \sum_{i = 1}^k \beta_i x_i$.

> Scalars $\alpha_i$ and $\beta_i$ do not need to add up to 1.

:::

::: {.definition name="Hyperplane"}
Fix $p \in \R^m$ and $\alpha \in \R$. The _hyperplane_ formed by $p$ and $\alpha$ is
$$
H(p; \alpha) := \left\{ x \in \R^m : p \cdot x = \alpha \right\}
$$
where $p$ is called the _normal vector_ of $H(p; \alpha)$.
:::

::: {.theorem name="Minkowski"}
Let $S \subseteq \R^m$ be non-empty, convex and closed, and let $\bar{x} \notin S$. There exists $p \in \R^m \setminus \{0\}$ and $x_0 \in S$ such that $p \cdot \bar{x} > p \cdot x_0 \geq p \cdot x$ for every $x \in S$.
:::

::: {.theorem}
Suppose that $S \subseteq \R^m$ is non-empty and convex, and that $x_n$ is a sequence in $\R^m\setminus \closure S$. If $x_n \to \bar{x}$, then there exists $p \in \R^m \setminus \{0\}$ such that, for every $x \in S$, $p \cdot x \leq p \cdot \bar{x}$.
:::

::: {.theorem name="Supporting Hyperplane"}
Suppose that $S \subseteq \R^m$ is non-empty and convex. If $\bar{x} \in \partial S$, then there exists $p \in \R^m \setminus \{0\}$ such that $p \cdot \bar{x} \geq p \cdot x$ for every $x \in S$.
:::

::: {.theorem}
Suppose that $S \subseteq \R^m$ is a non-empty and convex set, and let $\bar{x} \notin S$. Then there exists $p \in \R^m \setminus \{0\}$ such that $p \cdot x \leq p \cdot \bar{x}$ for every $x \in S$.
:::

::: {.theorem name="Separating Hyperplane"}
Let $S$ and $T$ be disjoint and convex subsets of $\R^m$. There exists $p \in \R^m \setminus \{0\}$ such that $p \cdot s \leq p \cdot t$ for every $(s, t) \in S \times T$.
:::

::: {.definition name="Convex Function"}
Let $S \subseteq \R^n$ be convex. A function $f : S \to \R$ is _convex_ if $f(tx + (1 - t)y) \leq tf(x) + (1 - t)f(y)$, $\forall x, y \in S$, $t \in [0, 1]$. Similarly, $f$ is said to be _strictly convex_ if $f(tx + (1 - t)y) < tf(x) + (1 - t)f(y)$, $\forall x, y \in S$, $t \in (0, 1)$.
:::

::: {.definition name="Concave Function"}
Let $S \subseteq \R^n$ be convex. A function $f : S \to \R$ is _concave_ if $f(tx + (1 - t)y) \geq tf(x) + (1 - t)f(y)$, $\forall x, y \in S$, $t \in [0, 1]$. Similarly, $f$ is said to be _strictly concave_ if $f(tx + (1 - t)y) > tf(x) + (1 - t)f(y)$, $\forall x, y \in S$, $t \in (0, 1)$.
:::

::: {.definition name="Subgradient"}
Suppose $S \subseteq \R^n$ is convex and $f : S \to \R$. A vector $p \in \R^n$ is a _subgradient_ for $f$ at $x \in S$ if
$$
f(y) \geq f(x) + p \cdot (x - y),
$$
for all $y \in S$.
:::

::: {.definition name="Supergradient"}
Suppose $S \subseteq \R^n$ is convex and $f : S \to \R$. A vector $p \in \R^n$ is a _supergradient_ for $f$ at $x \in S$ if
$$
f(y) \leq f(x) + p \cdot (x - y),
$$
for all $y \in S$.
:::

::: {.theorem}
Suppose that $S \subseteq \R^n$ is convex and $f : S \to \R$. If $f$ has a subgradient at every $x \in S$, then $f$ is convex.
:::

::: {.theorem}
Suppose that $S \subseteq \R^n$ is convex, $f : S \to \R$ is convex, and $x \in \interior (S)$. Then $f$ is continuous at $x$.
:::

::: {.theorem}
Suppose that $S \subseteq \R^n$ is convex, $f : S \to \R$ is convex, and $x \in \interior (S)$. Then $f$ has a subgradient at $x$.
:::

::: {.theorem}
Suppose that $S \subseteq \R^n$ is convex, $f : S \to \R$ is convex, $x \in \interior (S)$, and $f$ is differentiable at $x$. Then $\nabla f(x)$ is the unique subgradient of $f$ at $x$.
:::

::: {.theorem}
Suppose that $S \subseteq \R^n$ is convex and $f : S \to \R$. If $p_1$ is a subgradient of $f$ at $x_1$ and $p_2$ is a subgradient of $f$ at $x_2$, then $(p_1 - p_2) \cdot (x_1 - x_2) \geq 0$.
:::

::: {.theorem}
Suppose that $S \subseteq \R^n$ is open and convex, and $f : S \to \R$ is differentiable and convex. Then $\left(\nabla f(x_1) - \nabla f(x_2)\right) \cdot (x_1 - x_2) \geq 0$.
:::

## Support Functions

### Maximization

::: {.definition name="Barrier Cone"}
For any non-empty $S \subseteq \R^n$, the _barrier cone_ of $S$ is defined as
$$
b(S) := \left\{ p \in \R^n : \max_{x \in S}\ \text{is well-defined} \right\}.
$$
:::

::: {.definition name="Support Function"}
For any non-empty $S \subseteq \R^n$, the _support function_ of $S$ is a function $\sigma_S : b(S) \to \R$ such that
$$
\sigma_S (p) := \max_{x \in S} p \cdot x.
$$
:::

::: {.remark name="Cone"}
If $p \in b(S)$ and $\lambda \in \R_+$, then $\lambda p \in b(S)$ as well. Hence $b(S)$ is a _cone_ and, in particular, $0 \in b(S)$.
:::

::: {.proposition}
$b(S)$ need not be convex even if $S$ is.
:::

::: {.theorem}
Let $S \subseteq \R^n$ be non-empty.

* If $x_1$ solves $\displaystyle \max_{x \in S} p_1 \cdot x$ and $x_2$ solves $\displaystyle \max_{x \in S} p_2 \cdot x$, then $(x_1 - x_2)\cdot(p_1 - p_2) \geq 0$. This is known as monotonicity of solutions.
* $\sigma_S(tp) = t\sigma_S(p)$ for every $p \in b(S)$ and $t \geq 0$. Namely,  homogeneity of degree 1.
* If $b(S)$ is convex, then $\sigma_S$ is convex.
* Suppose $S \subseteq \R^n$ is closed and convex, and $p \in b(S)$. $x_p$ solves $\displaystyle \max_{x \in S} p \cdot x$ if and only if $x_p$ is a subgradient of $\sigma_S$ at $p$.
* Suppose $S \subseteq \R^n$ is closed and convex, $p \in \interior (b(S))$ and $\sigma_S$ is differentiable at $p$. Then $x_p$ solves $\displaystyle \max_{x \in S} p \cdot x$ if and only if $x_p = \nabla \sigma_S (p)$. Also known as **Hotelling's Lemma** in profit maximization.
:::

### Minimization

::: {.definition name="Barrier Cone"}
For any non-empty $S \subseteq \R^n$, the _barrier cone_ of $S$ is defined as
$$
b^-(S) := \left\{ p \in \R^n : \min_{x \in S}\ \text{is well-defined} \right\}.
$$
:::

::: {.definition name="Support Function"}
For any non-empty $S \subseteq \R^n$, the _support function_ of $S$ is a function $\tau_S : b^-(S) \to \R$ such that
$$
\tau_S (p) := \min_{x \in S} p \cdot x.
$$
:::

::: {.remark name="Cone"}
Note that $b^-(S) = -b(S) = \left\{ p \in \R^n : -p \in b(S) \right\}$. Furthermore, $\tau_S (p) = -\sigma_S (-p)$.
:::

::: {.theorem}
Let $S \subseteq \R^n$ be non-empty.

* If $x_1$ solves $\displaystyle \min_{x \in S} p_1 \cdot x$ and $x_2$ solves $\displaystyle \min_{x \in S} p_2 \cdot x$, then $(x_1 - x_2)\cdot(p_1 - p_2) \leq 0$. This is known as monotonicity of solutions.
* $\tau_S(tp) = t\tau_S(p)$ for every $p \in b^-(S)$ and $t \geq 0$. Namely,  homogeneity of degree 1.
* If $b^-(S)$ is convex, then $\tau_S$ is concave.
* Suppose $S \subseteq \R^n$ is closed and convex, and $p \in b^-(S)$. $x_p$ solves $\displaystyle \min_{x \in S} p \cdot x$ if and only if $x_p$ is a supergradient of $\tau_S$ at $p$.
* Suppose $S \subseteq \R^n$ is closed and convex, $p \in \interior (b^-(S))$ and $\tau_S$ is differentiable at $p$. Then $x_p$ solves $\displaystyle \min_{x \in S} p \cdot x$ if and only if $x_p = \nabla \tau_S (p)$. Also known as **Shephard's Lemma** in cost minimization.
:::

## Nonlinear Programming

### Minimization Problems

::: {.definition name="Nonlinear Programming Problem"}
Let $f : \R^n \to \R$, $g_1 : \R^n \to \R$, . . . , $g_m : \R^n \to \R$ be differentiable functions. The *NLP problem* is

$$
\begin{align}
\min_{x \in \R^n} \quad & f(x) \\
\text{subject to} \quad & g_1(x) \geq 0, \\
& \quad \vdots \\
& g_m(x) \geq 0.
\end{align}
$$
:::

#### Kuhn-Tucker Necessity Conditions

::: {.definition name="Kuhn-Tucker Pair for NLP"}
A pair $\left(\bar{x}, \bar{\lambda}\right) \in \R^{n + m}$ is a *Kuhn-Tucker pair for NLP* if

1. $g_i\left(\bar{x}\right) \geq 0$ for all $i = 1, \dots, m$,
2. $\bar{\lambda}_i \geq 0$ for all $i = 1, \dots, m$,
3. $\displaystyle \nabla f\left(\bar{x}\right) = \sum_{i = 1}^m \nabla g_i\left(\bar{x}\right)\bar{\lambda}_i$, and
4. $\bar{\lambda}_i g_i\left(\bar{x}\right) = 0$ for all $i = 1, \dots, m$.

We will say that $\bar{x}$ satisfies the Kuhn-Tucker conditions for NLP if there exists some $\bar{\lambda}$ such that $\left( \bar{x}, \bar{\lambda} \right)$ is a Kuhn-Tucker pair for NLP.
:::

::: {.definition name="Active Constraints"}
For any $x \in \R^n$, let $I(x) = \left\{ i = 1, \dots, m : g_i(x) = 0 \right\}$. Hence $I(x)$ is the set of *active constraints* at $x$.
:::

::: {.lemma}
Suppose that $\left( \bar{x}, \bar{\lambda} \right)$ is a Kuhn-Tucker pair for NLP. If $I\left(\bar{x}\right) = \O$, then $\nabla f\left(\bar{x}\right) = 0$. If $I\left(\bar{x}\right) \neq \O$, then $\displaystyle \nabla f\left(\bar{x}\right) = \sum_{i = 1}^m \nabla g_i\left(\bar{x}\right)\bar{\lambda}_i$.
:::

::: {.lemma}
Suppose that

1. $g_i\left(\bar{x}\right) \geq 0$ for all $i = 1, \dots, m$, and
2. $\left\{ \bar{\mu}_i \right\}_{i \in I\left(\bar{x}\right)}$ is a collection of non-negative numbers satisfying $\displaystyle \nabla f\left(\bar{x}\right) = \sum_{i \in I\left(\bar{x}\right)} \nabla g_i\left(\bar{x}\right)\bar{\mu}_i$.

If $\bar{\lambda}_i = \bar{\mu}_i$ for all $i \in I\left(\bar{x}\right)$ and $\bar{\lambda}_i = 0$ for all $i \notin I\left(\bar{x}\right)$, then $\left( \bar{x}, \bar{\lambda} \right)$ is a Kuhn-Tucker pair for NLP.
:::

::: {.definition name="Linear Programming Problem"}
For $a_1, \dots, a_m, c \in \R^n$ and $b_1, \dots, b_m \in \R$, consider the following *LP problem*:

$$
\begin{align}
\min_{x \in \R^n} \quad & c \cdot x \\
\text{subject to} \quad & a_1 \cdot x \geq b_1, \\
& \qquad \vdots \\
& a_m \cdot x \geq b_m.
\end{align}
$$
LP is a special case of NLP where $f(x) = c \cdot x$ and $g_i(x) = a_i \cdot x - b_i$.
:::

::: {.definition name="Kuhn-Tucker Pair for LP"}
A pair $\left(\bar{x}, \bar{\lambda}\right) \in \R^{n + m}$ is a *Kuhn-Tucker pair for LP* if

1. $a_i \cdot \bar{x} \geq b_i$ for all $i = 1, \dots, m$,
2. $\bar{\lambda}_i \geq 0$ for all $i = 1, \dots, m$,
3. $\displaystyle c = \sum_{i = 1}^m a_i\bar{\lambda}_i$, and
4. $\bar{\lambda}_i \left( a_i \cdot \bar{x} - b_i \right) = 0$ for all $i = 1, \dots, m$.
:::

::: {.theorem}
Suppose that $A = \left[ a_1 \mid \cdots \mid a_m \right]$ and $a_i \in \R^n$ for all $i = 1, \dots, m$. Then the set

$$
\pos A := \left\{ Au : u \in \R^m_+ \right\}
$$

is a non-empty, closed and convex cone.
:::

::: {.theorem}
If $\bar{x}$ solves the LP problem, then there exists some $\bar{\lambda}$ such that $\left(\bar{x}, \bar{\lambda}\right) \in \R^{n + m}$ is a Kuhn-Tucker pair for LP.
:::

::: {.definition name="General Constraint Qualification"}
A solution $\bar{x}$ to the NLP problem satisfies the *general constraint qualification (GCQ) condition* if it solves the following LP problem:

$$
\begin{align}
\min_{x \in \R^n} \quad & \nabla f\left(\bar{x}\right) \cdot x \\
\text{subject to} \quad & g_i\left(\bar{x}\right) + \nabla g_i\left(\bar{x}\right) \cdot \left(x - \bar{x}\right) \geq 0,\ \forall i = 1, \dots, m.
\end{align}
$$
:::

::: {.theorem}
If $\bar{x}$ solves the NLP problem and satisfies the GCQ condition, then it satisfies the KT conditions for NLP.
:::

::: {.definition name="Cottle Constraint Qualification"}
A solution $\bar{x}$ to the NLP problem satisfies the *Cottle constraint qualification (CCQ) condition* if there exists some $z \in \R^n$ such that $\nabla g_i \left(\bar{x}\right) \cdot z > 0$ for all $i \in I\left(\bar{x}\right)$.
:::

::: {.theorem}
Suppose $\bar{x}$ solves the NLP problem. If $\bar{x}$ satisfies the CCQ condition, then it satisfies the GCQ condition.
:::

::: {.definition name="Linear Independence Constraint Qualification"}
A solution $\bar{x}$ to the NLP problem satisfies the *linear independence constraint qualification (LICQ) condition* is the collection $\left\{ \nabla g_i \left(\bar{x}\right) \right\}_{i \in I\left(\bar{x}\right)}$ is linearly independent.
:::

::: {.theorem}
Suppose $\bar{x}$ solves the NLP problem. If $\bar{x}$ satisfies the LICQ condition, then it satisfies the CCQ condition.
:::

::: {.corollary}
Suppose $\bar{x}$ solves the NLP problem. If $\bar{x}$ satisfies the LICQ condition, then it satisfies the KT conditions for NLP.
:::

#### Kuhn-Tucker Sufficiency Conditions

::: {.theorem}
Let $f : \R^n \to \R$ be differentiable. Then

1. $f$ is convex if and only if $f(y) - f(x) \geq \nabla f(x) \cdot (y - x)$, and
2. $f$ is concave if and only if $f(y) - f(x) \leq \nabla f(x) \cdot (y - x)$,

for every $x$ and $y$.
:::

::: {.theorem}
Suppose $\bar{x} \in \R^n$ satisfies the KT conditions for NLP. If $f$ is convex and each $g_i$ is concave, then $\bar{x}$ solves the NLP problem.
:::

::: {.definition name="Quasi-convex / Quasi-concave"}
A function $f : \R^n \to \R$ is *quasi-convex / quasi-concave* if the lower-contour set $\left\{ x \in \R^n : f(x) \leq \alpha \right\}$ / the upper-contour set $\left\{ x \in \R^n : f(x) \geq \alpha \right\}$ is convex for every $\alpha \in \R$.
:::

::: {.theorem}
Let $f : \R^n \to \R$ be differentiable. Then

1. $f$ is quasi-concave if and only if $f(y) - f(x) \geq 0 \implies \nabla f(x) \cdot (y - x) \geq 0$, and
2. $f$ is quasi-convex if and only if $f(y) - f(x) \leq 0 \implies \nabla f(x) \cdot (y - x) \leq 0$,

for every $x$ and $y$.
:::

::: {.definition name="Pseudo-convex / Pseudo-concave"}
A differentiable function $f : \R^n \to \R$ is *pseudo-convex* if, for every $x$ and $y$, $\nabla f(x) \cdot (y - x) \geq 0$ implies $f(y) - f(x) \geq 0$. A differentiable function $f : \R^n \to \R$ is *pseudo-concave* if, for every $x$ and $y$, $\nabla f(x) \cdot (y - x) \leq 0$ implies $f(y) - f(x) \leq 0$.
:::

::: {.theorem}
Suppose $\left(\bar{x}, \bar{\lambda}\right)$ is a KT pair for NLP. If $f$ is pseudo-convex and every $g_i$ is quasi-concave, then $x$ solves the NLP problem.
:::

::: {.theorem}
Suppose $\bar{x}$ solves the NLP. If each $g_i$ is concave and if there exists $x^*$ such that $g_i(x^*) > 0$ for all $i$, then $\bar{x}$ satisfies the CCQ condition.
:::

::: {.remark}
For a NLP problem with concave constraints, the existence of an $x^*$ such that $g_i(x^*) > 0$ for all $i$ is called the *Slater constraint qualification* condition.
:::

#### Comparative Statics

Consider the problem $\displaystyle \min_{x \in \R^n} F(x, \alpha)$ subject to $G_i (x, \alpha) \geq 0$ for all $i = 1, \dots, m$, where $\alpha \in \R^p$ and all functions are differentiable. Call this problem $P_\alpha$. Let $v(\alpha)$ be the value of $P_\alpha$.

::: {.definition name="Regular Solution"}
A vector $\bar{x} \in \R^n$ is a *regular solution to $P_\bar{\alpha}$* if there exist $\varepsilon > 0$ and $\hat{x} : B_\varepsilon (\bar{\alpha}) \to \R^n$ such that

1. $\bar{x}$ satisfies the KT conditions,
2. $\hat{x} (\alpha)$ solves the problem $P_\alpha$ for every $\alpha \in B_\varepsilon (\bar{\alpha})$ and $\hat{x}(\bar{\alpha}) = \bar{x}$,
3. $I(\hat{x}(\alpha)) = I(\bar{x})$ for all $\alpha \in B_\varepsilon (\bar{\alpha})$, and
4. $\hat{x}$ is differentiable at $\bar{\alpha}$.
:::

::: {.theorem}
Suppose that $\bar{x}$ is a regular solution to $P_\bar{\alpha}$ with associated multiplier vector $\bar{\lambda}$. Then

$$
\nabla v (\bar{\alpha}) = \nabla_\alpha F (\bar{x}, \bar{\alpha}) - \sum_{i = 1}^m \bar{\lambda}_i \nabla_\alpha G_i (\bar{x}, \bar{\alpha}).
$$
:::

::: {.remark}
Suppose that $\bar{x}$ is a regular solution to $P_\bar{\alpha}$ with associated multiplier vector $\bar{\lambda}$. If $m = 0$, then $\nabla v(\bar{\alpha}) = \nabla_\alpha F(\bar{x}, \bar{\alpha})$. If $p = m$, $G_i (x, \alpha) = g_i(x) - \alpha_i$, and $F(x, \alpha) = f(x)$, then $\nabla v(\bar{\alpha}) = \bar{\lambda}$.
:::

### Maximization Problems

Consider the problem $\max f(x)$ subject to $g_i (x) \leq 0$ for all $i = 1, \dots, m$ where all function are differentiable. Let $I(x) = \left\{ i : g_i(x) = 0 \right\}$.

::: {.theorem}
If $\bar{x}$ solves the maximization problem above and $\displaystyle \left\{\nabla g_i (\bar{x})\right\}_{i \in I(\bar{x})}$ is a linearly independent set, then there exists $\bar{\lambda} \in \R^m_+$ such that $\displaystyle \nabla f(\bar{x}) = \sum_{i=1}^m \bar{\lambda}_i \nabla g_i (\bar{x})$ and $\bar{\lambda}_i g_i (\bar{x}) = 0$ for all $i$.
:::

::: {.theorem}
Suppose that

1. $g_i(\bar{x}) \leq 0$ for all $i = 1, \dots, m$,
2. there exists a set $\displaystyle \left\{\bar{\lambda}_i\right\}_{i \in I(\bar{x})}$ of positive numbers such that $\displaystyle \nabla f(\bar{x}) = \sum_{i\in I(\bar{x})} \bar{\lambda}_i \nabla g_i (\bar{x})$,
3. $f$ is pseudo-concave, and
4. each $g_i$ is quasi-convex.

Then $\bar{x}$ solves the maximization problem above.
:::

### Consumer Theory

#### Utility Maximization

Consider the following utility maximization problem

$$
\begin{align}
\max_{x \in \R^n} \quad & u(x) \\
\text{subject to} \quad & p \cdot x \leq y, \\
& x \geq 0.
\end{align}
$$

The KT conditions are

$$
\begin{array}{ccc}
\nabla u(x) \leq \lambda p & x \geq 0 & \left[ \nabla u(x) - \lambda p \right] \cdot x = 0 \\
p \cdot x \leq y & \lambda \geq 0 & \left( p \cdot x - y \right) \lambda = 0
\end{array}
$$

::: {.remark}
$\lambda$ is interpreted as the *marginal utility of income*. Call the solutions $x(p,y)$ the *demand functions* and $v(p,y) := u(x(p,y))$ the *indirect utility function*.
:::

::: {.remark name="Indirect Utility Function Properties"}
<br/>

1. Zero degree homogeneity.
2. $v$ is non-decreasing in $y$, and non-increasing in $p$.
3. $v$ is quasi-convex in $(p,y)$.
:::

#### Expenditure Minimization

Consider the following expenditure minimization problem

$$
\begin{align}
\min_{x \in \R^n} \quad & p \cdot x \\
\text{subject to} \quad & u(x) \geq \bar{u}, \\
& x \geq 0.
\end{align}
$$

::: {.remark}
Let $h(p, \bar{u})$ be the solution and $e(p,\bar{u}) := p \cdot h(p, \bar{u})$. The *expenditure function* $e$ is mathematically identical to the cost function.
:::

::: {.remark name="Expenditure Function Properties"}
<br/>

1. $e(\cdot, \bar{u})$ is 1-homogeneous.
2. $e(\cdot, \bar{u})$ is non-decreasing.
3. $e(\cdot, \bar{u})$ is concave.
4. By Shephard's Lemma, $\displaystyle \frac{\partial e}{\partial p_i} = h_i (p, \bar{u})$.
:::

::: {.theorem name="Roy's Identity"}
$$
x_i(p,y) = -\frac{\displaystyle \frac{\partial v}{\partial p_i}}{\displaystyle \frac{\partial v}{\partial y}}.
$$
:::

::: {.remark name="Relationships between utility maximization and expenditure minimization"}
<br/>

1. $h(p,v(p,y)) = x(p,y)$
2. $x(p,e(p,\bar{u})) = h(p,\bar{u})$
3. $e(p,v(p,y)) = y$
4. $v(p,e(p,\bar{u})) = \bar{u}$
:::

::: {.theorem name="Slutsky"}
$$
\frac{\partial x_i}{\partial p_j} (p,y) = \frac{\partial h_i}{\partial p_j} (p,v(p,y)) - x_j \frac{\partial x_i}{\partial y} (p,y),\ \forall i \neq j.
$$
:::

## Preferences and Utility Representation

### Binary Relations

::: {.definition name="Binary Relation"}
Let $X$ be a set and $X^2 = \left\{(x, y) : x, y \in X\right\}$. A *binary relation* on $X$ is any subset of $X^2$. We will typically denote binary relations by $P$, $R$, $I$ or $\succ$, $\succsim$, $\sim$.

> Let $P$ be a binary relation. Instead of $(x, y) \in P$, we write $xPy$. Similarly, instead of $(x, y) \notin P$, we write $x \notP y$.

:::

::: {.definition name="Strict Preference"}
Let $P$ be a binary relation. Suppose we would like $xPy$ to mean $x$ is *strictly better* that $y$. Here are some properties which we might like to impose on $P$.

* *Irreflexivity*: $xPx$ for no $x \in X$.
* *Asymmetry*: for any $x, y \in X$, if $xPy$, then $y \notP x$.
* *Acyclicity*: for every $n$ and for every $x_1, \dots, x_n \in X$, if $x_i P x_{i+1}$ for every $i < n$, then $x_n \notP x_1$.
* *Transitivity*: if $xPy$ and $yPz$, then $xPz$.
* *Negative transitivity*: if $xPy$ and $z \in X$, then either $xPz$ or $zPy$.
* *Connectedness*: if $x \neq y$, then either $xPy$ or $yPx$.
:::

::: {.remark}
<br/>

* $P$ is a (strict) *linear order* if it is connected, asymmetric and negatively transitive.
* $P$ is a (strict) *weak order* if it is asymmetric and negatively transitive.
* $P$ is a (strict) *partial order* if it is irreflexive and transitive.
:::

::: {.definition name="Weak Preference"}
Let $R$ be a binary relation. Suppose we would like $xRy$ to mean $x$ is *at least as good as* $y$. Here are some properties which we might like to impose on $R$.

* *Reflexivity*: $xRx$ for all $x \in X$.
* *Completeness*: for all $x, y \in X$, $xRy$ or $yRx$.
* *Transitivity*: if $xRy$ and $yRz$, then $xRz$.
* *Antisymmetry*: if $xRy$ and $yRx$, then $x = y$.
:::

::: {.remark}
<br/>

* $P$ is a (weak) *linear order* if it is antisymmetric, complete and transitive.
* $P$ is a (weak) *weak order* if it is complete and transitive.
* $P$ is a (weak) *partial order* if it is reflexive and transitive.
:::

::: {.definition name="Indifference"}
Let $I$ be a binary relation. Suppose we would like $xIy$ to mean that $x$ and $y$ are *equally good*. Here are some properties which we might like to impose on $I$.

* *Reflexivity*: $xIx$ for all $x \in X$.
* *Symmetry*: if $xIy$, then $yIx$.
* *Transitivity*: if $xIy$ and $yIz$, then $xIz$.
:::

::: {.remark}
Call $I$ an *equivalence* class if it is reflexive, symmetric and transitive.
:::

::: {.proposition}
Suppose that $P$ is a (strict) weak order and that binary relations $R$ and $I$ are defined, using $P$, as follows

$$
\begin{align}
xRy && \iff && y \notP x \\
xIy && \iff && x \notP y && \land && y \notP x
\end{align}
$$

Then $R$ is a (weak) weak order and $I$ is an equivalence class. If, in particular, $P$ is a (strict) linear order, then $R$ is a (weak) linear order.
:::

::: {.proposition}
Suppose that $R$ is a (weak) weak order and that binary relations $P$ and $I$ are defined, using $R$, as follows

$$
\begin{align}
xPy && \iff && xRy && \land && y \notR x \\
xIy && \iff && xRy && \land && yRx
\end{align}
$$

Then $P$ is a (strict) weak order and $I$ is an equivalence class. If, in particular, $R$ is a (weak) linear order, then $P$ is a (strict) linear order.
:::

### Maximization

::: {.definition name="Menu"}
Let $2^X = \left\{ A \subseteq X : A \neq \O \right\}$. Any member of $2^X$ is a *menu*.
:::

::: {.remark}
For any binary relation $T$ and any menu $A$, let
$$
\begin{align}
m(A,T) &= \left\{ x \in A : yTx,\ \text{for no}\ y \in A \right\}, \\
M(A,T) &= \left\{ x \in A : xTy,\ \forall y \in A \right\}.
\end{align}
$$
In general, these sets may be empty and they need not to coincide.
:::

::: {.proposition}
Suppose $X$ is finite. $P$ is acyclic if and only if $m(A,P) \neq \O$ for all $A$.
:::

::: {.proposition}
Suppose $R$ is a (weak) weak order and define $P$ using $R$ as follows:
$$
xPy \iff xRy \land y \notR x.
$$
Then $m(A,P) = M(A,P)$ for all $A$. If $R$ is a (weak) linear order, then $M(A,R)$ is a singleton.
:::

::: {.theorem}
Let $P$ be a binary relation on a countable set $X$. $P$ admits a utility representation if and only if $P$ is a weak order.
:::

::: {.remark}
In general, there exist weak orders (even linear orders) which do not have utility representations.
:::

::: {.theorem}
Let $P$ be a linear order on an arbitrary set $X$. $P$ admits a utility representation if and only if $X$ contains a countable $P$-dense subset.
:::

::: {.theorem}
A binary relation $P$ on an arbitrary set $X$ admits a utility representation if and only if $P$ is a weak order on $X$ and $X^*$ contains a countable $P^*$-dense subset.
:::

::: {.lemma}
Given $P$, say that the pair $(a,b)$ is a gap if $aPb$ but there exits no $c$ such that $aPc$ and $cPb$. Let $G_1 = \left\{ a : (a,b) \right.$ is a gap for some $\left. b \right\}$, $G_2 = \left\{ b : (a,b) \right.$ is a gap for some $\left. a \right\}$ and $G = G_1 \cup G_2$.

Suppose $P$ is a linear order on an arbitrary set $X$. $G$ is countable if one of the following two conditions holds:

1. $X$ contains a countable $P$-dense subset.
2. $P$ admits a utility representation.
:::

### A Topological Approach

::: {.corollary}
Let $R$ be a complete and transitive binary relation (i.e., a weak order) on $X \subseteq \R^n$. Define, as usual, $xPy$ if and only if $xRy$ and $y \notR x$. Recall that such $P$ is asymmetric and negatively transitive. Endow $X$ with the Euclidean metric. We say that $R$ admits a continuous utility representation if there exists a continuous function $u : X \to \R$ such that $xRy$ if and only if $u(x) \geq u(y)$.
:::

::: {.theorem}
Let $X \subseteq \R^n$ be convex and $R$ be a binary relation on $X$. $R$ is a continuous weak order if and only if it admits a continuous utility representation.
:::

::: {.proposition}
Let $e = (1, \dots, 1) \in \R^n$. If $R$ is a strictly monotone weak order and $\alpha$ and $\beta$ are scalars, then
$$
\alpha \geq \beta \iff (\alpha e) R (\beta e).
$$
:::

::: {.theorem}
Suppose $R$ is a strictly monotone and continuous weak order on $X = \R^n$. Then $R$ admits a continuous utility representation.
:::