PART 0 — Foundations of Quantum Machine Learning

Why Part 0 Matters

Quantum Machine Learning (QML) is often presented as something mysterious or futuristic. In reality, it is a natural extension of classical machine learning into quantum state spaces.

Before discussing algorithms, speedups, or applications, we must understand what replaces vectors, layers, weights, and nonlinearities in a quantum setting. This part establishes the conceptual and mathematical foundations required to understand why QML works at all.


0.1 A Common Mathematical Language: Linear Algebra

Both machine learning and quantum computing are fundamentally based on linear algebra.

In classical machine learning:

Data is represented as vectors:

 x \in \mathbb{R}^n

Models are functions parameterized by matrices and vectors:

 f_\theta(x) = Wx + b

Learning means adjusting parameters to minimize a loss.

In quantum computing:

Information is represented as state vectors:

 |\psi\rangle \in \mathbb{C}^{2^n}

Computation is performed via linear transformations:

 |\psi\rangle \rightarrow U|\psi\rangle

where (U) is a unitary matrix. (read more here)

📌 Key observation
Both paradigms manipulate vectors using matrices. The difference lies in:

  • the space (real vs complex),
  • the constraints (unitary vs arbitrary),
  • and how outputs are extracted (measurement vs direct readout).

This shared foundation is what makes QML possible.


0.2 Classical Bits and Quantum Qubits

Classical Bit

A classical bit can take only one of two values:

 b \in {0,1}

At any instant, the bit is in a definite state.


Quantum Qubit

A qubit, by contrast, exists in a superposition of basis states:

 |\psi\rangle = \alpha|0\rangle + \beta|1\rangle

with the normalization condition:

 |\alpha|^2 + |\beta|^2 = 1

  • (\alpha, \beta \in \mathbb{C}) are probability amplitudes
  • Measurement yields:
  • outcome (0) with probability (|\alpha|^2)
  • outcome (1) with probability (|\beta|^2)

📌 Important distinction
A qubit is not “partly 0 and partly 1” in a classical sense.
It is a vector in a complex vector space, and probabilities emerge only upon measurement.


0.3 Geometric Interpretation: The Bloch Sphere

Image

Any pure qubit state can be written as:

 |\psi\rangle = \cos(\theta/2)|0\rangle + e^{i\phi}\sin(\theta/2)|1\rangle

This maps to a point on the Bloch sphere:

  • North pole → (|0\rangle)
  • South pole → (|1\rangle)
  • Equator → equal superpositions

🧠 Machine learning intuition
A single qubit can encode continuous information, similar to a real-valued feature, but constrained to lie on a sphere.


0.4 Multi-Qubit Systems and Tensor Products

When combining qubits, we use the tensor product.

Two-qubit state:

 |\psi\rangle = \sum_{i,j \in {0,1}} \alpha_{ij}|ij\rangle

This state lives in a 4-dimensional complex space.

Number of qubitsDimension
1(2)
2(4)
(n)(2^n)

⚠️ This exponential growth is often misunderstood.

📌 Crucial point
The system exists in a (2^n)-dimensional space, but measurement reveals only limited information.
This is why QML is powerful—but not trivially exploitable.


0.5 Tensor Products of Operators: Understanding (H \otimes I)

Quantum gates also combine via tensor products.

 H \otimes I

means:

  • Apply Hadamard (H) to the first qubit
  • Apply identity (I) to the second qubit

In matrix form:

 (H \otimes I)|01\rangle = (H|0\rangle) \otimes |1\rangle

🧠 ML analogy
This is analogous to transforming one feature while leaving another unchanged.


0.6 Entanglement: Beyond Classical Correlations

Separable (non-entangled) state:

 |\psi\rangle = |\phi\rangle \otimes |\chi\rangle

Entangled state (Bell state):

 |\Phi^+\rangle = \frac{|00\rangle + |11\rangle}{\sqrt{2}}

Image
Image
Image

This state cannot be written as a tensor product of two single-qubit states.

📌 Why this matters for QML

  • Classical ML models interactions using extra parameters
  • Quantum models generate non-factorizable feature interactions naturally

Entanglement acts as a built-in inductive bias for modeling complex dependencies.


0.7 Quantum Gates as Trainable Parameters

Quantum gates are unitary transformations:

 U^\dagger U = I

Of special importance are parameterized rotation gates:

 R_x(\theta), \quad R_y(\theta), \quad R_z(\theta)

Example:

 R_y(\theta) = \begin{bmatrix} \cos(\theta/2) & -\sin(\theta/2) \\ \sin(\theta/2) & \cos(\theta/2) \end{bmatrix}

📌 In QML:

  • These (\theta) values are learnable parameters
  • They play the same role as weights in neural networks

0.8 Measurement as the Source of Nonlinearity

Quantum evolution is strictly linear:

 |\psi\rangle \rightarrow U|\psi\rangle

However, measurement produces nonlinear classical outputs:

 \hat{y} = \langle \psi | O | \psi \rangle

where (O) is an observable (e.g., Pauli-Z).

🧠 Critical insight
Measurement replaces activation functions.
Without measurement, quantum circuits alone cannot perform learning.


0.9 The Hybrid Quantum–Classical Learning Paradigm

Modern QML models are hybrid systems:

Classical data → Quantum encoding
               → Parameterized quantum circuit
               → Measurement
               → Classical loss + optimizer
               → Parameter update
  • Quantum computer: evaluates complex functions
  • Classical computer: performs optimization

This is not a limitation—it is a design principle.


Summary of Part 0

By the end of Part 0, we have established:

✔ QML is linear algebra on quantum states
✔ Qubits generalize classical features
✔ Entanglement encodes complex feature interactions
✔ Trainable gates replace weights
✔ Measurement provides nonlinearity

Leave a Reply