Launch in Colab Download .ipynb file
In this notebook, we will explore some least-squares problems, one of the first fundamental optimization problems encountered in this course. We’ll focus on two classical problems:
- Polynomial regression
- Denoising a signal using least-squares and regularizers or penalization methods.
This first TP involves the practical implementation of least-squares problems using two classical Python libraries:
- NumPy: For efficient numerical computations, matrix operations, and solving least-squares problems using built-in linear algebra functions.
- Matplotlib: For visualizing the data and displaying / interpreting results.
Running the Notebook¶
This notebook can be executed in the following environments:
- Google Colab: A convenient, cloud-based option that requires no setup. Simply open the notebook in Colab, and you’re ready to run the code.
- Locally: You can run the notebook on your local machine using environments like JupyterLab or Jupyter Notebook. You can also run it directly in Visual Studio Code if you have the Jupyter extension. In all cases, ensure you have Python installed along with the required libraries,
NumPyandMatplotlib.
# load necessary dependencies
import numpy as np
import matplotlib.pyplot as plt1. Polynomial regression¶
This first exercise will permit to review basic instructions in Python as well as standard functions from NumPy and Matplotlib. The goal is to estimate the parameters of a polynomial model given experimental (i.e. noisy) data using the least-squares method. We’ll first define the true model, generate some noisy synthetic data, and then estimate the model parameters.
Consider the ground truth polynomial model for .
Questions
Represent the function on .
Given equispaced time instants over , generate a noisy vector of measurements with entries , where is Gaussian. Represent noisy measurements and the ground truth model on the same graph.
From the data , we wish to fit a polynomial model of order of the form
Write the linear model matrix for an arbitrary order . Construct this matrix numerically, given and the vector of time instants . (Hint: check the
np.vanderfunction).Given an integer , write and solve the associated least squares problem. Display the estimated vector of coefficients. Compare the solution obtained by the explicit expression and that obtained with the
np.linalg.pinvfunction.Represent the estimated polynomial model for several values of (say ). Comment. What happens for ?
(Bonus) The choice of the degree of the polynomial can be evaluated using a standard criterion such as Akaike Information Criterion, defined as , where is the number of parameters to be estimated and is the norm of the residual (in this context). Plot this criterion for the different choices of . For each value of is the minimum attained?
(Bonus bis) Play around with differents values of , true polynomial , or noise level !
# question 1# question 2# question 3# question 4# question 5: # question 6: Bonus2. Denoising¶
This second problem is the denoising problem which corresponds to the reconstruction of a signal from noisy measurements , where is some noise. A typical assumption is that noise is Gaussian, i.e., for a given time instant , where is the noise variance.
The denoising problem can be formulated as a least squares problem. Let be time instants and denote by the vectors encoding the signal and the measurements, respectively. The denoising problem seeks to solve
Preliminary questions
- What is the solution to the least squares problem ?
Usually, we have access to extra information about the signal : e.g. it is non-negative, it belongs to some class of constraints, it exhibits regularity in some representation domain, etc. A classical assumption is that the signal is smooth. At first order, this is encoded by the function
- Show that can be expressed as
To take into account the smoothness of the solution into the denoising problem, we formulate a new problem, called regularized least squares or penalized least squares. It reads
where
- is called the data fidelity term;
- is the regularization (or penalty) and is called the regularization parameter.
- Compute the solution to the regularized least squares problem above with . Comment on the uniqueness of the solution. What happens when ? When ?
Write here your answers to preliminary questions using Markdown
Implementation tasks
- Generate a smooth signal where Hz and s. Choose a sampling frequency that satisfies the Shannon-Nyquist criterion.
- Generate noisy measurements by corrupting the signal vector by additive white Gaussian noise with SNR = 10 dB. (Recall the formula for the SNR!)
- Solve the regularized least squares for different values of , and represent corresponding solutions. Compare it to the ground truth signal.
- A common question in penalized optimization problems is to choose the hyperparameter . A classical approach is to use a heuristic called L-curve: for various values of , plot the data fidelity term vs the penalty term in the cost function. From this tool, what would be a good choice of according to you? Why?
- Start again the exercise with the second-order penalty function
Compare your results with the first-order difference penalty.
# question 1. # Question 2: # question 3: # question 4
# question 5