{ "cells": [ { "cell_type": "markdown", "id": "header", "metadata": {}, "source": [ "# Least Squares: Finding the Best Fit\n", "\n", "When you have more equations than unknowns, or when your data is noisy, you can't always solve $A\\mathbf{x} = \\mathbf{b}$ exactly. Instead, you find the **best approximate solution** using least squares.\n", "\n", "## The Problem\n", "\n", "Suppose you want to fit a line $y = mx + c$ to data points that don't line up perfectly. You want to find $m$ and $c$ that minimize the total error.\n", "\n", "This is an **overdetermined system**: more equations (data points) than unknowns ($m$ and $c$)." ] }, { "cell_type": "code", "execution_count": null, "id": "setup", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "id": "example-data", "metadata": {}, "source": [ "## Example: Fitting a Line\n", "\n", "Let's generate some noisy data and fit a line to it:" ] }, { "cell_type": "code", "execution_count": null, "id": "generate-data", "metadata": {}, "outputs": [], "source": [ "# Generate noisy data\n", "np.random.seed(42)\n", "x_data = np.linspace(0, 10, 20)\n", "y_true = 2.5 * x_data + 1.5 # True line: y = 2.5x + 1.5\n", "y_data = y_true + np.random.normal(0, 2, size=x_data.shape) # Add noise\n", "\n", "# Plot the data\n", "plt.figure(figsize=(10, 6))\n", "plt.scatter(x_data, y_data, label='Noisy data', alpha=0.6)\n", "plt.plot(x_data, y_true, 'k--', label='True line: y = 2.5x + 1.5', alpha=0.5)\n", "plt.xlabel('x')\n", "plt.ylabel('y')\n", "plt.legend()\n", "plt.title('Noisy Data Points')\n", "plt.grid(True, alpha=0.3)\n", "plt.show()\n", "\n", "print(f\"We have {len(x_data)} data points but only 2 unknowns (m and c)\")\n", "print(\"This is an overdetermined system!\")" ] }, { "cell_type": "markdown", "id": "formulation", "metadata": {}, "source": [ "## Setting Up the Problem\n", "\n", "For each data point $(x_i, y_i)$, we want:\n", "\n", "$$\n", "y_i \\approx mx_i + c\n", "$$\n", "\n", "In matrix form:\n", "\n", "$$\n", "A\\mathbf{x} \\approx \\mathbf{b}\n", "$$\n", "\n", "Where:\n", "\n", "$$\n", "A = \\begin{bmatrix}\n", "x_1 & 1 \\\\\n", "x_2 & 1 \\\\\n", "\\vdots & \\vdots \\\\\n", "x_n & 1\n", "\\end{bmatrix}, \\quad\n", "\\mathbf{x} = \\begin{bmatrix}\n", "m \\\\\n", "c\n", "\\end{bmatrix}, \\quad\n", "\\mathbf{b} = \\begin{bmatrix}\n", "y_1 \\\\\n", "y_2 \\\\\n", "\\vdots \\\\\n", "y_n\n", "\\end{bmatrix}\n", "$$" ] }, { "cell_type": "code", "execution_count": null, "id": "setup-matrices", "metadata": {}, "outputs": [], "source": [ "# Build the matrix A and vector b\n", "A = np.column_stack([x_data, np.ones_like(x_data)])\n", "b = y_data\n", "\n", "print(\"Matrix A (first 5 rows):\")\n", "print(A[:5])\n", "print(f\"\\nShape of A: {A.shape}\")\n", "print(f\"Shape of b: {b.shape}\")\n", "print(f\"\\nWe have {A.shape[0]} equations but only {A.shape[1]} unknowns\")" ] }, { "cell_type": "markdown", "id": "solving", "metadata": {}, "source": [ "## Solving with Least Squares\n", "\n", "The least squares solution minimizes the sum of squared errors:\n", "\n", "$$\n", "\\text{minimize} \\quad \\|A\\mathbf{x} - \\mathbf{b}\\|^2\n", "$$\n", "\n", "Use `np.linalg.lstsq()` to find the best-fit solution:" ] }, { "cell_type": "code", "execution_count": null, "id": "lstsq-solve", "metadata": {}, "outputs": [], "source": [ "# Solve using least squares\n", "solution, residuals, rank, singular_values = np.linalg.lstsq(A, b, rcond=None)\n", "\n", "m_fit, c_fit = solution\n", "\n", "print(f\"Best-fit line: y = {m_fit:.3f}x + {c_fit:.3f}\")\n", "print(f\"True line: y = 2.500x + 1.500\")\n", "print(f\"\\nTotal squared error: {residuals[0]:.3f}\")" ] }, { "cell_type": "markdown", "id": "visualization", "metadata": {}, "source": [ "## Visualizing the Fit\n", "\n", "Let's plot the data and the fitted line:" ] }, { "cell_type": "code", "execution_count": null, "id": "plot-fit", "metadata": {}, "outputs": [], "source": [ "# Generate fitted line\n", "y_fit = m_fit * x_data + c_fit\n", "\n", "plt.figure(figsize=(10, 6))\n", "plt.scatter(x_data, y_data, label='Data', alpha=0.6, s=50)\n", "plt.plot(x_data, y_true, 'k--', label='True line', alpha=0.5, linewidth=2)\n", "plt.plot(x_data, y_fit, 'r-', label=f'Fitted line: y = {m_fit:.2f}x + {c_fit:.2f}', \n", " linewidth=2)\n", "\n", "# Show residuals (errors)\n", "for i in range(len(x_data)):\n", " plt.plot([x_data[i], x_data[i]], [y_data[i], y_fit[i]], 'g-', alpha=0.3)\n", "\n", "plt.xlabel('x')\n", "plt.ylabel('y')\n", "plt.legend()\n", "plt.title('Least Squares Line Fitting')\n", "plt.grid(True, alpha=0.3)\n", "plt.show()\n", "\n", "print(\"Green lines show the residuals (errors) that are minimized\")" ] }, { "cell_type": "markdown", "id": "normal-equations", "metadata": {}, "source": [ "## The Normal Equations\n", "\n", "Mathematically, the least squares solution satisfies the **normal equations**:\n", "\n", "$$\n", "A^T A \\mathbf{x} = A^T \\mathbf{b}\n", "$$\n", "\n", "The solution is:\n", "\n", "$$\n", "\\mathbf{x} = (A^T A)^{-1} A^T \\mathbf{b}\n", "$$\n", "\n", "Let's verify this gives the same answer:" ] }, { "cell_type": "code", "execution_count": null, "id": "normal-equations-solve", "metadata": {}, "outputs": [], "source": [ "# Solve using normal equations\n", "x_normal = np.linalg.inv(A.T @ A) @ A.T @ b\n", "\n", "print(\"Using lstsq:\")\n", "print(f\" m = {m_fit:.6f}, c = {c_fit:.6f}\")\n", "print(\"\\nUsing normal equations:\")\n", "print(f\" m = {x_normal[0]:.6f}, c = {x_normal[1]:.6f}\")\n", "print(\"\\nThey match! But lstsq() is more numerically stable.\")" ] }, { "cell_type": "markdown", "id": "polynomial-fit", "metadata": {}, "source": [ "## Fitting a Polynomial\n", "\n", "Least squares isn't limited to straight lines. Let's fit a quadratic: $y = ax^2 + bx + c$" ] }, { "cell_type": "code", "execution_count": null, "id": "polynomial-data", "metadata": {}, "outputs": [], "source": [ "# Generate quadratic data with noise\n", "np.random.seed(42)\n", "x_data = np.linspace(-3, 3, 30)\n", "y_true = 0.5 * x_data**2 - 2 * x_data + 1 # True: y = 0.5x^2 - 2x + 1\n", "y_data = y_true + np.random.normal(0, 1, size=x_data.shape)\n", "\n", "# Build matrix for quadratic: [x^2, x, 1]\n", "A_poly = np.column_stack([x_data**2, x_data, np.ones_like(x_data)])\n", "\n", "# Solve\n", "coeffs, *_ = np.linalg.lstsq(A_poly, y_data, rcond=None)\n", "a, b, c = coeffs\n", "\n", "print(f\"Fitted quadratic: y = {a:.3f}x² + {b:.3f}x + {c:.3f}\")\n", "print(f\"True quadratic: y = 0.500x² - 2.000x + 1.000\")\n", "\n", "# Plot\n", "y_fit = a * x_data**2 + b * x_data + c\n", "\n", "plt.figure(figsize=(10, 6))\n", "plt.scatter(x_data, y_data, label='Data', alpha=0.6)\n", "plt.plot(x_data, y_true, 'k--', label='True curve', alpha=0.5)\n", "plt.plot(x_data, y_fit, 'r-', \n", " label=f'Fitted: y = {a:.2f}x² + {b:.2f}x + {c:.2f}', linewidth=2)\n", "plt.xlabel('x')\n", "plt.ylabel('y')\n", "plt.legend()\n", "plt.title('Least Squares Quadratic Fitting')\n", "plt.grid(True, alpha=0.3)\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "numpy-polyfit", "metadata": {}, "source": [ "## NumPy's Convenience Function\n", "\n", "For polynomial fitting, NumPy provides `np.polyfit()` as a shortcut:" ] }, { "cell_type": "code", "execution_count": null, "id": "polyfit", "metadata": {}, "outputs": [], "source": [ "# Fit a 2nd degree polynomial using polyfit\n", "coeffs_polyfit = np.polyfit(x_data, y_data, deg=2)\n", "\n", "print(\"Using lstsq manually:\")\n", "print(f\" [a, b, c] = [{a:.3f}, {b:.3f}, {c:.3f}]\")\n", "print(\"\\nUsing np.polyfit:\")\n", "print(f\" [a, b, c] = [{coeffs_polyfit[0]:.3f}, {coeffs_polyfit[1]:.3f}, {coeffs_polyfit[2]:.3f}]\")\n", "print(\"\\nSame result!\")" ] }, { "cell_type": "markdown", "id": "real-world", "metadata": {}, "source": [ "## Real-World Example: Sensor Calibration\n", "\n", "Suppose you have a temperature sensor that gives readings, and you want to calibrate it against a reference thermometer:" ] }, { "cell_type": "code", "execution_count": null, "id": "sensor-calibration", "metadata": {}, "outputs": [], "source": [ "# Sensor readings vs. true temperatures\n", "sensor_readings = np.array([20.5, 25.3, 30.1, 35.2, 40.0, 45.3, 50.1])\n", "true_temps = np.array([21.0, 26.0, 31.0, 36.0, 41.0, 46.0, 51.0])\n", "\n", "# Fit calibration curve: true_temp = m * sensor + c\n", "A_calib = np.column_stack([sensor_readings, np.ones_like(sensor_readings)])\n", "calib_coeffs, *_ = np.linalg.lstsq(A_calib, true_temps, rcond=None)\n", "\n", "m_calib, c_calib = calib_coeffs\n", "\n", "print(f\"Calibration equation: true_temp = {m_calib:.4f} × sensor + {c_calib:.4f}\")\n", "\n", "# Test it\n", "test_reading = 37.5\n", "calibrated_temp = m_calib * test_reading + c_calib\n", "print(f\"\\nSensor reads {test_reading}°C\")\n", "print(f\"Calibrated temperature: {calibrated_temp:.2f}°C\")\n", "\n", "# Plot calibration\n", "plt.figure(figsize=(8, 6))\n", "plt.scatter(sensor_readings, true_temps, label='Calibration data', s=50)\n", "x_line = np.linspace(20, 51, 100)\n", "y_line = m_calib * x_line + c_calib\n", "plt.plot(x_line, y_line, 'r-', label='Calibration curve', linewidth=2)\n", "plt.xlabel('Sensor Reading (°C)')\n", "plt.ylabel('True Temperature (°C)')\n", "plt.legend()\n", "plt.title('Sensor Calibration using Least Squares')\n", "plt.grid(True, alpha=0.3)\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "summary", "metadata": {}, "source": [ "## Summary\n", "\n", "- **Least squares** finds the best approximate solution when $A\\mathbf{x} = \\mathbf{b}$ has no exact solution\n", "- It minimizes the sum of squared errors: $\\|A\\mathbf{x} - \\mathbf{b}\\|^2$\n", "- Use `np.linalg.lstsq(A, b)` for the general case\n", "- Use `np.polyfit(x, y, deg)` for polynomial fitting\n", "- The solution is given by the normal equations: $\\mathbf{x} = (A^T A)^{-1} A^T \\mathbf{b}$\n", "\n", "## Applications\n", "\n", "Least squares is everywhere:\n", "- **Curve fitting** in data analysis\n", "- **Linear regression** in machine learning\n", "- **Sensor calibration** in engineering\n", "- **Signal denoising** in DSP\n", "- **Image reconstruction** in computer vision\n", "\n", "## Next Steps\n", "\n", "Now explore how matrices are used in:\n", "- {doc}`dsp` - Digital signal processing\n", "- {doc}`graphics` - Computer graphics transformations\n", "- {doc}`beamforming` - Antenna array processing" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.12.0" } }, "nbformat": 4, "nbformat_minor": 5 }