Trend Filtering - PETSc TAO with L1 Dictionary Regularization

Classic least squares optimization fails to properly accouint for noise in data, attempting to find an optimal solution. However, by the application of trend filtering using PETSc TAO with L1 dictionary regularization, the regularization allows the least squares regression model to not overfit for the noise in the data and produces a fit that better defines the trend in the data. As such, this implementation of the regularization term solves the following least squares problem:

$$\min_x \frac{1}{2} | Ax - b|_2^2 + \lambda |Dx |_1$$

With this, we note that the choice for the regularization term $\lambda$ and the dictionary matrix $D$ are influential to the functionality of the regularization on the least squares model fit. This method defaults to utilizing $D = I$ to represent a standard L1 regularization, but can be mutated to a custom $D$ matrix for wavelets or domain-specific transforms.

The code for this experiment is a modification of the C tutorial posted here: tao-leastSquares. This code was first directly translated into Python to utilize the petsc4py package as the PETSc Python API, then adapted to fit the needs of trend filtering for the presentation notebook.

The usage of the PETCs C implementation enhances the computational capabilities and utilizes the enhanced engine to speed up the vector-based computation. In the tutorial_presentation.ipynb notebook we explore the application of the python implementation of an interpretation of a Tao with L1 dictionary regulation methodology to analyze CO2 trends over time from NOAA data, and then stock market trendlines with volatility analysis off the trendline.

AI Usage Experience

I noticed that the AI would often state that this is the "correct final version," when it would often resort to a version that does not work when translating the code from C. This resulted in me actually context shifting between Gemini and ChatGPT. From this experience, it seems like Gemini is much stronger at reasoning to develop code that makes sense and assist with debugging. However, ChatGPT seemed to be work extremely well with taking the python code generated by gemini, and update the raw translation of the C code into a class that I am able to call for the purposes of trend analysis in the notebook.

In addition, both LLMs did extremely well to document the functions in the classes, allowing for easy understanding of the functions and their interactions for the implementation in the presentation and visualizations notebook.

Based on these, I would not utilize LLMs for large tasks such as code translation, since the overhead for managing the translation requires more time to ensure that the outputs are correct and debugging, whereas if I were to fully walk through the implementation myself I believe it would have been a faster process. Still, I would absolutely look to utilize LLMs for tasks such as function documentation or in-code docstrings to ensure documentation throughout the projects I develop remain consistent.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
docs		docs
.gitignore		.gitignore
README.md		README.md
tutorial_module.py		tutorial_module.py
tutorial_presentation.ipynb		tutorial_presentation.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trend Filtering - PETSc TAO with L1 Dictionary Regularization

AI Usage Experience

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Trend Filtering - PETSc TAO with L1 Dictionary Regularization

AI Usage Experience

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages