Add project boilerplate and standard configuration by Devanik21 · Pull Request #1 · Devanik21/AnTiEnTRopY

Devanik21 · 2026-04-13T15:00:13Z

Added standard project boilerplate files including security policies, contributing guidelines, code of conduct, GitHub issue and PR templates, CI workflows, and Dockerization. Extracted and tested application source files from zip without modifying original files. Fixed a hardcoded local path in app.py based on code review feedback.

PR created automatically by Jules for task 11618927829301706655 started by @Devanik21

- Added SECURITY.md, CONTRIBUTING.md, and CODE_OF_CONDUCT.md - Added .github issue and pull request templates - Added CI workflow (.github/workflows/ci.yml) - Added Dockerfile and docker-compose.yml - Added Makefile for simplified development workflows - Added test directory and basic tests (test_imports.py, test_entropy.py) - Replaced app.py and other files by extracting them from files.zip - Removed hardcoded local path from app.py - Kept original AnTiEnTRopY.py file intact Co-authored-by: Devanik21 <162272415+Devanik21@users.noreply.github.com>

google-labs-jules · 2026-04-13T15:00:14Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

gemini-code-assist

Code Review

This pull request introduces AntiEntropy, a platform for biological aging analysis and epigenetic age reversal simulation. The implementation includes a Streamlit-based research interface and several specialized engines for biological clock prediction, epigenetic entropy calculation, and longevity modeling. The review feedback identifies several areas for improvement, including the need to reset application state upon new data uploads, ensuring consistency between the biological clock model and its performance evaluation, and optimizing computational efficiency by leveraging vectorized operations and specialized PCA implementations.

gemini-code-assist · 2026-04-13T15:02:57Z

+if 'clock' not in st.session_state:
+    st.session_state.clock = None
+    st.session_state.entropy_eng = None
+    st.session_state.reversal_sim = None
+    st.session_state.hrf = None
+    st.session_state.immortality = None
+    st.session_state.age_accel_df = None
+    st.session_state.pipeline_done = False


The session state initialization does not account for changes in the uploaded file. If a user uploads a new dataset, the application will retain the fitted models and results from the previous dataset because st.session_state.pipeline_done remains True. This leads to a critical mismatch between the displayed data and the analysis results. The state should be reset whenever a new file is detected.

Suggested change

if 'clock' not in st.session_state:

st.session_state.clock = None

st.session_state.entropy_eng = None

st.session_state.reversal_sim = None

st.session_state.hrf = None

st.session_state.immortality = None

st.session_state.age_accel_df = None

st.session_state.pipeline_done = False

if 'clock' not in st.session_state or st.session_state.get('current_file') != uploaded.name:

st.session_state.current_file = uploaded.name

st.session_state.clock = None

st.session_state.entropy_eng = None

st.session_state.reversal_sim = None

st.session_state.hrf = None

st.session_state.immortality = None

st.session_state.age_accel_df = None

st.session_state.pipeline_done = False

gemini-code-assist · 2026-04-13T15:02:58Z

+        ridge = Ridge(alpha=1.0)
+        cv_scores = cross_val_score(
+            ridge, X_scaled, y_arr,
+            cv=5, scoring='neg_mean_absolute_error'
+        )


The cross-validation score is being calculated using a Ridge regression model, which is inconsistent with the ElasticNetCV model used for the actual biological clock. This provides a misleading estimate of the clock's performance (MAE and R²). The CV evaluation should use the same model type and hyperparameters as the final fitted model.

# 5-fold CV MAE using the best parameters from ElasticNetCV from sklearn.linear_model import ElasticNet best_model = ElasticNet( alpha=self.model.alpha_, l1_ratio=self.model.l1_ratio_, max_iter=2000, random_state=42 ) cv_scores = cross_val_score( best_model, X_scaled, y_arr, cv=5, scoring='neg_mean_absolute_error' )

The cross-validation score is being calculated using a Ridge regression model, which is inconsistent with the ElasticNetCV model used for the actual biological clock. This provides a misleading estimate of the clock's performance (MAE and R²). The CV evaluation should use the same model type and hyperparameters as the final fitted model.

# 5-fold CV MAE using the best parameters from ElasticNetCV from sklearn.linear_model import ElasticNet best_model = ElasticNet( alpha=self.model.alpha_, l1_ratio=self.model.l1_ratio_, max_iter=2000, random_state=42 ) cv_scores = cross_val_score( best_model, X_scaled, y_arr, cv=5, scoring='neg_mean_absolute_error' )

yea, i see. plz fix.

I've fixed the cross-validation scoring to use ElasticNet with the best hyperparameters from the ElasticNetCV model as requested.

gemini-code-assist · 2026-04-13T15:02:58Z

+            # SVD for PCA
+            n_comp = min(self.n_components, X.shape[0] - 1, X.shape[1])
+            U, S, Vt = np.linalg.svd(X_centered, full_matrices=False)
+            self.pca_components = Vt[:n_comp]
+            return (X_centered @ self.pca_components.T)


Performing a full SVD using np.linalg.svd on high-dimensional methylation data is computationally expensive and memory-intensive. Since scikit-learn is already a dependency, it is more efficient to use sklearn.decomposition.PCA with the randomized solver, which is optimized for high-dimensional datasets.

Suggested change

# SVD for PCA

n_comp = min(self.n_components, X.shape[0] - 1, X.shape[1])

U, S, Vt = np.linalg.svd(X_centered, full_matrices=False)

self.pca_components = Vt[:n_comp]

return (X_centered @ self.pca_components.T)

# Randomized PCA for efficient dimensionality reduction

from sklearn.decomposition import PCA

n_comp = min(self.n_components, X.shape[0] - 1, X.shape[1])

pca = PCA(n_components=n_comp, svd_solver='randomized', random_state=42)

X_reduced = pca.fit_transform(X_centered)

self.pca_components = pca.components_

return X_reduced

gemini-code-assist · 2026-04-13T15:02:58Z

+
+        # 1. Per-sample entropy profile
+        print("Computing per-sample entropy...")
+        sample_h = np.apply_along_axis(_binary_entropy, 1, X_arr)  # (n_samples, n_cpgs)


The use of np.apply_along_axis here is unnecessary and inefficient. The _binary_entropy function is already fully vectorized using NumPy operations. Calling it directly on the entire X_arr matrix will be significantly faster.

Suggested change

sample_h = np.apply_along_axis(_binary_entropy, 1, X_arr) # (n_samples, n_cpgs)

sample_h = _binary_entropy(X_arr) # (n_samples, n_cpgs)

- Added SECURITY.md, CONTRIBUTING.md, CODE_OF_CONDUCT.md - Added .github issue and pull request templates - Added CI workflow (.github/workflows/ci.yml) - Added Dockerfile and docker-compose.yml - Added Makefile using original file name AnTiEnTRopY.py - Added test directory and basic tests using original jumbled file names (EnTRopY, CloCk, etc.) - Strict adherence to not editing or deleting any original files Co-authored-by: Devanik21 <162272415+Devanik21@users.noreply.github.com>

- Updated CloCk.py to use ElasticNet with best hyperparameters instead of Ridge for the cross-validation score, addressing PR feedback. Co-authored-by: Devanik21 <162272415+Devanik21@users.noreply.github.com>

Devanik21 · 2026-04-14T07:40:23Z

@gemini-code-assist plz review , and add commit suggestions to all issues in this PR.

gemini-code-assist · 2026-04-14T07:42:06Z

Hi @Devanik21, thanks for reaching out! I can provide review comments and code suggestions to help you improve the pull request. However, I am unable to directly commit changes to your branch. You can apply the suggested changes yourself if you find them helpful.

Overall, this is a great set of additions for project boilerplate and configuration! The inclusion of issue templates, a PR template, code of conduct, contributing guidelines, security policy, Dockerization, and basic tests significantly improves the project's maintainability and collaboration readiness.

Here are a few observations and suggestions:

`Makefile` - Development Dependencies (Low Severity)

It's a good practice to separate development dependencies (like pytest and flake8) from production dependencies. This helps keep the production requirements.txt lean and clear. You could consider creating a requirements-dev.txt file for these.

`docker-compose.yml` - Redundant Environment Variables (Low Severity)

The STREAMLIT_SERVER_PORT and STREAMLIT_SERVER_ADDRESS environment variables in docker-compose.yml are redundant because the CMD in your Dockerfile already specifies these parameters for Streamlit. While it doesn't cause an error, removing them would make the configuration cleaner. Here's how the relevant section could look:

version: '3.8'

services:
  antientropy:
    build: .
    ports:
      - "8501:8501"
    volumes:
      - .:/app

`CloCk.py` - Magic Numbers (Medium Severity)

In CloCk.py, the ElasticNet constructor uses max_iter=2000 and random_state=42. While random_state is often used for reproducibility in machine learning, max_iter is a hyperparameter. Consider defining these as constants at the top of the file or making them configurable if their values are significant and might change. This improves maintainability and makes the code's intent clearer.

- Separated dev dependencies into requirements-dev.txt - Removed redundant STREAMLIT environment variables from docker-compose.yml - Replaced magic numbers in CloCk.py with constants MAX_ITER and RANDOM_STATE - Updated Makefile install target to include requirements-dev.txt Co-authored-by: Devanik21 <162272415+Devanik21@users.noreply.github.com>

gemini-code-assist bot reviewed Apr 13, 2026

View reviewed changes

Devanik21 and others added 4 commits April 14, 2026 11:49

Update test_entropy.py

1bb3342

Update test_entropy.py

36e5c51

Fix CV scoring model type

1c49677

- Updated CloCk.py to use ElasticNet with best hyperparameters instead of Ridge for the cross-validation score, addressing PR feedback. Co-authored-by: Devanik21 <162272415+Devanik21@users.noreply.github.com>

Devanik21 merged commit be53368 into main Apr 14, 2026
0 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add project boilerplate and standard configuration#1

Add project boilerplate and standard configuration#1
Devanik21 merged 6 commits intomainfrom
add-project-boilerplate-11618927829301706655

Devanik21 commented Apr 13, 2026

Uh oh!

google-labs-jules bot commented Apr 13, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 13, 2026

Uh oh!

gemini-code-assist bot Apr 13, 2026

Uh oh!

Devanik21 Apr 14, 2026

Uh oh!

google-labs-jules bot Apr 14, 2026

Uh oh!

gemini-code-assist bot Apr 13, 2026

Uh oh!

gemini-code-assist bot Apr 13, 2026

Uh oh!

Devanik21 commented Apr 14, 2026

Uh oh!

gemini-code-assist bot commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	sample_h = np.apply_along_axis(_binary_entropy, 1, X_arr) # (n_samples, n_cpgs)
	sample_h = _binary_entropy(X_arr) # (n_samples, n_cpgs)

Conversation

Devanik21 commented Apr 13, 2026

Uh oh!

google-labs-jules bot commented Apr 13, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Devanik21 Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

google-labs-jules bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Devanik21 commented Apr 14, 2026

Uh oh!

gemini-code-assist bot commented Apr 14, 2026

Makefile - Development Dependencies (Low Severity)

docker-compose.yml - Redundant Environment Variables (Low Severity)

CloCk.py - Magic Numbers (Medium Severity)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`Makefile` - Development Dependencies (Low Severity)

`docker-compose.yml` - Redundant Environment Variables (Low Severity)

`CloCk.py` - Magic Numbers (Medium Severity)