Skip to content

Lea/lokali#374

Open
LeahMalul wants to merge 13 commits intomainfrom
lea/lokali
Open

Lea/lokali#374
LeahMalul wants to merge 13 commits intomainfrom
lea/lokali

Conversation

@LeahMalul
Copy link
Copy Markdown
Collaborator

No description provided.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Nov 12, 2025

Greptile Overview

Greptile Summary

This PR implements a comprehensive refactoring across the AgCloud platform with three main themes: certificate handling flexibility, fruit data processing reorganization, and infrastructure scaling. The changes modify Docker certificate installation across 16 services to be conditional rather than mandatory, preventing build failures when certificates aren't present in different deployment environments. The fruit processing pipeline is restructured with new Kafka topics (inference.dispatched.fruit, inference.dispatched.camera) and updated MinIO event prefixes from "fruits/" to "fruit/tree/" to support hierarchical fruit classification. Infrastructure improvements include scaling Flink TaskManager resources (doubled slots, increased memory) and enhanced JSON response handling in HTTP dispatchers. The changes also add a new fruit-publisher simulator service and improve MQTT topic parsing flexibility for device identification.

Important Files Changed

Filename Score Overview
services/flink_writer_db/Dockerfile.flink 1/5 Critical bug: certificates copied to wrong directory, breaking certificate processing
GUI/src/vast/runner/Dockerfile 1/5 Flawed conditional certificate copying with tar commands that won't work properly
services/inference_http/app.py 2/5 Response structure change unpacking result dictionary could overwrite response keys
services/inference_http/adapters/fruit_segmentation_runner.py 2/5 Early return in loop breaks multi-fruit processing functionality
services/sounds_flink/Dockerfile 2/5 Certificate handling logic issues and removed COPY command affects functionality
services/alerts_forwarder/Dockerfile.flink 2/5 Conditional certificate check without COPY instruction makes condition always false
services/image-linker/Dockerfile.flink 2/5 Syntax error and certificate file existence check issues
GUI/src/vast/gateway/Dockerfile 2/5 Conflicting certificate handling with both conditional and unconditional approaches
services/API-notifications/src/Dockerfile 2/5 Tar-based certificate copying logic won't work as expected
GUI/src/vast/desktop/Dockerfile 2/5 Duplicated and conflicting certificate handling logic
simulators/docker-compose.yml 3/5 Volume mapping inconsistency for fruit-publisher metadata directory
docker-compose.yml 3/5 Volume permission change from read-only to read-write introduces security risk
streaming/flink/jobs/http_dispatcher.py 3/5 Complex event parsing changes remove validation logic with potential issues
storage_with_mqtt/mqtt_images/mqtt_ingest/app.py 3/5 MQTT topic parsing refactor adds complexity and potential debugging overhead
services/plant_stress/Dockerfile 3/5 Service activation with hardcoded credentials in environment variables
storage_with_mqtt/storage/Lifecycle_rules/minio-bootstrap/entrypoint/init.sh 4/5 Clean path structure change from "fruits/" to "fruit/tree/" prefix
services/inference_http/model_registry.py 4/5 Simple team identifier change from "fruit_defect" to "fruit"
simulators/Dockerfile 4/5 Well-implemented conditional certificate installation with proper error handling
services/ripeness-ml/deploy/Dockerfile 4/5 Robust conditional certificate processing with proper checks
services/db_api_service/app/contracts/Dockerfile 4/5 Clean conditional certificate installation preventing build failures
services/db_api_service/Dockerfile 4/5 Defensive programming improvement for certificate handling
storage_with_mqtt/storage/Lifecycle_rules/minio-bootstrap/Dockerfile 4/5 Resilient certificate handling preventing build failures
services/compression/Dockerfile 4/5 Graceful optional certificate installation with conditional logic
mqtt_and_kafka/kafka/kafka-files/create-topics.sh 5/5 Clean addition of inference topics for fruit and camera processing
services/sounds_classifier/Dockerfile.classifier-svc 5/5 Well-implemented conditional certificate installation making build resilient
GUI/src/vast/services/Dockerfile 5/5 Robust certificate handling with proper existence checks

Confidence score: 2/5

  • This PR contains several critical bugs and implementation issues that could cause build failures and runtime problems
  • Score lowered due to multiple Dockerfiles with broken certificate copying logic, early return bugs in fruit processing, and conflicting certificate handling approaches in several services
  • Pay close attention to services/flink_writer_db/Dockerfile.flink, GUI/src/vast/runner/Dockerfile, and services/inference_http/adapters/fruit_segmentation_runner.py which have critical functionality-breaking issues

Sequence Diagram

sequenceDiagram
    participant User
    participant DockerBuild as "Docker Build Process"
    participant CertManager as "Certificate Manager"
    participant PipInstaller as "Pip Installer"
    participant SystemPackages as "System Packages"
    participant VirtualEnv as "Virtual Environment"
    participant MinIOClient as "MinIO Client"
    participant KafkaConnector as "Kafka Connector"
    participant FlinkRuntime as "Flink Runtime"
    
    User->>DockerBuild: "Build container with NetFree support"
    DockerBuild->>CertManager: "Check for certificates in certs/ directory"
    alt Certificate files found
        CertManager->>CertManager: "Copy .crt files to ca-certificates"
        CertManager->>CertManager: "Update system certificate store"
        CertManager-->>DockerBuild: "Certificates installed"
    else No certificates
        CertManager-->>DockerBuild: "Skip certificate installation"
    end
    
    DockerBuild->>SystemPackages: "Install system dependencies"
    SystemPackages->>SystemPackages: "Install ca-certificates, curl, python3, build tools"
    SystemPackages-->>DockerBuild: "System packages ready"
    
    DockerBuild->>VirtualEnv: "Create Python virtual environment"
    VirtualEnv->>PipInstaller: "Install Python dependencies"
    PipInstaller->>PipInstaller: "Configure SSL certificates for pip"
    PipInstaller->>PipInstaller: "Install requirements.txt packages"
    alt NetFree environment
        PipInstaller->>PipInstaller: "Use trusted hosts configuration"
        PipInstaller->>PipInstaller: "Apply certificate bundle settings"
    end
    PipInstaller-->>VirtualEnv: "Dependencies installed"
    VirtualEnv-->>DockerBuild: "Virtual environment ready"
    
    DockerBuild->>KafkaConnector: "Download Kafka connector JARs"
    KafkaConnector->>KafkaConnector: "Download flink-connector-kafka JAR"
    KafkaConnector->>KafkaConnector: "Download kafka-clients JAR"
    KafkaConnector-->>DockerBuild: "Kafka connectors ready"
    
    DockerBuild->>MinIOClient: "Setup MinIO client"
    MinIOClient->>MinIOClient: "Copy mc binary from official image"
    MinIOClient->>MinIOClient: "Configure MinIO aliases and endpoints"
    MinIOClient-->>DockerBuild: "MinIO client configured"
    
    DockerBuild->>FlinkRuntime: "Configure Flink environment"
    FlinkRuntime->>FlinkRuntime: "Set Python executable paths"
    FlinkRuntime->>FlinkRuntime: "Configure JAR dependencies"
    FlinkRuntime->>FlinkRuntime: "Copy application code"
    FlinkRuntime-->>DockerBuild: "Flink runtime ready"
    
    DockerBuild-->>User: "Container build complete with NetFree compatibility"
Loading

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (4)

  1. services/inference_http/adapters/fruit_segmentation_runner.py, line 17 (link)

    syntax: Duplicate import of datetime - already imported on line 3

  2. services/image-linker/Dockerfile.flink, line 30 (link)

    syntax: Extra indentation before RUN command causes syntax error

  3. GUI/src/vast/gateway/Dockerfile, line 40-42 (link)

    logic: This unconditional COPY will fail if certs directory doesn't exist, breaking builds when USE_NETFREE=false. Remove this block since conditional certificate handling already exists above (lines 23-33).

  4. streaming/flink/jobs/http_dispatcher.py, line 177 (link)

    logic: Missing validation - payload sent with potentially null bucket/key values

26 files reviewed, 22 comments

Edit Code Review Agent Settings | Greptile

env_file: .env.fruit
volumes:
- ./data/fruit/images:/data/fruit/images:ro
- ./data/fruit/metadata:/data/metadata:ro
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Inconsistent volume mapping - should be /data/fruit/metadata to match the pattern used by other publishers

Suggested change
- ./data/fruit/metadata:/data/metadata:ro
- ./data/fruit/metadata:/data/fruit/metadata:ro

"ok": True,
"team": TEAM,
"result": result,
**result,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Unpacking result with **result could overwrite critical response fields if result contains conflicting keys like 'ok', 'team', 'image_uri', etc. Does the model runner guarantee that result will never contain keys that conflict with the response structure?

COPY certs/ /tmp/certs/
RUN if [ -d certs ]; then \
echo "Copying certs directory..."; \
tar -cf - certs | tar -xf -; \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Critical logic error: copying to root directory instead of /tmp/certs/. This breaks certificate processing on lines 15-21 which expects files in /tmp/certs/

Suggested change
tar -cf - certs | tar -xf -; \
tar -cf - certs | tar -xf - -C /tmp/;

Comment on lines +8 to +15
RUN if [ -f netfree-ca.crt ]; then \
echo "Installing netfree-ca.crt..."; \
cp netfree-ca.crt /usr/local/share/ca-certificates/netfree-ca.crt; \
chmod 644 /usr/local/share/ca-certificates/netfree-ca.crt; \
update-ca-certificates; \
else \
echo "No netfree-ca.crt found, skipping."; \
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: missing COPY command - the RUN command checks for netfree-ca.crt but there's no COPY instruction to make the file available in the build context

COPY certs/*.crt /usr/local/share/ca-certificates/
RUN update-ca-certificates

RUN if [ -f netfree-ca.crt ]; then \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: File existence check assumes netfree-ca.crt is in the build context root, but no COPY command brings it there. Should there be a COPY command to bring netfree-ca.crt into the container before checking for its existence?

event_id = event.get("event_id") or str(uuid.uuid4())

# 3) Extract bucket/key – support both native and MinIO S3-event formats
import sys, urllib.parse
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Importing modules inside function reduces performance and violates PEP 8

Suggested change
import sys, urllib.parse
import sys
import urllib.parse

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

raw_key = s3_block.get("object", {}).get("key")
if raw_key:
key = urllib.parse.unquote(raw_key)
# ננקה imagery/ כפול אם מופיע
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Hebrew comment may cause encoding issues in some environments

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

# --- DEBUG + try to detect device from filename ---
filename_base = os.path.splitext(result["filename"])[0]
if "-" in filename_base:
possible_device = filename_base.split("_")[0]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: The split should use '-' not '_' to match the condition check

Suggested change
possible_device = filename_base.split("_")[0]
possible_device = filename_base.split("-")[0]

Comment on lines +213 to +214
if FORCE_DEVICE_ID:
device_id = FORCE_DEVICE_ID
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Inconsistent indentation - should use 4 spaces to match the rest of the file

Suggested change
if FORCE_DEVICE_ID:
device_id = FORCE_DEVICE_ID
if FORCE_DEVICE_ID:
device_id = FORCE_DEVICE_ID

Comment on lines +233 to +240
if ns == "imagery":
subpath = "/".join(parts[idx + 1 : -3])
if subpath:
key = f"{subpath}/{device_name}/{date_part}/{result['publish_ts_ms']}/{result['filename']}"
else:
key = f"{topdir}/{device_name}/{date_part}/{result['publish_ts_ms']}/{result['filename']}"
else:
key = f"{topdir}/{device_name}/{date_part}/{result['publish_ts_ms']}/{result['filename']}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Inconsistent indentation - should use 4 spaces to match the rest of the file

Suggested change
if ns == "imagery":
subpath = "/".join(parts[idx + 1 : -3])
if subpath:
key = f"{subpath}/{device_name}/{date_part}/{result['publish_ts_ms']}/{result['filename']}"
else:
key = f"{topdir}/{device_name}/{date_part}/{result['publish_ts_ms']}/{result['filename']}"
else:
key = f"{topdir}/{device_name}/{date_part}/{result['publish_ts_ms']}/{result['filename']}"
if ns == "imagery":
subpath = "/".join(parts[idx + 1 : -3])
if subpath:
key = f"{subpath}/{device_name}/{date_part}/{result['publish_ts_ms']}/{result['filename']}"
else:
key = f"{topdir}/{device_name}/{date_part}/{result['publish_ts_ms']}/{result['filename']}"
else:
key = f"{topdir}/{device_name}/{date_part}/{result['publish_ts_ms']}/{result['filename']}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant