Skip to content

Restore basic link handling and add !pendinglinks command#23

Merged
Rajrooter merged 1 commit into
mainfrom
codex/review-codebase-for-improvements-and-fixes-bk50zl
Feb 3, 2026
Merged

Restore basic link handling and add !pendinglinks command#23
Rajrooter merged 1 commit into
mainfrom
codex/review-codebase-for-improvements-and-fixes-bk50zl

Conversation

@Rajrooter
Copy link
Copy Markdown
Owner

@Rajrooter Rajrooter commented Feb 3, 2026

User description

Motivation

  • The bot was not responding to dropped links because the cog lacked an on_message listener to detect and enqueue non-media URLs for review.
  • Users need a way to inspect their queued items, so a simple !pendinglinks command was added to allow review of pending links.

Description

  • Added a @commands.Cog.listener() on_message handler that ignores bots, processes commands, extracts URLs via URL_REGEX, filters with is_valid_url and is_media_url, and forwards accepted links to a handler.
  • Implemented _handle_link which builds a pending entry, calls link_preview and make_verdict_embed, persists the entry via storage.add_pending_link, sends an interactive LinkActionView, and stores the bot message id with storage.update_pending_with_bot_msg_id while tracking guild_pending_counts.
  • Added a pending_links_command exposed as !pendinglinks that retrieves pending links for the invoking user via storage.get_pending_links_for_user and displays them.
  • Kept integration with existing helpers and views (link_preview, make_verdict_embed, LinkActionView, is_valid_url, is_media_url, and storage APIs) so the change restores end-to-end link drop → review behavior.

Testing

  • No automated tests were executed for this change (review-only patch).

Codex Task


PR Type

Enhancement, Bug fix


Description

  • Added on_message listener to detect and process non-media URLs from messages

  • Implemented _handle_link method to create verdict embeds and store pending links

  • Added !pendinglinks command to display user's queued links for review

  • Enhanced download_bytes with scheme validation and Content-Length checks

  • Fixed TinyURL API endpoint to use HTTPS instead of HTTP


Diagram Walkthrough

flowchart LR
  A["Message Posted"] -->|"on_message listener"| B["Extract URLs"]
  B -->|"Filter non-media"| C["Valid Links"]
  C -->|"_handle_link"| D["Create Verdict Embed"]
  D -->|"Storage API"| E["Store Pending Link"]
  E -->|"LinkActionView"| F["Interactive Prompt"]
  G["!pendinglinks Command"] -->|"Query Storage"| H["Display User Links"]
Loading

File Walkthrough

Relevant files
Enhancement
main.py
Link detection, pending command, and download safety improvements

main.py

  • Added on_message listener to detect and enqueue non-media URLs from
    messages
  • Implemented _handle_link method that creates verdict embeds, stores
    entries, and sends interactive views
  • Added pending_links_command (!pendinglinks) to retrieve and display
    user's pending links
  • Enhanced download_bytes with URL scheme validation and Content-Length
    header checks before downloading
  • Fixed TinyURL API endpoint from HTTP to HTTPS for secure requests
+68/-1   

@vercel
Copy link
Copy Markdown

vercel Bot commented Feb 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
discord-link-bot Ready Ready Preview, Comment Feb 3, 2026 2:17pm
discord-link-bot-wfz6 Ready Ready Preview, Comment Feb 3, 2026 2:17pm

@qodo-code-review
Copy link
Copy Markdown

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Unbounded link processing

Description: The new on_message listener processes and enqueues every non-media URL found in user
messages without rate limiting or a cap on URLs per message, enabling a realistic spam/DoS
vector (e.g., a single message containing hundreds of URLs causing repeated storage writes
and bot responses via _handle_link).
main.py [910-954]

Referred Code
@commands.Cog.listener()
async def on_message(self, message: discord.Message):
    if message.author.bot:
        return
    await self.bot.process_commands(message)
    if not message.content:
        return
    links = re.findall(URL_REGEX, message.content)
    if not links:
        return
    filtered = []
    for link in links:
        if not is_valid_url(link):
            continue
        if is_media_url(link):
            continue
        filtered.append(link)
    if not filtered:
        return
    for link in filtered:
        await self._handle_link(message, link)


 ... (clipped 24 lines)
Sensitive link disclosure

Description: The !pendinglinks command posts the invoking user's pending URLs directly into the channel
(non-ephemeral), which can unintentionally disclose sensitive/private links to other
channel members if used in public channels.
main.py [955-966]

Referred Code
@commands.command(name="pendinglinks")
async def pending_links_command(self, ctx: commands.Context):
    links = await asyncio.to_thread(storage.get_pending_links_for_user, ctx.author.id)
    if not links:
        await safe_send(ctx, content="✅ No pending links.")
        return
    lines = []
    for idx, entry in enumerate(links, start=1):
        url = entry.get("url", "unknown")
        lines.append(f"{idx}. {url}")
    await safe_send(ctx, content="🕒 **Your pending links:**\n" + "\n".join(lines))
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Missing audit logs: The PR adds persistence of pending links (create/update operations) but does not add any
audit logging capturing user, action, timestamp, and outcome for these critical state
changes.

Referred Code
async def _handle_link(self, message: discord.Message, link: str):
    verdict, reason = get_link_verdict()
    preview = await link_preview(link)
    embed = make_verdict_embed(link, verdict, reason, preview)
    entry = {
        "url": link,
        "timestamp": datetime.datetime.utcnow().isoformat(),
        "author": str(message.author),
        "user_id": message.author.id,
        "guild_id": message.guild.id if message.guild else None,
        "archived": False,
        "expires_at": None,
    }
    pending_id = await asyncio.to_thread(storage.add_pending_link, entry)
    view = LinkActionView(link, message.author.id, message, pending_id, self, ai_verdict=verdict)
    prompt_msg = await safe_send(message.channel, embed=embed, view=view)
    if prompt_msg:
        view.message = prompt_msg
        self.pending_links[prompt_msg.id] = pending_id
        await asyncio.to_thread(storage.update_pending_with_bot_msg_id, pending_id, prompt_msg.id)
        if message.guild:


 ... (clipped 1 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Unhandled async failures: New asynchronous network/storage operations (e.g., link_preview, safe_send, and storage.*
calls) are performed without exception handling or fallback messaging/logging, risking
silent task failures and hard-to-debug runtime errors.

Referred Code
async def _handle_link(self, message: discord.Message, link: str):
    verdict, reason = get_link_verdict()
    preview = await link_preview(link)
    embed = make_verdict_embed(link, verdict, reason, preview)
    entry = {
        "url": link,
        "timestamp": datetime.datetime.utcnow().isoformat(),
        "author": str(message.author),
        "user_id": message.author.id,
        "guild_id": message.guild.id if message.guild else None,
        "archived": False,
        "expires_at": None,
    }
    pending_id = await asyncio.to_thread(storage.add_pending_link, entry)
    view = LinkActionView(link, message.author.id, message, pending_id, self, ai_verdict=verdict)
    prompt_msg = await safe_send(message.channel, embed=embed, view=view)
    if prompt_msg:
        view.message = prompt_msg
        self.pending_links[prompt_msg.id] = pending_id
        await asyncio.to_thread(storage.update_pending_with_bot_msg_id, pending_id, prompt_msg.id)
        if message.guild:


 ... (clipped 13 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
URL output sanitization: The !pendinglinks command echoes stored URLs back to the channel without additional
sanitization/normalization beyond earlier filtering, which may enable unwanted
formatting/mention effects or unsafe schemes if storage can contain non-validated entries.

Referred Code
@commands.command(name="pendinglinks")
async def pending_links_command(self, ctx: commands.Context):
    links = await asyncio.to_thread(storage.get_pending_links_for_user, ctx.author.id)
    if not links:
        await safe_send(ctx, content="✅ No pending links.")
        return
    lines = []
    for idx, entry in enumerate(links, start=1):
        url = entry.get("url", "unknown")
        lines.append(f"{idx}. {url}")
    await safe_send(ctx, content="🕒 **Your pending links:**\n" + "\n".join(lines))

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@Rajrooter Rajrooter merged commit f903533 into main Feb 3, 2026
7 checks passed
@qodo-code-review
Copy link
Copy Markdown

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Security
Handle invalid Content-Length headers securely

To prevent a potential memory exhaustion vulnerability, handle invalid
Content-Length headers by logging the error and returning None instead of
silently passing.

main.py [206-210]

 try:
     if int(content_length) > MAX_DOWNLOAD_BYTES:
         return None
-except ValueError:
-    pass
+except (ValueError, TypeError):
+    logger.warning(f"Invalid Content-Length header received from {url}: {content_length}")
+    return None
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: This suggestion addresses a security vulnerability where a malformed Content-Length header could bypass size checks, potentially leading to memory exhaustion. Explicitly handling the error and aborting the download is a critical security improvement.

High
Possible issue
Limit download read size

To prevent excessive memory usage, read the response content in chunks up to
MAX_DOWNLOAD_BYTES + 1 and reject if the size limit is exceeded, instead of
reading the entire response at once.

main.py [196-218]

     async def download_bytes(url: str) -> Optional[bytes]:
         try:
             parsed = urlparse(url)
             if parsed.scheme not in {"http", "https"}:
                 return None
             async with aiohttp.ClientSession() as session:
                 async with session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as resp:
                     if resp.status == 200:
                         content_length = resp.headers.get("Content-Length")
                         if content_length:
                             try:
                                 if int(content_length) > MAX_DOWNLOAD_BYTES:
                                     return None
                             except ValueError:
                                 pass
                         content_type = resp.headers.get('Content-Type', '')
                         if (not content_type) or any(ct in content_type for ct in ALLOWED_CONTENT_TYPES):
--                        data = await resp.read()
--                        if len(data) <= MAX_DOWNLOAD_BYTES:
--                            return data
++                        data = await resp.content.read(MAX_DOWNLOAD_BYTES + 1)
++                        if len(data) <= MAX_DOWNLOAD_BYTES:
++                            return data
         except Exception as e:
             logger.debug(f"download_bytes error: {e}")
         return None

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 8

__

Why: This suggestion correctly identifies that reading the entire response before checking its size is risky. By reading the response in chunks up to a limit, it prevents potential memory exhaustion from large files, which is a significant robustness and security improvement.

Medium
Avoid channel spam from multiple links

To avoid channel spam, handle multiple links in a single message by using
MultiLinkSelectView instead of processing each link individually.

main.py [929-930]

-for link in filtered:
-    await self._handle_link(message, link)
+if len(filtered) == 1:
+    await self._handle_link(message, filtered[0])
+elif len(filtered) > 1:
+    # This part assumes MultiLinkSelectView and other components are available
+    # and correctly implemented to handle multiple links, as suggested by context.
+    # The exact implementation might need adjustment based on the definitions
+    # of MultiLinkSelectView and how it processes links_data.
+    links_data = [{"url": link} for link in filtered]
+    view = MultiLinkSelectView(links_data, message.author.id, message, self)
+    await safe_send(message.channel, content=multi_link_message(len(filtered)), view=view)
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that processing multiple links individually will spam the channel and proposes using existing components to create a single, consolidated message, which significantly improves the user experience of the new feature.

Medium
General
Chunk long responses

To avoid errors with long lists of pending links, chunk the response message
into smaller segments before sending.

main.py [955-965]

     @commands.command(name="pendinglinks")
     async def pending_links_command(self, ctx: commands.Context):
         links = await asyncio.to_thread(storage.get_pending_links_for_user, ctx.author.id)
         if not links:
             await safe_send(ctx, content="✅ No pending links.")
             return
-        lines = []
-        for idx, entry in enumerate(links, start=1):
-            url = entry.get("url", "unknown")
-            lines.append(f"{idx}. {url}")
--       await safe_send(ctx, content="🕒 **Your pending links:**\n" + "\n".join(lines))
+        lines = [f"{idx}. {entry.get('url','unknown')}" for idx, entry in enumerate(links, start=1)]
++       header = "🕒 **Your pending links:**\n"
++       payload = header + "\n".join(lines)
++       for chunk in [payload[i:i+1900] for i in range(0, len(payload), 1900)]:
++           await safe_send(ctx, content=chunk)

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies a potential unhandled edge case where a large number of pending links could exceed Discord's message length limit. Implementing message chunking makes the command more robust and reliable.

Low
  • More

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5849931c03

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread main.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant