Skip to content

Conversation

@airajena
Copy link
Contributor

Description:

This PR implements an automated mechanism to fetch public Slack conversations and archive them as static AsciiDoc files within the Fineract documentation. This directly addresses FINERACT-2171 by making ephemeral community knowledge indexable by search engines and permanently accessible.

Changes

  • New Script: Added fineract-doc/slack.gradle to handle the fetching and generation logic.
  • Build Integration: Updated fineract-doc/build.gradle to include the archiving task as a dependency of asciidoctor.
  • Documentation: Updated the main index.adoc to include the new "Slack Archive" chapter.
  • Safeguards: Implemented robust checks for missing tokens (CI/CD safe), rate limiting, and privacy control using an allowlist mechanism.

Testing

  • Local Build: Verified that ./gradlew :fineract-doc:asciidoctor succeeds both with and without SLACK_TOKEN.
  • Formatting: Verified that generated AsciiDoc tables render correctly with timestamps and usernames.
  • Code Quality: Ran spotlessApply to ensure the new Gradle script follows project formatting standards.

Related Issue

Fixes FINERACT-2171

outputs.dir outputDir

doLast {
def token = System.getenv('SLACK_TOKEN')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Gradle task pulls Slack messages via the Slack API using a token from SLACK_TOKEN. If this token is accidentally leaked (in build logs, CI cache, config, etc), it could expose your Slack workspace. The script doesn’t include any secure handling/obfuscation of the token – it’s passed raw to HTTP connections.

Risk

Token might be logged in CI output.

Token could end up in Gradle caches, backups, or public artifacts if misconfigured.

Slack rate/lifecycle issues (tokens expire or permissions change).

Mitigation ideas

Require encrypted secrets or use a tool, not raw env token.

Add explicit handling to avoid leaking token in logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the security review, Aman!

I've addressed the concerns in the latest push:

  • Switched from System.getenv to providers.environmentVariable("SLACK_TOKEN").getOrNull().
    This aligns with modern Gradle best practices for input handling and configuration cache compatibility, ensuring we don't inadvertently bake secrets into the cache key in an unsafe way (which can happen with raw environment access).

  • Confirmed that the token variable is never printed or logged. The script only logs high-level lifecycle status messages.

  • The intention (which I’ll explicitly clarify in the documentation) is that this token should strictly be a Bot User OAuth Token with limited read-only scopes (channels:read, history:read), rather than a full user token. This significantly limits the blast radius if the token were ever mishandled.

With these changes, the secret handling follows standard and safe practices for CI-injected secrets in Gradle scripts.

@airajena airajena force-pushed the FINERACT-2171/make-slack-messages-discoverable branch from b996353 to a3bf00f Compare January 26, 2026 10:48
Copy link
Contributor

@adamsaghy adamsaghy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dont like this idea.

Fineract should not have much to do with this...

@meonkeys Am I missing something here?

meonkeys added a commit to apache/fineract-chat-archive that referenced this pull request Jan 26, 2026
@galovics
Copy link
Contributor

@meonkeys Why would we want Gradle and the build take care of this?

I don't get it.

This should be rather a crawler running outside of the fineract codebase to handle this. I'm not even sure about the value of publishing the slack messages to be fair.

@meonkeys
Copy link
Contributor

meonkeys commented Jan 26, 2026

I do want FINERACT-2171 implemented and I'm happy to see work being done on it. I would like to offer some suggestions on the implementation. For example, I'd like it kept separate from the main Fineract source code.

My first thought is that if ASF or someone can do this for us, that would be ideal. I asked for help in #asfinfra:

we'd like to capture and archive messages from Fineract's main (public) chat room. Is there somewhere on an ASF server we could host this content? There's a lot of good conversation in our chat room that would be nice to get indexed by search engines. Currently we just manually copy/paste important conversations to our main mailing list, but I think we can automate message capture with a bot by, e.g., slurping messages and generating static HTML. It could also save media (e.g. images) if that's acceptable, but maybe we just save text at first. I'm thinking we'd have like 10KiB/day of static HTML created. If we did also save media, we could limit the total saved to, say, 1MB/day or so.

They don't do that, but they had a couple suggestions:

  1. (fluxo) seems like you could commit them to a git repo, or host them on your project website. we don't offer another solution for web hosting
  2. (Humbedooh) or maybe automate sending it to a list every day?

Here are my ideas. Feedback welcome!

  1. The goal is to make our public chat messages to #fineract easily index-able by search engines. I think the tool should archive messages into source for a Jekyll static site, commit it to a git repo, and let GitHub Pages render and serve the content.
  2. Is there an extant Slack chat archiver bot/tool out there we can just use as-is?
  3. Let's use a new git repo for this. I created https://github.com/apache/fineract-chat-archive .
  4. Stick with the Gradle build tool since our community is already familiar with that (but I'm open to another framework/stack here, it doesn't have to be a Java-ish project).
  5. Use Groovy or Java for the main implementation language, but maybe pull the code out of the Gradle build file? (again, I'm fine with another framework/stack, maybe choose that based on which one has everything we need out of the box and will be easiest to maintain)
  6. Support running this locally, e.g. gradle updateChatArchive.
  7. Create a very simple GitHub Action that just runs the same thing we'd run locally.
  8. Main build target is idempotent: It can be re-run safely. If there aren't any new messages/media to archive/update, nothing happens.
  9. Link message timestamp to original/upstream chat messages.

Additional ideas, maybe post-MVP:

  1. What should happen if a message in Slack is edited?
  2. Generate an index for easy browsing by
  3. Handle media (e.g. images).
  4. Handle threads.
  5. Include avatar images.
  6. Include reactions.
  7. Include a search tool (Simple-Jekyll-Search is archived but maybe it still works, maybe there's a decent alternative, or maybe we just DIY/NIH).

@meonkeys
Copy link
Contributor

@airajena , what do you think? Will you create a PR against https://github.com/apache/fineract-chat-archive instead?

I'll add my long comment above to FINERACT-2171 . Please let me know your thoughts: Let's discuss in #fineract chat or on the mailing list or in comments on FINERACT-2171. Or directly... I'm happy to do a phone/video call at your convenience.

@meonkeys meonkeys closed this Jan 26, 2026
@airajena
Copy link
Contributor Author

@airajena , what do you think? Will you create a PR against https://github.com/apache/fineract-chat-archive instead?

I'll add my long comment above to FINERACT-2171 . Please let me know your thoughts: Let's discuss in #fineract chat or on the mailing list or in comments on FINERACT-2171. Or directly... I'm happy to do a phone/video call at your convenience.

Thanks everyone for the feedback — this makes sense.

I agree that this functionality doesn’t belong in the core Fineract build and should live separately. Thanks @meonkeys for creating apache/fineract-chat-archive — that looks like the right direction.

I’m happy to move this work there and adapt the implementation accordingly (standalone Gradle project, runnable locally + via GitHub Actions, idempotent archive updates, etc.).

I’ll take a look at the new repo and follow up with a PR there. Happy to discuss details in comments or on #fineract chat as suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants