Skip to content

Conversation

@jsnshrmn
Copy link
Member

@jsnshrmn jsnshrmn commented Dec 16, 2025

provides features required for mass emailing for survey mediawiki API.

Rationale

In T412427, we found that we were unable to send high volume mail via WMF cloud VPS SMTP infrastructure. So, we're going to try using mediawiki instead.

Phabricator Ticket

https://phabricator.wikimedia.org/T409420

How Has This Been Tested?

  • Unit tests have been added for the management command and user selection logic
  • No unit tests have been added for the backend api client
  • I have done quite a bit of manual testing

You can test run the command logic and djmail message tracking by passing through the default djmail backend (which should be console email for local dev).

For example:
python manage.py survey_active_users 000001 en fr de ja ru es --backend djmail.backends.default.EmailBackend --batch_size 10

You may omit the backend to send it through the api, but in mediawiki you'll need to add and confirm email addresses for users with wp_usernames that match users in your TWLight. If you setup something like mailhog in mediawiki to handle mail delivery, you'll be able to see the survey messages accumulate there. It's kind of a pain.

Screenshots of your changes (if appropriate):

N/A

Types of changes

What types of changes does your code introduce? Add an x in all the boxes that apply:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Minor change (fix a typo, add a translation tag, add section to README, etc.)

@jsnshrmn jsnshrmn force-pushed the Jsn.sherman/T412427 branch from f685cf3 to 0201ab9 Compare December 16, 2025 18:59
@jsnshrmn jsnshrmn force-pushed the Jsn.sherman/T412427 branch from 065941a to c086792 Compare December 17, 2025 19:45
@jsnshrmn jsnshrmn force-pushed the Jsn.sherman/T412427 branch from c086792 to e6fd993 Compare December 18, 2025 21:04
@jsnshrmn jsnshrmn changed the title WIP requirements: use twl djmail fork WIP enable mw api emailuser backend Dec 18, 2025
@jsnshrmn jsnshrmn force-pushed the Jsn.sherman/T412427 branch from e6fd993 to 1608291 Compare December 18, 2025 23:09
@jsnshrmn jsnshrmn force-pushed the Jsn.sherman/T412427 branch from 1608291 to 1bc7bb9 Compare December 18, 2025 23:22
@jsnshrmn jsnshrmn force-pushed the Jsn.sherman/T412427 branch from 1bc7bb9 to a61018e Compare December 19, 2025 04:30
@jsnshrmn jsnshrmn force-pushed the Jsn.sherman/T412427 branch from 937980a to 66d0f18 Compare December 19, 2025 18:34
@jsnshrmn jsnshrmn force-pushed the Jsn.sherman/T412427 branch from b0b158c to 289f1a1 Compare December 19, 2025 19:20
@jsnshrmn jsnshrmn force-pushed the Jsn.sherman/T412427 branch from 289f1a1 to 6ffba2d Compare December 19, 2025 19:56
Copy link
Contributor

@suecarmol suecarmol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a first pass of the code and everything looks good. I left some questions that might need to be addressed in this PR.

The command python manage.py survey_active_users 000001 en fr de ja ru es --backend djmail.backends.default.EmailBackend --batch_size 10 seemed to hang in my local environment, but python manage.py survey_active_users 000001 en fr de ja ru es --backend djmail.backends.default.EmailBackend --batch_size 10 --staff_test seems to be working fine.

- Validate language codes in command
- Remove signal handling and clean up email sending code
- Fix issues in custom model manger
- Fix duplicate email check:
    - stop using survey_email_sent userprofile field
        - actually dropping the field should be handled later
    - Now searches for existing messages based on translated subject and recipient
    - Skips sending if message (of any status) exists
- Outputs some basic statistics
- Updated tests to reflect the new approach
- better naming for maxlag handling method in backend

Bug: T409420
@jsnshrmn
Copy link
Member Author

Made a first pass of the code and everything looks good. I left some questions that might need to be addressed in this PR.

The command python manage.py survey_active_users 000001 en fr de ja ru es --backend djmail.backends.default.EmailBackend --batch_size 10 seemed to hang in my local environment, but python manage.py survey_active_users 000001 en fr de ja ru es --backend djmail.backends.default.EmailBackend --batch_size 10 --staff_test seems to be working fine.

I changed the strategy for checking for existing emails; before I was searching based on the user profile language, but realized that we would miss messages if the user changed their language preference. What I didn't do was reload a prod-sized user set and retest. It didn't hang on my machine, but it did take a really, really long time. The staff test works against a much smaller subset of users. I'll just switch back to the original strategy, which runs much faster, and should only result in a few possible edge cases of duplicate emails.

- dramatic speedup
- more informative log messages during execution

Bug: T409420
@jsnshrmn
Copy link
Member Author

Made a first pass of the code and everything looks good. I left some questions that might need to be addressed in this PR.
The command python manage.py survey_active_users 000001 en fr de ja ru es --backend djmail.backends.default.EmailBackend --batch_size 10 seemed to hang in my local environment, but python manage.py survey_active_users 000001 en fr de ja ru es --backend djmail.backends.default.EmailBackend --batch_size 10 --staff_test seems to be working fine.

I changed the strategy for checking for existing emails; before I was searching based on the user profile language, but realized that we would miss messages if the user changed their language preference. What I didn't do was reload a prod-sized user set and retest. It didn't hang on my machine, but it did take a really, really long time. The staff test works against a much smaller subset of users. I'll just switch back to the original strategy, which runs much faster, and should only result in a few possible edge cases of duplicate emails.

I benchmarked several approaches with a large number of users and we now have the best of both worlds; the user query now happens quickly and we still search for all current translations of the subject line.


# All Wikipedia Library users who:
for user in (
users = (
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@suecarmol @DHardy-WMF @katydidnot I added in support for the qualification exemptions: I did some checks and found that they had a small impact on the recipient count. Should make testing a little easier!

return wrapper


def _handle_maxlag(response):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DHardy-WMF if you have a local db setup that supports maxlag, I'd appreciate at least some minimal manual testing on this. This function is completely untested.

"There was an error in the request to send the email."
)
emailuser_data = _handle_maxlag(emailuser_response)
if emailuser_data["emailuser"]["result"] != "Success":
Copy link
Contributor

@katydidnot katydidnot Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After confirming my email address locally with a test account, I'm still seeing the following error on send:

{'error': {'code': 'cantsend', 'info': 'No send address', '*': 'See http://localhost:8080/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes.'}}

Curious if anyone else is seeing this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested this locally and "successfully" sent 10 emails in different languages, so it's working for me!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@suecarmol
When testing this locally did you use the API with a locally confirmed user? How did you setup that test user? Any specific steps beyond visiting the local confirmation page from mailhog?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{'error': {'code': 'cantsend', 'info': 'No send address', '*': 'See http://localhost:8080/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes.'}}

@katydidnot I think this looks like mediawiki doesn't know what the "from" address should be? So, maybe it's on the mw config side rather than the user account side?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, no, it looks like the api talks about the target/to/recipient/send address interchangeably

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've discovered some bugs in my maxlag handling code while looking at this; rejiggering!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@katydidnot, I used my personal account and changed the relevant information in the DB so that it matched the personal account I created in my MW local environment. I also saw that other emails were "sent" (they were printed on the console) to other users that qualified.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@katydidnot I still haven't been able to reproduce this, but I have now improved error handling and retry logic quite a bit.

Copy link
Contributor

@suecarmol suecarmol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested this locally and was able to "send" emails in different languages. Thank you for your work on this!

- correct connection and maxlag retry bugs
- bubble up api error messages
- include usernames in exceptions for unsendable messages
- more DRY

Bug: T412427
@DHardy-WMF
Copy link
Contributor

I was trying to review this patch locally, unfortunately this is all I see when I run the command:

(venv) /app# python manage.py survey_active_users 000001 en fr de ja ru es --backend djmail.backends.default.EmailBackend --batch_size 10
0 users qualify
0 users previously sent message will be skipped
0 remaining users qualify
attempting to send to 0 users

Could be due to OAuth not going through the local MediaWiki instance. Is my TWLight misconfigured somehow?

@suecarmol
Copy link
Contributor

This is what I'm seeing when I run the same command:

Screenshot 2026-01-16 at 18 59 15

This is a partial screenshot of the result of running the command.

@suecarmol
Copy link
Contributor

@jsnshrmn, when I run python manage.py survey_active_users 000001 es --backend djmail.backends.default.EmailBackend, emails are sent to users whose default language is not Spanish. Is this intended?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants