Skip to content
This repository was archived by the owner on Apr 2, 2024. It is now read-only.

Fix handling of redirects. Previously, if a url caused a redirect, the r...#260

Open
jgsbennett wants to merge 2 commits into
mozilla-services:masterfrom
jgsbennett:redirects
Open

Fix handling of redirects. Previously, if a url caused a redirect, the r...#260
jgsbennett wants to merge 2 commits into
mozilla-services:masterfrom
jgsbennett:redirects

Conversation

@jgsbennett
Copy link
Copy Markdown

...edirected url and time would be recorded twice(or more times for more redirects) in the results, as opposed to correct info for each individual request being recorded before the subsequent redirect.

I'll confess that I haven't done extensive testing of this, but it fixes the problem I was seeing for me. (In particular, I haven't run any of the tests included in the module, so apologies if this is just plain broken!).

The problem this addresses:
If the page that Session.send() is trying to access is a url that causes a redirect, from url a to url b say, then a hit is recorded for url b twice, with the start/elapsed time both being for url b, since the res.started /res.elapsed and res.url properties are all overwritten. This means that loads does not record any info for url a(although the total number of hits it records is correct, assuming that you are in fact interested in the redirect hits).
In practice, if there were several redirects, this means you would only record the last request, and that it would be recorded once for each request in the redirect chain that led to it.

The suggested fix below makes resolve_redirects() record information for the hit before attempting the redirected request. The final non-redirected request is still recorded in 'send'.

Notes:

  • requests module has an is_redirect property being added to Response, which is not in the released 2.2.1 requests module. Depending on the next loads release date, it will probably be desirable to change the first line in resolve_redirects() to make use of this.
  • I've made an assumption that the redirected hits are interesting enough to fix logging them properly. I could imagine that users of the module might want to know the total elapsed time from first url until the end of any redirects for the first link? (i.e: Still allow redirects, but don't count it as a separate hit), although personally I'm not all that fussed.

…e redirected url and time would be recorded twice(or more times for more redirects) in the results, as opposed to correct info for each individual request being recorded before the subsequent redirect.
@tarekziade
Copy link
Copy Markdown
Contributor

Thanks a lot that looks good. Minor comments inline following

Comment thread loads/measure.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need the parenthesis here

@tarekziade
Copy link
Copy Markdown
Contributor

for the Travis failure: you can use "make test" that will run the same test which includes pep8 tests.

Thanks

@jgsbennett
Copy link
Copy Markdown
Author

Hi Tarek, thanks for the comments - all of them valid.

The parenthisis slipped in because I was playing around with some bits and pieces after taking the code from requests.models, and I forgot to remove them. You're right, they're totally unnecessary.

As for the "if hasattr(self, '_need_to_analyse'):", I agree it's more than a little hacky. As it happens, I've already spotted another bug around that area caused by my fix, which is that if you now set allow_redirects to false any redirected url hits are recorded twice even though they were only visited once, and no redirect happened. I've made another minor fix for that which I'll submit shortly. Hopefully you'll find it more satisfactory. Any further comments welcome!

Is the Travis failure likely to be something related to this pull? I noticed there was one, but I hadn't looked into it yet.

Jake

@tarekziade
Copy link
Copy Markdown
Contributor

Is the Travis failure likely to be something related to this pull? I noticed there was one, but I hadn't looked into it yet.

it's because we run "flake8" against the code - and your changes are producing style warnings.

To try this at home, 'make build' then 'bin/flake8 loads/'

Last, since this is a touchy area, I think it would be great if we had a small test for that change. I can help there if you are not familiar with Loads tests

@jgsbennett
Copy link
Copy Markdown
Author

Hmm, I've changed my mind slightly about the minor extra fix I was going to make. I don't think I can get rid of the flag entirely, but I'm happy to move it onto the requests or response object if you think that's suitable.

I'll take a look at the style bits, and although I agree about the need for tests, I don't know where I'd start writing these. Additionally, I'm not sure I'm best placed to work out what other scenarios in loads I might have broken with this change? My usage is quite limited in scope currently.

…where setting allow_redirects=False caused hits to be logged twice.
@jgsbennett
Copy link
Copy Markdown
Author

Hi Tarek,

I've just pushed a couple changes, I hope they're more to your liking? This commit also fixes the additional bug I realised I had introduced where "allow_redirects = False" cases were logged twice. As best as I can make out, this now works.

Sadly, it seems I'm not really in a position to run the style checker right now as I'm not set up to run make on my machine. If it's easy for you, it'd work for me if you dumped the complaints in here for me to look at, or alternately I won't be offended if you wanted to skip me entirely and make the changes yourself.

Jake

Comment thread loads/measure.py
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To simplify this code (resolve_redirects is actually calling send method) i would propose change this method shown below without changing/adding resolve_redirects. Additional this will record any redirect hit.

allow_redirects = kwargs['allow_redirects']
kwargs['allow_redirects'] = False
start = datetime.datetime.utcnow()
res = _Session.send(self, request, **kwargs)
res.started = start
res.method = request.method
self._analyse_request(res)

if allow_redirects and res.is_redirect:
        for res in self.resolve_redirects(res, request):
            # resolve redirects returns generator which is calling send method 
            pass

return res

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants