Selenium Headers
So, after my DefCon talk a few weeks ago, someone came up to me and mentioned that something I had said -- Selenium browsers have the same headers as regular browsers -- was incorrect. He said that there were subtle differences between the headers that Selenium sent, and the headers that a normal human-operated browser would send. I only had the chance to speak with them very briefly, so I may have misunderstood, but I thought I'd put this to the test to see what the deal is between normal browser headers, and headers as seen through the Selenium Python library. Obviously, some might point out that this sort of thing is probably Googlable -- however, I invite them to try and google things like "Selenium different headers" and see where that gets you. No such luck finding anything relevant here!
In the interest of totally unofficial research, I'm using the page http://www.procato.com/my+headers/as a neutral third-party judge of browser headers.
Firefox
Headers via a human opening browser:
Host www.procato.com
User-Agent Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:35.0) Gecko/20100101 Firefox/35.0
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language en-US,en;q=0.5
Accept-Encoding gzip, deflate
Connection keep-alive
Headers via Selenium:
Host www.procato.com
User-Agent Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:35.0) Gecko/20100101 Firefox/35.0
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language en-US,en;q=0.5
Accept-Encoding gzip, deflate
Referer http://www.procato.com/my+headers/
Connection keep-alive
Chrome:
Headers via a human opening the browser:
Host www.procato.com
Connection keep-alive
User-Agent Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36
Accept */*
Referer http://www.procato.com/my+headers/
Accept-Encoding gzip, deflate, sdch
Accept-Language en-US,en;q=0.8
Headers via Selenium:
Host www.procato.com
Connection keep-alive
User-Agent Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36
Accept */*
Referer http://www.procato.com/my+headers/
Accept-Encoding gzip, deflate, sdch
Accept-Language en-US,en;q=0.8
So far, it looks like the headers are exactly the same (as I'd expect, given that, fundamentally, the exact same software is being used to make the exact same request here). Just for fun, let's take a look at the headers that PhantomJS (obviously, via Selenium) is sending:
User-Agent Mozilla/5.0 (Macintosh; Intel Mac OS X) AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.9.8 Safari/534.34
Referer http://www.procato.com/my+headers/
Accept */*
Connection Keep-Alive
Accept-Encoding gzip
Accept-Language en-US,*
Host www.procato.com
Well, there's certainly a big fat "PhantomJS" in there. If you were checking headers for potential bots, I'd say you'd want to block that. However, I haven't been able to figure out what he was talking about, other than maybe meaning "block things with PhantomJS in the headers," and I simply misunderstood. Anyway, if anyone has any advice/pointers, I welcome them in the comments!
Add new comment