Accurate web measurement is critical for understanding and improving security and privacy online. Implicit in these measurements is the assumption that automated crawls generalize to the experiences of typical web users, despite significant anecdotal evidence to the contrary. Anecdotal evidence suggests that the web behaves differently when approached from well known measurement endpoints, or with well known measurement and automation frameworks, for reasons ranging from DDOS detection, hiding malicious behavior, or bot detection. This work explores improving the state of web privacy and security by investigating how, and in what ways, privacy and security measurements change when using typical web measurement tools, compared to measurement configurations intentionally designed to match “real” web users. We build a web measurement framework encompassing network endpoints and browser configurations ranging from off-the-shelf defaults commonly used in research studies to configurations more representative of typical web users, and we note the effect of realism factors on security and privacy relevant measurements when applied to the Tranco top 25k web domains. We find that web privacy and security measurements are significantly affected by measurement vantage point and browser configuration, and conclude that unless researchers carefully consider if and how their web measurement tools match real world users, the research community is likely systematically missing important signals. For example, we find that browser configuration alone can cause shifts in 19% of known ad and tracking domains encountered, and similarly affects the loading frequency of up to 10% of distinct families of JavaScript code units executed. We also find that choice of measurement network points have similar, though less dramatic, affects on privacy and security measurements. To aid the measurement replicability, and to aid future web research, we share our dataset and precise measurement configurations.

2021 THE WEB CONFERENCE NEWSLETTER
The Web Conference is announcing latest news and developments biweekly or on a monthly basis. We respect The General Data Protection Regulation 2016/679.