Main PR UI. Will be used to create new rulesets and/or update existing rulesets
Home page for project. On this page users can search for existing rules and view corresponding attributes by clicking view.
Usefulness You probably already know that modern web mostly uses two protocols: plain text (insecure) HTTP and encrypted (secure) HTTPS. Although the percentage of "pages loaded over HTTPS" tripled in the last 5 years from (25% in January 2014 to over ~75% now[Googe][Firefox]), that metric does not account for sites that deploy HTTPS unreliably and we can fix this (see examples). This is caused by many factors, including complexity of existing standards, legacy code bases and CMS or simply laziness of site administrators to ensure all requests are made over HTTPS. One client-side tool that aims to fix this problem is a browser extension HTTPS Everywhere, which relies on a database of rules telling it which requests to rewrite from HTTP to HTTPS and how to do it. Unfortunately, despite large user base (over 3 million extension users across Chrome, Firefox and Opera) many rules are stale because updates are done mostly manually by a small number of volunteers on GitHub. We plan to automate this process and update existing rulesets and create new ones. This application will be immediately useful to all 3 million HTTPS Everywhere users once we start merging rulesets upstream and will be useful for site administrators interested in securing their services.
We are different from two related sites - SSL Observatory and HTTPS Everywhere Atlas. This is because SSL Observatory is no longer maintained and HTTPS Everywhere Atlas only searches for existing rulesets by target. It merely displays two XML files – the latest version of rule on master branch and the currently deployed version. It does not support search by any other criteria, does not collect any other information or propose updates. We target the same group of people - HTTPS Everywhere maintainers and volunteers on GitHub, but we will offer much more advanced functionality.
Main Datasets Current HTTPS Everywhere rulesets (over 100000 domains in "rule targets") - the main dataset Chromium HSTS preload list Used for de-duplication (if domain is preloaded, we don't need to include it in a ruleset. Also, we don't bother to check Safari, Firefox, etc. lists because they are all based on Chromium list.
HTTP redirect responses
Results of HSTS preload tester Results of Google searches with advanced queries Why crawl the whole domain if we can use indexing done by Google?
Basic Functions of database:
Search for ruleset
Search for proposed updates
Update or Delete existing ruleset, and Insert new ruleset
Advanced function 1: Suggest ruleset updates based on observed information form the real world.
Examples of how we do this are here. Usefulness: This automates most of ruleset maintainers' work, they no longer have to write the rulesets by themselves, instead they get a "security report" and can just aprove or . This increases the speed of updates and thus increases effectiveness of the extension and might even reduce rule breakage. Technical challenge (very challenging): we collect the data, and implement decision logic to prove that the new ruleset (1) does not break sites and (2) has maximum coverage. Advanced function 2: Integration with GitHub (to push to and pull updates from) and FTP (to release our updated rules via HTTPS Everywhere update channel) Usefulness: HTTPS Everywhere maintainers will be able to see all the relevant information about a proposed update in our system and then commit the changes to GitHub in a single click (if they have write access) or create a patched branch and then open a pull request (if they don't have write access). Real-world HTTPS Everywhere users will be able to sign up for our rules that haven't been merged upstream by including our FTP URL in their HTTPS Everywhere extension. **Technical challenge: GitHub integration: We will export a subset of our total dataset in HTTPS Everywhere ruleset format (XML), push files to GitHub, make pull requests dynamically, and then track state of pull requests via API import updates (made by other people) from GitHub, parse rulesets (XML) and record them in our database integration with GitHub issue tracking system and pull requests - display relevant links to the GitHub issues and pull requests as well as Make our own HTTPS Everywhere update channel (via FTP): this is technically pretty simple: just export some data in JSON, sign it, put JSON and signature files on FTP and update timestamp in another file Obtaining data: HTTPS Everywhere rules - from the official repository on GitHub Chromium HSTS Preload List - from the Chromium source tree Output of HSTS Prepoad tester - we check out Chromium's preload tester from GitHub and run it locally HSTS headers, HTTP redirects, hashes of responses to compare content served over HTTP and HTTPS - by crawling