We have an excellent list of projects, and I’d like to thank all the project mentors for taking the time to work with our students.
Review Board is a powerful web-based code review tool that helps developers do peer review as they write code. Review Board is used by thousands of software companies including Twitter, Yahoo, and VMware, as well as many open-source projects like Apache and KDE.
For a full list of project suggestions, check out our wiki:Student Project Ideas
For more information, see the project web page at http://reviewboard.org, or our students demo videos at https://www.youtube.com/channel/UCTnwzlRTtx8wQOmyXiA_iCg/featured
Umple is an open source toolkit whose objective is to merge UML modeling and programming into a single activity. Umple can be used in several ways: It can be used as a textual language for UML. It can also be used as a programming-language pre-processor, allowing UML concepts like associations and state machines to be added directly to Java, C++, and PHP. In addition, Umple allows drawing UML diagrams online and generating code directly from those diagrams. It is the goal of the Umple team to have large numbers of programmers and modelers incrementally adopt Umple. The barriers to entry are low, since using Umple can be done in a minimal way, without disrupting the existing model or code. Umple is an open-source project hosted on Github. You will have the opportunity to learn some or all of the following:
Code coverage can be used for many purposes. At Mozilla we have great ideas of how to use code coverage to better the quality of each version of Firefox and to make developers job easier by providing more data and easier testing.
We have two long term goals we would like see implemented:
For the purposes of the 2016 Fall UCOSP program we will start by delivering goal #1. In order to achieve this we will need to:
Last year, the Firefox Debugger team decided to re-write the UI on Github with React. Today it has a modern web stack (React, Webpack, Babel) and is one of the most widely contributed to Mozilla projects.
With this project, we want to take the next step and build on our relationships with the major framework and library teams to build the kind of tools that we would have liked to have while building the Debugger. We hope that these integrations will make the debugger feel more like a React or Ember debugger than a JS debugger.
Some examples of framework and library integrations include:
Highlighting framework frames in the call stack will let us tell the user what frameworks are doing when the application pauses. For instance, if the user pauses while making an API call, the debugger call stack can show the user that the application is updating the user record because the “save” button was pressed.
Showing framework components in the source tree will let users see their application’s framework types. In the case of React and Redux that includes (Components, Actions, and Stores). We think this view can be more valuable than just seeing source files because it is closer to how developers think about their applications. It also provides us a way to link to other framework information in the future.
Previewing library objects will let frameworks format the objects the way they were intended to be seen. For instance, the most important data in a React component is the props and state. We hope to give frameworks control over how users see their framework objects because we think it paves the way for future integration points in the future.
These features are exciting because it shows that when we work closely with libraries and framework teams, we can build new types of tools that help developers every day.
In the Inspector, we will be working on a number of tools related to various CSS properties, Layout and Font inspection. Some examples of these new tools and features that previous students have worked on included:
To read about some of the accomplishments from previous UCOSP students, visit https://hacks.mozilla.org/2017/06/new-css-grid-layout-panel-in-firefox-nightly/.
For the Inspector, we want to continue to working on better CSS Layout tooling such as CSS Grids and z-index inspector. In addition, we are aiming to implement more visual editors that will help developers and designers with debugging and editing complex CSS properties visually. Since CSS is very declarative by nature, we can provide tools that are commonly found in digital design tools, such as Adobe Photoshop and Illustrator, and Sketch, in order to bridge the learning gap of new developers and designers learning CSS. This would allow them to edit various CSS properties such as box-shadow, clip-path, gradient, filters, etc without knowing what the properties are.
Our goal is to help developers and designers come into the Inspector and be able to easily edit and learn HTML and CSS, and make changes to their web pages and eventually provide better authoring tools to help them also extract these changes.
The Taskcluster team at Mozilla builds a platform for performing the continuous integration of the Firefox web browser. We currently perform many of our operations using cloud resources, primarily the AWS suite of tools. We would like to expand our monitoring and management of our AWS resources. The first area we’d like to focus on is EC2 (VMs as a service).
We have a component that manages our EC2 account called ‘ec2-manager’. In this service, we’d like to monitor things like how many instances we have, how many spot requests are open and how many EBS volumes we have. We’d like to feed that into our standardized reporting tool called ‘statsum’ so that we can spot trends and highlight issues.
We’d also like to have tools to monitor for unused resources which cost us money or are causing our account to be messy. Examples are reporting on machine images (AMI) snapshots which haven’t been used in a long time, EBS Volumes which aren’t attached to anything and SSH key pairs.
A third thing we would like to do is monitor the reasons for instance termination and spot request outcomes. This can be used to gather information on the health of EC2, since the Amazon provided means of determining health are incomplete.
The final component would be to introduce management of resources. The main resource here is Security Groups for EC2 instances. These are a white list based set of firewall rules which can be attached to EC2 instances. We’d like to have facilities for automated management and monitoring for them. We would like to have checks in place to ensure that we are alerted if any changes from the expected security group settings are made.
Formulize is a tool for making data management systems on the web.
Formulize has extensive support for modelling workflows, so that organizations can customize how users interact with the data that Formulize is managing. It is aimed at “power users” in not-for-profits and other organizations without large IT departments and resources, empowering them to create systems that would otherwise require custom programming to deploy.
The most basic operation in Formulize, is the creation of forms. Administrators can specify what elements should appear on the form, and also how different groups of end users should be able to interact with the form. From there, administrators can make custom screens that control how lists of entries in each form are shown to end users.
Administrators can also control how different forms relate to each other, similar to describing table relationships in an ERD. These relationships then govern how data is queried from the database, enabling screens to display complex sets of information to users, rather than just entries from a single form.
Formulize can work as a standalone application, installed on a web server. Formulize can also be embedded within any PHP-driven web application on the same server where it is installed. A Drupal module has been created that supports extensive integration with the Drupal content management platform, including single sign-on for users. Integration plugins for WordPress and Joomla have also been created (by previous UCOSP students!).
Formulize is used by organizations around the world, for a variety of purposes, from tracking the status of housing renovations, to recording the activities of wilderness rescue teams. Formulize was created by Freeform Solutions, a Canadian not-for-profit organization that helps other not-for-profits with IT.
Because it is a tool that you use to create other systems, rather than a tool that does something for end users by itself, there is a high degree of abstraction throughout the codebase, especially the parts that interact with the database. The code has to read configuration information specified by the administrators, and use that to dynamically generate all operations, including database queries.
The newer parts of the codebase employ some object orientation. Older parts remain largely procedural. The codebase is maintained on GitHub.
Students will have extensive exposure to PHP of course, and related web technologies. This term, we will begin by focusing on upgrading Formulize to be compatible with PHP 7. There are also several significant, older branches that are complete, but need updating to merge into the current master branch.
Over the years, Formulize has developed a specific process for managing code, tests and documentation all through our GitHub repository. Our the tests are run automatically by a continuous integration system, based on Travis-CI and Sauce Labs. Students will have exposure to this entire process as part of contributing to the Formulize project.
Learn more about Formulize:
Download Formulize and docs: http://formulize.org/formulize-downloads
Browse the GitHub repository: https://github.com/jegelstaff/formulize
Video tutorials for using Formulize (the full series is about three hours, but you can skip around between various videos at your leisure): http://formulize.org/formulize-workshop
Learn about our version control and continuous integration process: https://jegelstaff.github.io/formulize/developers/
Apple is a dominant player in the mobile space, and Swift is its new language for writing mobile apps. While Apple provides tooling based on LLVM, that is low-level machinery that is not readily usable by researchers working on program analysis at a higher level. Current research frameworks such as WALA and SOOT have little support for Swift, while, on the other hand, they provide extensive support for Android, even its latest versions. The result is a plethora of published tools for Android, e.g. Flowdroid, Scandroid, Stringoid, DroidInfer to name a few, and a relative dearth of tools for Apple platforms. Given the popularity of Apple products and that wealth of published work for Android based on available infrastructure, we believe that lack suitable Swift infrastructure is a significant impediment. So support for Swift in a major platform like WALA would enable a wide range of new work.
Apple’s security model is significantly different from that of Android, for example; similarly Swift is rather different from popular languages on Android, especially at the type system level. It would be interesting to understand how these aspects affect program analysis, and analysis infrastructure is a prerequisite for such investigations.
There has been some ongoing work on Swift support since the WALA hack-a-thon held at PLDI 2017 , where the hack-a-thon participants implemented some basic level of functionality for people to explore call WALA from within the Swift compiler.
Evaluate and extend Swift call graph construction in WALA for some open source Swift code, and create any needed special-purpose policies to avoid precision loss in the face of advanced Swift features.
Current state-of-the art work on safe browsing and tracker prevention largely rely on heuristic
 approaches or (semi) manually updated white/black lists
. When it comes to measuring the efficacy of privacy and tracking protection service, client-side performance is difficult to asses. Using web crawler technology to study the incidence and prevalence of tracker code also has limitations in terms of the portion of the web that can be seen. Current anti-tracking technologies attempt to block the execution of (primarily
We would like to gather a labeled dataset to guide development of (rule-based) classification of page elements in real time. The aim would be to prevent tracking code from executing while minimizing page breakage. The classifier should be evaluated such that it generalizes well to new pages without the need for a black-list/white-list paradigm.
The project will combine a small set client-side measurements for a group of opt-in users’, measured during unperturbed web browsing. The set of metrics to be collected in this context will be discussed during the September UCOSP sprint with the first full team deliverable being a finalized list of measurements to be delivered on or before October 18th. Measurement of features indicative of fingerprinting should be the primary focus with additional features relating to the user experience also considered; for example: page breakage, load-times, layout disruption, etc. Some starting points of projects active in this technology space are here: chameleon, panopticlick, cliqz.
Collaboration on this project between Mozilla and openWPM maintainer Steven Engelhardt will allow us to combine this information with a broader view of such API calls by detecting their occurrence in a large corpus of crawl data. One team member will be tasked with interfacing the classifier trained on our client data with the openWPM platform.
In order to improve upon current methods for assessing anti-tracking performance, a criteria of “information leakage” will be developed and articulated in the context of reference statistics derived from the crawl data. This quantity will relate to the distinctiveness of a particular browser visit to a page in a feature space defined by the information accessible by the page.
The data derived from this two-prong collection will generate a semi-labeled dataset with which to train a classifier that does not rely on black-list/white-list resources. Assuming a sufficiently level of performance is achieved, such a classifier could be used to generate and update a black-list by periodic crawls or perform tracking-blocking in real time if deployed in a web extension. A reach goal of this project will be to investigate the feasibility of detecting and clocking first-party tracking efforts, which are a growing concern as a small number of entities gain ever-larger presence on the web.
Data analytics and classifier implementation: In order to make use of the data collection efforts from both the client-collection and crawl portions, a feature selection procedure must be carried out. This will be performed by a fourth team member. This task may consider the additional context of information we can get from project Fathom. The features selected here will form the input space of a classifier that will assign a label of safe/not-safe to specific page elements. The efficacy of classification strategies will be evaluated against the information leakage metric described above. A substantial validation step will be required here to prototype and evaluate various approaches to this problem that are expected not only to prevent fingerprinting, but also (where possible) preserve page functionality.
Project lead: Martin Lopatka (Mozilla)
Differential Privacy expert: Dave Zeber (Mozilla)
TrackWare specialist: Luke Crouch (Mozilla)
OpenWPM and browser fingerprinting specialist: Steven Engelhardt (Princeton)