Welcome to UCOSP!

UCOSP is a program that brings together students from across Canada to work together on open source projects. Students learn how open source software development takes place, practice distributed development, and have the opportunity to work on project with knowledgeable industry or faculty mentors on software with real users.

Mar 24, 2016 - Jupyter Notebook – UCOSP 2016 – Recap

– Adam Wong

Hey everyone, now that I am halfway through my UCOSP project, I would like to take some time to reflect and share some of my experiences. I have had the honor of working with three other talented students from the University of British Columbia as well as with two friendly and intelligent Mozilla employees in the Jupyter Notebook project.

So what exactly is a Jupyter Notebook, and why should you really care?
Well, the Jupyter Notebook is an electronic document that can contain Python code as well as flexible human-readable text such as paragraphs, comments, figures, equations, etc. The code within these documents can be executed and analyzed.

One of the main challenges I have overcome during this experience so far is learning to function within and adapting to the somewhat distributed working environment. Previous assignments I’ve worked on were with a local team of developers usually in person. People could be called or messaged and then ideas could be bounced off each other on whiteboards, over a cup of coffee or even on the same computer.

This new approach has team members all over Canada working together (British Columbia, Ontario, Nova Scotia) as well as internationally (England). The first and most obvious obstacle is that with each different area, there is a different time zone. When scheduling meetings or Skype calls, it is common to find a reasonable time, taking into consideration the up-to-nine-hour time difference. Outside of these scheduled weekly meetings, we often find ourselves communicating via IRC or Skype, both platforms which I am familiar with.

The Jupyter Project uses many different technologies – some I have used before and others not so much. The languages include JavaScript and Python – the former I have used for mere hours (but am currently learning) and the latter I have not used since my first year. Ramping up was a significant challenge, as the time spent studying syntax and language structure put me even further behind my teammates. I often felt like the least useful member of the team as the other students were much better versed in the aforementioned technologies. In spite of this, however, after the initial ramping period I felt much more comfortable working on this project.

You might be asking, “Well I know what the notebooks are, so what is your team doing?” Our team is currently working on various add-ons for the Jupyter Notebook – extensions that will be released open-source and accessible to the public.

The first project was a simple extension that would allow Jupyter Notebook files to be uploaded to GitHub’s gist platform. Specifically, the content from the Notebook (code and text) would be uploaded to the authenticated user’s GitHub account via OAuth2 for safe keeping. During our Toronto code sprint in January, we laid down the foundation for this extension. The main goal for the meeting was to successfully implement a new button in the JavaScript front-end, and to link it to the Python back-end server (where the Jupyter Notebook was running) for processing. Knowing I had limited knowledge in JavaScript at the time, I took it upon myself to handle much of the back-end Python code instead. As I had previously studied OAuth2 in a Computer Security class at my university (ECE568), most of the APIs were familiar. For those that don’t know, OAuth2 is a protocol used by a client (our server acts as a client to GitHub) and a server (the GitHub server offering the OAuth2 service) to allow secure access to an authenticated user’s private information. Specifically, the user is redirected to GitHub, where they specify which services to allow us (the client) to use. After authentication, GitHub will then comply with our requests (in this case, uploading the notebook contents as a Gist) on the user’s behalf. After sifting through the many pages of the GitHub API documentation, I was able to independently implement the handshaking and authentication process between GitHub and our server. However, I got stuck with formatting the notebook content which was formatted in JSON. After spending several hours trying to figure out why my GitHub call would return with a ‘404’ error, I was at a loss. At that point, knowing nothing about JavaScript or JSON, sought out help from my teammates who were eager to jump in and lend a hand. Turns out there is a bizarre function call in JavaScript called “JSON.stringify” that ended up solving my issue. With the whole team working together, we were able to get the extension out and it is now available via ‘pip’ or at https://github.com/mozilla/jupyter-notebook-gist.

Our second extension was an add-on for Apache Spark integration. Apache Spark is a framework for cluster computing – a system where distributed machines can be used together to perform large computations. It is very useful when data processing, something that Mozilla would definitely use to process large volumes of Firefox usage information. If, within the Jupyter Notebook, an Apache Spark job was ongoing, we wanted to provide a progress bar. Unfortunately this task had just gotten off the ground as the university’s Reading Week began, and I had already made plans to be out of the country. I kept tabs on the progress (IRC) during that week, but sadly did not have time to make significant contributions. Upon return, I tried to quickly ramp up, but to little avail. Apache Spark does not like to run on native Windows, so I installed VirtualBox with Ubuntu 15.04 to get it working. Only after much debugging and installing, did I realize that all the tasks for the Apache project were taken and/or in progress by another team member. The feeling of insignificance returned, but after expressing my concern to the Mozilla team lead, he mentioned that it might be better to get started on the next project which would soon be fleshed out and discussed. At the end of our meeting, he mentioned someone around the office was looking for C++ assistance, which is my area of expertise so I will definitely follow up on that as soon as possible.

That pretty much sums up where I am now at currently in the project. I definitely enjoy working with everyone, and my mentors are very intelligent, open and approachable individuals. I feel like working in an open-source environment provides me with unrestricted access to the world’s knowledge, and even someone like myself can share everything that I have learned. Being able to see people happily using what I have worked on is such an amazing pleasure and honor for me, as a software developer. I hope that I will be able to continue to make a difference both in the Jupyter Notebook project, and in future endeavours.

While UCOSP will soon come to a close, the journey into the world of open-source is a never ending road of exploration, learning and sharing. Mozilla is a wonderful company with many wonderful people. It would be a dream to be given opportunities to work with them in the future.

Jan 12, 2016 - How to Generate UML Diagram of Models in Django Project

(An experience report from Yorie Nakayama)

If you are developing with python on Django, or on any frameworks that employ the MVC (Model-View-Controller) paradigm, you might come across whether there is a way to easily visualize all the models and their dependency relationships in your application as a diagram. Models are central to most applications using MVC, and being able to glance at the structure of models and their relationships to each other greatly helps any developer beginning to hack on the codebase.

When I worked with Reviewboard, a Django-based web app for doing code reviews, for my UCOSP, one of the first things I did was to generate a UML diagram of the models in the project using an extension package for Django called django-extensions. The diagram would let the students, who are new to the project, to see the overall picture of the models, and also let the mentors easily explain how some models are related to each other.

This post will provide a step-by-step instruction on generating a UML diagram of a Django application using django-extensions. Similar tools also exist for other MVC frameworks, such as railroady gem for Ruby on Rails, or phuml for PHP (not limited to CodeIgniter).

1. Install Dependencies.

For python, you need to install either pydot or pygraphviz as a python interface of the graph visualization library called graphviz. Either will be fine as far as django-extensions is concerned. I went with pydot because I had a trouble installing pygraphviz. If you choose to go with pygraphviz, you can ignore this step and resume from step 2 after installing pygraphviz.

pip install pyparsing==1.5.7
  
pip install graphviz
  
pip install pydot

The version is specified on pyparsing installation because pydot is not compatible with newer (>=2.x.x) versions of pyparsing. I recommend that graphviz is installed before pydot to avoid an issue where pydot is not able to find graphviz executable. A similar installation issue may also occur with pygraphviz.

2. Install django-extensions.

Django-extensions provides various tools that could help development of Django applications such as enhanced shell, or UML diagram generation.

pip install django-extensions

3. Enable django-extensions in your project.

Under your Django project directory, open the file setting.py with a text editor. Within the settings, you should find a python list called INSTALLED_APPS.

Append ‘django_extensions’ to the list and save. Then you can use the functionalities that django-extensions provides.

4. Generate the UML diagram.

On a command shell, go to the Django project directory, and execute the following command to generate your diagram of all models in the project.

python manage.py graph\_models -a -o myapp\_models.png

The -a flag specifies to include all apps listed in INSTALLED_APPS for diagram generation. The output path is relative to the project root directory.

If you decide that your project does not need django-extensions, you can simply remove ‘django_extensions’ from INSTALLED_APPS in settings.py.

With a UML diagram for your models, you can easily see the overall picture of the data organization in your application. Hopefully this diagram could help a new member on a Django project to find where to start with the development.

— Yorie Nakayama

Dec 15, 2015 - Getting Ready for the New Year

As we prepare for the next group of students who will start in January, we are seeing some turnover in the project offerings for UCOSP, so I’d like to welcome the newcomers and thank the project mentors from the past term.

The project mentors for Waterbear, CodeIgniter, and Mylyn are all based in Vancouver which makes it more difficult to attend a code sprint in Toronto.  It has been great working with Dethe, James and Sam, and I hope we will have a chance to work with you again in the future!

We are also saying goodbye a long-term project: Phonegap.  Tim Windsor has moved to a different job and is not able to supervise students this term.  Tim has been a terrific mentor over several years, and it would be a great pleasure to work with him again.

We are welcoming two new projects this term, both from Mozilla: Jupyter Notebooks for Data Analysis, and Code Coverage Explorer.  I’m excited to see what students will do on both of these projects.