Build and Test (BAT): continuous integration

BAT webpage
Originally built at Jungo in 2001, and has been constantly upgraded and improved ever since.

checkout: cvs co
make: jake build system, compiling and building images
lint: js/json/perl/html/css
unit-test: js (zmocha)/C (jtest)
upload: uploading results and images to fs

BAT colors

GREEN - build and unit-tests passed successfully
ORANGE - build passed, but unit-test failed
YELLOW - build passed, but unit-test identified memory leak
RED - build failed
White strips background indicates a BAT request cycle

What happen when BAT is broken?

You don't necessarily have to fix it yourself;
If possible, find out whose commit broke it and get them to fix it.
Don't assume they already know it's broken.
If it's a simple fix - fix it yourself
If you have to fix after someone else broke BAT, add that person to the commit message using NOTIFY:
If it's not easy to fix (requires thinking/debugging/cannot be fixed in 1 minute), revert the offending commit and notify whoever made it
Alternatively, if unit-test fails, disable the individual test using if (0) // XXX person.
Use your judgment as to when to revert and when to disable the test.
Disable when you know that the test itself is the problem, or that the breakage is not mission-critical. Always notify the person responsible for fixing the test.

Mongo Documents (Mdoc): our crm

Mdoc webpage
Originally built to control the Deploy work flow, it quickly expanded to hold most of our procedures for R&D, Deploy, sales, IT and HR.
The system is based on a mongo collection that stores data per every procedure. This allows us to control, verify, and investigate procedures, which we do daily.

Mdoc worlds

Deploy - holds all the deploy procedures related to deploying new versions of our products and servers
CDN/Lum - holds all the CDN/Luminati customers leads and specific procedures executed in this department
R&D - holds all the R&D training procedures
IT - holds all the IT installation procedures
HR - holds all our candidates files

Jdoc notification system

Jdoc stands for 'Jungo documents', and was originally built at Jungo in 2001.
The jdoc mechanism allows you, or anyone interested, to register notifications for a specific file or directory's content, based on our CVS, and review every single commit.
The jdoc file is located at the root directory of our development tree.
This notification mechanism is being used to implement our async Review and to support the co-ownership of a module or a file, by sending emails for every modification made to the file or directory's content. Note that you may use the NOTIFY: statement, in every commit you do, to manually activate this notification system for specific logins (people in Spark) you want to notify.

NOTIFY: colin added 'eval' to dropdown list added 'eval' to dropdown list NOTIFY: colin

In case you would like to cancel the notification for the commit you are about to do, since your change is cosmetic or not important, use NOTIFY: cancel in the commit message.

Peer Review: keeping our code quality top-notch

Writing code is our profession, keeping it top-notch is our pleasure!
As it's the main form of communication between the team, having our codebase perfectly consistent in style makes it easy for developers to read and modify each other's code.
So, to increase the team's efficiency, we invest time in teaching everyone how to write code. The review process is our main tool for doing so.

Blocking sync Review: first review, then commit

During the bootcamp and even few weeks after, your code will be reviewed by your mentor in order to provide you with all the tools and guidelines on how we write our code. During this period, you'll do commit sessions together with your mentor until he is fully convinced you are prepared to write code like everybody else in Spark and commit it on your own.
It's your responsibility to commit your code, and hence, you will probably call your mentor for a commit session once you are done with your task.

Pre-commit checklist

You probably ask yourself what the mentor is checking during this session and whether your code is ready?
Below is the checklist you need to do before answering yourself if your code is ready for a commit session:

Verify that all editors are closed before committing the code
Do cvsup -A on the root directory

Make sure all new files were added to the cvs using cvs add FILENAME (verify no new files marked with ? exist).
Verify that all modified (M) files are files you want to commit.
If there are files you want to commit later, duplicate the tree cp -a zon1 zon1.new, and make the current tree include exactly what you are going to commit.

Run zlint -m from the root directory (use cdr to move to the root directory).
Coding conventions: re-read the code, and validate it is compliant. Noobs can use zlint -cm tool to assist locating convention mistakes (the tool is only 95% accurate - so you still must know all the rules!).
Run zdiff and re-check all your changes. Validate there are no 'debug leftovers' (console.log()...).
Its OK to do a partial solution/implementation, but in such cases comment in the code, or add to the version_plan a description of what you will do later.
If this is a modification/removal, rgrep the whole tree to validate all related code is modified accordingly
Run all the unit-tests from the root directory
Make sure the tree can compile (jmake cm release)
Test the feature and play with it in zlxc
Open questions? not perfect? not 100% ready for commit as-is? then don't call your mentor for a commit session.
Rather, prepare all the list of questions in an editor, with your suggested solutions, and call your mentor for a Q&A session.
Later on, once your questions have been answered, and you code is perfect, call your mentor for a commit session.

Committing

Commit only the files you want to commit

cvs ci cvs ci file1.js file2.js file3.js

Commit message

Short, simple and descriptive - be minimalistic

fix mistake in passing of video url in opts from zone_init to zone_find and converting opt.url to opt.video_url move zone_init to zone_find and convert opt.url to opt.video_url zone_init -> zone_find opt.url -> opt.video_url

Start with small letter

Update tasks update tasks

When doing a sync review add your reviewer login in brackets

fixed conventions (nir)

Do not include the filename you committed to in the message - it is redundant

fixed conventions in agent_conf.js fixed conventions

Redo your code for a commit session

You may call your mentor for questioning session. This is more than welcome and even recommended during the bootcamp. However, once calling your mentor for a commit session the expectation is that you did your best to comply with ALL pre-commit checks. It means that all open questions have been looked up deeply and resolved, and your code is bright and shining, waiting to be sent out to our million of users.
In Spark we respect other's time, calling your mentor for a commit session which ended up with a "redo", is a completely waste of time.
Make sure you meet this expectation. Take very seriously the remarks you received, and try to find out how to improve your personal commit procedures not to repeat these mistakes next time.

Post-commit checklist

Now, after you have committed your code, below is the checklist you need to do after your commit:

Run cvsup and make sure the tree is fully committed (no M, nor ? files)
Check BAT until you see your commit got compiled and tested green
If BAT is broken, make it green again by fixing the problem
Deploy your code - bring value!

Deploy your code immediately. Worse case, in case it is after working hours, ask for a pending release in the morning
Track it immediately, or after deployment (60 seconds manual test after deployment, kibana, stats)

Non-blocking async Review: first commit, and later others will review

Once passing the stage of blocking review you'll start committing your code on your own.
This time, the review will be done by co-owners of the code you changed or by other engineers who are registered on the files you have modified (using the jdoc system).

FS: Central shared filesystem

text missing: contribution welcome!

Jungo: Our origin: Passion for complex networking and a love for Open Source

text missing: contribution welcome!

rgrep: recursive grep

grep is an amazing tool. Just like you google the Internet for answers on public info, you grep source code for answers on our private tree.
rgrep is our recursive grep, with a few little usability improvements:

Scans by default all files in the tree, recursively
Skips version controlled meta-files, editor temporary files (swp, bak, ...), and build output directory
Allows language selection, such as rgrep --js send_msg will only search *.js files for send_msg string

We recommended to always rgrep from the root directory. This will ensure you never miss a reference.
rgrep is installed on your machine as part of our development environment.

zlint: general purpose linter

Can lint many types of files: js, json, perl, html, css and more. Always run it before commit.
zlint is installed on your machine as part of our development environment.

Screen capture

When reporting bugs, requesting UI related tasks, or completing UI related tasks, add a minimal (cropped) screen capture or animated GIF to your report demonstrating the item or modification you did. Also, mark in it where the change is so it will be easy for the other side reviewing it and respond quickly.
An example of a cropped, tiny marked screenshot:

Implementing the above can be done by using applications such as FastStone for static screenshots, or ScreenToGIF for animated GIFs.

zmocha: mocha based JS unit-testing tool

Our unit-test framework is based on mocha. Every new function deserves a unit-test to make sure further development will not affect current behavior.
One of the most efficient ways of testing your code modifications is running unit-tests on the whole development tree by executing jmake run_tests.
Running tests on a specific directory can be done by running zmocha within the directory.
Running specific test in a specific directory can be done by zmocha -g [test]

zupdate: Updating your development environment and tools

zupdate is a command line that will install and update your development environment, based on your .zon tree (this is the reason why we keep .zon untouched and unmodified).
The development environment holds the gvim definition and themes, VM look&feel and all required development tools.
It is recommended to run zupdate every day, as our codebase is changing rapidly together with our development environment and tools.

gVim: text editor

Our common text editor based on Vim, powered with many vim plugins that enhance our development productivity, and used as our code IDE.
Why are we using gVim?
Because it is tidily connected to our development environment and tools.
Can I use any other IDE instead?
Sure! You may use any IDE you like, just keep in mind that our gVim is already adjusted to our coding conventions and has many features to support our way of development. Your IDE should be adjusted as well.
Moreover, when working with other developers, they mostly would like to work on gVim, so you'll have to adjust yourself to it.
Do you have a shortcut for using gVim in your development environment?
Just use g. It will open the gVim editor on your machine. More shortcuts and useful commands are in our vi Basic page.

CVS: Concurrent Version System

Our source control plays a major role in daily inter-developer communication tool in the company.
Commits are also a form of email, being one of the core engines behind our module ownership and code reviews.
Our version of CVS, originally based on the open source CVS, has been greatly enhanced and improved over the years to provide:

transparent offline support (via local smart cache proxy)
instant checkout and update (via background prefetching)
remote web interface, GitHub like
local web interface, similar to other version control GUI clients
MD5_CDN signature based smart annotate: find out who really originally wrote a function, no matter if it was copied, or indentation has been changed.
commit hooks: committing directly activates many different deployment tasks, updating thousands of servers, configuration, and code - instantly.

For a quick start, look at CVS basic command

cvsed: utility for editing and committing common files

An editor tool which combines editing a file using gvim, and committing capabilities, using cvs update and commit commands.
cvsed has shortcuts to our most common used files, which can be seen by executing cvsed --help.
You may use this tool for editing your daily file (cvsed daily) or your version plan (cvsed vp) or simply open both (cvsed vpd).

zlxc: emulate everything in your PC

A staging environment is a great tool for developers: they can play around with the feature they developed in a 'full system', with all the hundreds of computers, IPs and route tables, frontend and backend web servers, databases, message queues etc... and validate that everything works well together.
In Spark we wanted every developer to own a complete staging environment of his own, so we used LXC technology to power it.
In a single command we manage to start up in under 6 seconds a full emulation of hundreds of NodeJS/C/C++ based servers with different IP addresses and complex route tables, multiple MongoDB and NGINX servers, graphite, elasticsearch, and many other servers and services.
Now every developer can fully test the complete product, inside his own laptop, even before the commit and see how everything plays well together.
Give the developers great testing and emulation tools, and they will deliver great quality products.
Full description of zlxc can be found once running zlxc --help, however some zlxc most common commands and usage are:

zlxc run [group] - runs specific servers (saves booting time)
zlxc relink - relink modified files to the zlxc container
zlxc windows - runs zlxc and enable connecting external Windows VM to the zlxc environment
cvsup -A && jmake cm release && zlxc run [group] --purge - run an up-to-date clean zlxc instance from scratch

Workstation

Your work station enables you to be efficient. At Spark we invest a lot of efforts making it best configured and efficient as possible.
To keep it up-to-date, make sure to run zupdate every day. It will make sure all your development tools (gvim, zlxc, cvsed, zlint, FastStone, etc.) and look&feel are kept configured and updated.
It could happen that you'll feel the way your workstation is configured is not your style, and you would like to change it to suit you best. You may freely change it, once you are fully independent and can work on your own. Therefore, to save your colleague's time (as they are used to this environment), work with the current configuration and suggest any environment enhancements to your mentor.
Our development tree is called zon. While the environment and tools are being installed by zupdate on the .zon tree, you may duplicate it for development purposes to zon1 and/or zon2, by simply execute cp -a .zon zon1 or cvs co -d zon2 zon. This will give you the ability to work on one task in zon1, while doing other tasks on zon2.
Moreover, doing your daily and updating your version plan can be done very easily by executing cvsed daily (to update your daily), or cvsed vp to update your version plan's section, or cvsed vpd to update both daily and version plan at the end of the day.

Monitor and alerts

We use graphite as the base of our monitoring and alerting system. We add many zcounters in our code to send stats to our servers, and aggregate them up on a realtime monitoring and alerting dashboard. That's how we keep our services up and running 24/7/365.

zcounter: monitor your code

To validate your code runs well in the real world, add zcounters inside your code - count and measure events, and use our graphite based dashboard to monitor their values, and alert when something goes wrong.

perr: collect errors and other reports

The idea of "perr" was so that clients (initially is was for Spark Accelerator) could submit crashes and other runtime errors. Submission of perr reports is open to the outside world, and the system is built to handle a very large incoming flow.

These days, the technology has been repurposed to collect reports about events other than errors, too. For instance, our Spark loader and player send reports that include a lot of stats like how long it took to start playing, whether there were buffering events and how many, and so on. Our browser extensions, Android apps etc send reports on events such as "authentication", "first use" and so on.

The perr servers receive these reports over HTTP. Then they do several things:

Report zcounters about certain types of perr or their properties. Most, if not all, perr types produce a zcounter stats.glob.perr.$ID just to count them. Also, CDN reports produce many zcounters that are specific to particular customers/zones, based on individual events mentioned in the report. In fact, most of the CDN world in the dashboard is produced this way by the perr servers.
Drop certain reports. The server can be configured to drop reports according to specific rules. This is done so that the database does not get flooded with millions of identical reports when we only need a few to diagnose problems. Note that the zcounter submission above happens before the dropping.
Forward remaining reports to the database where their bodies will be stored.

There are different perr servers for the same reason as why there are different www servers. Better distribution of load, separation of code (for example, zs-perr-hola-b does not need the code that processes CDN player reports), separate releases.

Different kinds of clients use different URLs to submit their perr reports, although the format is the same. For example, the CDN player submits to https://perr.h-cdn.com/client_cgi/perr (I think), while the browser extension submits to https://perr.holaspark.com/client_cgi/perr. These domains are served by nginx on zs-client-hola-f* and zs-client-spark-f*, which then forwards these requests to zs-perr-spark-b* and zs-perr-hola-b*, respectively (with load balancing).

perr data visualization

ElasticSearch is the specialized database to which the perr service submits the reports on step 3 above. In fact, the full data flow is like this:

(client) -> zs-client-*-f -> zs-perr-*-b -> zs-log -> zs-elasticsearch -> (kibana)

(The zs-log server does little more than just forward the reports. These days, its presence is mostly historical.)

Kibana is a web application that lets you explore the ElasticSearch database. It is to ElasticSearch what our dashboard is to metricdb.

Deploy: shipping your code to millions of users

Deployment is putting your work out for everyone to use. The deploy team will take your request, and will follow a detailed procedure that will label, test and eventually release your code.
If an issue occurs during this process, the deploy team will notify you - so you will be able to quickly fix it.
The deploy team is available to receive your requests and is responsible for monitoring dashboard 24/7/365

Bootcamp: your first days at Spark

A 3 week training program, in which you are introduced to the way we work, communicate and collaborate with each other.
It's time to learn! and adopt our DNA, as well as perform your profession skills.

Spark Noob: start doing real stuff, but still a lot of questions

Larger tasks, significant contribution while actively learning about the company's DNA, software architectures and work methods. Code reviews for most commits, some without a pre-commit review session, and sometimes even only after deployment.

Spark Junior: on your way to being part of the family!

A major contributor to Spark's products, developing specific domain knowledge in fields that are of interest to you within Spark. Commit directly to the product tree without prior review, and see your contributions going live to millions of users within hours of being developed. Still internalizing Spark's DNA and best practices, and open to comments about commits from other Spark Veterans.

Spark Veteran: A 'did-it-all' Veteran at Spark

An integral part of the Spark Family. A major contributor, knowledge center, and a person who can help Noobs and Juniors get up to speed on the Spark DNA and best practices through mentoring, and code reviews.

Spark Chef

We truly believe that eating healthy and good food should be part of having fun at work, while being creative and productive when creating fantastic products!
Our in-house chefs are responsible for those great meals we have every day.

Mentor: Your Spark veteran buddy

The Spark mentors are selected Veterans that embody the Spark DNA and have the ability to help Noobs learn to become great contributing team members.
Every Noob is assigned a mentor -- a Veteran who helps to progress from noob to junior to veteran status. It's up to you, the Noob, to get as much face time with the Veteran and ask any question, so that you can learn as much as possible. The Mentor is busy with his tasks, but will take the time to help you as much as you need when you ask for it.
What's the best way to learn from a mentor? One very effective way is plain old imitation -- looking at his practices, daily routines, time and task management, email styling, at how he communicates with colleagues. These are all great practices to take on until you create your own style at Spark.

At Bootcamp stage mentor helps how to find what is necessary for fulfilling tasks: documentation, code, people. He also reviews your code.
Mentor of a Noob helps learning company's DNA, and still does the code reviews.
Mentor of a Junior is a go-to person to understand how the company and procedures work in detail.

MVP: Minimal Viable Product

In any task/feature/product we build at Spark, we try to progress incrementally, by building a Minimal Viable Product (or minimal viable task/feature) that brings value to the end user. This allows testing whether the solution is in the right direction. It also brings immediate value to the end user, or if it is an internal task - brings an immediate solution to a real problem. If the MVP works out well, additional incremental improvements will be done. If not, a different direction (thus a different MVP) will be tried out to solve the same problem.
This is a core concept in modern Lean Startups. We highly recommend you read The Lean Startup book. See also Wiki.

ZON: appears in the code - but what is it?

Spark was founded in 2008, and sticking to the MVP method, our 0.1 version of the company name was ZON. This gave us time to find a name that best represents our values: simplicity, directness, friendly, welcoming. By the beginning of 2009 we found it: "Spark", coupled with is friendly smiley :-)
So our source tree directory is still '.zon', and our internal tools and functions many times have a 'z' prefix, and our P2P protocol is called zmsg.

Coding conventions (Style Guides)

Source Code is the main produce of software developers. It's most of what we read and write at all day - even more than the time we spend reading and writing emails (... we also have email style guide to make them consistent!)
It's the main form of communication between the team, and therefore its consistency is highly important.
Having our codebase perfectly consistent in style makes it easy for developers to read and modify each other's code. We try to express our individuality and personal great engineering skills, not by spacing and indenting our code differently than others, rather by finding simple solutions to complex problems, great algorithms, original and efficient designs.
There's no 'right' or 'wrong' in coding conventions - any coding convention chosen can be 'right', as long as it exists, and is strictly followed by all of us - thus achieving consistency.

Email

Emails is our primary way of communicating within the team, as we appreciate each other's time.
We use it to discuss issues offline, to notify about code modifications (jdoc), as well as provide tasks to each other and share useful information.
Since this interface is central to the way we work, just like with coding conventions, we also adopted an email style guide to speak the same language, and be more efficient while handling it as part of our regular tasks.

version_plan: Open planning for the near and far future

The version plan can be found in doc/design/version_plan.txt.
Each engineer will have a section with his login and username, holding his future and current tasks. The version plan holds the R&D planned tasks. It is the engineer's responsibility to update his progress in this file once a day.
Updating this file can be done easily with cvsed. Writing guidelines can be found in the Version plan code.

daily: Open reporting of daily achievements

Your daily file is a text file, located under your name in .zon/doc/report directory.
The goal of this file is to report your daily achievements, describe your commitment (tasks you will complete for sure in the next day) and state your next planned task/s (tasks your are about to start after completing your commitment).
This daily report, makes our development transparent and let everybody understand what everybody does, as well as let everybody comment or suggest what is best to do next (you may use the NOTIFY to let specific people know about your progress).
Updating this file is done every day, usually at the end of it (you may choose to update it at the beginning of the work day, summarizing your previous working day) using our cvsed tool.
What should be reported?
In general, every task that took you more than half a day, should be reported.
What kind of tasks should I report?
The file actually describes your working progress, according to the plan described in version_plan.txt. Writing guidelines can be found in the Daily code.

Product weekly report

A weekly executive summary, generated by product or project owners, to their team, managers and stake holders, to be send by the end of the week, spreading the news about weekly accomplishments.
What should be reported?
In general, every achievement accrued and completed during the last week (with respect to a specific product/responsibility) should be reported as well as relating to the activities and goals to be carried out the following week.
Be precise. Write down numbers. Compare to previous weeks and explain the growth/regression, success/failure.

Was a great week: we reached 95% success rate in install flow, so our installs improved dramatically. we have many ideas we will work on next week to continue improving it. Also the homepage changes helped to get the numbers up. Improved number of downloads on hola homepage as a result of improving the user flow during installation. Completed: + web installer - increase the installs from 65% to 95% by fixing technical bugs during installation flow. Planned: * Design and implement the main view in hola homepage to increase the first click on the main button from 35% to 70% (attached first version of the new design. we will run a test during the weekend).

Online

The working hours of everyone, including management, are all open for the whole company to see - enabling real time peer collaboration.
This allows the company's R&D to be spread out in 20+ cities, over 15 time zones, and still be an efficient and productive development team.

DNA Manager

Spark is a DNA company
To keep our DNA, which is critical to our success, the DNA manager overlooks all activities in the company, and makes sure they are compliant with our DNA, which we all participated in creating.
This includes activities such as:

version plan
daily report
attendance: hours, vacations, leave, sick, holiday...
emails and communications
R&D and Deploy shifts
DNA training sessions in bootcamp
Spark ranks
... and generally looking out for any deviation from our DNA

Attendance

At Spark you may choose to work from any location you prefer to be as productive and effective as you can.
We find that working in the office with your peers is typically the most productive environment and most fun to work at. This is the reason we have several offices worldwide, allowing our employees to spend their day with their peers. So while working from our offices, we usually work for 9.5 hours a day (working hours are usually 09:00-18:30), out of which 30 minutes is our lunch break. Minimum working day is usually 4-5h a day.
But some of us live far from the office or in countries where Spark does not have an office and so must work from home (cloud office).

Reporting attendance

We have several interfaces to report attendance:

attendance clock
When working from the office, use your card or your fingerprint on the attendance clock, at the entrance, to login / logout upon entering and leaving the office.
web
Timesheet web interface (requires VPN connection), which enables you to login and logout while defining your work location:
- At home - When working from home or hotel
- In Office - When working from Spark's office only if the attendance clock is not working or available
- On Road - Used by sales people, going to meetings outside their home or hotel
email (sales team only)
Send an email to login@holaspark.com with your location in the subject (home, office, meeting, on the road...):
From: nir@holaspark.com To: login@holaspark.com Subject: meeting
to logout, send:
From: nir@holaspark.com To: logout@holaspark.com
command line
daily login / daily logout - which enable you to login and logout respectively from within your Ubuntu VM (usually being used by the R&D)

Reporting absence

If you have a planned absence - report it in advance on your daily report. Unplanned absence (e.g. sick leave) should be reported on the daily report as well.
If you plan to work less hours a day for a given period of time, notify your manager - to set expectation of your progress correctly. Your daily report for such days should also indicate it.

Suggest

When making an improvement suggestion, whether of an internal tool or of a new design to replace an existing one, the suggestion needs to be well thought of before. The responsibility for thinking it out over, and bringing well-prepered suggestions are on the one who is suggesting it.
The cost-effectiveness of the proposed solution should be the main factor.

Suggestion checklist

How much time is spent today due to the existing design?
How much time will be saved after the implementation?
How much Total Cost Ownership is the change (preparing suggestion, discussion time, agreeing on it, coding, debugging, deploying, finding all the edge cases)?
Do we lose any features? Are you fully aware that these features are not required?
Do you have a full idea on how to implement it, on all its technical aspects and details, and there are no 'black holes'?
What are the cheaper/quicker ways to solve the same problem?