- When it’s used as a pipeline within a $lookup, MongoDB’s $expr operator only supports indexes for equality matches. $in is apparently not considered an equality match, even when the search array only holds one value and in other situations the query parser would treat it as an $eq.
- MongoDB supports wildcard indexes to quickly define an index for all fields in a document or subdocument.
- Some more useful applications of JavaScript’s Array.reduce() function, beyond summing numbers. I do miss some of the array utility functions from underscore/lodash when I’m working in projects that don’t use them, but it had never occurred to me to try to recreate them like this.
- I enjoyed these tips for front-end security and learned a few things (particularly some new HTTP headers such as Content-Security-Policy and X-Frame-Options). I don’t agree that front ends should be sanitising input against SQL injection attacks, though – that is definitely a job for the back end since attackers could bypass the front end altogether and send their malicious content straight to the server, and doing it twice introduces the risk of unwanted double-escaping.
- There’s a new media query in town and it’s called prefers-reduced-data. There’s also an HTTP header called Saves-Data, and both are intended to indicate to an application that it should be more conservative with how much data they send to the user. I can see applications for this on the front end (polling the server for updates less frequently, for instance) and the back end (sending fewer fields for each record, or reducing the pagination size), and as a data miser myself I’ll be very interested to see if it achieves widespread support.
This Week I Learned: 2020-11-29
- The difference between the ADD and COPY Dockerfile commands. ADD does some magic stuff (like extracting files out of an archive) but is deprecated because it was sometimes a bit too magic. COPY does what it says on the tin and is the recommended way to do things now.
- How to run a Python Flask app in Docker. The development server only listens to requests on 127.0.0.1 by default, and it turns out that when requests come into the container through the Docker bridge, they don’t get forwarded to 127.0.0.1.
- MongoDB has a change streams feature, which allows an application to receive a notification of any database write, without having to poll for changes.
- The MongoDB $lookup stage can take let and pipeline parameters, which can be used to perform a subquery and append the results to the return document. I’m looking forward to experimenting with this some more.
- I really enjoyed this talk about the Lean Web by Chris Ferdinandi. As a user I’ve never been a fan of the single-page app approach to simple static sites, and I like the example of using a service worker to pre-fetch all the other content the user might want to see instead.
Git for SVN Devs, Part 4: More Recipes
Creating a repository
For a brand new project I prefer to create the project in GitHub and then clone it locally. For existing projects, there is the GitHub Importer tool which should do the trick.
GitHub projects have a “default” branch. The default default is master, but if you are going to use a development branch then GitHub’s UI will be more helpful if you make this the default – instructions here.
Once you’ve created the project, you can click on the “Code” button in GitHub to get the URL, then use this to run git clone https://github.com/myproject.git
.
Making a .gitignore File
The .gitignore file tells Git what files you never want to commit to the repository. This is useful for keeping things you don’t want (like IDE settings, cache directories, and compilation output) out of your code base.
Just create a file called .gitignore in the root of your project, and add each path that you want to ignore on a separate line. If you’re using a JetBrains IDE, you’ll want to add the following line to exclude your IDE index files:
.idea/
Throwing Out Uncommitted Code
Sometimes you’ve stuff something up so badly that you just want to chuck it out and start again. If you haven’t committed anything yet this is super easy.
PHPStorm: Right click on the project directory and choose Git -> Rollback. You can also do this for individual directories or files by right-clicking on them.
Terminal: From the project root do git checkout .
– or you can specify the path if you want to target a specific directory or file.
If you have committed something you didn’t mean to, there’s an excellent guide here for fixing the mess.
Committing and Pushing Code in PHPSTorm
There are actually 3 steps for getting code from your local file system into the remote repository in Git. This does sound like a lot of overhead, but if you have the right tools you can do it all with a couple of clicks of the mouse.
The first step is adding, where you tell Git that you would like the changes you made to a file to be included in the next commit. Then you commit, which adds the commit to your local repository. Finally you push, which sends your new commit (and any others that hadn’t been sent yet) to the remote.
Some devs prefer to make all their in-progress commits locally, and then push the whole feature to the remote once it’s done. Others prefer to always commit and push, so that their work is backed up somewhere if their hard drive crashes.
In PHPStorm adding is taken care of for you. Once you’re ready to go, right click on the project and choose Git -> Commit Directory. You’ll see a list of all the files you’ve changed, and can deselect any you don’t want to include. You won’t lose your changes to these files – they’ll still be there on your working copy so you can continue working on them, but they won’t be added or committed.
The Commit button on this widget has a secret which is hidden to impatient devs. If you click the little arrow on the right-hand side, it will give you the option to Commit and Push. This will push your commit for you, and create a new remote branch with the same name as your local branch if needed.
Committing in the Terminal
Your first step will be to list the files that you’ve changed. For this you need git status -s
, which will list all files which have been changed since the last commit.
Files that haven’t been added will have a little red status text next to them. The ones that have been added will have green status text. Use git add {path}
to add what you want, and git reset {path}
to un-add anything you don’t want.
Once you’re done, git commit -m "Meaningful commit message"
to commit all the added files.
Your last step is to push your branch. git push
will work if your branch already has a remote tracking branch. If not, you want git push -u origin {branchname}
.
Opening a Pull Request
Open your repository’s GitHub homepage in your browser. If you’ve pushed something recently you may see a yellow banner across the top of the page with a quick link for you to open a PR. If not, you can go to the Pull Requests page and then click the green New Pull Request button.
Your PR’s “base” branch should be the branch you want to merge into (e.g. development) and the “compare” branch should be the one you’ve built your feature in.
The description field can be useful for a variety of things – testing or upgrade instructions, a brief explanation of why you chose to implement things the way you did, mentioning any limitations or future work that may be required. This becomes part of your project’s documentation and is easily searchable through GitHub after the PR has been closed.
Reviewing and Merging a Pull Request
The Files Changed tab is the best place to go through the code line-by-line and see what’s changed. You will notice a little blue “+” button appears on the left-hand margin of the line that you’re hovering over – you can click on this to add a note on a specific line. Use the “Start a Review” option, which just limits the amount of notification spam the PR’s author receives.
Once you’ve done your line-by-line review (and checked out and tested the code if desired) you can leave any general comments on the Conversation tab, then submit your review.
Once any concerns you raise have been addressed, it’s time to merge the PR. Once you’ve merged it, delete the branch (GitHub will give you a button to do this) since it’s no longer useful.
Deleting remote branches
As mentioned above, the easiest way to delete remote branches is to nuke them through GitHub as soon as they’ve been merged. If you didn’t do this, another option is to go to the repository home page in GitHub, open the branch switcher and choose “View All Branches” at the bottom, and then click on the delete icon next to the offending branch.
If you’d rather do it through the terminal, you can:
git push origin --delete {branchname}
Release Tags
A tag is a snapshot of the code base at a particular moment in time. They are most commonly used to record releases, so that you have a quick reference to see what code was released when. Making a new release tag off the master branch is typically one of the last things you would do before you actually publish the code onto your production servers.
The easiest way to make a release tag is to click on the link on the right-hand sidebar of your repository’s GitHub home page.
To make a new branch from the terminal:
git tag -a {v1.1} -m "{tag title}"
git push origin {v.1.1}
Summary
This concludes my series on Git for SVN devs. This is not a comprehensive guide to everything that Git has to offer, but I hope it has been a useful introduction to get you up to speed and working with Git on a daily basis. Happy coding!
Other Posts in This Series
This Week I Learned: 2020-11-22
- A bit about the internals of Docker, thanks to LinkedIn unlocking this course for me for a week. I found the demonstration of multi-layer builds to reduce the size of the final image particularly interesting.
- An overview of the Parsimmon library for building parsers in JavaScript. The use of the composition pattern to build quite complicated parsers out of very simple components is beautiful. Thanks to Hackle Wayne for presenting this talk.
- A comparison of the different I/O approaches and performance of PHP, Java, Node and Golang. Conclusion – Node is not truly non-blocking because the V8 engine runs each script in a single thread. In fact, Node was worse than or comparable with Java in all the benchmarks presented, while Golang was consistently ahead of the pack.
- I’m fascinated by this glossary of data patterns from the MongoDB blog. I haven’t had time to dig into them all yet, but look forward to putting a couple of them into use on my next project.
- Websites which use ARIA have 11 more accessibility errors on average than websites that don’t. I loved this entertaining video about ARIA by @heydonworks, complete with cameo appearances by potty-mouthed Ada Lovelace, a layout table dinosaur, and a herd of stoned horny goats.
Git for SVN Devs, Part 3: Recipes
The two major differences between Git and SVN are that it is a distributed VCS (meaning that you have your own local repository as well as the one on GitHub) and the way its internal structure makes merging branches easier.
This post is primarily focused on recipes for getting the latest code into your local repository and keeping it in sync with GitHub. The next post will explain how to get your completed code back into the development branch and provide some other miscellaneous tips.
Why Do I Need Two Repositories?
Personally speaking, I prefer Git mainly for the ease of branching. If your team’s workflow is designed so that you don’t finish up with multiple devs working on the same branch, I can’t think of any particularly compelling bonus that having a local repo gives you. Sure, it enables you to work without the internet if you want to – but how much work are you going to get done without your esteemed colleagues Google and Stack Overflow?
The good news is that having that local repo doesn’t introduce a lot of overhead into the way I work day-to-day. You can treat Git more or less like SVN from that perspective, and just push every commit as soon as you make it.
And on that note … on to the recipes! In this section I’ll cover what you need to know to get the right branch of code in front of you at the right time. Stay tuned for part 4 where I explain how to commit your code and then how it finds its way back into the development branch.
Initial checkout
Find the URL of your project in GitHub. Open a terminal, switch to the parent directory that your projects live in, and …
git clone https://github.com/{myorg}/{myproject}.git
Fetching Remote Branches
Your local repository keeps metadata about all of the branches that are in the remote repository. This will get stale as branches are created and deleted on the remote. A “fetch” updates that list of branches. If you ever want to checkout a remote branch but you can’t find it, a fetch should set you right.
PHPStorm: Right-click on project -> Git -> Repository -> Fetch
Terminal: git fetch
Seeing Available Branches
PHPStorm has a handy-dandy branch switcher in the bottom right-hand corner. This will show you a list of all of the remote and local branches.
Or if you’d rather use the terminal:
See Local branches: git branch
or git branch -l
See Remote branches: git branch -r
See All branches (local and remote): git branch -a
The PHPStorm branch switcher handily shows which remote branch each local branch is “tracking”. If you want to see something similar in the terminal, it’s included in the output of git branch -vv
.
See what branch I’m on
The current local branch is highlighted and displayed with an asterisk in the output from git branch
. In PHPStorm, it’s displayed in the status bar, where you click to open the branch switcher.
Checkout a Remote Branch
If you want to work with a branch locally and you don’t already have it in your repo, you’ll need to check it out from its remote source. In most scenarios there is only one remote, called origin, which is the GitHub repository. You want to create and checkout a local branch which is tracking the origin branch.
PHPStorm: Click on the remote branch in the bottom right-hand branch switcher, and choose “Checkout as new local branch”.
Terminal: git checkout {branchname} origin/{branchname}
With the current version of Git you can skip the second argument if you want your local branch to have the same name as the remote branch, e.g. you can just say git checkout {branchname}
. If Git can’t find a local branch with that name, it will look for a remote branch that matches and create a local branch with the same name.
Checkout a Local Branch
The process is very similar if you already have a local branch that you want to checkout. Just choose it from PHPStorm’s branch switcher, or from the terminal:
git checkout {branchname}
Create a new branch
When you create a new branch, the local branch that you already have checked out becomes its “parent”, if you will. Always make sure you have the right branch checked out first, and that it’s up-to-date with the remote if appropriate. Let’s consider a typical scenario where you’re going to start a new feature, so you want a new branch off development.
PHPStorm: checkout the development branch. Pull remote changes from the remote tracking branch (origin/development) by right-clicking on the project and choosing Git -> Repository -> Pull. Now open the branch switcher and choose New Branch (the item right at the top).
The terminal equivalent is:
git checkout development
git pull
git branch -b {newbranchname}
Your branch will only exist in your local repository at this stage – I’ll talk about how to push it to the origin in the next part.
Merge origin into my branch
Your local development branch will get out of date very quickly as new features get merged into the origin. You can update it by switching to the branch and then running git pull
in the terminal, or right-clicking on the project and choosing Git -> Repository -> Pull.
Keeping a Feature Branch Up to Date
You should keep your feature branches up to date with development, so that as you develop you are working against the latest state of the code base.
I really prefer to do this in my IDE, as I like to have a UI to help me resolve any conflicts that might arise. In PHPStorm, checkout your feature branch, then open the branch switcher, click on the remote development branch, and choose Merge into Current.
If you’d really rather do it in the terminal, the equivalent is:
git merge origin/development
Deleting local branches
First switch to a different branch – you can’t delete a branch that you’ve currently checked out.
Then either git branch -d {branchname}
or in PHPStorm open the branch switcher, click on your local branch, and choose Delete.
This won’t delete the remote tracking branch. That’s a topic for next time…
Pruning Orphaned Branches
After you’ve been working on a project for a while, you are likely to accumulate an impressive collection of orphaned local branches. These are branches that you built a feature in, where that feature has now been merged into development and the remote branch has been deleted.
Personally I just delete them one at a time in the PHPStorm branch switcher every now and then – it’s a nice quick thing I can do while I’m waiting for unit tests to run or for something to download. However, I know that most devs hate doing repetitive work like this and would rather automate it.
Git doesn’t provide a great built-in way to remove the orphaned branches for you. There are a lot of sample scripts on Stack Overflow, and the safest strategy seems to be to check for the “:gone” string appearing in the output from git branch -vv
. This will protect any other local branches which have never been pushed to the remote from being nuked as well.
Other Posts in This Series
Git for SVN Devs, Part 2: A Suggested Workflow
So you’ve moved to Git because all the cool kids are doing it. You’re still doing your old workflow, just with added complexity, and you’re wondering what all the hype is about.
There are lots of different ways of using Git. They all involve a lot of branches, because that is the best way to have lots of people working on the same code base without creating a giant mess. Let’s build a simple workflow for a team of 1-5 people, starting with a simple setup for single-dev teams and tweaking it until it’s enterprise-ready.
Master and Development
Most Git setups will have these two branches. master is the code that lives on your production server. development is the stuff that’s finished and will be included in the next scheduled release.
This is the setup for my one-person hobby projects. I commit new stuff directly to development (often unfinished), and when I’ve got a stable set of stuff I merge it into master and pull it onto the server.
Benefits: Your production code is separate from the new stuff. If you need to fix a bug urgently, you can do it on the master branch and pull that onto production, without simultaneously releasing your new features.
Drawbacks: Your development branch is still a dumping ground for everything that isn’t production-ready. As you add more devs it will become unstable and your team might fall back into the anti-pattern of not being allowed to commit anything just before the release.
Branch Per Feature
As above, but every feature (or improvement or bug fix) is developed in its own branch. Once it’s ready to go, its creator merges it into development or master as appropriate.
If you have to make a production bug fix, you make a branch off master. Give it a meaningful name, like bugfix-1234 (where 1234 is a ticket number) or bugfix-3.2.1 (where 3.2 is your release number and this is the first bug fix). Do your work in this branch, committing as many times as you like, and then merge it into master once you’re done.
For other work, you make a branch off development. Meaningful and consistent names are still very helpful. This is often just a ticket number (1234), or a ticket number with a brief description (1234-new-datatables). Once it’s been approved for the next release, it gets merged back into development.
Benefits: Half-finished code is not in your development branch. If devs are switching between tasks, they can commit what they’ve got without breaking things for others. Ditto if you want to transfer a half-finished task from one dev to another. Devs can check out each other’s completed features to test and review, or to demo to product owners.
Drawback: The review process is informal (a dev might ask a colleague to look at something before merging, but there’s nothing stopping them from skipping this step) and not particularly transparent (difficult to identify whether a task was reviewed, and by who). Useful feedback from reviews is only communicated between the reviewer and reviewee.
Pull Requests
As above, but when a dev has finished a task, they raise a pull request in GitHub to indicate that it is ready for review.
A pull request is an assertion that a piece of code is ready to be merged into development. The list of open pull requests can be seen in the GitHub UI. Senior devs can check this regularly and review the tasks that are waiting to be merged. They make comments on the pull requests which all devs on the team can see. For new features they will probably also want to check the code out and see it in action, but this may not be necessary for minor tweaks and bug fixes.
The dev addresses the reviewer’s comments, pushes their changes, and adds a note to the PR to say that they’ve fixed it. Once the reviewer is happy with the code, they click a button on GitHub and it gets merged into development.
Benefits: Review process becomes asynchronous (devs aren’t privately messaging each other for reviews) and transparent (because everybody can see what’s been said). New devs can learn a lot about the system and the team’s coding conventions by browsing open PRs. The archive of closed PRs doubles as a library of blueprints for future work, which can also be very helpful for juniors and newcomers.
Drawbacks: GitHub PR notification spam (but you can turn down your notification settings so that you only get emails about the stuff that’s relevant to you). Can lead to painful multi-round reviews where the dev deals with one reviewer’s feedback, but then gets more suggestions from another.
Automated Code Quality Checks
The leading Git providers have a range of tools to help prevent low-quality code from making it into your code base.
On GitHub this takes the form of GitHub Actions, which allows you to configure workflows which are triggered automatically by certain events. A common usage of these is to perform automated checks (such as running a linter or a set of unit tests) every time a new pull request is raised.
This is only scratching the surface of what you can do with Actions. There are almost 6000 actions available in the GitHub Marketplace, and a variety of other events you can trigger workflows on – you can even schedule them like cron jobs.
Benefits: Automatic linting saves reviewers from wasting their time pointing out basic style violations. Automatic test builds mean you can be confident that code going into the development branch won’t break anything – although of course this promise is only as good as your test coverage!
Drawbacks: Automated processes are not a substitute for human judgement.
Summary
Git can do a lot to solve common collaboration headaches. You don’t have to change everything about your workflow at once, but if you recognise any of the pain points above hopefully this guide will help you to address them.
Your team’s GitHub workflow should be reviewed regularly, as should the larger process of how a task goes from an idea in somebody’s brain to a completed, deployed feature.
Other Posts in This Series
This Week I Learned: 2020-10-25
- How to set up replication in OrientDB.
- I’ve been meaning to learn Docker for years, and a brand new Ubuntu install and a need to experiment with replication was the perfect opportunity. I found this video very helpful to get my sandbox OrientDB cluster up and running, although I’m sure there’s much more to explore.
- Microsoft Edge is coming to Linux. This feels like the scene in a teen movie where the popular kid starts pretending to be friends with the nerd so that they can betray their confidence later.
- A little about the Mongoose ODM JavaScript library and add-ons like restify-mongoose, from this great talk by Andrew Watkins of voluntarily.nz about the dependencies used in his project and how to choose a good library.
- How to boot from USB in GRUB after a minor mishap while trying to change the partition structure for my Ubuntu install.
Git for SVN Devs, Part 1: Why Switch?
Like many devs who’ve been round the block a few times, I got my first professional experience using SVN.
My first exposure to Git was at a place that was using it exactly like it was SVN, with one master branch that all the devs pushed to, and nobody was allowed to push for two days before a scheduled release in case they broke something. They’d switched because they knew Git was the cool thing that everybody raved about, but they got confused by the overly-detailed tutorials and the new jargon and never worked out how to use it to improve their workflow.
I Googled just enough to learn how to get my code into the master branch, understanding about 10% of what I read and liking even less. Why did I want a localised repository? How was this better than SVN when there were more steps required for me to do the “same thing”?
That was seven or eight years ago now. Since then I’ve worked on a huge variety of projects and seen the difference that a well-designed Git workflow can make to a team. I’ve recently dipped my toes back into a project using SVN and it’s helped to clarify the benefits in my mind.
SVN Devs are terrified of branches
Branches sound all fine and dandy until you need to sync things up again, and in SVN-land that usually seems to go to custard. It’s easy to lose half a day swearing at a bad merge … so they don’t. Everything happens on the trunk.
The first ever commit made by a rookie dev that breaks the build. The urgent bug fix for the production environment. Possibly the stuff somebody was in the middle of when they had to stop and do that urgent bug fix. The regular save-points for a major feature which isn’t ready for production yet. The stuff that’s supposed to go into the next release. The stuff that’s finished, but when you demo it to the business they don’t like it and don’t want it in the next release.
This works OK for a very small team, if everyone is communicating well and know what code they are allowed to commit when. It’s easy, and at first glance there’s no overhead.
As the team gets larger, you finish up with an almost constantly broken branch that makes your devs tear their hair out. They start updating as rarely as possible, and when they do there’s usually a mess of merge conflicts.
Comments, Comments Everywhere
It also leads to a proliferation of comment blocks. Old code gets commented out because devs aren’t confident about how to get it back if they need to revert in a hurry later. Recent code gets commented out by other devs because it’s got bugs and they can’t get their work done. New features live in if (false) blocks or comment blocks until they’re ready to go live.
Your team has to scroll through all of this commented-out stuff every time they scan the codebase. And it gradually rusts because the IDE doesn’t pick it up when you’re refactoring, so then when you do decide you need a relic from the past, it often needs resuscitation first anyway.
How Git Can Help
Git is definitely more complex than SVN. The learning curve to understand everything is huge (I’m nowhere near the top) but it’s not that hard to learn enough to manage a simple workflow in Git.
Branching and merging is easier in Git, so your devs will be more inclined to use branches. It’s not that hard to keep your production code separate from the stuff you plan to release next week, and your incomplete features aside in their own branch.
The major Git hosts (e.g. Github, GitLab, Bitbucket) all have sleek user interfaces that make it easy to explore the code base, review code before it’s merged into your main development branch, and travel back in time to see what changes were made when, by who. This helps with the cargo-cult commented-out code clutter – there’s another place to get that code back on the remote chance that you really do need it, and in the meantime it’s not cluttering up the source code.
A transparent code review process that happens before code makes it into the development branch can lead to big improvements in the quality and consistency of the code. Everyone can see the code that’s up for review and what’s been said about it, which helps to keep the common issues at the front of people’s minds. Newcomers can skim over the pull requests to get a quick idea of how things are formatted now (rather than 5 years ago when the class they’re working on was last changed) and to get a quick overview of how a new feature is put together.
There’s always going to be some disruption when you switch to something new, but the gains in long-term productivity mean that the change will quickly pay for itself.
Other Posts in this Series
This Week I Learned (2020-10-18)
- There’s a new kid on the web image format block. It’s called AVIF and it’s derived from the AV1 video format. I’m quite impressed with the size and quality of compressed images compared to other formats like JPEG.
- JavaScript’s Number.MIN_VALUE is actually the smallest possible non-negative float value, not the smallest (most negative) possible number.
- Chrome Dev Tools has an awesome screenshot tool as well as many other cool features I was already familiar with. And did you know that you can style console.log output?
- SQL Server for Linux is a thing. Probably not a thing I would recommend for a production app, but it will be useful for dev work for a project I’m working on where my app needs to connect to a standalone SQL Server instance on Azure.
- There is a new HTML element called
<portal>
in the works. It seems to be a new improved variation on the old <iframe>, and has interesting potential to replace the current fad for making everything a single-page app.
This Week I Learned (2020-10-11)
- PHPStorm has a quick definition lookup feature which pops up a window showing the source code definition of a symbol in your code when you press Ctrl + Shift + I. No more switching back to the mouse to Ctrl + Click and then losing my place!
- Imagine a world without targeted online advertising. I can’t be the only person who looked at a few articles about coronavirus and is now getting targeted by Southland wedding venues and cut-price tours of Italy. This Wired article explores how banning targeted ads might affect the world, from the quality of our journalism to election results.
- OrientDB is an open-source graph database implemented in Java. There’s a free official introduction course on Udemy which does a good job of explaining the concept of what a graph database is, although I think I’ve only grasped a fraction of the potential use cases so far. I hadn’t heard the terms “edge” and “vertex” since maths at uni so I’m glad that linear algebra paper finally came in handy!
- Flexbox Froggy is a fun little game to help you learn the CSS flexbox feature set. I’ve learned new properties, particularly order and flex-direction to adjust the order in which elements are displayed.
- A lot more about autofill and its behaviour in different browsers. These days the autocomplete property has more potential values than just on/off, and if you specify “new-password” as the value some browsers (e.g. Firefox) will help the user to generate a strong password.