Proposal: Designate commits as minor edits in Git

Git blame: A reliable rat

A great benefit of version control systems is that they make it possible to see who introduced substantive changes in the past. For example in Git, git blame <file> will reveal who last edited each line of code in <file>.

Despite the cheeky name, the greatest value of git blame isn’t so much blaming others for their mistakes, as identifying who to confer with when proposing changes. The last developer to touch a line of code may have an interest in its current state, can answer questions about it, and may have valuable perspective that will improve your proposed changes.

Slow standards adoption? Blame blame.

Unfortunately, this is an obstacle to the adoption of consistent code standards in an open source project like WordPress.

Any patches you make to legacy code whose soul purpose is applying coding standards, without introducing substantive changes, will make you appear as the last author in git blame, losing valuable information about whoever made the last substantive changes. Thus, this type of edits is discouraged.

As a result, WordPress’ adoption of its own coding standards in core code slows way down.

This is a bummer, because there would be dozens of people happy to make core contributions strictly to apply code standards. It’d be a great way for newbies to learn the ropes while making incremental improvements to code quality.

How about “minor commits” that blame is blind to?

Wouldn’t it be nice if you could indicate that a change is minor edit when you commit it? git blame would skip over these minor edits to display only substantive edits from older, non-minor commits. Obstacle to code standards adoption solved.

This could look something like $ git commit --minor <file to commit>.

Implementation considerations (wherein I wade way out past my depth)

For this to work, I’m aware of at least three things that would need to change in Git’s internals:

  1. Implement the --minor flag (or whatever) in git commit
  2. Extend data model in commit blobs (the files where Git stores its object data) to include optional metadata that means “this is a minor edit”.
  3. Make git blame aware of “this is a minor edit” metadata and crawl as far up the tree as needed to encounter an edit that is not minor.

Number 3 would add a bit of performance overhead to running git blame. I could be way off here, but I doubt that’s a deal breaker.

Number 2 might be, though. The structure of commit blobs is super lean — just a reference to a tree describing the current file structure, the commit’s author, the commit message, and a reference to parent commit object(s). Nothing more. Thus, adding metadata to support this type of feature could increase every commit’s size by a significant percentage, and that would add up when applied to an entire repository’s object graph. Would that be justified by the limited utility that a minor edit functionality would add?

Perhaps this isn’t such a big issue, as that “minor” metadata flag could either be set to true, or be nonexistent and implied to be false. This would only take up more hard disk in the cases where minor = true, instead of with 100% of commits.

Applying this to WordPress

I wrote this up with Git examples because I’m much more familiar with it, but WordPress still uses SVN for core development, and probably will for some time.

So until and unless WordPress completely migrates to Git, we’d also need an equivalent new “minor edit” feature added to SVN if we were e to benefit fully in WP developer land.

Git Submodule Cheatsheet

I’m aware git submodules aren’t awesome, but a lot of what makes them a pain is having to remember the arcane sequence of commands to invoke when using them in a collaborative team project. I’m creating this cheat sheet for my own reference. If you find it useful too, or have suggestions for improving it (or if you spot errors to correct), let me know.

Adding a submodule to a project

$ cd <repo root>
$ git submodule add <readable remote submodule repo> <relative local path to install target>

Making changes to a submodule

Here we want to first push our changes to the submodule’s upstream repo, and then record the change in the parent project repo. It’s very important to not skip the first part, as that would break the submodule for other developers when they pull changes to the parent project with an updated reference to a nonexistent state of the submodule repo.

$ cd <submodule path>
// do stuff
$ git commit ... // Changes committed to submodule; parent repo only recognizes that submodule's commit has changed
$ git push ... // Push submodule changes; parent repo unaffected
$ cd <anywhere within parent repo>
$ git commit ... // Commit updated reference to new submodule state (reference by commit)
$ git push ...
// Tell your fellow coders to be sure and update submodules when they next pull.

Cloning a repo with submodules for the first time

After cloning the repo, initialize and update your submodules. git submodule init sets up the repo structure. git submodule update populates submodule files by pulling their commits.

$ git clone ...
$ git submodule init
$ git submodule update

Pulling commits including updates to submodules

$ git pull ...
$ git submodule update