The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Thursday, August 07, 2008

A primitive form of fine-grained revision control

Work continues on “Project: Leaflet” and when I last left off, I mentioned that git is nearly perfect for handling the fine-grained revision control.

I'm here to report—it is.

The ability to make changes to one version of “Project: Leaflet” (say, the MySQL version) and then selectively merge changes into the other version (in this case, the PostgreSQL version) isn't that bad with git.

I currently have three respositories for “Project: Leaflet”—the “master” repository with two branches, one for the MySQL version, and one for the PostgreSQL version; another one that's my working MySQL repository, and the third that's the working PostgreSQL version.

The workflow isn't that bad. I make changes on one of the work repositories, say, the MySQL version:

mysql-work> vi somefile.c # make changes, test, etc
mysql-work> git commit -a # have working version, commit changes

Then, when done there, I go to the master repository:

master> git checkout mysql
Switched to branch "mysql"
master> git pull server-path-to-mysql-work
 [ bunch of output ]
master> git log >/tmp/changes
master> git checkout postgresql
Switched to branch "postgresql"

I then view the changes made, and pick which commits I want to merge:

master> git cherry-pick f290b3e50e4cea1c3ee5e5265faa996943ef8542
 # that large value is the ID of the commit
 # I pick the ones that apply 
 [ bunch of output ]
master> git cherry-pick 574756ffaa10cdc8452b33bf3d0ab8b786395080
 [ bunch of output ]

Then go to the other work repository, and pull the now-merged changes:

postgresql-work> git pull server-path-to-master
 [ bunch of output ]
postgresql-work> vi somefile.c # make any non-portable changes,
postgresql-work> git commit -a # tests, etc, 

And then back to the master to pull back the PostgreSQL changes and any non-specific merges that may have come up. I could probably make it smoother, as git is also a revision control toolkit, but as of yet, it's not yet annoying enough to warrant the work.


Still obsessing over stupid benchmarks …

The problem. The PHP implementation is a lot slower. Embarrassingly slower. Without any caching the Java version is able to do ~6000 queries per second. The PHP counterpart can push through ~850 queries. The implementations are the same. The stats provided by the author of the library are 8000 vs 1200. So about the same as my measurements.

Via reddit.com, Case study: Is PHP embarrasingly slower than Java?

In my ever continuing obsession with stupid benchmarks and optimization, I decided to tackle this particular little problem like I did with Jumble—map everything into memory and avoid disk I/O altogether (well, explicit disk I/O—the system will page in the data implicitly as it's used). This time, the data maps down to an object file about 8½ megabytes in size (all constant data, so pages can be discarded, not paged out), and with that, I was able to get ~100,000 queries per second.

On a 120MHz machine!

It didn't even take all that long to write …

Obligatory Picture

An abstract representation of where you're coming from]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.