TechTalksTO is a series of technical presentations here in Toronto. Throughout 2011 they brought us folks from Mozilla, Twitter, Well.ca and other cool places around the internets. They even put on a fantastic all-day event at the not-too-classy-but-so-cool Toronto Underground Cinema, featuring speakers from Disqus, Github, OpsCode, and more. It was a fantastic event! We had dumplings!
Well, not everyone knows that TechTalksTO is the brainchild of two FreshBooks developers: Shey and Jason. They organized this whole thing, found every speaker and arrange every event. They worked their hearts out producing a series of valuable talks for the Toronto technology community. At FreshBooks we are super inspired by their example. Shey and Jason are both great developers, and great team leads, but clearly they had something more to offer than producing great software — they wanted to help build a community of great software development practice right here in Toronto.
We couldn’t be prouder of them. So we gave them a shield. It’ll come in handy during a zombie apocalypse, we figure, and blue goes with everything, right?
As a part of the UTF-8 work we’ve been doing, we have to turn HTML entities (numeric and named character references) into UTF-8 byte sequences. As it turns out, browsers don’t do this the way PHP 5.3′s
html_entity_decode() function does.
There are 4 classes of difference:
- Surrogate codepoints
- Windows-1252 characters in the ISO-8859-1 dead zone
- Mathematical angle brackets
- U+0000 (NUL)
The first we can just ignore. Surrogate codepoints aren’t valid in UTF-8 (if you have them, you’ve actually got CESU-8), and we’re going to leave them entirely alone. This means that eventually things might get a bit uglier (instead of seeing a replacement character e.g. �, you’ll see the text of the character reference e.g.
훘). This way we’re not destroying data, even if it wasn’t displaying correctly before.
The second is a bit of fun. All the specs say that numeric character references use Unicode codepoint values. But (apparently) a lot of data contain codepoints in the range U+0080 through U+009F inclusive, which are (largely) non-printing control characters. Browsers rewrite numeric character references for the 27 codepoints which are defined in Windows-1252 in that range anyway. For example, the numeric character reference
’ looks like…
We recently ran into an interesting and difficult problem: how do we change a large, heavily-used table without violating our “avoid downtime if possible” mantra?
The following is a slightly expurgated version of the postmortem I sent the rest of the team. Table names have been changed to protect the guilty.
In the Sinistar release’s post-release migrations list, there were two
ALTER TABLE migrations slated to make structural changes to the
customer_login tables (changing the types of some columns, specifically). When we tested these migrations on a copy of the production system’s data, we discovered that they would have caused those tables to be inaccessible for two and five minutes respectively. Since customers are core to our applications, we determined that this qualified as “downtime” in the release plan.
I prototyped an alternate approach to the
customer changes (outlined below) intended to make changes to the table without taking the app offline while they were happening. The alternate approach was considerably more complex than the original
ALTER TABLE statement, as well as being considerably slower (~15 minutes for our largest shard, as compared to two minutes); we discussed whether it was worth taking the app offline…
I have a great offer for the first 10 people to jump on this deal.
Every year PayPal throws a huge developer conference, Innovate, in San Francisco. This year they have expanded it to launch X.commerce, the new e-commerce platform from eBay, PayPal, Magento, and GSI Commerce.
That conference is next week, October 12-13 in the Moscone Conference Center in San Francisco. I have 10 free passes. Email me — sunir@**freshbooks.**com — and I’ll send you a coupon code.
And of course I’ll be there representing FreshBooks. If you’re there, I’d love to meet you! Again, shoot me an email — sunir@**freshbooks.**com.
Last year we purchased the book Refactoring Databases. While the first chapter was preaching to the choir (database migrations are normal around here), there was one extremely valuable gem: the use of database triggers to effect seamless data migrations.
The book targets enterprise environments where there are multiple applications accessing the database concurrently, each with release cycles measured in months and years, and regularly scheduled downtime windows. In contrast, FreshBooks has fewer moving parts (and each part is much smaller), release cycles on the order of weeks (sometimes hours!), and no time is good for downtime. We’re used around the world, and it shows in the server activity logs 24/7.
However, we can still learn a lot from Refactoring Databases. Methods of performing migrations with staggered application releases over the course of months are equally applicable to a normal web app undergoing a 0-downtime deployment with rolling backend restarts to a new version of the code.
Database triggers are a way of telling the database server to react automatically to some other action, like If This Then That. The Big Three operations that change data are
DELETE. One example use…