ZFS on FUSE

Jeff Bonwick pointed out that ZFS is going to be ported to FUSE (filesystem in user space). This is being done as part of the Google Summer of Code by Ricardo Manuel da Silva Correia.

For more info check out the Google SoC application and the ZFS on FUSE/Linux Blog. Although this project specifically mentions FUSE on Linux I hope that it will work with FUSE for FreeBSD.

I don’t know what sort of performance penalty is involved with FUSE, but it would be darn cool to have this work with FreeBSD.

Misunderstanding Foreign Keys

Curtis Poe takes a turn at shooting down people who think foreign keys aren’t important with a O’Reilly blog post: Misunderstanding Foreign Keys. Curtis seems to have been inspired by this Are Foreign Keys Worth Your Time? blog post. It looks like Ruby on Rails is partly to blame for this attitude by not properly supporting foreign keys. That points back to MySQL and their off again on again history for foreign keys.

It is unfortunate that there are people who think that databases without foreign keys should be the norm and not the exception.

Catching errors and problems at the application level is a good thing, by all means please do so. But don’t, ever, EVER, use that as an excuse to not have proper checks in place at the database layer. I’d liken this to the same people who feel that they don’t need to validate web data on the server side because they are doing that on the browser side with javascript. So what happens when someone comes along who has javascript turned off? So much for you data validation. The same thing is true of the database and application relationship. What happens when someone writes a new tool for accessing your database and all of you application level checks are skipped?

Very bad things, that’s what.

SQL Server Equivalent To MySQL And PostgreSQL Limit Clause

Welcome to another episode of “blogging to remember something that I don’t do often enough to remember off the top of my head but now I’ll remember to go search my blog for the answer the next time that I have to do this”. Today’s episode involves coming up with the equivalent of LIMIT from MySQL and PostgreSQL in Microsoft SQL Server (2000 in this case, I’m not sure if this has changed in 2005). Although the LIMIT syntax doesn’t appear to be part of the SQL standard syntax, I prefer it over the more verbose methods that are mentioned in the standard. Using LIMIT is easy and very handy, I wish other database vendors would pick up on it.

SQL Server has a clause called TOP that looks like this:

SELECT TOP n *
FROM tablename
ORDER BY key

This is fine, but it doesn’t support offsets. To get the TOP N rows with an offset you’ll have to make use of a sub-select along the lines of:

SELECT TOP n *
FROM tablename
WHERE key NOT IN (
    SELECT TOP x key
    FROM tablename
    ORDER BY key
)

So listen up Oracle, DB2 and SQL Server, the LIMIT clause is a lot easier to use so set egos/NIH (not invented here) aside and adopt the LIMIT syntax. Please. Pretty please. Pretty please with sugar on TOP.

UPDATE 23 May 2006 @ 10:30am : For those of you who aren’t familiar with the LIMIT clause in PostgreSQL and MySQL it allows you to limit the number of rows returned. To get the first 10 results you’d something like this:

SELECT *
FROM tablename
ORDER BY key
LIMIT 10

But what if you app was paging results sets, displaying 10 at a time? With LIMIT you can provide an OFFSET that allows you to skip ahead. Here is an example of getting another 10 rows, with an OFFSET of 100:

SELECT *
FROM tablename
ORDER BY key
LIMIT 10 OFFSET 100

Simple and to the point. Thanks to Scott (see comment #1) for reminding me that not everyone reading this was already familiar with the LIMIT clause.

Happy Birthday To Me

Yesterday (21 May) was my 33rd birthday. Thanks to Google I learned that Arthur Ignatius Conan Doyle (of Sherlock Holmes fame) and I almost share a birthday, his is today (22 May). I missed it by one day and 114 years :-)

Ethernet and I are also pretty close, if you include Robert Metcalfe’s ethernet memo as it’s original birth, on 22 May 1973. Based on that I am one day older than ethernet.

1973 was also the year that Unix was being rewritten in C.

Information Society Reconstituted

If you’ve been reading this blog for awhile you already know that I’m a fan of Information Society. So I was pretty excited to hear Kurt Harland announce that the band name Information Society had been turned over to Paul Robb, who is reconstituting the band at informationsociety.us. Unfortunately Kurt will not be involved, so the new Information Society consists of Paul Robb, James Cassidy, Christopher Anton (vocals) & Sonja Meyers (keyboards).

I want to have high hopes for the new Information Society. I’ve started going through the Live Journal Information Society community and their MySpace music site, which has a few tracks that you can listen to. They also have a MySpace blog, but the only thing there is the original announcement.

Only time will tell if Christopher Anton will be able to do well in Kurt Harland’s shoes. If they do put out a new CD I’ll likely buy it just to see how it turned out.

If you long for the good old days there are lots of Information Society videos on You Tube.

FlightAware: FreeBSD and PostgreSQL

BSD Talk #42 has an interview Karl Lehenbauer about FlightAware.com. FlightAware tracks flight information, so far example here is their page on live flights to and from Sacramento Executive Airport. There is a lot of information that they are making available for free. Here is an outline of some of the more interesting bits that were mentioned by Karl during the interview.

All of FlightAware’s systems are 64-bit AMD based computers running FreeBSD 6.x, specifically the FreeBSD/amd64 port. They use PostgreSQL for the database back end. Slony 1 is being used to replicate data. Hard drives are in a RAID 1 (mirroring) configuration using 3ware controllers.

Now for some numbers:

  • Receiving the data and processing it puts them about 6 minutes behind real time
  • Generating one map can be done in about 160 milliseconds of CPU time
  • Capable of generating several million maps a day
  • About 1 TB of stored data
  • Approximately 40 million position updates on air craft per day

PostgreSQL wasn’t able to keep up with the updates so they wrote a memory resident database service queries. I’m still not exactly clear on what the relationship is between PostgreSQL and their memory resident database, which uses about 1 GB of RAM.

Nice to see a company putting FreeBSD and PosgreSQL to good use. I’m curious about the 40 million inserts per day number. Bring on the math!

  • 40,000,000 inserts per day
  • 40,000,000 / 24 = 1,666,667 inserts per hour
  • 1,666,6667 / 60 = 27,778 inserts per minute
  • 27,227 / 60 = 463 inserts per second

So that boils down to about 463 inserts per second on average. I’d expect that their actual peak requirements are much higher than that (perhaps two or three times that number?). That is just data that they are receiving, that doesn’t include the queries being run against their system to power the website. This brings up another question, how much bandwidth do they have dedicated to receiving these updates? It is possible that each individual update is fairly small (lat, long, src, dest, flight id, airline, plane type, etc) so that might not be too bad. Even at 256 bytes per update, doing 40 million of those a day adds up very quickly.

Right now their website provides the following numbers:

Currently Tracking

Tracking 4,986 airborne aircraft (224 VFR) with 21,205,488 total flights in the database.
FlightAware has tracked 48,940 arrivals in the last 24 hours.

Cool stuff.

Google Trends

Just in case you missed it Google had a press day yesterday where they announced Google Co-op, Google Desktop 4, Google Notebook (which hasn’t been released yet, but most folks are guessing this is based on Writely and Google Trends. Out all these the one I’m most interested in is Google Trends.

Traditionally Google digests information and only rarely offers ways for the public to view details about information it has collected. With the release of Google Trends we finally get a chance to see what people have been looking for. You can compare search terms and narrow it down by time and region. Unfortunately their graphs don’t expose raw numbers, only the relative number of searches over time. Although the United States is included in the region search results, you can can’t filter on it. Despite these short comings I think Trends will be a hit and likely spawn a whole new round of Google games.

The obvious use of Trends is to compare what people are looking for related products. An example of this would be intel vs. amd or dell vs. hp. Another trend to look at is how new product search compares to their replacements, like powerbook vs macbook.

One thing to be careful of is that Google will start returning 403 Forbidden (HTTP status codes) if you attempt searches in rapid succession. I think this is rather lame, especially given the excuse they provide:

    We're sorry...

    ... but your query looks similar to automated requests from a
computer virus or spyware application. To protect our users, we
can't process your request right now.

    We'll restore your access as quickly as possible, so try again
soon. In the meantime, if you suspect that your computer or network
has been infected, you might want to run a virus checker or spyware
remover to make sure that your systems are free of viruses and
other spurious software.

    We apologize for the inconvenience, and hope we'll see you
again on Google.

Given Google’s failure to prepare for previous product launches (remember how long Google Pages and Google Analytics were down after their initial debut?) this may be their way of preventing Google Trends from being crushed after only being available for one day. Is Google really concerned about people writing viruses that infect systems with the purpose of being able to issue a large number of queries to Google Trends? If so why aren’t they doing the same thing for their home page? Time to call Google out, while Trends is cool, their effort to “protect … users” is lame.

Joyent/TextDrive's Really, Really, Really Bad Netperf Benchmarks

I came across a Scales With Rails presentation at http://www.scalewithrails.com/downloads/ScaleWithRails-April2006.pdf that focused on scaling Ruby on Rails on Solaris instead of FreeBSD. I’m assuming that they were using FreeBSD at least in part because Joyent had purchased TextDrive which uses FreeBSD. So I’m just cruising through this PDF to see what sort of things they mention and then things got very strange on page 40 of the PDF.

Before I get into the details of page 40 go take a look at page 39. In this section they are talking about the limitations of Ethernet, specifically the gigabit variety. Page 39 provides the simple math, 1Gbps = 125MB/sec and 100Mbps = 12.5 MB/sec. So with that in mind take a look at page 40, where you find results of two netperf runs, one from FreeBSD to FreeBSD and another from Solaris to Solaris, both using gigabit ethernet. These results look to be quotes from Netperf benchmarks of FreeBSD versus Solaris. I won’t comment on the Solaris numbers as I don’t have much experience with Solaris and I don’t have any Solaris systems to run similar tests on. I do however have some FreeBSD systems to run these tests on and I have to say that I find their FreeBSD results to be a complete joke. I’m almost tempted to call them a lie. I’m specifically interested in the FreeBSD to FreeBSD over a switch results, which they show as:

pacific# /usr/local/netperf/netperf -fM -H private.comox.textdrive.com -tTCP_STREAM -- -m1472
TCP STREAM TEST to private.comox.textdrive.com
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    MBytes/sec

65536  32768   1472    10.02       4.86

So out of the theoretical 125MB/sec (which you’ll never be able to reach, but that is another story) they were only able to get 4.86MB/sec. Only 3.888% of the absolute max? I was immediately suspect of these numbers, so installed netperf on two old FreeBSD boxes. The client box is a Dell Optiplex GX300 (P3-800 256MB) running FreeBSD 6.1-RC1 with a 100Mbit network card. The server side for my netperf test (neptune) is a Dell PowerEdge 4400 (P3 XEON-800 512MB) running FreeBSD 6.1-BETA1 with a 100Mbit network card. These are plain installs of FreeBSD on old hardware, nothing exciting. The network path between these two systems runs through two Netgear gigabit switches. I ran the same test they did more than a dozen times and always got results back at more than 11MB/sec. Here is the fastest result from my test runs:

> /usr/local/netperf/netperf -fM -H neptune -tTCP_STREAM -- -m1472
TCP STREAM TEST to neptune
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    MBytes/sec

 65536  32768   1472    10.00      11.21

Here is the slowest result from my test runs:

> /usr/local/netperf/netperf -fM -H neptune -tTCP_STREAM -- -m1472
TCP STREAM TEST to neptune
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    MBytes/sec

 65536  32768   1472    10.00      11.07

I don’t understand how they got 4.86MB/sec with gigabit and I got 11.07MB/sec with 100Mbit ethernet. The difference is just huge, 6.21MB/sec, well over twice their throughput. Unfortunately I don’t have spare FreeBSD systems with gigabit network cards to test against, but based on my 100Mbit tests I’m going to assume that they would be much faster than TextDrive’s 4.86MB/sec. What may be even more disturbing is that the folks at TextDrive and Joyent have been running their systems in this condition and didn’t see any large red flags go into the air indicating that they were doing something wrong.

A few minutes looking around on Google revealed that this came from Jason Hoffman on a post to TextDrive’s blog: Comparative netperf (network performance) of FreeBSD versus Solaris on identical hardware. I’d leave a comment for Jason on that post but there doesn’t appear to be any way to leave a comment. A post on the Joyent blog: Can you check my math? (Perhaps we’ll also call this, “What’s in a web server”?) posted on the same day, also by Jason, does seem to allow for comments so I’ll leave one pointing to this post.

I just can’t get over how completely bogus their numbers are. I can’t believe that they would then use them in a claim about how poorly FreeBSD does in netperf tests. Just amazing.

If you know what is going here please let me know. If I’ve done something wrong in my tests please point it out, because right now it looks like Jason Hoffman/TextDrive/Joyent need to go hire new sysadmins.