SacStarts.com

At the beginning of January Adam Kalsey launched SacStarts.com. The site grew out of Adam and Scott Hildebrand looking for tech get togethers in the Sacramento/Davis area.

Now there are informal monthly dinners for tech heads and entrepreneurs to get together and chat. The February dinner in on the 22nd, 7pm at Hoppy Brewing Company on Folsom Blvd. If you are in the area signup, the only cost is for your own dinner.

I’ve been to many of the dinners they’ve had so far and the conversation has always been interesting and informative.

In June there are plans for a BarCamp Sacramento.

Wikipedia Turns on Nofollow for Links

Word is that Wikipedia has turned on nofollow on links at the direction of Jimbo Wales. In the past Wikipedia voted to remove nofollow from links. There is more discussion of this at slashdot, network world and Google Blogoscoped.

The bottom line? This will do nothing to stop people from spamming pages at Wikipedia. Two years ago (almost to the day), nofollow was announced by Google and friends as a way to stop comment spam. From what I’ve seen, comment spam is still on the rise.

If you want to stop/prevent Wikipedia page spam then only apply techniques that actually work to achieve that goal. Using nofollow does nothing towards accomplishing that goal.

IBM developerWorks: Convert XML to JSON in PHP

IBM developerWorks recently posted an showing how to Convert XML to JSON in PHP. There are a few things in this article that I wanted pick on mention.

The first one is the "Browser-side data processing" section. Let me repeat it one more time, JSON Does Not Require eval( ).

Next up, why limit yourself to just XML to JSON conversion? The next most obvious conversion is JSON to XML. And what about JSON with callbacks? Since all of this is being done in PHP there is another, perhaps less obvious, conversion that we can do: serialized PHP to XML and or JSON. There is no reason why we shouldn’t be able to convert between all three of these formats in any order we want.

Back in November I wrote a simple library to convert Multiple Output Formats For Web APIs. The OutputFormat class (http://josephscott.org/code/php/outputformat/outputformat.tgz) is a simple wrapper format data as a PHP array, serialized PHP, XML, JSON and JSON with callback. Here’s an example:

<?php
require("./OutputFormat.php");

$php_array = array(
	"person"	=> array(
		"name"	=> "Joseph Scott",
		"age"	=> 33,
		"url"	=> "http://joseph.randomnetworks.com/"
	)
);

$of = new OutputFormat();

$json_array = $of->arrayToJSON($php_array);
$json_callback_array = $of->arrayToJSON($php_array, "my_js_function");
$serial_array = $of->arrayToSerial($php_array);
$xml_array = $of->arrayToXML($php_array);

print_r($php_array);
print("\\n\\n");

print($json_array . "\\n\\n");
print_r($of->jsonToArray($json_array));
print("\\n\\n");

print($json_callback_array . "\\n\\n");
// This conversion will fail because it has a JavaScript callback in it.
print_r($of->jsonToArray($json_callback_array));
print("\\n\\n");

print($serial_array . "\\n\\n");
print_r($of->serialToArray($serial_array));
print("\\n\\n");

print($xml_array . "\\n\\n");
print_r($of->xmlToArray($xml_array));
print("\\n\\n");
?>

The code above is from the example file in http://josephscott.org/code/php/outputformat/outputformat.tgz.

Perhaps developWorks plans on future articles that will expand on this subject. In the meantime feel free to use my OutputFormat class. As an added bonus it works in both PHP 4 and PHP 5.

Windows Doesn't Like Trailing Periods/Dots

Just in case the mystery of Subversion and trailing spaces wasn’t enough for you here’s another one. On one of the Windows workstations TortoiseSVN would continue to show a change in a folder where no changes had been made. So we started playing with this to figure out what was going on.

I did the most obvious thing first, I ran svn update on the folder. This brought everything up to date and the icon overlay would go green for about three seconds and then turn red again. I did this three times in a row. The results were the same every time, green for a few seconds and then red. Next we started a commit to see what Subversion thought had changed. It indicated that there was one directory (called misc) that had changed. But nothing in the folder had changed.

Once again it took looking that directory name for a few minutes before went and compared it a checkout of the same repository on a unix system. Then the light bulb went on! In the repository the directory name had a period as the last character.

This Windows XP system didn’t like seeing a dot as the last character in a directory name and so it removed it. This showed up as a change to Subversion. We’d run svn update and it would restore the directory with the period, which was fine for a couple of seconds which was how long Windows took to notice it and change the directory name. So we were chasing our own tail for awhile.

I had no idea that Windows XP had a feature that automatically changed the name of a folder. What a completely annoying "feature".

The lesson here is that in at least some circumstances Windows XP doesn’t live having a trailing period/dot in a directory name and will remove it for you automatically.

Subversion Doesn't Like Trailing Spaces

My recent Subversion success hit an old snag the other day. I was importing another FrameMaker book, which seemed to go okay, but checking it out again failed on one directory over and over. It didn’t make any sense, there was nothing unusual about the directory that I could see. I removed the whole book and re-imported it just to see if I missed something. The second import went fine (so I thought), but the checkout failed in the same spot once again.

So I tried all the little variations of re-importing specific directories and checking out specific directories. The failure was always the same and always on the same directory. At one point I stopped and just looked at the error, which included the directory name, for a few minutes. I wondered, could there possibly be some white space on the end of the directory name?

Testing this theory was easy enough, filename completion in my shell confirmed that there was one space at the end of the directory name. Unbelievable! Who would put a space at the end of directory name!? At this point it didn’t matter, I removed the directory with the space from the repository. This took a couple of tries because the import on that directory didn’t really succeed properly. So even though the error showed up on checkout, it was the importing that actually failed in the first place.

Once the directory with the trailing space was renamed to exclude the space I was able to import it. Checkout also worked normally after that.

So if you have what looks like a normal file or directory that is causing errors on an import or checkout, take a moment ensure that there are no trailing spaces. Subversion won’t be happy if there are.

Implementing a Queue in SQL

Greg’s Implementing a queue in SQL (Postgres version) is an interesting exercise. The goal was to implement a simple queue that can be managed via SQL. The example is a simple first in, first out (FIFO) queue, with a limit of 5 items.

Two methods are demonstrated to accomplish this, the first one makes use of the PostgreSQL Rules and is very short. The second one is a little bit longer and uses a PL/pgSQL function as a trigger. One advantage of the second method is that it lets you know how many rows in your queue were recycled. Both methods require only one table.

There’s also a queue in MySQL article as well.

Is it practical to implement FIFO queues in a DBMS? Maybe not, but it is a good example of thinking outside the box.

JSON Does Not Require eval( )

With so many discussions about JSON going on lately (like XML Has Too Many Architecture Astronauts from last week) I’m tired of seeing people suggest that JSON is great because you can just eval( ) it. This immediately makes me nervous. Executing remote code that you don’t control just doesn’t strike me as a good idea. You can reduce the risk this brings by running it through the Javascript JSON parser.

Even better than that though is that you don’t have to use eval( ) at all to make use of JSON. By using callback functions you can avoid using eval( ) completely. Here’s an example of how this works using the Yahoo Time Service API (some lines wrapped for easy reading):

<script type="text/javascript">
function handleJson(json) {
    alert("Timestamp: " + json.Result.Timestamp);
}
</script>
<script type="text/javascript"
src="http://developer.yahooapis.com/TimeService/V1/getTime?
appid=YahooDemo&output=json&callback=handleJson">
</script>

The code from the Yahoo! API looks like:

handleJson({"Result":{"Timestamp":1168227638}});

So if you’ve run away from looking closely from JSON because all this talk of eval( )’ing remote code then come back and look at callback functions.

While writing this I started learning more about how eval( ) works in JSON. Tom Insam has a great blog post on Javascript eval( ). It turns out that eval( ) isn’t a function at all, but an object method. On top of that it is limited to the context that it is used in. Just incase Tom’s post disappears one day I’ve included his examples.

Limit eval( )’ed code to a specific object:

  var code = "var a = 1";
  eval(code); // a is now '1'.

  var obj = new Object();
  obj.eval(code); // obj.a is now 1

Limit eval( )’ed code to a specific function:

  var a = 2;
  function foo() {
    eval(code); // a is 1, but scoped inside the function
  }
  foo();
  // a is back to being 2 here

Get eval( )’ed code into the global scope from inside a function:

  var a = 2;
  var global = this;
  function foo() {
    global.eval(code); // a is 1 globally now
  }
  foo();
  // a is now 1

So even if you don’t want to use eval( ) for remote scripts, now you’ve got a few tricks for using eval( ) in your own scripts.

Akismet, One Year Later

On January 5th, 2006 I turned on the Akismet WordPress plugin for filtering comments and trackbacks. It wasn’t perfect (one month review showed ~ 0.1875% false positives) and went down a couple of times, but after one year I’d say it is good enough. After a couple of months the volume of spam was so high that I stopped going through the spam queue, hoping that anyone with a comment that never showed up would contact me to tell me about it.

Any comments marked as spam by Akismet are deleted after 15 days. That doesn’t seem like a very long time, but on this blog that queue is over 10,000 comments lately. I can’t imagine having to review more than 20,000 comments every month. Ug.

After one year my WordPress dashboard indicates that Akismet has blocked more than 112,000 comments and trackbacks as spam.

UPDATE Sat 6 Jan 2006 @ 7:30am : Wouldn’t you know it, the next morning after writing this Akismet let some 50 or so obvious comment spam get through. Looks like they were having problems around 3am, all of these comments came in at about that time. So it isn’t perfect, but I wouldn’t even think of turning it off.