Archive for September, 2007

MySQL Having Clause

Tuesday, September 25th, 2007

Last Friday, I had to run some complicated SQL queries at work. I was dealing with 300k rows of history items that I had to select based on an aggregate function. Unfortunately, WHERE clauses cannot contain the result of an aggregate function. Having clauses are like where clauses, but they’re applied last, and without optimization–and they can contain the result of an aggregate function. Here’s an example:

mysql> SELECT `col1`, COUNT(distinct `col2`) as as distcol2 , COUNT(distinct HOUR(`date`)) as hours FROM `sometable` WHERE foo GROUP BY `col1` HAVING `hours` = 2 and `distcol2` = 2 order by rand() DESC limit 100;

Doctoral Inspirations

Monday, September 17th, 2007

While reading my Gmail yesterday, I saw a sponsored link titled “The PHP Doctor“. Being a self-proclaimed PHP expert myself, I was interested to see exactly how good this PHP Doctor is. He has a blog–a PHP tips and tricks blog. The first post was about functions, and their use in writing modular code. Although the post used a solid example of functions, I couldn’t help but notice a few sloppy performance issues. The following are tips that I have created after reading that post, and the second post on the page.

  1. Where possible, do not use functions in the entrance condition for loops. Consider the following example: (From The PHP Doctor*)
    for( $x = 0 ; $x < strlen( $InString ); $x++ ) {
        if (strstr( $OKChars , $InString[$x] ) ) $OutString .= $InString[$x];
    }

    Note that for every iteration of that loop, the value strlen( $InString ) remains the same. However, on each iteration of the loop, the function is run. To avoid running this function each iteration, the value should be assigned to a variable before the first iteration:

    for ( $x = 0,$c = strlen( $InString ); $x < $c; $x++) {
        if ( strstr( $OKChars , $InString[$x] ) ) $OutString .= $InString[$x];
    }
  2. ++$i, $i++, $i += 1, and $i = $i + 1 all have the same basic effect, but they are indeed different. Using using $i++ or ++$i is faster than using$i += 1 or $i = $i + 1. For those who don’t remember, $i++ and ++$i are respectively, the post increment and the pre increment operators. The difference between the two is that pre increment increments $i, then returns the incremented value, whereas post increment returns the value of $i, then increments $i.
  3. Although PHP doesn’t really care, try to avoid implicitly casting numerical types to strings when it is not necessary. Consider a function that takes a numer, $Number, and rounds it to the nearest hundredth. (Note: this can be done via PHP’s round function: round($Number,2)). Here is an example from The PHP Doctor*:

    function roundToTwo( $Number ) {
        //pow(10, -3) = .001
        $Number += pow( 10 , -3 ); //Make sure the thousandths place is populated
        $Number = round( $Number * pow( 10 , 2 ) ) / pow( 10 , 2 );
        //Make sure the thousandths place is populated
        $Number += pow( 10 , -3 ); //pow() returns float or int (false on error)
        //Take the decimal point, and the following two characters.
        $Number = substr( $Number , 0 , strpos( $Number , '.' ) + 3 );
        return $Number;
    }

    Rounding is a mathematical function, but the preceding function relies on properties of strings. One downside of this is performance; another is formatting: roundToTwo() will return a string, and because strings have no concept of significant digits, non-significant digits will be returned. (return value could be something like 123.00) Anyway, with a little more thought, we can find a way to do this without using strings. When round is passed only one argument, it will round the argument to the nearest integer. It follows that round($Number*100)/100 will function the same as roundToTow($Number). But suppose we want to avoid using round()… Recall that if we are rounding to n decimal places, we round up when the digit in the (n+1)th decimal place is greater than or equal to 5, and down otherwise. Adding 5 to this digit will accomplish this. Thus, we can now use the floor() function instead of round(): floor($Number*100 + .5)/100.
  4. In the previous example, not the use of pow(x,y) where x and y are constant. For better performance, calculate the value pow(x,y), and use that value in your program. In some cases, this will also lead to more readable code.