Refactoring a few Jolt tests

Posted by Vic Cherubini on March 09, 2010

ZDC is a much more talented developer than I am, so I frequently throw my code his way for review. In reviewing some of the tests I wrote for Jolt, he had some criticisms. One test in particular, testRouteNamesAreValid(), was quite smelly.

It was incredibly verbose and tested two things at once: valid named routes and invalid named routes. He suggested I refactor it to use a data provider and to split it up into two distinct test. The updated code is much more concise, easier to read, and more descriptive.

Read the new unit tests to see what I changed. Adding a new route to test as either valid or invalid is now simply adding another element to the array in either of the data providers.

I used this method in a few other test classes as well to test exceptions are being thrown properly.

Writing Unit Tests has definitely tested my discipline as a developer more than any other tool I’ve ever used. As I’m becoming more familiar with them, they’ll become second nature like using source control. If you’re not using them, do yourself a favor and add them to your toolbelt.

Jolt Routing

Posted by Vic Cherubini on March 07, 2010

This is a valid article, and is considered technically accurate up to March 7, 2010.

Over the weekend, I worked on the routing code for Jolt. Following a strict TDD design, I wrote the tests first and then wrote the code to match the tests.

I primarily worked on the Named Route code. Jolt will have two types of routes: Named and RESTful. Named Routes are exactly that, they’re explicitly specified and can be routed to any controller and action. They do have a specific format, and they won’t let you create a route if the format is invalid.

Named Routing

Take the example of a simple messaging application. You want to send messages to the server, but don’t wish to use a purely RESTful interface. Simply create a create-message and send-message route and you’re on your way.

namespace Jolt;
 
$router_config = array(
  'site_root' => 'http://joltcore.org',
  'site_root_secure' => 'https://joltcore.org'
);
 
$named_route_list = array(
  new Named_Route('/create-message', 'Message', 'create', Route::METHOD_POST),
  new Named_Route('/send-message/%d/%d', 'Message', 'send', Route::METHOD_POST)
);
 
$router = new Router();
$router->setConfig($router_config);
$router->setNamedList($named_route_list);
$router->dispatch();

All calls to http://joltcore.com/create-message through a POST method along with any associated data (which the controller is responsible for loading) will be passed to the Message controller and the create() action. If the method is called through GET, an HTTP 400[#1] will be returned because it is a bad request.

Additionally, a call made to http://joltcore.org/send-message/10/18 through a POST method will call the Message controller and the send() action, which would, for example, send the last created message from User #10 to User #18.

As you can see, Named Routes are clearly very powerful. All named routes must start with a “/” and can be nested infinitely. Values can be replaced with normal printf() replacements: %d for digits, %s for strings would be the most common.

RESTful Routing

RESTful routing is very powerful, but less flexible than Named routing. RESTful routing lets you define a route and a resource. From there, Jolt handles everything else. Assuming the setup from above, adding a RESTful route for managing Users would be simple.

$restful_route_list = array(
  new Restful_Route('User')
);
 
$router->setRestfulRouteList($restful_route_list);

By defining the User resource as RESTful, several methods will be automatically available.

  • http://joltcore.org/user GET – Get a list of all User models (objects).
  • http://joltcore.org/user/1 GET – Retrieve data for a specific user.
  • http://joltcore.org/user POST – Create a new user.
  • http://joltcore.org/user/1 PUT – Update a user’s data.
  • http://joltcore.org/user/1 DELETE – Delete a user.
  • http://joltcore.org/user DELETE – Delete all users.

You’ll be expected to write the actual controller to handle these requests. Following a formal RESTful interface, you should not add additional resources. If they are required, please us a Named route.

All routes will take into account the Accept and Content-Type HTTP headers. The resulting controller will then use these to determine the format of the incoming data and the format of the result data. Additionally, each controller will return the correct HTTP status code.

Routes can accept multiple formats as well. If the Content-Type header specifies the incoming data is text/json and wants the data back as XML (the Accept header specifies text/xml), the controller will be able to handle that.

Conclusion

Building an application in Jolt will be seamless and easy. Bundled with the framework is a set of tools to facilitate this. One command line tool, jolt-app will read in your list of Routes and build all of the controller and action skeletons for them. It is my hope you’ll become an active and interested participant in the Jolt project. Please follow it on Github or subscribe to this blog.

References

  1. HTTP/1.1: Status Code Definitions

Jolt Interceptors

Posted by Vic Cherubini on February 23, 2010

This is a valid article, and is considered technically accurate up to Feb. 22, 2010

Interceptors will be a new feature added to Jolt. When dealing with a classic Model-View Controller, the Router should determine both the type of content coming in (via the Content-Type header), and the type of data the client with accept (via the Accept header).

If an HTTP/1.1 request contains these headers, the RESTful application should respond correctly.

Content-Type: text/json
Accept: text/json

The MIME-type application/json is also valid, I believe. The Router needs to interpret this headers and tell the Controller how to respond accordingly. Of course, the default Content-Type and Accept headers are text/html in Jolt, but in the cases when they are different, what should be done?

The concept of Interceptors are introduced. An Interceptor takes the resulting object from the Controller and determines what View to send it to and how. By default, an Interceptor will take the entire Model and turn it into the appropriate type.

class User_Controller extends Jolt_Controller {
  public function bookListGet($user_id) {
    // Assuming the Accept header is text/json, Content-Type is normal text/html
    $user = new User($user_id);
    $book_list = $user->getBookList();
 
    // Render accordingly
    $this->book_list = $book_list;
    $this->render('book-list');
  }
}

The render() method will know what type of content the client accepts. If it’s the default HTML, that View will be loaded, rendered, and returned. In this example, the JSON Interceptor would turn the $book_list object (or array) into a JSON object and return that to the client.

What if the resource were to only return a single object? How might that look?

class Post_Controller extends Jolt_Controller {
  public function postGet($post_id) {
    $post = new Post($post_id);
 
    $this->post = $post;
    $this->render('post');
  }
}

The client would receive a JSON object that contains the post data.

{"date_create":1266926552,"date_modify":0,"title":"My First Post","content":"Here is the body of my first post."}

This is a fairly straight-forward example, but what if you didn’t want the date_create or date_modify fields returned in the object? The default JSON Interceptor does a 1:1 conversion between a PHP object and JSON.

An extended Interceptor would be written for that resource that would remove those fields.

class Post_Interceptor extends Jolt_Interceptor_Json {
  public function run() {
    // $this->post is the object from above passed into here.
    $this->post->remove('date_create', 'date_modify');
 
    // Encode everything to JSON and return it to the client.
    parent::run();
  }
}

The above are just examples as I’m fleshing out Jolt, but this should allow for systems that can be quickly and easily built. Basic models will easily handle themselves. More complex models may require further action; an Interceptor could run additional operations on an object before it was returned to the client.

As Jolt becomes more mature, I promise the methods above will change; this should give you an idea of how a small system would work.

New Roadmap for Artisan System

Posted by Vic Cherubini on February 18, 2010

This is a valid article, and is considered technically accurate up to Feb. 21, 2010

I recently rejoined the ranks of the working at a great company. As a result, I am not as concerned about trying to get contract and freelance work and I can now concentrate on my personal projects more. In the last 6 months I spent a lot of time working on other software, and while it paid the bills, I’m anxious to get back to my own.

My first real piece of Open Source software I wrote is my PHP Framework Artisan System. I haven’t had a chance to work on much since it was originally released, only changing a few things and undergoing some reconstruction since. I’ve now written a new roadmap for 2010 in the direction I’d like to take Artisan System.

2010 is going to bring a new change to Artisan System, and the roadmap includes the following items.

  • System fully under test (SUT). A new test suite will be written for all existing functionality, and new development will be done through tests.
  • Controller mechanism will be made fully RESTful. You will be able to access Controllers through a URI via PUT and DELETE methods. GET will list all items in a collection, PUT will update an item in a collection, POST will create a new item in a collection, and DELETE will delete an item in a collection.
  • A new design pattern called Interceptors will be introduced. Interceptors intercept the output of objects from a Controller to return them in the format that the Accept header asks for. HTML will be the default, but XML, JSON, JavaScript or other formats will also be allowed. Thus, if you wanted to return an object from a Controller that only includes a certain number of fields, a custom Interceptor could handle that, rather than having the Controller handle it.
  • The Controllers will read the Content-Type header to determine the format of the input.
  • The appropriate HTTP response code will be sent after a call to a URI. 200 for HTTP OK, 404 for not found, etc.
  • Everything will be time tested and sped up if necessary.
  • My other framework, DataModeler, will be integrated with Artisan System. They’ll become a single framework.
  • Redis support, and potentially support for other NoSQL databases will be added.
  • The codebase will be Doxygenated.
  • The Router will support routing tables for vanity URL’s.
  • A new website and community will be launched to create a strong developer community. The website will include a bug tracker, wiki, and forum.
  • Videos, screencasts, tutorials, and podcasts will be released to help support this effort.

I’m really looking forward to seeing what I can create. Of course, I’ll appreciate any help I can get. You can fork the Artisan System project on Github and help develop it. After I get the new website up and running, you’ll be able to access the bug tracker, wiki, and forum. From there, we’ll correspond development efforts.

More work on DataModeler 1

Posted by Vic Cherubini on January 24, 2010

This is a valid article, and is considered technically accurate up to Feb. 21, 2010

I’m having a blast working on DataModeler. So much, in fact, I don’t want to work on anything else!

The 0.0.2 tag was just released on GitHub and PearHub. DataModeler allows you to build your models decoupled from a data store. From there, you can add all of the business logic you like to each of them. When writing your tests, you can mock any data you need to send to them, and still have them execute as if they were tied to the data store.

The class DataObject is the base abstract class that handles creating the decoupled models. It must be extended, and the name of the extended class should be the table name the model will eventually use. You can override this if you wish. Let’s see what a simple Product may look like:

class Product extends DataObject {
}
 
$product = new Product();
$product->setPrice(1095)->setName('Sweet Salty Balls');
 
echo $product->getName(); /* 'Sweet Salty Balls' */
 
/* Changing the primary key, table, or adding all of the data at once is simple. */
$data = array(
  'price' => 1095,
  'name' => 'Sweet Salty Balls'
);
 
$product->table('data_products')
  ->pkey('data_product_id')
  ->model($data);
 
echo $product->getName(); /* 'Sweet Salty Balls' */
echo $product->table(); /* 'data_products' */

For each of the id(), children(), hasDate(), methodCache(), model(), pkey(), and table() methods, if no argument is passed, the current value of that class variable will be returned. If an argument is passed, the class variable will be set to that value. The reason these aren’t get*() and set*() methods is because a member of the object could very well be ‘table’ and thus, the automatically created getTable() would already exist and cause problems.

In your application, tying the Model to a data store is equally as simple. The DataModel class handles this for you. It is not abstract, so you can create it directly.

/* Assuming class Product and $product from above exist. */
 
/* First need to connect to a database, DataModeler uses PHP PDO for this. Deal with it. */
$db = new DataAdapterPdoMysql('hostname', 'database', 'username', 'password');
$db->connect();
 
$model = new DataModel($db);
 
/* Load the first matched record. Return it to $product. */
$matched_product = $model->where('product_id = ?', 1)->loadFirst($product);
echo $matched_product->getName() . PHP_EOL;
 
/* Load all matched products into an iterator. Each element of the iterator is a Product object. */
$iterator = $model->field('product_id', 'name', 'price')
  ->where('product_id != ?', 4)
  ->where('name != ?', 'Second Product')
  ->orderBy('name', 'DESC')
  ->groupBy('name')
  ->limit(2)
  ->loadAll($product);
 
foreach ( $iterator as $obj ) {
  echo $obj->getName() . PHP_EOL;
}

Tying this in to your tests is simple. Let’s say the Product class was updated to have some logic with setPrice():

class Product extends DataObject {
  public function setPrice($price) {
    $price = abs(intval($price));
    $this->__set('price', $price);
  }
}
 
/* And in your ProductTest.php file */
 
require_once 'PHPUnit/Framework.php';
require_once 'Product.php';
 
class ProductTest extends PHPUnit_Framework_TestCase {
 
  public function testProductPriceCantBeNegative() {
    $product = new Product();
    $product->setPrice(-2059);
 
    $this->assertGreaterThan(0, $product->getPrice());
  }
}

This is a fairly simplistic example, but as your business logic in your DataObject classes becomes more complex, you’re tests can mock the data ensuring the DataObject’s still return the same values.

As DataModeler matures, I’ll release more complex examples. The near future holds work on relationships between DataObject’s. Often, one table in a database is the “parent” of other tables (i.e., the other tables are useless without the parent table). For example, with simple products, the “product” table would the be the parent, and “product_price”, “product_image”, and “product_description” would all be children. Accessing these from the parent table is beneficial. It cuts down on the number of objects to manage, and allows you to access many children objects from a single parent object. More on that to come.

Please download the 0.0.2 tag of DataModeler from GitHub or PearHub and let me know how it works for you.

New Weekend Project – DataModeler for Efficient TDD with PHP

Posted by Vic Cherubini on January 19, 2010

This is a valid article, and is considered technically accurate up to Feb. 21, 2010

I’m beginning my foray into Test Driven Development with PHPUnit. When building web applications, the ActiveRecord pattern becomes very handy. Using my ActiveRecord class, a programmer can easily have their models map 1:1 to tables in the database.

While this is very easy to use, it’s not very good for writing unit tests. The reason being is the class is tightly coupled with the database. A unit test should use most basic data as possible, which means the data should be mocked, and not come from a database. You can write separate unit tests for the database management classes themselves, but the models should be abstracted away from the database.

I started working on a new project called DataModeler over the weekend. It can be found at GitHub and PearHub. Development will take place on GitHub, and official releases pushed to PearHub.

There are several different classes within the project. Essentially, one can create an Object class and a Model class. The Object class is for managing data about that Object. It lacks any dependencies. The Model class is for loading and saving the Object to a datastore.

class ProductObject extends DataObject {
}
 
class ProductModel extends DataModel {
}
 
$data_adapter = new DataAdapterPdoMysql(/* connection info */);
 
$product = new ProductObject();
$product->setName('name')->setPrice(1495);
 
$product_model = new ProductModel($data_adapter);
$product_model->save($product);
 
/* OR */
 
$product_model = new ProductModel($data_adapter);
 
/* Load up product #1 into $product (by reference) */
$product_model->load($product, 1);
 
$product->setName('a new name');
 
/* Update the product now. */
$product_model->save($product);

Thus, writing a test for the ProductObject class is simple. Any logic that takes place in that class is only for data loaded up for that object. In your test, you can simply mock the data and the ProductObject class will be none the wiser.

In addition to easy TDD, the DataModeler framework comes with a very nice iterator class for really managing data properly. It uses PHP PDO for the database layer, so you have that going for you too.

Be sure to watch the project on GitHub, and let me know of any ideas you have for it.

Working with Artisan System – Basics

Posted by Vic Cherubini on December 14, 2009

This is a valid article, and is considered technically accurate up to Feb. 21, 2010

I’m incredibly proud of my PHP Framework Artisan System. It’s gone through a lot of revisions over the last year, and its never really picked up. Part of the reason is I haven’t promoted it much, but another part is that no one knows what to do with it. As the problem with a lot of open source software, there’s a lack of documentation to support it.

This occurred to me after I made a post on Hacker News about my Amazon S3 Backup Utility. Immediately, I had 7 new followers of it on Github.

I want to make a difference with my code, and I want to get the word out about Artisan System. It’s incredibly simple, but powerful. I’ve started this series of posts to help people start working with Artisan System.

Let’s get started.

Installation

Start by creating a new directory structure in your web root to store all of the files in. Download the latest tarball, untar it, and then rename the resulting directory to Artisan.

cd /var/www/
mkdir artisan-system -p
cd artisan-system
wget http://github.com/leftnode/Artisan-System/tarball/0.5.4
tar -xvzf leftnode-Artisan-System-cc2af85.tar.gz
mv leftnode-Artisan-System-cc2af85 Artisan
rm leftnode-Artisan-System-cc2af85.tar.gz

Artisan System is now installed. You could equally clone it from Github as well and use the clone.

Databases

The first thing you’ll want to explore is how to connect to a database. Until recently, Artisan was going to support multiple databases. After some reflection on the direction I wanted to take the project, I removed the potential support for additional databases and just kept MySQL. Adding a new database adapter would be simple. Furthermore, the current database adapter could be updated to using PDO to automatically support additional database wrappers.

Connecting

Connecting to a MySQL database is simple.

There no longer is any formal configuration object in Artisan System. Configurations are now simple arrays. You can optionally pass in the port to the configuration array, but Artisan selects the MySQL port, 3306, by default.

This example is the simplest of examples on how to connect and then disconnect to a specified database. If the connection fails, the exception is caught and reported.

Querying

Querying the database is just as simple with the query() method in Artisan_Db. The query() method takes a raw SQL string and executes it. It does not take any precautions to prevent SQL Injection attacks. In general, you should not use the query() method directly.

Querying With Methods

Artisan System provides a much more robust system for querying a database: methodized querying. Methodized querying builds a query through an object via a chainable object. The most common query types (SELECT, INSERT, UPDATE, DELETE, and REPLACE) are included and accessible through the Artisan_Db object.

You should use methodized querying for several reasons.

  1. Its safe. The data is automatically escaped to prevent SQL Injection attacks.
  2. Its fast. It’s obviously not as fast as querying with a direct SQL string, but its faster than PHP’s parameterized querying.
  3. Its semantical. Building a query reads like writing a sentence.
  4. Its abstract. The classes behind the query actually build the SQL itself, so porting it from one database system to another would be simple.

Querying with methodized querying is simple and straightforward. One of my favorite features is to be able to select a single row and get a specific value back. It takes what would normally be 3 to 4 individual commands to a single one.

Take the time to explore Artisan System. You’ll find a powerful, compact PHP Framework to really help you build a web application quickly. In the next article, we’ll cover the basic Model-View Controller Pattern in Artisan System.

Building an advanced ActiveRecord Model in PHP – Part 1 1

Posted by Vic Cherubini on October 26, 2009

This is a valid article, and is considered technically accurate up to Feb. 21, 2010

As Ruby on Rails took off, so did the idea of the ActiveRecord design pattern. ActiveRecord allows you to define a class that models a single table in the database. This works well for small tables that don’t rely on other tables for writing and loading, however, the patterns doesn’t work as well with a table that depends on many other tables.

This series will start of examining a class to handle small tables, and then progress through several parts to handling validation, using a separate data recorder, and finally to handle relationships. The code in here will all use PHP5.2+ and will be available on our GitHub account and to download at the end of each article.

For example, in our eCommerce application, SpEEdy Cart, the table that stores global, editable configuration data is very straightforward.

CREATE TABLE `config` (
  `config_id` smallint(3) NOT NULL AUTO_INCREMENT,
  `group` varchar(32) collate utf8_unicode_ci NOT NULL,
  `name` varchar(64) collate utf8_unicode_ci NOT NULL,
  `value` text collate utf8_unicode_ci NOT NULL,
  `required` tinyint(1) NOT NULL DEFAULT '0',
  PRIMARY KEY  (`config_id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

Each piece of configuration data is independent of any other piece and only modifies the execution of the software. There are no other tables that store configuration information, so this table is a prime candidate to have an ActiveRecord Model written for it.

The ActiveRecord pattern allows the programmer to create a new Config object, set the data, and write it to the database. This is done through the interface of the class itself, and the database writing is hidden from the user.

/* Load up configuration record ID #10, set some new data, and write it. */
$config = new Config(10);
$config->setGroup('category')
  ->setName('category_depth')
  ->setValue(3)
  ->write();
 
/* Create an empty configuration object, load the data from an array, and then write a new record. */
$cfg = array('group' => 'category', 'name' => 'category_depth', 'value' => 3);
$config = new Config();
$config->loadFromArray($cfg)->write();

The write() method knows if the object is loaded or not (if the primary key of the object is set). If its loaded, an UPDATE command is issued, otherwise, an INSERT command is sent.

Surprisingly, the Config class is empty.

class Config extends Model {
}

All of the work takes place in the class Model. First, take the time to analyze the class, and then it will be broken up into individual pieces. The Doxygen comments in this class have been omitted to reduce the number of lines and to improve the readability.

abstract class Model {
  protected $_id = 0;
  protected $_pkey = NULL;
  protected $_model = array();
  protected $_table = NULL;
  protected $_methodCache = array();
  protected $_forceLoad = false;
 
  const TABLE_ROOT = 'artisan_';
 
  public function __construct($id=0) {
    $class = strtolower(get_class($this));
    $this->_table = self::TABLE_ROOT . $class;
    $this->_pkey = $class . '_id';
 
    $this->load($id);
  }
 
  public function __destruct() {
    $this->_id = 0;
    $this->_model = array();
  }
 
  public function __call($method, $argv) {
    $argc = count($argv);
    if ( 0 === $argc && true === isset($this->_methodCache[$method]) ) {
      return $this->_methodCache[$method];
    } else {
      $k = substr($method, 3);
      $k = strtolower(substr($k, 0, 1)) . substr($k, 1);
      $k = preg_replace('/[A-Z]/', '_\\0', $k);
      $k = strtolower($k);
 
      if ( 0 === $argc ) {
        /* If the length is 0, assume this is a get() */
        $v = $this->__get($k);
        $this->_methodCache[$method] = $v;
        return $v;
      } else {
        /* Else assume its a set with the first element of $argv. */
        $this->__set($k, current($argv));
        return $this;
      }
    }
  }
 
  public function __set($k, $v) {
    $this->_model[$k] = $v;
    return true;
  }
 
  public function __get($k) {
    if ( true === isset($this->_model[$k]) ) {
      return $this->_model[$k];
    }
    return NULL;
  }
 
  public function enabled() {
    if ( false === isset($this->_model['status']) ) {
      return true;
    }
 
    if ( 1 == $this->_model['status'] ) {
      return true;
    }
    return false;
  }
 
  public function getModelAsArray() {
    return $this->_model;
  }
 
  public function getId() {
    return $this->_id;
  }
 
  public function getPkey() {
    return $this->_pkey;
  }
 
  public function getTable() {
    return $this->_table;
  }
 
  public function setPkey($pkey) {
    $this->_pkey = $pkey;
    return $this;
  }
 
  public function setTable($table) {
    $this->_table = $table;
    return $this;
  }
 
  public function loadFromArray($array) {
    $this->_model = $array;
 
    if ( true === isset($array[$this->_pkey]) ) {
      if ( true === $this->_forceLoad ) {
        $this->load($array[$this->_pkey]);
      } else {
        $this->_id = $array[$this->_pkey];
      }
    }
 
    return clone $this;
  }
 
  public function write() {
    if ( $this->_id > 0 ) {
      $this->_update();
    } else {
      $this->_insert();
    }
    return $this->_id;
  }
 
  public function load($id) {
    $id = intval($id);
    if ( $id > 0 ) {
      $result = LN::getDb()->select()
        ->from($this->_table)
        ->where($this->_pkey . ' = ?', $id)
        ->query();
      if ( 1 == $result->numRows() ) {
        $this->_model = $result->fetch();
        $this->_id = $this->_model[$this->_pkey];
      }
 
      return true;
    }
 
    return false;
  }
 
  protected function _insert() {
    LN::getDb()->insert()
      ->into($this->_table)
      ->values($this->_model)
      ->query();
 
    if ( 1 == LN::getDb()->affectedRows() ) {
      $this->_id = LN::getDb()->insertId();
      return true;
    }
 
    return false;
  }
 
  protected function _update() {
    LN::getDb()->update()
      ->table($this->_table)
      ->set($this->_model)
      ->where($this->_pkey . ' = ?', $this->_id)
      ->query();
 
    if ( 1 == LN::getDb()->affectedRows() ) {
      return true;
    }
 
    return false;
  }
}

All of the members of the class are protected so they can be accessed in extended classes. This class assumes the primary key (ID) of the table is an integer. However, load() could be updated to assume its any data type.

The member variable $_pkey stores the name of the primary key of the table. It is automatically created by taking the name of the class and appending “_id” after it. It can be overwritten via setPkey().

$_model stores the actual data itself. It’s simply a key/value array with each key corresponding to a field of the table. $_table is the name of the table being worked on, and is automatically determined based on lowercasing the name of the class. Thus, if the class was named Product_Description, the value of $_table would be product_description and the value of $_pkey would be product_description_id.

$_methodCache stores a list of get*() methods and their resulting values for quick lookups later in __call(), and finally $_forceLoad causes the loadFromArray() method to load the data freshly from the database, regardless of if the data already exists in the model array.

The constructor takes an optional argument for the ID of the object to load. After the default table and primary key are established, the data is loaded (if possible). The loading takes place in load() which attempts to load the data based on the primary key.

load() is public in the case that it needs to be called if the primary key or table are changed.

loadFromArray() takes an array of key/value pairs and loads it into the object. This is useful if a list of objects is needed from a single query. For example, one could select all configuration options, and set them to a list of Config objects with a single query.

$config_list = array();
$config = new Config();
$result_config = LN::getDb()->select()->from('config')->query();
while ( $cfg = $result_config->fetch() ) {
  $config_list[] = $config->loadFromArray($cfg);
}

Because each $cfg contains the primary key of the Config object (`config_id`) the object will be fully loaded and the $config_list array will contain a list of Config objects. Note: loadFromArray() returns a clone of $this. This ensures each element of $config_list will have a new Zend refcount value and will be an entirely different object.

The other methods, excluding __call() are fairly evident in their purpose. __call() is where a lot of the work takes place, so understanding it fully is crucial.

__call() is a Magic Method in PHP, meaning it is present in classes and is called silently if defined. It takes two arguments, the name of the method called and the arguments passed to that method.

__call() is executed in the case that a method that isn’t written as a member of the class is called. If __call() doesn’t exist in the class and a method is called that doesn’t exist as well, the normal error handling routines will throw the appropriate warning.

As a result, any object can now easily handle the get*() and set*() methods that are suitable only for that object.

The full name of the method, without the () is the value of $method in __call(). First, the value is checked against the $_methodCache array. If its found, that value is returned immediately and no extra processing is required.

However, if the call is a set*() type, or a get*() type that hasn’t been cached, further processing is necessary.

Fortunately, “get” and “set” are both three characters, so they are stripped off. Next, the first letter of the string is lowercased. Thus, “setModuleName” becomes “moduleName”. With the power of a simple regular expression, uppercased characters have an underscore appended before them. “moduleName” becomes “module_Name”. Finally, the entire string is lowercased, becoming “module_name”. This is the key that will be used to get or set a property of the model.

If there are no arguments, the class assumes this is a get*() method, and if that key corresponds to a value in $_model, the data is returned. It is also at this point that the method and the data is cached for future lookups.

If there are 1 or more arguments, a set*() method is assumed, and all but the first arguments are discarded. The first argument is the new value of the property, and it is set. $this is returned for chainability.

This simple class is very powerful for handling database tables easily. As shown earlier, setting up a class to handle global configuration data is simple.

Keep in mind, the ActiveRecord class described defines a 1:1 relationship between a record and its representation in your code. It should not represent a list or multiple objects. A separate class, or an Iterator is required to handle that case.

It should be evident where successive parts of this article series will go. An issue to resolve are classes that need to load/modify multiple other objects. For example, a Product class may need to load a list of Prices, Images, and Attributes. Each of these are separate tables, and thus have a separate class to handle them, however, in a read-only system (such as the front end of a shopping cart), knowing about a Product_Price is fairly meaningless without the context of the Product. Thus, it is the responsibility of the Product object to manage the Product_Price object.

The next article will cover using a separate recorder adapter to save the data in a different location, and to abstract the saving of data away from the Model object itself. From there, relationships like the scenario described above will be closely examined.

Subscribe to Leftnode’s RSS feed to read more about this topic in the upcoming days!

Introducing the Artisan System SDK 2

Posted by Vic Cherubini on October 07, 2009

While the content of this article is accurate, the Git repository is no long valid as of Feb. 21, 2010

I released the Artisan System Framework in November 2008. I was very excited when it released, got some good feedback, got some bad feedback, but was very happy with my work. It went through about 8 months of development, and was rewritten twice. Ultimately I derived a lot of pleasure from releasing my first open source product.

With the 0.4 release coming in the next week, I felt it was appropriate to launch the SDK, or software development kit, so developers can become quickly acclimated with how the framework operates. The feedback I received from explaining how the Model-View Controller works in Artisan System was that it wasn’t as intuitive as I had thought. Rather than updating the post, I decided the best way to show its intuitiveness would be to release the SDK.

Because Leftnode is moving to git for some projects, I decided to release the SDK on github.com. You can find it at http://github.com/leftnode/artisan-sdk. It includes version 0.3beta of Artisan System, but will be upgraded to 0.4 when it is released.

The default SDK includes all of the code necessary to build a small website. It includes database access (for the contact form mail page), the Model-View Controller, and a global static class for passing objects between each other.

Installation is very simple.

  • Clone the repository in a web-serveable directory
  • Create a new MySQL database named `artisan` and run the artisan.sql file against it.
  • Copy configure.template.php to configure.php
  • Open configure.php in your editor
  • On a development server, set DEBUG_MODE to 1, on a live server, set it to 0.
  • Set the value for the DIR_ROOT define for the root location. Essentially, this is the directory where index.php resides.
  • Set the keys server, username, password, and database in the $config_artisan_db variable to your appropriate database settings.
  • Set the values of site_root and site_root_secure in the $config_artisan_router variable.
  • If your web server has mod_rewrite (or similar) enabled, keep rewrite in $config_artisan_router as true, otherwise, set it to false.
  • Set the email_list value in $config_form to an array of email addresses the form should submit to.

This concludes the installation. Currently, it is done manually, but an automatic installation will be available in the future.

To access the initial site, if you have http://localhost/ set up properly, you can go to http://localhost/directory/ where directory is the name of the directory you placed all of the files in.

After you’ve verified the installation worked properly, and navigated through the small website, we’ll discuss the overall architecture of the SDK.

Application Directory Structure

The first directory you should pay attention to is app/. Open it, and you’ll see two sub-directories, Page/ and Root/. In Root/, you will see one PHP file, Root.php and a directory, View/. Opening Root.php shows a small class file.

<?php
 
class Root_Controller extends Artisan_Controller {
  protected $_layout = 'index';
 
  public function renderLayout($view) {
    $this->view->css_artisan = DIR_CSS . 'artisan.css';
 
    $this->render('root/header', 'header');
    $this->render('root/menu', 'menu');
 
    $this->render($view, 'body');
 
    $this->render('root/footer', 'footer');
  }
 
  protected function _redirect($url) {
    header("Location: " . $url);
    exit;
  }
}

This is the root Controller class. All Controllers in Artisan System must extend from the class Artisan_Controller. Because this Root_Controller extends Artisan_Controller, all subsequent Controllers will extend it. All Controllers must also be named {Name}_Controller, where {Name} is the actual name of the Controller.

All Controllers reside in a central directory, with sub-directories corresponding to each Controller.

app/
  Root/
    Root.php
    View/
      header.phtml
      footer.phtml
  Index/
    Index.php
    Model/
      Index.php
    View/
      index.phtml
      alternate.phtml
  Account
    Account.php
    Model/
      Account.php
    View/
      account.phtml
      update.phtml
      friends.phtml
      contacts.phtml
  Profile
    Profile.php
    Model/
      Profile.php
    View/
      profile.phtml

With this structure, any Controller can load a View of any other Controller. Additionally, the file in the Model/ directory handles the validation rules for different sections of each Controller.

Rendering Views in a Controller is easy with the render() method. It takes two optional paramters, $view_file and $content_block. $view_file defaults to the name of the method you’re calling with the URL (see URL Routing below), and $content_block defaults to no content block, meaning the view is rendered directly to the output stream.

By passing a full path for the $view_file parameter in render(), it will load up that View file within that Controller directory. If no path is passed, the View file from that Controller directory is loaded.

Because an application does not know of the concept of a header or footer (or other repeated content on the page), the Root_Controller was added. Each subsequent Controller extends Root_Controller, and every method in that Controller would call $this->renderLayout() rather than render() directly. As a result, the header, menu, and footer Views are always loaded in. Without this, one would have to load in each of these at every method. Backing up a directory and opening to Page/Page.php shows how this works.

<?php
 
require_once 'app/Root/Root.php';
 
class Page_Controller extends Root_Controller {
  // ##### POST Methods ##### //
 
  public function contactPost() {
    $form_id = intval($this->getParam('form_id'));
    $contact = $this->getParam('contact');
 
    try {
      $contactor = Artisan::getContactor();
      $contactor->setFormId($form_id)->setFieldList($contact)->send();
 
      $this->contactGet();
    } catch ( Artisan_Exception $e ) {
      // Determine what to do...
    }
  }
 
  // ##### GET Methods ##### //
 
  public function indexGet() {
    $this->renderLayout('index');
  }
 
  public function contactGet() {
    $this->renderLayout('contact');
  }
 
  public function aboutusGet() {
    $this->renderLayout('aboutus');
  }
 
  public function servicesGet() {
    $this->renderLayout('services');
  }
}

Because Page_Controller extends Root_Controller, it can call renderLayout() with the name of the View to load. Focusing on the *Get() methods for now, you see how simple they are. The method indexGet() loads up the index.phtml View, which is found in app/Page/View/.

Library Files

In the lib/ directory, you’ll see on sub-directory, Artisan/ and two PHP files. The Artisan/ sub-directory contains the entire Artisan System framework. Artisan.php is a static class that serves as a simple entry point into the system. Most of the methods within here could be combined into a single chunk of code in index.php, however, by putting them in separate static methods in a single class, they can be loaded in different areas of an application, making deployment easy and avoiding globalizing variables. Finally, Contactor.php is a simple class for managing a Contact Us form.

URL Routing

Controllers are broken into different methods that are accessible through the URL. With mod_rewrite turned on, URL’s look like: http://website.com/controller/method/arg1/arg2/argN. The first value after the base URL is the name of the Controller to load. Next, is the method within that Controller. Finally, any additional values, separated by forward slashes, are passed in as arguments to the method. Which method is called in the Controller is determined by the request type. GET requests route to methodGet(), and POST requests route to methodPost(). Thus, going to http://website.com/account/update would attempt to load Account_Controller::updateGet(), while posting a form to that same URL would attempt to load Account_Controller::updatePost().

Public Facing Files

Navigating to the root directory and then to public/ shows several sub-directories: css/, image/, layout/, and locale/. The names should be relatively intuitive as to what they do. One note: layout/ holds the different layout files. Layout files are what are loaded automatically after a view is rendered if the $_layout property of the Artisan_Controller class is defined. Because it is defined as ‘index’ in the code above, the index.phtml file is loaded. This small file is parsed by the Controller itself, with each content block being rendered to the appropriate area.

Download

You can download the initial release of the Artisan System SDK using Git from github.com. The public clone URL is git://github.com/leftnode/artisan-sdk.git

PHP5 Database Iterators Performance 1

Posted by Vic Cherubini on October 01, 2009

This is a valid article, and is considered technically accurate up to Feb. 21, 2010

The previous article about introducing PHP5 Database Iterators received a lot of good discussion, particularly on Hacker News. Whenever an article generates a lot of discussion (the link had a 2:1 ratio of comments to upvotes), a great deal is learned and passed on. Rather than responding to each individual comment, I decided to write an entire blog post. This article will be primarily focused on performance, both in speed and space complexities.

One concern was the overhead of having an entire class devoted to simply looping through a result from the database. If the Db_Iterator class simply did that, it would be hard to argue my point. However, careful analysis of what the iterator actual does proves it to be both faster and more memory efficient.

Result Sets

An important concept in querying a database is the result set that is returned. When a query is executed against a database (MySQL will be used for this article), it returns a pointer to the first result in the set.

<?php
 
$db = mysqli_connect('localhost', 'username', 'password', 'iterator_test');
 
$sql = "SELECT * FROM `user` u WHERE u.email_address LIKE '%@gmail.com%'";
$result_set = mysqli_query($db, $sql);
 
var_dump($result_set);
 
mysqli_close($db);

This small script shows that the $result_set variable is of type mysqli_result.

object(mysqli_result)#2 (0) {
}

When PHP passes the query to MySQL and data is returned, the data is oriented into sets of rows. The result set points to the beginning row. (If the query is not of type SELECT, $result_set is boolean of value true on a successful query, false otherwise). Further investigation of the PHP source code describes what is behind the actual result set.

PHP Internals

Download the latest 5.2.x release (5.2.11 at the time of publication) and un-tar it to a local directory. In your editor, open up the file ext/mysqli/mysqli_nonapi.c +232. All of the following assumes a basic SELECT query that returns more than 1 rows and everything executes properly.

In PHP, new functions are generally registered through the preprocessor directive PHP_FUNCTION(). The argument to the directive is the name of the function, in this case mysqli_query.

The first 20 or so lines are not incredibly important, they just set up some local variables and do some error checking. Line 255 fetches the connection resource into the local mysql_link variable. Line 259 performs the actual query. If everything goes well, and the query is of type SELECT, the result_mode is checked. Because mysqli_query() takes an optional third parameter for the result mode, and none is specified, MYSQLI_STORE_RESULT is used by default. Thus, the method, mysql_store_result() is called, passing in the result set from MySQL.

Lines 284 through 287 is where the result set is created and returned.

mysqli_resource = (MYSQLI_RESOURCE *)ecalloc (1, sizeof(MYSQLI_RESOURCE));
mysqli_resource->ptr = (void *)result;
mysqli_resource->status = MYSQLI_STATUS_VALID;
MYSQLI_RETURN_RESOURCE(mysqli_resource, mysqli_result_class_entry);

mysqli_resource->ptr is the result set pointer to the first row returned. This pointer to the first row returned is what allows iterating (whether through a normal while() loop, or through an iterator) to happen. During each mysqli_fetch_assoc() (or similar method call), the pointer is adjusted to the next row, until no more rows are found, and NULL is returned. The directive MYSQLI_RETURN_RESOURCE() is called, returning the resource, which is passed into the above PHP script as the variable $result_set.

Back to PHP

The usefulness of this should be evident: rather than returning an array of all the rows found, a pointer to the first matched row is returned. This pointer is then stored in the Db_Iterator class.

/* Assuming the above query was executed properly. */
$iterator = new Db_Iterator($result_set, new User());
 
/* Iterate through a list of User objects now. */
foreach ( $iterator as $user ) {
  /* Do whatever with $user. */
}

The memory footprint of this result set pointer is tiny compared to storing all rows matched as an array. Many queries can easily return hundreds of thousands or millions of rows. Storing each of these, even without using an object for each element, is incredibly inefficient, both in space and time complexity. The space complexity is many times as large because an array of N elements must be created to store each result.

Another nice feature of using this pointer idea is being able to easily jump from row to row by referencing it’s index.

<?php
 
$db = mysqli_connect('localhost', 'username', 'password', 'iterator_test');
 
$sql = "SELECT * FROM `user` u WHERE u.email_address LIKE '%@gmail.com%'";
$result_set = mysqli_query($db, $sql);
 
/* Jump to the 11th actual row, row 0 points to the first matched row. */
mysqli_data_seek($result_set, 10);
 
while ( $row = mysqli_fetch_assoc($result_set) ) {
  echo $row['email_address'] . '<br>';
}
 
mysqli_close($db);

The useful method, mysqli_data_seek() allows the programmer to easily jump to a specific row, similar to jumping to an array index (which, in C, is syntactic sugar for pointers).

Race Conditions with SELECT’s

What happens when a row that is matched in a result set is updated by another query? For example, say one wants to fetch all comments by a specific user. The query executes successfully, and returns a result set to PHP. Because that result set is a reference to rows, and not the rows themselves, another query could easily come and update (or delete) one of those matched rows resulting in inaccurate results. Fortunately, the database ensures that the rows matching a result set have the values at query time returned specifically. This occurrence, known as a race condition in multithreaded applications, is quite common, and a source of many developer frustrations. However, MySQL handles it for the developer.

Conclusion

Understanding PHP’s internals is necessary for every developer. It can give helpful insights as to better design your programs. One can think of a result set as a pointer in C. While it doesn’t create memory on the heap, it does allow for easy navigation through a list of elements.