WordPress 3.1: Advanced Taxonomy Queries

One of the new things in 3.1 that hasn’t got a lot of discussion yet is the new Advanced Taxonomy Queries mechanism. At the moment, this is still being actively developed, but the basic structure is finalized enough to give at least a semi-coherent explanation on how to use it. Since 3.1 is now going into beta, it’s unlikely to change much.


What’s a Query?

In WordPress land, a “query” is anything that gets Posts. There’s a lot of ways to do this.

  • You can use query_posts to modify the main query of the page being displayed.
  • You can create a new WP_Query, to get some set of posts to display in its own custom Loop.
  • You can do a get_posts to get some limited set of posts for display in some special way.

Regardless of the method, you have to pass parameters to it in order to specify which posts you want. If you’ve used this at all before, then you’ve used parameters like cat=123, or tag=example, or category_name=bob and so forth. When custom taxonomies were developed, you were eventually able to specify things like taxonomy=foo and term=bar and so on.

Querying for Posts

The problem with these is that people sometimes want to specify more than one of these parameters, and not all parameters worked well together. Things like cat=a and tag=b, or cat=a and tag is not b, and so forth. This is because cat and tag are both forms of taxonomies, and the code didn’t handle that well. Sure, some of it worked, for specific cases, but those were mostly there by accident rather than by design. In other words, those cases worked because the system just happened to get it right for that particular instance.

Well, all these old methods still work, but they have been made into a more comprehensive system of generically specifying arbitrary taxonomies to match against. When you specify cat=123, it’ll actually be converting it to this new method internally.

Query Strings are for Suckers

One side effect of this new system is that it doesn’t really work with query strings very well. It can be done, but it’s a lot easier and more sensible if you just start getting into the array method of doing things instead. What’s the array method? I’ll explain:

Imagine you used to have a query that looked like this:

1 query_posts('cat=123&author=456');

A simple query, really. The problem with it is that WordPress has to parse that query before it can use it. But there is another way to write that query as well:

1 query_posts(array(
2   'cat' => 123,
3   'author' => 456,
4 ) );

Essentially, you separate out each individual item into its own element in an array. This actually saves you some time in the query because it doesn’t have to parse it (there’s very little savings though).

The advantage of this is that you can build your arrays using any method of array handling you like. Here’s another way to do it:

1 $myquery['cat'] = 123;
2 $myquery['author'] = 456;
3 query_posts($myquery);

Simple, no? But what if you have to deal with the $query_string? The $query_string is that old variable that is built by WordPress. It comes from the “default” query for whatever page you happen to be on. One of the main uses for it was to deal with “paging”. A common method of doing it was like this:

1 query_posts($query_string . '&cat=123&author=456');

If you use arrays, you have to deal with that yourself a bit. There’s a couple of possible ways to do it. The easiest is probably to just parse the query string yourself, then modify the result as you see fit. For example:

1 $myquery = wp_parse_args($query_string);
2 $myquery['cat'] = 123;
3 $myquery['author'] = 456;
4 query_posts($myquery);

I started out with the $query_string, used wp_parse_args to turn it into an array, then overwrote the bits I wanted to change and performed the query. This is a handy technique I’m sure a lot of people will end up using.

On to Advanced Taxonomies

Advanced Taxonomy queries use a new parameter to the query functions called “tax_query”. The tax_query is an array of arrays, with each array describing what you want it to match on.

Let’s lead by example. We want to get everything in the category of “foo” AND a tag of “bar”. Here’s our query:

1 $myquery['tax_query'] = array(
2     array(
3         'taxonomy' => 'category',
4         'terms' => array('foo'),
5         'field' => 'slug',
6     ),
7     array(
8         'taxonomy' => 'post_tag',
9         'terms' => array('bar'),
10         'field' => 'slug',
11     ),
12 );
13 query_posts($myquery);

Here we’ve specified two arrays, each of which describes the taxonomy and terms we want to match it against. It’ll match against both of them, and only return the results where both are true.

There’s two things of note here:

First is that the “field” is the particular field we want to match. In this case, we have the slugs we want, so we used “slug”. You could also use “term_id” if you had the ID numbers of the terms you wanted.

Second is that the “terms” is an array in itself. It doesn’t actually have to be, for this case, as we only have one term in each, but I did it this way to illustrate that we can match against multiple terms for each taxonomy. If I used array(‘bar1?,’bar2?) for the post_tag taxonomy, then I’d get anything with a category of foo AND a tag of bar1 OR bar2.

And that second item illustrates an important point as well. The matches here are actually done using the “IN” operator. So the result is always equivalent to an “include” when using multiple terms in a single taxonomy. We can actually change that to an “exclude”, however, using the “operator” parameter:

1 $myquery['tax_query'] = array(
2     array(
3         'taxonomy' => 'category',
4         'terms' => array('foo', 'bar'),
5         'field' => 'slug',
6         'operator' => 'NOT IN',
7     ),
8 );
9 query_posts($myquery);

The above query will get any post that is NOT in either the “foo” or “bar” category.

But what about terms across multiple taxonomies? So far we’ve only seen those being AND’d together. Well, the “relation” parameter takes care of that:

1 $myquery['tax_query'] = array(
2     'relation' => 'OR',
3     array(
4         'taxonomy' => 'category',
5         'terms' => array('foo'),
6         'field' => 'slug',
7     ),
8     array(
9         'taxonomy' => 'post_tag',
10         'terms' => array('bar'),
11         'field' => 'slug',
12     ),
13 );
14 query_posts($myquery);

This gets us anything with a category of foo OR a tag of bar. Note that the relation is global to the query, so it appears outside the arrays in the tax_query, but still in the tax_query array itself. For clarity, I recommend always putting it first.

Useful Gallery Example

By combining these in different ways, you can make complex queries. What’s more, you can use it with any taxonomy you like. Here’s one I recently used:

1 $galleryquery = wp_parse_args($query_string);
2 $galleryquery['tax_query'] = array(
3     'tax_query' => array(
4         'relation' => 'OR',
5         array(
6             'taxonomy' => 'post_format',
7             'terms' => array('post-format-gallery'),
8             'field' => 'slug',
9         ),
10         array(
11             'taxonomy' => 'category',
12             'terms' => array('gallery'),
13             'field' => 'slug',
14         ),
15     ),
16 ) );
17 query_posts($galleryquery);

This gets any posts in either the gallery category OR that have a gallery post-format. Handy for making a gallery page template. I used the wp_parse_args($query_string) trick to make it able to handle paging properly, among other things.

Speed Concerns

Advanced taxonomy queries are cool, but be aware that complex queries are going to be slower. Not much slower, since the code does attempt to do things smartly, but each taxonomy you add is the equivalent of adding a JOIN. While the relevant tables are indexed, joins are still slower than non-joins. So it won’t always be a good idea to build out highly complex queries.

Still, it’s better than rolling your own complicated code to get a lot of things you don’t need and then parsing them out yourself. A whole lot easier too.