THE PUTXI PROJECT

Catalonia is Not Spain

Access SQL TRANSFORM Statement

Posted by putxi on March 16, 2007

The Access SQL TRANSFORM statement is a non-standard extension to SQL-92 implemented by Microsoft’s JET database engine to create crosstab queries. Crosstab queries calculate a sum, average, count, or some other aggregate on rows, and then group the results in two dimensions: one down the left side (row headings) and the other across the top (column headings). In Excel these are called PivotTables. In SQL Server you’re on your own.Crosstab queries provide a powerful data analysis tool, but can be tricky to use in Access. It’s not that they’re hard to create. Access provides a wizard for creating new ones, and the syntax is pretty straightforward, if under-documented (that TRANSFORM link above is to the Access 97 documentation). It’s using them, like as the basis for a report, that can be tricky.

The problem is that a crosstab query’s columns aren’t generally known until the query is run. After all, a crosstab query’s purpose is to turn row aggregations into columns, and so the number of columns (and their names) will generally vary depending on the data selected. Access reports, on the other hand, really want to know at design time what columns their base queries will return. Without this information, they can’t know what data columns are available for layout and presentation.

Luckily, there are lots of ways to get around this in Access. For instance, you can assemble, parse, and distribute the query’s results as delimited values, or use a multi-column subreport, or use a parallel non-TRANSFORM GROUP BY query and fill the report’s controls programmatically. The folks at Corporate Technologies have prepared a great demonstration of the many ways to do this, so I won’t repeat them here. What I am going to do is ramble some about creating a parameterized crosstab query with a fixed set of columns in order to highlight a couple of gotchas.

I prepared a sample database, in case you want to follow along.

Let’s start with a single table named mytable having the following columns and rows:

column  type
key     long integer
project long integer
year    integer
amount  currency

key  project year    amount
1    100     2003    $0.50
2    100     2004    $1.00
3    100     2004    $1.50
4    100     2005    $2.00
5    200     2004    $3.00
6    200     2005    $4.00
7    200     2005    $4.50
8    200     2006    $5.00

Using standard GROUP BY SQL, it’s easy to get a total of the amounts by project (see q0a):

SELECT mytable.project, Sum(mytable.amount) AS total
FROM mytable
GROUP BY mytable.project;

project   total
100       $5.00
200      $16.50

Or a total of the amounts by year (see q0b):

SELECT mytable.year, Sum(mytable.amount) AS total
FROM mytable
GROUP BY mytable.year;

year    total
2003    $0.50
2004    $5.50
2005   $10.50
2006    $5.00

However, to get a total of the amounts by project for each year, we’ll need a crosstab query.

To turn the former GROUP BY query into a crosstab, wrap it in a TRANSFORM statement, using the Sum() of mytable.amount for the TRANSFORM and mytable.year as the PIVOT (see q1a):

TRANSFORM Sum(mytable.amount) AS total
SELECT mytable.project
FROM mytable
GROUP BY mytable.project
PIVOT mytable.year;

project   2003   2004   2005   2006
100       $0.50  $2.50  $2.00
200              $3.00  $8.50  $5.00

Alternatively, we can GROUP BY year and PIVOT on the project (see q1b):

TRANSFORM Sum(mytable.amount) AS total
SELECT mytable.year
FROM mytable
GROUP BY mytable.year
PIVOT mytable.project;

year    100      200
2003    $0.50
2004    $2.50    $3.00
2005    $2.00    $8.50
2006             $5.00

Now suppose we need to create an Access report based on the first crosstab query above (i.e., total amounts by project for each year), but we want to restrict the data to a user-specified year and the year after. To do this, let’s add a WHERE clause that uses a numeric parameter named “foryear” (see q2):

PARAMETERS foryear Short;
TRANSFORM Sum(mytable.amount) AS total
SELECT mytable.project
FROM mytable
WHERE mytable.year In ([foryear],[foryear]+1)
GROUP BY mytable.project
PIVOT mytable.year;

When we run this query, we’ll be prompted to enter a value for the “foryear” parameter. If we specify 2004, we get this:

project   2004   2005
100       $2.50  $2.00
200       $3.00  $8.50

Perfect. Unfortunately, we can’t base a report on it. Well, we could, but we would have to bind the columns to a particular set of years, or programmatically establish the column bindings at runtime. Not what we want.

Luckily, in situations like this, where the number and/or type of columns desired are known in advance, there’s a simple solution. We can coerce the PIVOT data to a set of fixed values using an expression. In other words, instead of relying on the varying values of a column to determine our column headings, we’ll calculate fixed column headings from the column data using an expression.

In our query, we know we’ll only ever have two columns: one for the specified “foryear” parameter, and one for the next year. So, we can PIVOT on an expression that converts mytable.year values to the appropriate fixed values of “thisyear” or “nextyear” (see q3):

PARAMETERS foryear Short;
TRANSFORM Sum(mytable.amount) AS total
SELECT mytable.project
FROM mytable
WHERE mytable.year In ([foryear],[foryear]+1)
GROUP BY mytable.project
PIVOT IIf(mytable.year=[foryear],"thisyear","nextyear");

Now when we run this query and specify 2004 for the “foryear” parameter, we get this:

project   nextyear   thisyear
100       $2.00      $2.50
200       $8.50      $3.00

And if we specify 2005 for the “foryear” parameter, we get this:

project   nextyear   thisyear
100                  $2.00
200       $5.00      $8.50

Now we have a crosstab query with fixed columns to which we can bind a report.

I should point out that if you are going to create a report based on this query, you might want to avoid specifying the report’s RecordSource at design-time. In other words, create an unbound report, and bind the report to the query at run-time by setting the report’s RecordSource property in its Open event (see rep-q3a). If you don’t do this you’ll be prompted for the “foryear” parameter a million times while designing the report. Alternatively, you can bind the query’s parameter to a control on an open form, or wait until you’re done designing the unbound report and assign its RecordSource just before you save it (see rep-q3b). Note that you’ll have to assign the report a RecordSource at design-time if you intend to use it as a subreport.

So, are we done? Not quite, we’ve still got a problem. The problem is that our crosstab query has columns consisting of fixed values, but doesn’t have a fixed number of columns. In other words, we know what the possible column values are (and so can bind form/report controls to them), but we don’t know until the query is run how many of those columns it will actually have.

To illustrate this problem, run our last query again and specify 2002 for the “foryear” parameter:

project   nextyear
100       $0.50

Yikes, we lost a column.

This presents an obvious problem for any forms or reports we’ve bound to the query. If those forms or reports always assume the existence of both a “thisyear” and “nextyear” column then we’ll be seeing a 3070 error: “The Microsoft Jet database engine does not recognize <name> as a valid field name or expression.”

The other problem is that we can’t use this query as the basis for a subreport. If we do (see rep-q3bsub), we’ll get an error 2172: “You can’t use a pass-through query or a non-fixed-column crosstab query as a record source for a subform or subreport.” And, contrary to KB 209218, clearing the child report’s LinkChildFields and LinkMasterFields properties will not allow the report to run “showing every row in the subreport.” I also noticed that Access swallows this error if the parent report is opened programmatically using DoCmd.OpenReport, even though the report still doesn’t work (see frm-rep-q3bsub).

The solution is simple: explicitly declare the columns our crosstab query will always have. We do this by adding an IN clause to our PIVOT (see q4):

PARAMETERS foryear Short;
TRANSFORM Sum(mytable.amount) AS total
SELECT mytable.project
FROM mytable
WHERE mytable.year In ([foryear],[foryear]+1)
GROUP BY mytable.project
PIVOT IIf(mytable.year=[foryear],"thisyear","nextyear") IN ("thisyear", "nextyear");

Now if we specify 2002 for the “foryear” parameter we get:

project   thisyear   nextyear
100                  $0.50

Great. We’ve now got a fixed column crosstab query that we can bind to the RecordSource of a report at design-time without causing endless parameter value prompts (see rep-q4), and that we can successfully use as the RecordSource of a linked subreport (see rep-q4sub).

By the way, if you’re wondering why earlier we didn’t do something like this:

PARAMETERS foryear Short;
TRANSFORM Sum(mytable.amount) AS total
SELECT mytable.project
FROM mytable
WHERE mytable.year In ([foryear],[foryear]+1)
GROUP BY mytable.project
PIVOT mytable.year IN ([foryear], [foryear]+1);

It’s because Access doesn’t support the parameterization of IN clauses in a TRANSFORM statement’s PIVOT clause. This will get you an error 3071: “This expression is typed incorrectly, or it is too complex to be evaluated. For example, a numeric expression may contain too many complicated elements. Try simplifying the expression by assigning parts of the expression to variables.”

Now, let’s talk about another TRANSFORM gotcha. It seems that Access can’t see through TRANSFORM SQL the same way it can see through all other SQL statements. Specifically, Access can’t resolve implicit query parameters through a chain of separate nested queries that contain, somewhere in the chain, a TRANSFORM query, unless the TRANSFORM query also uses or declares the same downstream parameters. More simply, TRANSFORM queries terminate implicit parameter bubbling in Access.

An example will probably explain this better. If we have a query named q5 like this:

SELECT mytable.project, mytable.year, mytable.amount
FROM mytable
WHERE mytable.year In ([foryear],[foryear]+1);

And another query named q6 that relies on q5 like this:

SELECT mytable.project, Sum(mytable.amount)
FROM q5
GROUP BY mytable.project;

Running query q6 will cause Access to prompt us for a value for the “foryear” parameter, which is used, but not explicitly declared, in q5 . Likewise, if we programmatically evaluate the Parameters collection of q6’s QueryDef object, we’ll find that it contains a single “foryear” Parameter:

?CurrentDb.QueryDefs("q6").Parameters(0).Name
foryear

However, if we add another query named q7, which also relies on query q5, but uses a TRANSFORM statement, like this:

TRANSFORM Sum(mytable.amount) AS total
SELECT mytable.project
FROM q5
GROUP BY mytable.project
PIVOT mytable.year;

Running the query will not get us a prompt for the “fordate” parameter. Instead, it will will get us an error 3070: “The Microsoft Jet database engine does not recognize ‘foryear’ as a valid field name or expression.” We’ll also see this error if we try to programmatically evaluate the Parameters collection of q7’s QueryDef object.

The solution is simple: explicitly declare query parameters, either in the queries that use them, or in the TRANSFORM queries that use the queries that use them.

Finally, just to be complete, I should point out that there’s a way to get the same crosstab-style fixed column results illustrated above without using a TRANSFORM statement (see q8):

PARAMETERS foryear Short;
SELECT myvalues.project, Sum(myvalues.thisyear) AS thisyear, Sum(myvalues.nextyear) AS nextyear
FROM
  [SELECT mytable.project, IIf(mytable.year=foryear, mytable.amount, Null) as thisyear, IIf(mytable.year=foryear + 1, mytable.amount, Null) as nextyear
   FROM mytable
   WHERE mytable.year In (foryear,foryear+1)]. AS myvalues
GROUP BY myvalues.project;

Of course, this type of sub-SELECT SQL is always an option when you’re dealing with fixed column presentations. However, I find the TRANSFORM-based approach cleaner and easier to read.

Advertisements

3 Responses to “Access SQL TRANSFORM Statement”

  1. Stuart Voice said

    Hi,
    I have been trying to find a solution to missing columns in a pivot query (into a report) for a while now with no joy – the simple addition of the IN statement into the SQL query and it works fine – wish I had found it sooner, thanks for the assist.
    Stuart

  2. MissFitz said

    Thanks for the great example on TRANSFORM. I used it for a report and it worked perfectly.

  3. saqib said

    Hello
    Can in PIVOT we can use two fields. if it is required what will be the query i.e apply pivot on two column in one pivot table.

    Thanks

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: