Business Intelligence, Small, Medium and Venti

The world “business” in the field of IT is heavily overloaded with surprising meanings.  For example, the fact that an street address has a maximum number of characters of 50, is sometimes called a business rule.  And so is the algorithm for calculating what your life insurance premium.

“Business Intelligence” is just as surprising.  Business intelligence is using computers to gather actionable information to be used in business decisions.  Often, this works with some privative technology, such as summing or sorting.

Small Business Intelligence
In short, small business intelligence is anything that can be achieved easily with ordinary SQL.

Finding your best customer, is a matter of  summing and sorting.

Finding the worst deadbeats, is a matter of sorting your accounts receivable by age and size.

Ratios and Dashboards. A businessman wants to know, am I making money?  That’s easy, subtract costs from revenues.  But it doesn’t adjust for scale.  Am I making a lot of money compared to other bigger and smaller companies?  Profits/Revenue gets us a ratio that can be tracked over time, benchmarked against other similar organization and son on.  The challenge of these ratios is that while the math is simple, assembling that much data is complicated, mostly by the fact that the more data you are moving around, the more problems you run into.  Deciding how to present the results as something other than a sea of numbers is also complicated– it is the user experience challenge of the database world.

Ratios analysis can quickly turn into my next category, especially if you want to have the profit rate for every department, office, district, country, etc in your organization– a trick that requires cross tabs and lots of them.

Business Intelligence is also confused with making the above look pretty, so often reporting products call themselves Business Intelligence, even if they really are only meant to deal with small “business intelligence”

Medium Business Intelligence
Crosstabs, also called pivots are the next level.  If you have 2 discrete variables, such as product category and state, you can find out what states are enjoying the full range of your products and which ones aren’t.   If one cell looks like every other, then we don’t have much to say, except the variables are probably not related to each other.  If they cells are very different, then you have a non-random patter and it probably means something, like widget X isn’t being advertised enough in Idaho.  In stats class this would have been Chi-Square statistics–surprisingly, chi-square statistics are never reported with cross tabs in any common database tool I’ve used.

There are two technical challenges with working with crosstabs.  First, SQL was never meant to support crosstabs. Relational algebra (which SQL is a type of), is all about two dimensional tables with columns, not multidimensional cubes of data.  Hence the need for MDX, yet another SQL-like query language that concisely queries cubes to return particular cross tabs of data, after all, we don’t have a very good way of visualizing crosstabs of more than two dimensions.  MDX also has an expression language which has the feel of excel formulae, especially in the way that

The final thing that makes business intelligence base on cubes and crosstabs hard is the combinatorial explosion of report types.  Cubes are often searched for unusual patterns by successively querying similar queries, sometimes at increasingly granular levels (drilling down), or some other ad hoc pattern.  This represents more canned queries than can hoped to be created by a finite IT department, so various cube browsers have been invented to allow non-database experts to craft an MDX query without realizing that they are crafting an MDX query.  Not unlike SQL query designers, this has mixed success as MDX is more expressive than what most cube browsers are capable of illustrating.

Big Business Intelligence
Big business intelligence is mostly the domain of  professional mathematicians and statisticians and includes data mining, multiple regression, and every other advanced statistics modeling technique that has ever been imagine by feverish mad professors.  The professional mathematicians are the only ones that have the credentials the business community wants to see before they will trust a mathematical model they don’t understand.

Microsoft has attempted to introduce “black box” data mining.  It is so black box that it is hard to figure out how to put data into these models and harder yet to understand what we are getting back.  This is a real tragedy because Microsoft could have figured out how to bring the power of advanced college statistics to the database world without resorting to proprietary black box algorithms.

By advanced college statistics, I mean things like ordinary least squares regression, monte carlo optimization and other models that can be understood with out a PhD.  Right now, doing this requires using the analysis features of excel, SAS, Mathematica, Gretl, etc.

Sadly, this level of business intelligence is likely to remain in the far future.