BI Project Management in the Age of “In Memory”

Eliran AzulayBI projects tend to take a long time. This is a known problem. Many clients know that when they embark on a BI project with the IT department, they will probably have to wait a long time for the result, go through requirement surveys, make a decision on the design, etc. It won’t just come on a silver platter. In Memory is not a new concept in the world of BI. Up until recently, the only well known tool in the field was QlikView. New tools have popped up recently, including Microsoft’s PowerPivot for the client version (the server version is called Analysis Services Tabular Mode). There are other tools, created by companies such as Tableau, MicroStrategy and BusinessObjects, which I am less familiar with, but they’re all similar in that they work on virtual machine memory and provide the capability to arrive at a product quickly with less effort. Right about now you’re probably asking yourselves questions like: “So why not develop all the projects this way?” or, “Where’s the catch?”. So first, let’s put things into order. We’ll present the pros and cons and then we’ll understand how In Memory can help in classic BI projects. And no, you are not going to dump OLAP in the trash.

The Pros:

  • Millions of lines processed at incredible speed.
  • The ability to create a data inquiry model that includes several data sources with no ETL (some would say this is a disadvantage).
  • Most tools enable product sharing and inquiry in a very convenient and immediate way.
  • PowerPivot specifically is a completely free tool. You can take several data sources on your computer at home and build a dashboard for yourselves on Excel. Once you have a basic understanding of the product you will be able to create a dashboard in no more than 20 minutes.

The Cons:

  • It’s new and it’s fresh (Tabular came out in May 2012). Developers don’t have a lot of knowledge in the area and some of them aren’t quick to adopt new technology in such a short period of time (that’s why this post was created).
  • It can make you lazy as a developer. ETL is a well known problem, mainly for Israelis… Developing a stable ETL is a milestone in every responsible BI project.
  • As soon as you run out of RAM on the machine, the project will collapse. At the very least an error will pop up. Users won’t like that.

so when should In Memory be used and when should it not?

If you’re about to start a big project then I hope you do a comprehensive and lateral specification before getting your hands dirty. Once you’ve gathered the requirements and you have access to the data sources, you are ready to get down and dirty your hands. But, you need to specify ETL and specify a cube, and this means allowing the developers of each of these fields to understand the requirement and start writing. You can easily lose a couple of months on this process. The end client gets irritated, because he provided the requirement two months ago and hasn’t seen any product. Sending him the specification document would probably be tantamount to sending a grocery list in Chinese – utterly confusing.

This is where the new technology comes in. What I like to do is to create a quick project on Tabular Mode (and if you don’t have such a server, you can install PowerPivot and not have to wait for the DBA), and then start importing the data sources. Once that’s done you can easily take the date field and convert it to the PK format of your choice so it connects to your DateWarehouse’s time dimension, take another relevant field and connect it to a different data source, import the dollar rate on RSS so it also enters the picture, etc.

Once you’ve tried and deployed the server, you can connect to the “Tabular Cube” from any Excel client or relevant OLAP Viewer product. You build a few nice graphs, do a nice Pivot so it’s easy on the eye, and schedule a meeting with the client.

In this meeting, you explain to the client that you’ve built a basic model, so they can take a look and get a sense of what they might get. They will be very happy and want to play around with the data, and things will begin to become a lot clearer to them. At this point there will be some new requirements and clarifications about what should and shouldn’t be done. Clients will always change their minds about things they said in the past once they’ve seen the product.

In such an event, you’ve saved yourselves a lot of the specification work, as well as the correction work once the project is on the air. In Memory technology is very good for displaying capabilities.

At the same time, these models can serve as a product for many years. You can create a job that will update the data every few minutes. The data can sit directly on the dedicated database and be updated in real time while the user does an inquiry (with the Microsoft tool this is possible only when the data sources are SQL Server). This saves the IT team a lot of time. And if you have a client that you trust, you can let them develop a fast solution themselves using PowerPivot, and then you can import it to the Tabular.

This works very well for small to medium models. However, if complicated calculations, sophisticated authorizations and special pyrotechnics are involved, you would probably want to consider performing an educated comparison before you begin the work.

What else is it good for?

Let’s say you have a very large OLAP cube, and you don’t want to change it because that would involve very high processing and you’re busy working on a different project, but the client is really putting on the pressure. Give them a versatile In Memory solution! Take the data from the cube model (from the star scheme on the database), add the data that the client requires, and give him access. You can even create some reports and throw in a dashboard. They won’t know the difference. The next time you upgrade their classic cube, add what you built on the In Memory tool, so that everything is unified and secure. This way you can stay focused and keep the client happy.