The world of data warehousing is wonderful. I don’t say this because some of the technologies used are enthralling and fun, nor because the return on investment of a successful data warehousing project is usually high. While those would be perfectly valid reasons to get excited about stars and snowflakes, column-stores and graphs, the truth is that data warehousing is wonderful because there are many ways (and tools) to go about doing it. The choices are seemingly endless.
The reasons for implementing it, on the other hand, are not.
Bringing analytical capability to your company is rarely cheap. Some projects take long. Other projects are shorter but may require more expertise, which makes them just as expensive. The harsh truth is that, wonderful as it may be, data warehouses (and variants thereof) require justification.
To make things easier, we have listed 10 reasons why you should consider getting in touch with us (other than the fact that some of us are really nice people):
1. Inefficient Use of Time
One of the greatest motivators for companies to implement data warehouses is that many of their employees are spending precious time collecting and analysing data in spreadsheets. Correcting the data and producing reports takes great effort, but the real crime is that this is generally done once a month or once a quarter, depending on the business requirement.
Having a data warehouse with a good set of dashboards and reports would automate this and save everyone a lot of time.
The true Achilles’ heel of the spreadsheet operation is in fact the inaccuracies and the discrepancies that inevitably crop up over time. We established earlier that manually collecting data and putting together reports is not an easy task, but the bigger problem is that the slightest mistake could throw a report off balance. It could be a difference of 2% or a difference of 20%. Without proper measures in place (and there usually aren’t), your company will have no way of finding out whether any reports are correct or not.
The likes of Richard Branson believe that leaps of faith are good for companies, but it’s an act of plain stupidity if that leap of faith is believing unverified numbers.
Manually verifying each report, on the other hand, would only exacerbate the problem of time consumption.
When it comes to implementing a data warehouse and building reports around it, validation and verification need only occur once. After that, you are guaranteed continuous correctness.
3. Variance between departments
Continuing on the theme of imprecision, another problem that plagues larger companies is the problem of data silos. Data is considered to be in a silo when data (often redundant) is collected and insulated from the rest of the company. Moreover, different departments might have different ways of calculating the same things, or might use terms differently.
As part of the data warehousing exercise, such processes are rigorously thrashed out, leaving one single source of the truth.
4. Not looking into the past
When analysing data directly from your source systems (payroll systems, point-of-sales systems, etc…), it might be difficult to look at historical data. In some cases, historical data is stored in older spreadsheets or is even purged. Being able to compare to past data helps give you an idea of the direction in which you are going. You could then supplement that analysis with correlating any trends with changes to different variables in your business.
Having a data warehouse will allow you to capture data from many sources (payroll, point-of-sales, weather forecasts, stock market indices) over time. Easy period-on-period analysis is one of the many things that a good data warehouse will allow.
5. Pushing a Spreadsheets to its Limits
A racing car with a puncture sounds pretty difficult to control. A spreadsheet with 75,000 rows and 20 columns sounds equally difficult, especially if you are on a deadline. There is no denying that spreadsheets are absolutely excellent tools, but many employees become increasingly reliant on spreadsheets – writing more reports, storing more data, and so on. Before you know it, your spreadsheets are taking forever to load, you are laboriously triple-checking every line to avoid mistakes and your time is just disappearing into thin air. Oh, and a spreadsheet is a lot more difficult to use once you are dealing with large wads of data.
A data warehouse is, by design, very scalable. It can grow to accommodate massive amounts of data. The right dashboards will allow you to quickly leaf through that data and find the answers you are looking for, without having to control a behemoth-like spreadsheet.
6. Trouble sharing work
This point continues from the assumption that large and complex reports are being built in spreadsheets, but looks at the next step. What happens once they’re done? Do you e-mail them and fill your colleague’s hard drives with more redundant copies? Do you put them on a shared drive where the file can only be edited by one person at any given time? What happens if someone finds a mistake and fixes it? Are you disciplined enough to replace your local (and original) spreadsheet with the updated one? These might be manageable for one-off reports but it could be quite the problem when you are churning out a dozen reports a year.
These (and many other) questions arise from problems that hamper the hard work some people put into their reports. The worrying part is that the problems do not stop at sharing one’s work – they could lead to more time wasting and worse yet, imprecision.
A centrally-accessible reporting server will allow employees to write their reports and publish them for others to see. What’s more, these report servers come with tools that manage changes and backups.
This ice cream is more editable than most spreadsheets, because two users can make changes at the same time.
7. Trouble with security
At most companies (especially the larger ones), employees are not allowed access to all the data. In some cases, this would simply be a matter of not sharing the data with them. But what if you had a large report with data from many departments? In a spreadsheet, you’d have to use up a lot of time toying around with security settings to hide the data. Even then, the security offered might not be up to scratch. A sure-fire way of protecting parts of the data would be to issue multiple versions of the same report, containing only the data that the audience is privy too. That would take a lot of effort though.
What if you are e-mailing your report, and rather than sending it to your colleague, you send it to a customer with the same name? Someone once sent me a sample of bank records by accident.
Dashboards can be designed to automatically secure parts of the data depending on the employee’s login credentials. That’s far easier than managing the problem manually.
That’s what happens when you leave the security of your information in the hands of every employee.
8. A need for varied data
What if you would like to start collecting data from new sources? For example, Facebook comments about a new product. Would you log into Facebook and copy-and-paste each comment into a spreadsheet? The smart money would use a tool to scrape comments off Facebook and store them in a database. But what then? Are you willing to spend time each day loading these comments and copying them into the various reports you might have? Will you match each comment to each product/characteristic manually?
A well-designed data warehouse can integrate multiple sources of data into one unified data model. Furthermore, new sources can be added with relative ease.
Variety can be found everywhere, not only in your data sources.
9. An interest in trends and patterns
Data mining comprises a set of techniques used to process (or “mine”) a set of data in order to identify patterns and discover knowledge. Finding patterns in sales, correlating stock movements with oil prices and forecasting next winter’s customer footfall are all examples of data mining.
Spreadsheets are capable of rudimentary data mining techniques, but there are limits on how much data can be mined in one sheet. I cannot imagine eBay fitting all their sales into one spreadsheet in order to find seasonal trends.
A data warehouse is not a data mining tool, but can be used as a source of clean and structured data which can be fed directly into a data mining tool.
10. Overloaded Operational Systems
Many operational systems (also known as source systems), such as payroll and point-of-sales systems, need to be maintained regularly in order to keep them operating at optimal speeds. This generally means reducing the number of times their databases are read, as well as purging some of the older data. This data is sometimes archived, though even then it isn’t in a readily accessible state.
Repetitively collecting data from operational systems will impact their stability and speeds. Any changes to the data will not be reflected in the collected data, either. By connecting reports directly to the operational databases would avoid the problem of showing outdated data, but that report is then vulnerable to any changes to the system. What would happen to your year-on-year reports if someone were to purge the POS system of all sales but this year’s?
A data warehouse is designed to read data from operational systems during off-peak hours, updating itself as it goes along. This data is then cleaned and stored in a highly-accessible, structured manner. This makes it a far more reliable source of data.
This is not the only kind of overloading that should be avoided.
If any of these are a thorn in your side, give us a call and let us help you out. There are many ways to go about implementing Business Intelligence in your organisation, and one of them is bound to be the right fit.