I watched OOW13 keynote y’day. Oracle CEO Larry Ellison mentioned Oracle 12c would maintain two copies of data, one for OLTP & another for OLAP. And data for both OLTP & OLAP tables would be updated synchronously. OLTP tables would be stored in row format & OLAP tables would be in Columnar format.
The columnar format OLAP tables would be helpful in reading data faster while row format OLTP tables would help in processing OLTP transactions faster.
It sounded great soon after hearing. On further analysis & discussions, I’ve following questions:
1) Mr Ellison mentioned OLAP queries would run faster as data would be in the columnar format. I get this. He also mentioned OLTP transactions would run faster. The reason: Current systems have 10-20 analytical indexes per table & 1-3 OLTP indexes per table. Since users are going to use OLAP system for queries/analytical requirements, we would no longer need to create 10-20 analytical indexes; as a result, updates/deletes/inserts would perform less number of operations. This would result in improved performance he reasoned. I would agree with him if all tables have 10-20 analytical indexes on an average.
However, I don’t believe I’ve ever worked in a system(including ERP system such as SAP) which had/has 11-23 indexes per table on an average. At the maximum, I’ve seen a few tables having ~10 indexes(total). On an average, I’ve seen 3-5 indexes per table. And as customers normally have some kind of Data Warehouse systems for their analytical requirements, almost all indexes are used for OLTP purposes. In that case, I doubt a significant number of indexes can be dropped to improve the performance. 10-20 analytical indexes seem to be a very big assumption. If this assumption is not correct(not by a little but by a big margin), then wouldn’t we see a significant OLTP performance degradation?
2) How would this approach work with SAP? In order to improve the scalability & performance of SAP systems, SAP follows asynchronous processing using what’s known as VB* tables. In other words, SAP uses their own transaction handling. SAP doesn’t rely entirely on Oracle’s transaction handling to ensure the data consistency. Would Oracle use VB* tables to copy data to OLAP side or trigger updates/inserts/deletes etc only when target tables are updated asynchronously? How would the error on OLAP side be handled if processed asynchronously?
3) SAP also uses the application servers buffering. How would this be updated to OLAP side? Wouldn’t there be an inconsistency between OLAP side & OLTP side because SAP knows a little bit more about data(based on whether table is buffered or not) than Oracle?
4) What are memory/disk size requirements to manage two copies of data?
5) No logging for OLAP data. What does this mean to recover the database from a crash? Would all OLAP data be recreated during restart after a crash? Would that mean the recovery time after a crash be a lot longer?