Much has been written about SAP HANA. The technology has been variously described as “transformative” and “wacko.” Well, which is it?
Disclosures
I have a few disclosures to make before I continue my analysis and comments on Hana:
- I worked at SAP for six years, as well as eight years at Oracle (plus also at Ingres before that).
- I was at SAP when the technology underlying HANA was acquired, though I am referring to and using no trade secrets or proprietary information in preparing this analysis.
- I attended this year’s SAPPHIRE conference in Orlando, and SAP paid for my airfare and hotel.
Relational Databases
Relational databases have dominated the commercial information processing world for twenty years or more. There are many good reasons for this success.
- Relational databases are suitable for a broad range of applications.
- Relational databases can enable access to data relatively efficiently even if the query was not initially envisioned when the database was designed.
- Today’s relational databases are economical, available on a broad range of hardware and operating systems, generally compatible across vendors, performant for many queries, scalable to fairly large data volumes without resorting to partitioning, suitable for partitioning when larger scale is required, based on open standards, mature, and stable.
- There are a large number of developers, administrators, designers, and an ecosystem of service providers who are very knowledgeable about today’s popular relational databases, and who are available at economic rates of pay.
NoSQL, Columnar, and In-Memeory Trend
There is an emerging trend towards databases that are designed to solve specific problems. While relational databases are good for solving many problems, it is easy to conceive of specific problems that are not well-solved by general-purpose databases. Relational databases are well-suited to handling structured data where the schema does not change, where text processing is not an important requirement, where data is measured in gigabytes rather than petabytes, where geographical or time-series (e.g., stream) processing is not required, and where the server does not need to support transactional and decision-support queries simultaneously.
Some problems do not fit those criteria. The data set is such that the schema varies from record to record, or over time. Text, image, “blob,” or geographical data may be a dominant data type. More and more frequently, applications manage “big data,” or huge volumes of data from millions of users or sensors. Some applications require simultaneous access to data for transactional updates as well as for aggregation in decision-support queries. For all of these cases, advanced architects and developers are looking at specialized data stores and data processing systems such as Hadoop, Cassandra, MongoDB, and others. These domain-specific data stores are known as “NoSQL” databases.
There is some controversy over whether NoSQL means “no SQL” or “Not Only SQL.” Regardless, those non-relational stores such as Hadoop, are growing in popularity, but are not really a replacement for relational data stores. A key property of most commercial relational databases is their compliance with a principle called “ACID,” which essentially guarantees that database transactions occur in a reliable way. Many NoSQL databases use techniques like “eventual consistency” to improve performance at the cost of inconsistent data – a sacrifice that is unsuitable for most business applications. After all, if you deposit money in a bank account, you want it to be available for withdrawal right away, not “eventually.”
Another trend in the database world is towards new methods of storing data, without eliminating the ACID properties that business applications need, and without sacrificing the SQL language that is so well-known and widely supported. Two specific approaches are quite popular these days – columnar storage and in-memory databases.
Column stores, such as HP’s Vertica or SAP Sybase IQ, store data by column. By contrast, traditional SQL databases store data as rows. The benefit of storing data as rows is that it is often the fastest way to look up a single value, such as salary, given a key value like the employee ID.
Columnar databases group data by column. Within a column, generally speaking, all the data is of the same type. A columnar store, therefore, stores data of a single type all together, which can give advantages such as the possibility for significant compression. Good compression can lead to reduced disk space requirements, memory requirements, and access times.
In-memory databases take advantage of two hardware trends: a significant reduction in the cost of RAM, and a significant increase in the amount of addressable memory in today’s computers. It is possible, and economically feasible, to put an entire database in memory, for fast data management and query. Using columnar or other compression approaches, even larger data sets can be loaded entirely into main memory. With high-speed access to memory-resident data, more users can be supported on a single machine. Also, with an in-memory database, both transactional and decision-support queries can be supported on a single machine, meaning that there can be zero latency between data appearing in the system, and that data being available to decision-support applications; in a traditional set-up where data resides in the operational store, and then is extracted into a data warehouse for reporting and analysis, there is always a lag between data capture and its availability for data analysis.
SAP HANA
Several years ago, SAP acquired Transactions In Memory, a company that had developed an in-memory database. Over the years since, at virtually each annual SAPPHIRE conference, SAP has discussed how this in-memory technology would revolutionize business computing, but I personally found the explanations to be somewhat short on convincing details.
Even the name, HANA, has changed in meaning over the years. Initially, the name stood for “Hasso’s New Architecture” (and a beautiful vacation spot in Maui, Hawaii) and referred only to the software. Today, HANA stands for High-Performance Analytical Appliance, and refers to the software and the hardware appliance on which it is shipped. In addition, HANA has evolved from a data warehousing database into a more general purpose platform.
SAP HANA does manage data in memory, for nearly incredible performance in some applications, but it also manages to persist that data on disk, making it suitable for analytical applications and transactional applications – simultaneously. But HANA’s capabilities do not end there, and that may be the key to HANA’s long-term value.
In the short-term, it seems that SAP still struggles to generate references for HANA, other than in a narrow set of custom data-warehouse-type analytics. That may obscure where HANA can really deliver its first market successes.
When HANA is generally available, it is expected to include both SQL and MDX interfaces, meaning that it can be easily dropped into Business Objects environments to dramatically improve performance. Some Business Objects analyses, whether in the Business Objects client or in Excel, can achieve orders of magnitude of performance improvement, with very little effort. Imagine reports that used to take a minute to run now running instantaneously. Imagine the satisfaction of your BOBJ user community if all or most of their reports and analysis ran instantaneously. Line-of-business users will pay for this capability, and that will open the door for SAP HANA in Business Objects accounts. After HANA gets in the door, I’m sure the CIO will find tons of additional uses for it. This is huge, and will generate truckloads of money for SAP, while also making customers super-satisfied.
And think of what SAP HANA means for competitive comparisons with Oracle, SAP’s maximum enemy. Larry wants to sell you Exalogic and Exadata machines, costing millions; Hasso wants to sell you a simple, low-end, commodity device delivering the same benefits. If I were SAP, I’d have sales reps with HANA software installed on their laptops, demonstrating it at every customer interaction, and comparing it (favorably) with Oracle Exadata, and suggesting that customers demand that Oracle sales reps bring in an Exadata box on their next sales call – and not to bother showing up without one. Larry wants to sell you a cloud in a box; SAP will sell you apps on the cloud, or analytics in a box for hundreds or a thousand times lower cost than Oracle’s solution.
The longer term benefits of HANA will require new software to be written – software that takes advantage of objects managed in main memory, and with logic pushed down into the HANA layer. I’ll post more on this potential in the future, but just think of what instantaneous processing of enormous data sets will mean to business – continuous supply chain optimization, real-time pricing, automated and excellent customer service, and much more.
Summary
In the long run, SAP HANA may indeed revolutionize enterprise business applications, but that remains to be seen. Right now, SAP HANA should be capable of creating substantial customer benefits – and generating a very large revenue stream to SAP.

This was very good until we reached the point of pricing — maybe I missed it, but I haven’t seen any pricing anywhere on HANA. So (perhaps by design) how can anyone make any comparisons with a straight face? And where is the comparison for those who aren’t an existing ERP/BI customers? For a company that invested several hundred million dollars in an SAP install primarily for competitive advantage, one of which I am familiar with in process, how will they feel if the value add on top of the stack–HANA appliance, is provided to competitors even for a few million? (Servers alone are reported to be in the $400k range now)
Mark –
Thanks for the thoughts. You are right that SAP has not shared its thoughts on pricing overall, but we do know a few things:
1. Oracle Exadata runs only on machines that cost in the range of $10^6 to $10^7.
2. Hasso has said that SAP HANA can run an entire company on a box comparable to a Mac mini, which is in the price range of $10^3
If SAP is wise, it will keep the software in a price range of $10^4 to $10^5, which would give it an overall solution price advantage of $10^1 to $10^4 versus Oracle. Even with a server in the price range of $10^4 to $10^5, SAP still has a price advantage of $10^1 to $10^4.
Further, SAP has the potential to package the integration with Business Objects so that the implementation costs are negligible as well.
But, as you point out, we must keep an eye on the pricing from SAP to really understand the economics in comparison. Thanks for keeping me honest!
– Dennis
I will post soon about pricing – I got some good information on pricing at SAP TechEd, and the news is very good for SAP customers (and potentially very bad for Oracle sales reps and shareholders!) … stay tuned …
– Dennis
Thought it woudln\’t to give it a shot. I was right.
Mr. Moore,
I read your article and have a couple questions that were not addressed in your article. I have a novice understanding of data warehousing so my questions may seem fairly elementary.
What is the best way to discuss?
Jennifer –
Please post your questions here, or you can send them to me on twitter @dbmoore. Thanks!
– Dennis
Dennis,
Vertica and Sybase IQ are both relational and ACID. You may question their performance when being updated, but that’s a different matter.
And Vertica, at least, has some pretty good answers to that challenge.
Curt –
I’m not sure where you got the impression that I’m saying that Vertica and Sybase IQ are not ACID or relational. I know that Vertica and Sybase IQ are both ACID and relational.
I did say that ‘Many NoSQL databases use techniques like “eventual consistency” to improve performance at the cost of inconsistent data – a sacrifice that is unsuitable for most business applications.’ I did *not* say that Vertica or Sybase IQ is NoSQL.
I did contrast columnar databases with “traditional SQL databases,” the latter storing data by row.
Thanks for giving me the opportunity to clarify.
– Dennis
Curt –
I’m not sure where you got the impression that I’m saying that Vertica and Sybase IQ are not ACID or relational. I know that Vertica and Sybase IQ are both ACID and relational.
I did say that ‘Many NoSQL databases use techniques like “eventual consistency” to improve performance at the cost of inconsistent data – a sacrifice that is unsuitable for most business applications.’ I did *not* say that Vertica or Sybase IQ is NoSQL.
I did contrast columnar databases with “traditional SQL databases,” the latter storing data by row.
Thanks for giving me the opportunity to clarify.
– Dennis
Jennifer –
Please post your questions here, or you can send them to me on twitter @dbmoore. Thanks!
– Dennis
Hi Dennis,
I’ve a simple question regarding your argument on HANA running “in-memory” which, in turn, makes the difference.
From what I know, classic RDBMS cache disk pages as much as possible, meaning that in the ideal case all data relevant to a query already sits in a (memory) cache. Following your argument that would mean that Oracle, IBM DB2, Sybase, SQL-Server should run equally fast as HANA. They should simply sit on a server with enough memory … which – to your argument – shouldn’t be a problem given today’s hardware and prices.
Is there a flaw in my (and consequently your) line of argument?
Regards
Bob
Wow – the great questions keep pouring in!
What is the difference between an in-memory database and a traditional database with the cache cranked up? Several items are fundamentally different in these two cases.
A traditional relational database is optimized for disk-based access, which is orders of magnitude slower than memory-based access. This leads to certain design choices that have traditionally been made; one could argue these choices could be made differently going forward, but databases like ORACLE currently have not done so – not even in their Times Ten database product.
Traditional databases assume the data resides on disk or in a buffer. To operate on the data, it needs to move the data from disk into the buffer (if the data is not already there). Greatly increasing the size of the cache does not guarantee the buffer pool will be large enough to host all the data, but let’s assume that it does for a particular case for comparison purposes.
Traditional databases assume the index also may be on disk. To operate on the index and data, both of which may be on disk, the traditional database (intelligently) optimizes for minimum disk accesses. In-memory databases have index structures optimized to reduce computational time, and these index structures are significantly faster for accessing memory in data than traditional database indices. Today’s traditional databases do not offer these types of indices that are optimized for memory-resident data, though there is no intrinsic reason they cannot in the future.
Traditional databases assume the primary way to access the database is from disk, and have been designed to support lots of failure cases (computer crash, database software crash, disk loss) that require that the database can be recovered quickly and consistently after a crash. In-memory databases (generally, and HANA in particular) stream transactions to disk for persistence and to ensure consistent recoverability, but these databases assume access is to memory-resident data; this allows optimizations (without sacrificing data recoverability) to reduce the number of times the computer is waiting for the disks to acknowledge that they are in synch with memory. This would be very hard (in my opinion) to offer as an option for some data in a relational database. Although I have not measured the impacts of this topic, I believe it can be one reason why lock contention is such a negligible problem in in-memory databases.
One last item I’ll touch on is the columnar storage and compression. Some newer relational databases use columnar storage and compression to greatly increase performance for some operations – although it is tricky to have good insert performance in a columnar database, particularly one that is disk-based. Columnar storage and compression is easily achieved in memory-resident databases for a number of reasons.
I could go on and on, but I would recommend two resources if you’d like more information on this topic. First, a nice little paper (only 8 pages, and very high level) is available on the web at http://www.inf.uni-konstanz.de/dbis/teaching/ws0203/main-memory-dbms/download/MainMemoryDatabaseSystemsAnOverview.pdf . A second resource is Hasso Plattner’s book on this topic, “In-Memory Data Management: An Inflection Point for Enterprise Applications,” http://www.amazon.com/-Memory-Data-Management-Inflection-Applications/dp/3642193625/ . Amazon shows the book as not yet being available, although there is a free copy of the electronic version available online at http://www.sapandasug.com/Complete_In-Memory_Data_Mgmt_Book.pdf .
I hope this helps! Thanks,
– Dennis
Dennis,
Excellent write up but could you please elaborate more on this comment:
“Columnar storage and compression is easily achieved in memory-resident databases for a number of reasons.”
Telling us there is an issue on columnar database’s with this but not in memory gives a onesided arguement. I was under the belief HANA would have similar problems.
Second, another analyst, I wish I could find the posted, pointed out it that the Sybase IQ database could return performance similar to the HANA at a much lower cost. Any thoughts to that?
Finally as an SAP customer, HANA draws no interest. Having a software application that works and works well is what many want from software vendor. Not an overpriced hardware appliance.
Thanks,
LB
LB –
Another great question. When I wrote the article, I didn’t exhaustively cover all the topics.
Updates in columnar databases can be a problem because of the need to update each column’s data (which can involve many blocks on disks, one for each column, plus also potentially many for indices). During that update, lots of locks have to be acquired, and the update can impact other processes, degrading the overall process of the system.
Updates to in-memory columnar data stores may involve the same number of locks, but these locks don’t need to be held for very long (updating memory is orders of magnitude faster than updating disk), resulting in much less of a performance impact to other users and processes. In fact, update performance is one area where in-memory database technology really shines.
I’m sure there are applications where Sybase IQ would give performance similar to HANA, but there are many applications where HANA would give performance results that cannot be matched by any disk-based technology. As to cost – well, the cost of RAM has dropped a lot over the past few years, and a lot of data warehouses could fit nicely even in the memory you might have on your laptop.
I can understand that you are not interested in a new technology – that’s a wise approach. However, HANA is not just technology, it’s also the potential for a transformational user experience. When you reduce the time for a query to complete and return an answer by one or two orders of magnitude, users now not only have the ability to solve business problems faster and better. In addition, users (and applications) can include far more data analysis in activities that generate business value. An analysis that takes a second rather than two minutes can now become part of a call center agent’s script. An analysis that takes three seconds rather than a minute can run during a travel planning session. An analysis that takes a tenth of a second rather than ten seconds can be run during a car crash (whereas the ten second analysis could not). HANA can be the difference between having a software application that works well for users in scenarios where no software application could have worked before.
I hope this helps – I’m happy to provide additional references on the technical issues if that would help.
Thanks for the comments!
-DBM
Hi Dennis,
To achieve faster data retrieving, my understanding is that Exadata pushes query from DB server to the hardware as much as possible, while HANA takes use of memory but querying the memory instead of the hardware. Am I correct?
Thanks,
Ray
Ray –
It is possible that Exadata has some sort of smart disk controllers – I haven’t done any research on that topic. Regardless, getting data from (or updating data on) disk is always slower than getting (or updating) data in memory. BTW, the acquisition of Pillar Data by Oracle could provide Oracle with even some more interesting performance options for Exadata. But it will still be slower for many queries than HANA …
– DBM
[…] The real (potential) impact of #SAP HANA Larry wants to sell you Exalogic and Exadata machines, costing millions; Hasso wants to sell you a simple, low-end, commodity device delivering the same benefits. If I were SAP, I’d have sales reps with HANA software installed on their laptops, demonstrating it at every customer interaction, and comparing it (favorably) with Oracle Exadata, and suggesting that customers demand that Oracle sales reps bring in an Exadata box on their next sales call – and not to bother showing up without one. Larry wants to sell you a cloud in a box; SAP will sell you apps on the cloud, or analytics in a box for hundreds or a thousand times lower cost than Oracle’s solution. […]
Hi Dennis,
Nice write up and follow on comments. As a SAP BW consultant, I’ve read a few articles on SAP HANA. Each article mentions the performance improvements and the benefits of querying data that resides in memory but I haven’t seen any mention on how in memory database will impact the delivery of EDW solutions. Simply put, how will my job change? Or from a business perspective, how will the delivery of my EDW solution change? I think we are going to see a fundamental change in that EDW will move away from the storing structured data in multiple layers to more of a virtual EDW model where data is built on the fly from the underlying source tables. For example, when ECC is moved on the HANA platform why extract the data into another environment and build the typical 3-tier data model?
Do you have any sources that discuss this topic in more detail?
Thank you,
Dae Jin
Dae Jin –
I agree with your supposition regarding how data warehouses will change. As users are able to ask more and more queries in real time, IT’s role will change from a heavy emphasis on structuring the data and managing performance, to a role of connecting as many data sources as possible and exposing metadata to tools so that users can query the system more effectively. In addition, as BPM and rules engines are connected to high performance analytics like SAP HANA, IT will also take on the role of capturing business rules and processes into such systems.
I don’t know of any sources that discuss this trend. Thanks for a great comment and questions!
– Dennis
PS – Are you by any chance related to Will Swope (formerly of Intel)? If so, please give him my best regards!
[…] The real (potential) impact of #SAP HANA Larry wants to sell you Exalogic and Exadata machines, costing millions; Hasso wants to sell you a simple, low-end, commodity device delivering the same benefits. If I were SAP, I’d have sales reps with HANA software installed on their laptops, demonstrating it at every customer interaction, and comparing it (favorably) with Oracle Exadata, and suggesting that customers demand that Oracle sales reps bring in an Exadata box on their next sales call – and not to bother showing up without one. Larry wants to sell you a cloud in a box; SAP will sell you apps on the cloud, or analytics in a box for hundreds or a thousand times lower cost than Oracle’s solution. […]
[…] More drinking of the Hana Kool-Aid […]
Hello Dennis,
what will be the impact of new high speed SSDs, or other nonvolatile memories (HP IBM) on Hana’s future? Fast devices like Micron P320h or Intel Ramsdale, reading data with speed more and more close to the RAM memory, will be on the market in the near future, probably this year. It is previsible that even faster ones will came into the market in 1-2 years. Will this take out the advantage of Hana reading data, at high speed, from memory?
Thanks,
Florian
Florian –
Thanks for another great question!
Today’s relational databases were designed for a time when memory was fast but ultra-expensive, processors were much slower than today (and with many other limitations, such as much smaller caches), and when disk drives were small/slow/expensive. There was a lot of work done to optimize which data were brought into limited memory.
HANA, or SAP’s In-memory Compute Engine (ICE), was developed without these arcane assumptions. Regardless of how fast such memory can get, HANA should outperform a database architected for an outmoded set of assumptions. All databases (including HANA) should benefit from such faster memory, but – assuming you have enough memory to hold your entire database in memory – HANA should outperform traditional databases no matter how fast such permanent storage can get.
This is because in-memory databases, like HANA, do not have the overhead burdening Oracle, DB2, and other traditional databases. This overhead is related to disk management, cache management, and optimization for disk retrieval.
I hope this helps, and thanks for another great question!
— Dennis
Hi Dennis
Great article. As a Basis person, how can I present the advantage of SAP
HANA to our potential customers with budget constraint but having performance issue
in their BI/BW systems?
Secondly, will HANA impact the storage vendors in anyway in the future ( like netapp or emc)?
Effy
Effy –
Thanks for the kind words and the excellent questions.
For customers who have budget constraints, the thing to do is to look at the cash flows over a few years, along with an SAP sales rep I think. Maintaining a big Oracle BI data warehouse (or IBM, or Microsoft) costs a lot in terms of the database license, hardware costs, and DBAs. And, at the end, the users get slow results. Just in terms of costs, for many customers with large data warehouses, HANA can pay for itself in the first year just based on savings.
However, HANA also has the potential to give users a major improvement in user experience and performance. If you can go from taking five minutes to process a query, to taking a second or two to process the same query, you can create business advantages. What would happen if we ran this discount? What would happen if we changed our manufacturing run size? What would happen if we expanded the sales force by 5%? by 6%? or shrank it by 10%? This could revolutionize business decision-making.
As to the impact on storage vendors, yes, HANA could have a major impact. First, data would not need to be stored redundantly in a data warehouse database. Second, data could be stored in a much more compact form. Third, data would not need large indices to be stored on disk. For these and other reasons, demand for disk drives could be impacted, although there is MUCH growth in demand for disk drives just to capture all the human and machine generated events (like web interactions and sensor readings). In addition, the disk drives that are needed for databases like Oracle or Microsoft SQL Server or IBM DB2 are disk drives at the high end of the cost scale – with intelligence built in, incredibly high speed, etc. Disk drives needed just to capture data for crash or disaster recovery, as with HANA (or web or sensor logs for that matter) can be cheap, commodity drives. Less demand for high end units could be an impact of HANA.
Thanks, and please let me know if you have other questions, or comments on my responses!
[…] Dennis Moore wrote a detailed post in June about SAP HANA. A question surfaced in the comments, asking how SAP HANA will compete as SSDs get faster. […]
[…] Dennis Moore wrote a detailed post in June about SAP HANA. A question surfaced in the comments, asking how SAP HANA will compete as SSDs get faster. […]
Hi Dennis,
As head of BI where I work I am still struggling into some critical issues regarding BW and its reports in reference to HANA. Current information in SAP has a different structure from BW. SAP delivers standard extractors and we developed some of our own. We also cross reference data with multiple files and data from other sources as well as other SAP Servers. From there we can extract our own customized reports. To conclude my questions are:
1. Does hana will have an interface where I can model my new structures for reporting?
2. Does hana will have enough logic to extract data from SAP through generic apab datasources as well as other sources (PC_FILE), SQL_SERVER?
3. Would SAP deliver standard extractors as they did for BW, which I may say they really contribute for fast implementations (with enhancements)?
4. Is there any way to leverage current BW after acquiring HANA?
I believe these factors are critical for SAP R3 -> BW current enterprise solutions.
Thanks for your comments
David –
Tough, but important, questions!
One of the great mysteries of the universe is “how do I get my data out of SAP, and present it to Information Workers (and others) to help them do their jobs better?” I struggled with this issue when I was at SAP, attempting to create applications that spanned organizational silos (xApps), focused on Information Workers where they live(d) (Duet, which integrated the things workers do in Microsoft Office with the processes and data in SAP), and even a simplified enterprise search product (Argo – never made it out the door to my eternal regret).
The BW extractors are the key mechanism SAP has created to get data out of SAP and into a data warehouse structure. These extractors are crucial, because SAP does not store data in a normalized set of tables, enabling simple access via SQL SELECT statements – such an approach might not scale, probably would not be secure, and would limit flexibility. Instead, data must be programmatically accessed, and the BW extractors have been written to get this data out of SAP.
SAP HANA as previously announced can certainly use BW as a source of data, and it is even conceivable that a customer could write their own equivalents of BW extractors for some limited subset of SAP objects/data within the programmatic capabilities of SAP HANA. For customers already heavily invested in BW, the approach of using BW as a data source for SAP HANA could be appealing, especially if there are many existing reports and analyses that run fine without needing the SAP HANA capabilities. Still, it would be even nicer to forego the middle step of extracting data to BW in many cases.
I believe that SAP will make more announcements about SAP HANA at SAP TechEd next week in Las Vegas. I will be at the event and will cover SAP HANA there. Expect SAP to announce integration of SAP HANA with BW, as well as new SAP HANA applications, at this event. Perhaps we’ll have a better answer to your questions at that point. Here are my thoughts on your specific questions:
1. SAP HANA does have a metadata layer where you can model your analytical/reporting objects.
2. SAP HANA can have logic, if you provide it (or if SAP announces such a capability in the future) to extract data from SAP, SQL databases, files, etc.
3. SAP has not yet announced standard extractors for SAP HANA, but I expect they will announce integration with BW next week accomplishing this with SAP BW as an intermediary data store.
4. I believe SAP will announce this capability next week, but certainly SAP HANA can extract data from BW, and programmers can write code that will even process data in SAP HANA and write it back to SAP BW (although this is not for the faint of heart).
I hope this helps, and thanks for an excellent set of questions!
– Dennis
Forgot to add:
Not all Business rules are in the extractors but also in the data mart. Does HANA may handle this process of data cleaning or enhancement?
David –
To your follow up – I am not aware of any current automation for extracting business rules from data marts/warehouses or even from operational systems and automatically propagating them into SAP HANA.
– Dennis
David –
Also, I don’t know if you’re planning to attend SAP TechEd, but this would be a PHENOMENAL venue to really get hands-on with SAP HANA, and to learn the difficulties and tips/techniques (e.g., data modeling, replicating data into SAP HANA from productive environments, deploying SAP HANA (life cycle management) in a productive environment, SQL Script).
Also, check out SAP SLT trigger-based replication, as this appears to be SAP’s recommended way to get data into SAP HANA. Here are some good links on this, and on SAP HANA in general:
http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/70890624-84b0-2e10-e4b8-ac2a3d296b98?QuickLink=events&overridelayout=true
http://help.sap.com/hana/hana1_tom_en.pdf
Good luck – let me know how your investigation and project go!
– Dennis
Hello Dennis,
Excellent article and good feedback on the questions.
I have one myself:
You mention in the article that extraction from HANA into Business Objects will become much faster. Will HANA also allow others (like Cognos, Qlikview, Tibco Spotfire) to easily extract (or connect to) large volumes of data in memory?
As background: currently, getting data out of SAP or BW for those who did not get the keys to the backdoor (hence, anybody other than SAP/BO) is a nightmare: slow and limited in the amount of data to be extracted, needing middleware like Simba or Composite.
How do you think this is going to change?
Kind regards, Andre
Andre –
Thanks for the kind words! One of the finest database geniuses I have ever known shared your last name, but with a slightly different anglicized spelling (Bob Koii, a great guy, may he rest in peace).
There will be many great benefits of integrating the operational and analytical processes. One of these benefits is the ability to see reports, dashboards, charts, and analyses on data with essentially no latency. As our understanding of this integration increases, you will see applications that integrate analytical data into operational processes – not just embedded charts and dashboards, but predictive analytics sending control signals into business processes automatically (like program trading does today in stock markets). I believe we will see, with a simplification of the layers, and databases optimized for the workloads rather than for developer productivity (the latter being the true benefit of the relational model), a significant reduction in total cost of ownership, with significantly reduced hardware, storage, and operational costs. It will be up to the vendors involved to make sure we also see reduced (overall) software costs, although we may see significant shifts in revenue from one vendor (e.g., Oracle) to another (e.g., SAP).
However, to achieve these kinds of benefits, we require integration at the metadata level as well. HANA can be accessed via SQL queries, so of course any improvements in query performance, real-time ETL, and TCO reductions are likely to benefit owners of other analytical platforms as well (and you should look into SAP Data Services, which may solve the problem you pose), but I suspect the largest benefits (by far!) will accrue to those who use an “SAP stack” including HANA, BW/Business Objects, and Business Suite (or other SAP applications).
Why? Well, to achieve some of the performance benefits, SAP must optimize the analytical application server to work with HANA, sending HANA optimized queries, pushing application logic (e.g., functions and calculations) into the database layer rather than moving data to the application server for all processing, and operating on compressed data in memory. Of course, sharing metadata all through the stack will ensure that all operational objects have analytical analogs that are current with the operational schema – this schema maintenance is no small effort today, and this effort would be greatly simplified (thus yielding significant TCO benefits) with HANA + BW/BOBJ + Business Suite.
In the long term, I suspect SAP will have to rewrite its applications in large part. First, the user interface and user experience are not cleanly separated from application logic across all SAP apps, and this is going to be a large effort to redo. Second, SAP accesses its objects from persistent storage using a relational model, and this is not optimal; using an object-oriented paradigm in SAP’s persistence layer, rather than OpenSQL, will dramatically improve the performance of the applications, and open up new capabilities such as event-based processing (although this may eliminate the ability for SQL Server, DB2, Sybase, and Oracle to serve as SAP “databases”). Lastly, SAP’s functionality is largely provided through “stateful” interfaces, and the world of the future will likely require more stateless services for process automation, agility, developer productivity, and multi-party transactions. If SAP does rewrite large portions (or the whole enchilada) of its Business Suite, you can bet they will do so in a way optimized for HANA.
Thanks for another great set of questions!
– Dennis
Dennis, you wrote ” Larry wants to sell you Exalogic and Exadata machines, costing millions; Hasso wants to sell you a simple, low-end, commodity device delivering the same benefits.”
I do not think it is correct approach to compare Oracle Exadata and SAP HANA. It looks like pure advertisement of the last. Could you take Oracle TimesTen product for comparison. This is their comprehensive In-Memory solution. What is real advantage of SAP HANA vs. Oracle TimesTen?
Dave –
More has emerged lately about SAP HANA’s pricing. If SAP comes to you with a HANA deal, negotiate hard!
In some sense, a comparison between SAP HANA and Oracle Exadata *is* unfair, but that is because Oracle Exadata provides much lower ROI. However, Oracle will generally pitch Exadata into accounts that say they want better database performance, SAP certification of any kind, and TCO reduction due to landscape simplification – and that is exactly where SAP will position SAP HANA. These products, while dissimilar in many ways, are direct competitors in many situations.
Oracle Exadata has one significant advantage over SAP HANA, leading to one interesting use case for which SAP HANA is not (yet) appropriate – due to its 100% compatibility with the Oracle database. If your enterprise has a lot of Oracle database instances, you could consider and accomplish a landscape simplification where you replace many/most/all of your Oracle database instances with a single Exadata box, managed centrally, with full 100% compatibility with the databases they would replace – plus full compatibility with your employees’ skills, and easy ability to hire new, knowledgeable employees. You could do this today, even for Oracle databases running under SAP Business Suite, and that can’t be said (yet!) for SAP HANA.
SAP HANA, on the other hand, is an in-memory database. Even when Oracle has data in cache, SAP HANA will outperform Oracle for many queries, using far fewer memory block move operations and far fewer CPU instructions. The reasons for this will be the subject of a future post.
Oracle TimesTen is more like SAP HANA than is the Oracle RDBMS, but TimesTen is not “eligible” for this discussion in that it is not (to my knowledge) certified for use with any SAP component. In other words, you cannot run something like SAP Business Suite or SAP BW on Oracle TimesTen, and I have heard of no plan to certify this combination. It may not be possible to get these SAP components to run on TimesTen – see http://docs.oracle.com/cd/E11882_01/timesten.112/e13073/oracle_tt.htm for an interesting discussion by Oracle of the compatibility issues between TimesTen and a “normal” relational database. Of particular interest are the second and third issues mentioned, issues with transactional semantics and cursors. Perhaps those issues could be overcome, although I have no reason to think they can, but again there has been no plan announced by anyone of which I am aware to make TImesTen certified with SAP. SAP HANA is also not currently certified for use as the database under SAP Business Suite, but it is certified for SAP BW and SAP has announced a plan to bring HANA to the point where it shall be certified with SAP Business Suite.
From a pure performance perspective, there is one other significant difference between TimesTen and SAP HANA – TimesTen uses a row-based storage approach, whereas SAP HANA is generally used as a columnar database (although SAP HANA does offer the option of row storage). Many select-style queries run much faster with a columnar database than with a row store, thus the prevalence of techniques like pre-aggregation/materialized views in row stores. Other queries, like insert and update statements, may run much faster in a row store than in a columnar database. Both these statements are MASSIVE oversimplifications, but as generalizations I believe they will hold true. For more on TImesTen as a row store, see http://docs.oracle.com/cd/E11882_01/timesten.112/e13065/comp.htm .
Based on the maturity of TimesTen vs. SAP HANA, and their teams’ foci and optimizations, I would expect TimesTen to outperform SAP HANA in most highly transactional applications, and SAP HANA to outperform TimesTen in most highly analytical applications. I would also expect SAP HANA to outperform any Oracle database product when used with SAP software (such as Business Suite, Business Objects, or BW), when SAP HANA has been released and certified for that SAP software – SAP will optimize to ensure this is true, I suspect. Perhaps we can get SAP and Oracle (both for TimesTen and ORACLE RDBMS) to run identical benchmarks based on real-world query loads someday, so we can do an “apples to apples” comparison.
Thanks for a good topic to add to this discussion thread!
– Dennis
Dennis, thanks for your interesting informations about HANA.
I think HANA has capabilities for better performance results in special cases,
but I think HANA is not the solution for SAP applications in general.
Why ?
SAP are the business critical application ,
and HANA is today a complete new solution !
How i it with scalability ? –> HANA shared nothing cluster based on HW – Cluster based on Clusterfilesystems –> simplify IT ? I think no.
How is it with availablity ? –> spare node “warm started” is needed –> additional costs and only failover with downtime (downtime in SAP environments ?)
How is it with backup / recovery ? –> you need persistenzlayer on SSD Drives –> costs
How is it with knowhow in the field ? –> less –> costs for workers with HANA knowhow
How is it with operating system support ? –> only SUSE OS is supported
How is it with vendor lock in ? –> HANA is a optimized solution for SAP Apps. –> extremly costs
How is it with interoperability to other databases with nonSAP systems ?
and and and
Dennis –
Very informative and important write up (expecially for SAP BI Consultant).
I work as SAP BI consultant and I have few questions which I think would come up in the coming days:
1) If HANA’s integration with SAP BW/ECC is successful then what happens to SAP BW Accelerator (BWA)? Many businesses spent huge amounts on buying BWA to improve BW reporting performance. What happens when SAP BWA customer realise that HANA could have been great and far better investment and yet provide better results than BWA? (Although I think HANA would cost way more than BWA and will take time to stabilise in the market)
2) If one can load its entire data into HANA then why there would be need of SAP BW (as middle layer) as such. Wouldn’t SAP BW loose its key advantage? Why wouldnt customers think that data from ECC can directly be loaded into HANA, structures can be created in HANA (instead of BW) and use of BusinessObjects as query design tool?
For businesses it takes long time to load data in BW (data load maintenance and monitoring is another cost, effort) whereas if ECC data can be directly made available in HANA, wouldnt that save time and money?
As far as I can see, many businesses would think of SAP BW as bump (since it would consume the same time to pass on data to HANA memory). Please correct me if you think otherwise.
Regards,
Prashant
Sorry – it looks like I missed your comment.
1. If HANA is successful, I would expect SAP BWA to be retired as a product. This must be handled carefully by SAP to be successful, because customers have invested in very expensive hardware for BWA, and HANA currently is not supported on any hardware other than a small list of certified HANA appliances (none of which would have been used by the BWA installed base). If SAP retires BWA, even if they offer customers a credit or free upgrade to HANA for BW, customers will still have to buy new hardware, which could be costly.
2. BW currently uses a database (often Oracle) to store data and metadata. That is the role HANA plays with BW – plus HANA also executes some functionality in the data layer that was executed in the app server layer with BW. BW provides real value on top of the database, and that won’t change with HANA. However, your point about integrating ECC on HANA is insightful – I think that will be the future for SAP.
Thanks for some great comments!
– Dennis
Dennis: Good job on the SAP HANA write-ups. Please can you share with us your experiences with SAP HANA database for SAP BW. It is currently in ramp-up and it looks very promising!
The article was awesome. I have a question for you. Can u describe me the difference between a normal sqlscript and sap hana sqlscript version 2?
Rani –
I wish I could! You should go to Vijaya Vijayasankar’s web site (http://andvijaysays.wordpress.com/) or John Appleby’s web site (http://www.bluefinsolutions.com/insights/profiles/john_appleby/) – they will likely be able to provide a much better answer to your question. Thanks!
– Dennis
Hi Dennis,
If SAP executes fully on its HANA roadmap, how do you view the impact to Teradata? Thanks!
Pratik –
If SAP executes fully on its HANA roadmap, I suppose they would hope to become the number 2 database company in the world very soon. Certainly, if HANA delivers on SAP’s ambitions, it could have a substantial, negative impact on competing database products, most notably Oracle (which has a large presence in SAP accounts). Other database vendors, especially those focused on high performance analytics, would also see a reduction in demand in SAP accounts. Those that offer similar benefits to SAP HANA might see an upswing in demand elsewhere.
Teradata would see some impact for sure, but HANA’s suitability for Big Data analysis is not yet proven, so the extent of the impact could range from almost nothing to significant (especially in SAP environments). The probability is high, however, that SAP’s focus and most obtainable market would be in its installed base in applications favoring data exclusively from SAP, and that is probably not a major market segment for Teradata, so the short-term impact on Teradata is probably not that severe. However, it is likely that every Teradata rep is being asked “what about SAP HANA” in their accounts right now; if Teradata has not developed a good answer for that, or if their sales staff has not learned that response, SAP HANA could have an impact on sales cycles (making them longer) and average selling price (ASP, making it lower) even without being technically suited as a competitor.
Thanks for a great strategic question!
– Dennis
Hi Dennis, great article.
We design a system 8 years ago without thinking we´ll have the success and expansion we actually have.
Our DB is based on MySQL, now the lots of information and the new features on the system and reports requirements have exceeded our infrastructure. Besides our main client is migrating to SAP. What we can do for migrating to SAP and then Hana?
Thanks a lot.
Jorge –
That’s a challenging question. I’m not sure I have enough data to give you advice. It sounds like your firm built an application on MySQL, and you’re wondering if HANA would be a good choice for scaling. Most companies in this situation would look first at Oracle DBMS and Microsoft SQL Server, or at rearchitecting the MySQL system to improve scalability (some of the most heavily trafficked web sites on the Internet use MySQL to deliver content).
If you can provide more information, I would be happy to comment further, but it may be a bit too early to consider moving your business to SAP HANA.
Buena suerte!
– Dennis
Great article and dialogue here – thanks for sharing so openly.
Question: when we all talk to our clients about SAP HANA, most will be at least eager to know more, some will want to move quickly in this direction. The very next questions will get into specifics around architecture changes (which I see discussed mostly here), then right into real budget issues to make this real for them … and this info is still very rough, even now. But, one area that still seems very hard to get discussion on is the actual impacts to existing IT infrastructures … servers, networks, core designs, reconfigurations, systems management challenges, performance monitors for this kind of environment, migration priorities, … on and on. Do you see this area being a significant part of the challenges for implementing HANA effectively? I realize there are already some certified platform vendors lined up, but even they aren’t putting out much information on exactly what needs to be tackled and in what order for an effective HANA leverage. Can you share your thoughts in this area too, or offer some additional helpful sources/links for this type of information. Thank you, again.
Ken –
Super questions. Information is just emerging about sizing and architectural considerations. Some decent information can be found in the SAP HANA Technical Operations Manual (http://help.sap.com/hana/hana1_tom_en.pdf). Some very good blogs can be found listed in my blog at http://dbmoore.blogspot.com/2011/12/my-favorite-sap-hana-blogs.html . Vijay, Vitaliy, and John Appleby have some good experience in the topic. You’ll certainly hear some good sessions at SAPPHIRENOW this May. I saw several great presentations from SAP TechEd online at http://www.sapvirtualevents.com/teched/Sessions.aspx?category=sessiontype&code=Lectures&NavId=10 . I especially liked http://www.sapvirtualevents.com/TechEd/sessiondetails.aspx?sId=132, http://www.sapvirtualevents.com/TechEd/sessiondetails.aspx?sId=124, and http://www.sapvirtualevents.com/TechEd/sessiondetails.aspx?sId=146 . If you learn more on the topic, please reply to this comment and add to your karma!
Thanks,
– Dennis
Dennis,
Thanks for the info links and feedback, contacts, etc. As SAP HANA momentum accelerates, I will be focusing hard on the real and practical issues that accompany a step change like this. This one brings great application and analytics improvements that will challenge many legacy IT architectures to also step up in ways they weren’t originally designed to address. We may see a significant need to reassess/redesign strategic platforms and infrastructures, even as a base enabler, for best realization of SAP HANA capabilities. We’ll be challenged to help our clients know just where to “tweak”, where to upgrade/add, and where to start fresh, and in what sequence. What I’ll be looking for from these early HANA pilot installations are guidelines and metrics, from actual implementations, on what kind of infrastructure changes are required, or recommended; what kind of costs and lead times should we plan for; and what kind of migration road maps are most practical to help our clients “get there” with least disruption and best results, with a real plan. I am looking for these pilot efforts to begin providing this kind of information for us soon, as we are already talking to clients about moving forward. I’ll certainly share whatever I begin to pick up along the way, too.
Ken
Thanks, Ken! BTW, a good place to go for this kind of info is always SCN. You can start with http://www.sdn.sap.com/irj/scn/advancedsearch?query=hana and explore from there. Best of luck!
– Dennis
Okay, stating the obvious here – but SAP has been building HANA for about 20 months. Kognitio has been building our in-memory row-based RDBMS for over 20 years. All processing done in memory, and proven, mature version 7 of our Software.
How could SAP HANA compare to THAT?
Michael –
Thanks for the comment. I believe there are many products in the market that can deliver many of the same technical features that SAP HANA can deliver. However, a few comments regarding your specific points;
1. SAP has been developing HANA for many years, depending on your perspective. First of all, much of SAP HANA is based on SAP DB, which is a really “mature” code base. Second, some of the “in memory” parts of HANA are based on code acquired around the end of 2005 I believe, and developed continuously since then. The rest of the “in memory” parts of HANA are based on the TREX code base, which is also quite a few years in the making. The HANA concept was revealed at least six years ago, and the company has been working on it since. Your point about 20 months of HANA “building” is not at all accurate.
2. I have no information on Kognitio, and will take you at your word that the product is mature. This blog was not about evaluating the technology of HANA or Kognitio, but explaining the strategic impact of HANA. HANA will have an impact because it is from SAP – it will drive down the price of Oracle in SAP shops, further the trend of vertical integration by vendors, and will create the potential for innovative solutions. Kognitio could not have any of these impacts without being acquired by a company like IBM, SAP, Oracle, or Salesforce.com.
Kognitio may become a successful product and company. I wish you all the best. HANA may be a complete and utter flop. It wouldn’t be the first time an innovative product idea from SAP flopped by being overhyped too early, failing to deliver, or because the market passed the product by. However, all signs point to SAP HANA changing the dynamics of the industry, the competitive relationship between Oracle and SAP, the “coopetition” with IBM and Microsoft, and with some nascent customer successes – and with somewhere around $200M+ in sales in its first nine months on the market.
Feel free to add some links here so I can learn more about Kognitio. Thanks!
– Dennis
Hi,
i m new to asp hana..How to start with sap hana
How to istal the s/w in my system
I want to have hana on top of Oracle,how can i pull real time data from oracle to hana
Thank,
Amit
Amit –
The best place to start is on http://www.experiencesaphana.com/ . I would also recommend participating in the forums on SAP Community Network at http://scn.sap.com/ . All the best,
– Dennis
Dennis,
How does using HANA compare to some of the other “big data” solutions out there such as Netezza (an appliance), Vertica (HP), Aster Data (Teradata), Google’s BigQuery, a Hadoop-based system, etc…. Warning – I’m a business guy with just enough technical knowledge to be dangerous (to myself!), so I’m probably mixing apples and oranges here. Can you help me sort out where HANA fits among these options for a current SAP customer?
Also, I understand that in-memory provides speed that can be amazingly fast, but I would think that the near-term value to customers is likely to be their ability to do analyses that they’ve always wanted to do, but couldn’t afford the resources required to do them in a practical timeframe.
I would think every business analyst would know about one or two of these off the top of their head. They may even be implementing them using aggregated or a subset of data and assuming the results they get are applicable across the population. Do you think that’s true? If you do, why hasn’t the acceptance of HANA been even faster? What’s holding back this latent, underserved pool of already conceived of and potentially implemented analyses?
Vikas
P.S. I’m being purposefully provocative to get to the root of the problem. Is HANA missing some functionality, do the processes it can enable not exist yet or are we just waiting for the ecosystem and support partners to build up momentum?
[…] The real (potential) impact of SAP HANA […]
[…] 링크는 여기 […]