SAP HANA’s Big Data strategy

Category: SAP HANA Posted:May 22, 2017 By: Ashley Morrison

SAP HANA’s Big Data Strategy

With the amount of data in digital form being created at an explosive rate, the processing of such data produced by enterprise applications along with the enormous information loads streaming in from a plethora of external sources is leading to cumbersome analytical processes. This means businesses require a relatively wide range of analytical capabilities. Most organizations are currently looking to HANA Big Data Tools which are provided by SAP.

The leaders in technical research Forrester Research has said that Big Data is not only going to get bigger and richer but will also originate and stream from an ever-increasing number of resources ranging from external to internal. The report also stated that Enterprises require an advanced and modern data analytics strategy which offers an all-pervasive and real-time data access layer linked to all relevant data flowing in from various sources.

HANA Strategy

Meeting the requirements of such enterprises will see SAP progressively investing in offering enterprise users with unrestrained access to advanced analytics tools that utilize its HANA in-memory RDBMS. This was a statement released by a Senior Analyst named Anne Moxie working currently at a company based in the USA named Nucleus Research. This statement was further validated by Werner Hopf, the CEO of a business named Dolphin Enterprise Solutions Corp. Dolphin is a SAP partner company. Hopf went on to say that Sap has poured in a huge amount of investment in terms of development over the past three years to extend HANA capabilities. This has been done so that it can be utilized as the database background for systems involved in transaction processing.

SAP HANA Platform Big Data Management

For example, last September, SAP announced HANA Vora, a new in-memory query engine for Hadoop that addresses the challenges companies face as they manage distributed big data, Moxie said. Since HANA as a stand-alone application comes across limitations it won’t be suitable for extremely large data volumes since the cost-effectiveness of such an effort will decrease drastically. This was the view of John Appleby, who is the General Manager at the global consultancy Bluefin Solutions, which is also incidentally a SAP partner.

SAP’s Hadoop Strategy

While the previous approaches to managing data exist to a certain extent, in the decades past, databases dealt only with the recording and storage of data. Execution of Reporting was utilized only for decision support. Prior to the massive influx of data, data was mostly high value and even highly structured. This is very much unlike today, wherein Terabytes of data flow in every instant, which in the end may or may not hold any value.

In the current market scenario, businesses need data management strategies which can transact, analyse and act at the same time. These should also, needless to mention, support cloud, machine-machine, social and mobile modalities. Such support must be constant- on a 24/7 basis.

SAP strongly believes that the answer to this issue is undoubtedly SAP HANA. They also believe that Hadoop is poised to play a huge role as part and parcel of the solution. This view is supported by several market surveys such as the April 2013 TDWI Best Practices report, which dealt with the integration of Hadoop into Data Warehousing and BI. It states that while over 78% of enterprises see Hadoop as a complement to a data warehousing, only around 10% of enterprises seem to have an active implementation of Hadoop at the time the survey was taken.

Learn SAP HANA from Industry Experts

SAP HANA and Hadoop have different, complementary roles in a Big Data solution.

SAP HANA and Hadoop are known to complement each other extremely well, according to the enterprise named Mitsui Knowledge Industry, which highlights this fact. This company believes that one day, patients will be able to receive personalized treatment for diseases based on DNA Analysis alone. This solution makes use of Hadoop to collate patient DNA sequence with a standard (normal) sequence since the data is in a partially structured format which can be worked in parallel (matched) across various machines. Recognizing mutations and forecasting the optimal treatment will need large amounts of iterative analysis, which can be ideally executed in SAP HANA. This has brought down the time for such a task from 2-3 days right down to an astonishing 20 minutes.

Common uses for Hadoop are of 3 types mainly,

  • Data Pre-processing prior to storage in databases, using BI tools to analyse or utilizing in applications
  • Analysis of documents and multi-structured data that several databases were not initially architected for
  • Archiving of massive amounts of data, especially for data of unknown value

Currently, SAP has partnered with three major vendors – Intel, Hortonworks and Cloudera to make sure that the market can harness the consolidated capabilities of Hadoop and SAP HANA. February 2013 saw SAP and Intel announcing a collaborated effort to optimize SAP HANA further and introduce a good deal of Hadoop integration.

SAP HANA

SAP strongly believes that in-memory capabilities lie at the heart of a Big Data strategy, along with close integration with various data sources like Hadoop and data warehouse. Some of the reasons these beliefs hold true are given below:

Real-time access and data integration top CIO priorities:

At any time in the IT market, volume is considered as the major technology obstacle, whether it is unstructured or even structured. But market surveys which are coming in currently have said that velocity and data consolidation are equally critical. According to the latest reports, integration of data silos is the biggest challenge yet on the technical side.

The report goes on to say that Hadoop workers deal with data latency which is seemingly inherent in Hadoop in a multitude of ways. Data needing real-time access has an on-demand basis to manage data in a DBMS whenever it is deemed possible. The TDWI Best practices report has gone on to show that brand new Hadoop projects have begun in an effort to overcome MapReduce’s batch-oriented nature. These include projects to construct capabilities which are similar to databases and an individual project needed to construct an in-memory cache.

Value from Big Data is realized in operational processes: 

To realize true business value, data analysis is not enough. This would require the ingraining of insights right into the processes and also into user behavior through which business value is realized. And this is in real-time. Major companies such as eBay, Facebook and Google each have a platform which realizes the value of insights. Google has its search index, Facebook possesses a real-time ad platform and eBay, of course, has an e-commerce platform. The real-time nature of Facebook ad displays, Google searches and eBay’s e-commerce process is ever-present and undeniable. The take away from this discussion is that while it is essential to deploy a platform capable of transforming Big Data into insights, the next step of transforming those insights into business value is equally important.

SAP Big Data Processing Framework

SAP believes that, with in-memory tech at its core, SAP HANA is not just about databases, but technology such as Hadoop has a highly critical and complementary role to play.

SAP believes to a great extent that a comprehensive Big Data solution is end-to-end in nature and is meant to handle anything from storage, visualization and processing to low-level data ingestion. This extends to the range of Big Data applications and even analytic solutions. This is required to provide for all of an enterprise’s stakeholders also. These stakeholders can range from the BI analyst and IT staff to the executive management, frontline staff and the CIO as well. This can even be embedded directly into enterprise applications and business processes. SAP is aiming to build an end-to-end solution that caters to every need mentioned above.

 Tony Baer, Principal Analyst, Software – Information Management (Expert-Speak)

SAP has focused on the construction of the HANA platform initially and has only now started to look into its Big Data strategy. Recently, SAP had announced the extension of Smart Data Access and even its query technology which is federated from Sybase, right to the HANA platform. SAP also announced OEM handles the Hadoop platform providers Intel and Hortonworks. The Smart Data Access technology has been well established on Sybase but was still in its first release in 2013 for HANA and Hadoop. By promoting federated query, SAP has added a touch of realism into the data management strategy of SAP making it more tangible.

Register for live webinar on SAP HANA by Industry Experts

Venturing beyond in-memory

SAP’s HANA positioning has stressed its intentional role as a platform for transaction-processing and analytics applications. SAP stresses on HANA as both OLTP and analytics platform, thus addressing latent demand for applications and also use cases which will directly benefit from real-time processing according to a report from the organization Ovum.

Nevertheless, Ovum has said that very few applications ever benefit from storing a 100% of data in-memory. Ovum research has said that Storage tiering is the new trend in databases, with different use cases existing for every form of storage right from disk to DRAM and SSD Flash. The long-term drop in DRAM costs has added a great deal of feasibility to in-memory HANA platform. And when contrasted with SSD’s or plain old disks, this type of storage still can avail pricing on a premium level.

Ovum research has predicted that the trend for analytic applications leans towards what can only be called as aggressive utilization of data-teiring strategies. This is where data is stored on the appropriate medium such as SSD’s, DRAM or more.

SAP in its early messaging did not have the prominent capability to page lesser used data to disk. The extension of Smart Data Access is a federated query utility first developed by Sybase. Ovum stands by this strategy of embracing data teiring. Smart Data Access permits data within remote systems to show as virtual tables in HANA. In here processing is found to be pushed down to the source systems.

Currently, Smart Data Access supports Sybase IQ which is a columnar analytic database which was introduced around 2 decades ago. It also supports Teradata and Sybase ASE. It is also interesting to note that the most recent versions of Hadoop include Hive 0.12 – this is the version that has performance enhancements from Hortonworks’ Stinger project. Also on the rise are SAP’s plans to extend support for targets like Microsoft SQL server and Oracle as well

When it comes to federated data access, SAP is not alone, with Teradata presently supporting an approach within the relational platforms. Nevertheless, SAP is unique in handling data in Hadoop without physical migration to SQL environments. In one of the Ovum reports, it has been seen that the modes for Hadoop-SQL integration presently span from interactive query on Hive metadata to batch processes. It also includes ETL from Hadoop to SQL data warehouse and even extract of Hadoop data to SQL as an external table. There is also the physical integration of SQL tables into HDFS

Smart Data Access is not said to be complete in the form of features. For example, it does not presently support HANA instances which are found to be clustered. In addition to this, there is no clear support for large object data types which are bound to be found on Hadoop. This further restricts Smart Data Access to data which can be found using Hive, while raw data stays off limits. While this is not a game changer, it is definitely a chance for SAP to spread its federated query subset.

 Ramping up Hadoop support

When compared with Teradata, Oracle and Microsoft which have selected an individual Hadoop platform OEM partner with SAP spreading its wagers with reseller agreements for

  • Hortonworks for a plain Apache open source platform
  • Intel, serving as a premium provider of a higher-performance Hadoop

Intel Hadoop distribution aggressively taps into the native instructions of the Xeon chipset required for data pre-processing in cache, coupled with compute-intensive operations such as encryption and graph processing. When first announced, Intel also claimed Terasort performance benchmarks on a platform with holistic SSD Flash storage.

Between the two, Hortonworks is more established, having entered the Hadoop platform business two years ahead of Intel. Aside from the differences, OEM strategies play out mainly with both Hadoop providers.

In addition, SAP has certifications for both Cloudera and MapR which is similar to Teradata, and also has a minimal relationship with Cloudera.

Ovum believes that SAP’s selection of two providers is a form of strategy sampling, just like the erstwhile-EMC Greenplum Company which is now called Pivotal did with an initial OEM strategy with MapR. This was for high-performance Hadoop that was phased out in favor of the SQL-on-Hadoop Pivotal HD offering. Nevertheless, the relationship harnesses SAP’s joint development work with Intel, where Xeon processors were tuned for the HANA platform.

Cuurently, SAP has been found to have no plans of conducting a tighter integration of HANA with the Intel Hadoop platform. This has been developed with a high-performance option that comprises of SSD Flash drives and even 10GbE high-speed Ethernet interconnects.

While Ovum Research believes that disk-based Hadoop platforms will stand for the market mainstream, developments with open source frameworks like Spark (for tiering of hot data to in-memory DRAM storage) and Shark which runs Hive on Spark are touted to have interesting product opportunities for SAP. This is in turn aimed at offering a high-performance converged HANA/Hadoop platform which could seamlessly integrate SAP Business Suite on HANA with Big Data analytics.

Conclusion

As can be seen from the detailed plans SAP has laid out for its HANA Big Data strategy, it is clear that SAP has an elaborate roll out strategy keeping it years ahead of its competitors. With a successful execution of this strategy SAP will see big things happening in the near and distant future.

24 X 7 Customer Support X

  • us flag 99999999 (Toll Free)
  • india flag +91 9999999