Bringing SAP data into enterprise analytics that include Oracle-based systems

Most Fortune 500 companies rely on SAP to manage critical business processes such as supply chain and finance.  Its solutions help these companies run their businesses efficiently and effectively.  However, most companies find that for certain functions, it pays to have third party solutions, e.g. Maximo for MRO, PeopleSoft or Workday for HR, and Manhattan Associates for warehouse management.  They may also find other solutions have very specialized vertical content built into them that SAP does not match.  At HEXstream, we see many utilities using Oracle for the utility-specific functions and SAP for generic functions.  Finally, we see people starting to incorporate streaming data from marketing or IoT applications into their data infrastructure.   

Much of the non-SAP data is stored on Oracle databases.  In addition, most of these heterogeneous shops use some or all Oracle middleware to curate, transform, and protect this data.  They do so because Oracle middleware is among the most performant and richest in the industry.  Oracle middleware and databases regularly inhabit Gartner’s Magic Quadrants in their fields.   

However, the nature of SAP data does not make it easy to blend with non-SAP data, particularly with Oracle and other non-SAP tools.  In the past, SAP wanted people to bring non-SAP data into BW.   Customers we talked to have found the flat file interface clunky.  There were no accelerators worth the name to help import this data.  Now, SAP wants people to bring all the data together in HANA.  This has similar issues to BW and one more.  Because HANA is an in-memory appliance, it gets expensive as data volumes increase.   

Organizations who have a mixture of SAP and non-SAP data have largely taken one of two approaches to analytics. 

  • They have looked at their SAP and non-SAP data in two or more different silos.  This works adequately for single functional analytics, but it does not address mission critical problems that cross organizational silos.  I will give examples of these problems in the next paragraph.  In today’s fast paced environments, organizations cannot afford to confine their analyses to single silos.  Their customers expect better, and their competitors are not standing still. 
  • They have spent lots of money and hired expensive, scarce experts to integrate this data, but because of the expense, this approach leaves many problems unaddressed because they cannot be addressed cost effectively or quickly enough for the solutions to be relevant.   

The founders of HEXstream have seen first-hand the value that comes from integrating data that crosses functional silos, data that helps answer questions like these: 

We have developed expertise and tools to allow SAP customers to answer these questions, questions that people outside the SAP ecosystem have been able to answer for years.  Because of Oracle middleware’s prevalence in the marketplace, this white paper will focus on integration using Oracle tools.  A later white paper will tackle the problem from the point of view of non-Oracle middleware.  The challenges and solutions are similar for Oracle and non-Oracle middleware, though.   

To do this, we have divided the problem into three sections because the solutions involve different sets of tools, tools HEXstream’s consultants have mastered as they apply to SAP solutions and Oracle middleware. 

  • Directly connecting a BI tool to SAP, whether ECC or HANA.  BW uses the same types of data structures that ECC does.  From the point of view of data integration, connecting to BW is equivalent to connecting to ECC.  
  • Using an ETL tool to extract data from ECC, HANA, BW, and non-SAP sources in batch and combining it into a data lake or data reservoir where all data can be viewed and combined easily.   
  • Using tools designed to handle streaming data to pull data from SAP, non-SAP ERP’s, and native streaming data sources, like SCADA data, to allow real time decisions to be made from data that is not yet at rest. 

In this white paper, we will focus on using Oracle tools as examples because we have the most experience with these.   

Direct connection:  We do not usually recommend this because ill-considered queries can bring an operational system to its knees.  However, for small requests, such as retrieving all the outstanding invoices for one vendor, it does work.  OBIEE, Oracle Analytics Cloud, and Oracle Data Visualization support SAP ECC and HANA as data sources.  They also allow one to “play the data where it lies” by pulling data from more than one data source as part of a single query and mashing up the results in the BI Server.  These tools do not repeal laws of physics, and complex joins of large amounts of data in the BI Server will take time to execute.  However, one can execute these queries much more quickly and reliably than one could dump the data to spreadsheets, load it into a personal database like Microsoft Access, and produce the results there. 

Batch connection:  People ask us, “Isn’t all data integration moving towards real time?”  For many processes monitored by ERP’s, supply chain systems, and HR systems, there is no added value in getting data more frequently than hourly or, in many cases, daily.  So long as ETL tools are cheaper to license, easier to implement, and provide richer libraries of transformations, batch integration will have a role to play.  Before specifying an architecture, we ask, “What is real time enough?”  If one is running a foreign exchange desk at a bank, 10 milliseconds may be too much latency.  For updating a customer master file, weekly may be good enough, by contrast.  Oracle Data Integrator (ODI) supports SAP ECC and HANA as data sources.  We at HEXstream know how to connect ODI to SAP as well to many other data sources and can use these tools to build data warehouses containing SAP, other off the shelf software, and custom written software written using the many databases and file structures ODI supports.  These tools let people build data lakes and data warehouses that combine these data in a manner that BI tools can query easily and that will scale more easily than SAP HANA or BW.  They will not handle streaming data sources, though, which brings us to the third part of this problem. 

Real time and streaming data:  We see the need for people to integrate selected data sources in real time growing as the cost of real time integration falls and the number of data sources where real time integration makes sense increases.  As soon as a transaction comes across the wire, people need to make decisions based on it, whether it is to close a valve, make a new offer, or reroute a truck.  This action needs to be taken even before the data is stored in a data lake.  When some of this data comes from SAP, the problem becomes even more challenging.  SAP wants you to bring data into it, not to take data from it and store it elsewhere.   

Oracle’s Golden Gate can pull data from ECC in real time, but it cannot do the same for HANA.  Golden Gate for Big Data extends the functionality of Golden Gate by allowing Golden Gate to target big data type data sources. Oracle positions Golden Gate as EtL, meaning it has only a limited set of transformations.  However, in that regard, it is no different from other real-time data integration tools. 

But Golden Gate and other existing offerings leave much to be desired when it comes to real-time HANA integration.  To handle this, HEXstream has developed a technology agnostic, non-invasive solution called Harmonia.  Based on Apache Kafka or MapR-ES, Harmonia listens for changes from many different sources, including (but not limited to) Oracle Applications, SAP HANA, and ECC.  It then takes the changed data and it into an output data store, such as Apache Hive, among many others.  Harmonia is deployed alongside existing architecture in the cloud, on premise, or hybrid, operating asynchronously to avoid data loss.  It can serve as an event hub as well as a simple point-to-point data movement solution.  

Incorporating Harmonia into existing data operations brings many benefits, including the following: 

  • Organizations can expand the scope of their work to include projects requiring integrated SAP and non-SAP data. 
  • Real-time and IoT data can be combined with SAP and non-SAP data to determine the causes of anomalies detected by sensors. 
  • The speed, cost, and quality of decision making is improved by erasing the artificial divide between SAP and non-SAP data. 
  • Analytic operations become more efficient by reducing the cost of combining data from the different sources required to advance organizational objectives. 


With this collection of tools, HEXstream has helped and will continue to help organizations integrate SAP data into a unified data architecture based on open standards.  Once an organization has implemented this functionality, it can break down the data silos that cause organizations to make suboptimal decisions because they ignore data that has traditionally been hard to integrate.   

About the Author

Will Hutchinson

Will Hutchinson is the Director of the Analytics Practice at HEXstream.  He is a former Distinguished Sales Consultant for Oracle and has over 30 years of experience with data warehousing and analytics. He is the author of a book on analytics and has extensive industry experience spanning pharmaceuticals, oil and gas, consumer goods, insurance, and manufacturing. Will is an expert in ROI and TCO analysis and is a polished speaker and trainer.

2 replies on “Bringing SAP data into enterprise analytics that include Oracle-based systems

Leave a Reply

Your email address will not be published. Required fields are marked *