Databricks, Hive, Spark, Scala, IntelliJ Idea Community Edition, Unravel, Hive Shell, Spark2-shell, CDH, GitHub, Azure Cloud
The customer is an American multinational computer software company with game-changing innovations that are redefining the possibilities of digital experiences.
The customer had embarked on a multiyear initiative focused on moving their Big Data platform from Cloudera Hadoop On-Prem instance to Cloudera Data Platform (CDP) on Azure. As a first step, they wanted to explore the prioritized MapReduce jobs in the current state and consider migration to Spark before moving the workloads to Azure Cloud.
They had initially created a solution with Hadoop Map Reduce engine and Hive Queries (HQL). The current setup had the following challenges:
- Slower code execution speed
- Higher storage requirement
- Difficult to maintain workflows
- The newer solution they envisioned should address all issues mentioned above and wanted a revamped approach in processing Big Data. They were looking for a partner that could support them in converting identified MapReduce Jobs to Spark as they wanted to reduce the execution and processing time of Jobs as it was impacting their business performance.
- Eventually it will enable them to move their Big Data platform from Cloudera Hadoop On-Prem instance to Cloudera Data Platform (CDP) on Azure.
WinWire, in collaboration with the customer, has taken two prioritized jobs [LTV & AES] to convert MapReduce jobs to Spark. These were categorized as high complexity jobs.
WinWire team transitioned MapReduce code to Spark code seamlessly. This transition enabled the customer to process data faster and improve the overall performance of the job by reducing the executing time by more than 50%.
- Reduced the execution and processing time of job by 50%
- Greater customer satisfaction through better project execution
- Better opportunities and improved business performance