Eui-Hong Han, Director, Big Data and Personalization, The Washington Post
Benefits of Cloud Computing For Data Analytics
The benefits of cloud computing for Data Analytics include scalability, flexibility, and collaboration. At The Washington Post, we have embraced cloud computing in full scale and saw these benefits in many projects. As business is growing, the amount of data we are dealing with is increasing as well. It is much easier for us to analyze the data on the cloud computing infrastructure to accommodate the increasing data. As we are constantly experimenting new ideas and iterating very quickly, we need a very dynamic and flexible computing environment. When we adapt our analytical tools and models for new ideas, we are able to change our cloud environment accordingly with little operational overhead. On the other hand, when there is a new feature or an upgrade of our cloud environment, we can adapt our tools and models to utilize the extra cloud power and enhance our analytical capabilities. The cloud computing allows us to collaborate better. Team members can easily access to the data at the same time and share information and their work through the platform. Every team member can be individually responsible for some applications, and meanwhile provide supports and deal with issues as a team.
Challenges in Deploying Data Analytics Solutions
Some of the challenges include unrealistic expectation, lack of collaboration or talents. I think not every data analysis task can result in finding very clear patterns and valuable information, and this might be due to the problem itself that is quite challenging, it is hard to discover the deeply hidden knowledge, or the data contains lots of noise that hide the real valuable information.
It is critical to understand the problem and business value, and then set the clear expectations in terms of metrics and ROI. When building analysis models, it would be good to keep things simple, intuitive, and incremental. After getting the results, we need to interpret the results from different perspectives and evaluate based on metrics and ROI, such that we make a good progress to meet expectations.
Successful data analytics application deployment requires several different talents. For example, an application with predictive models requires a business analyst who understands the problem in the domain and can articulate requirements, a data analyst who can collect data and analyze, a modeler who can build a model, an engineer to deploy models, etc. The success of the project depends on members with right skill set and close collaboration among different team members.
Growth through Data Analytics
The key is to interact with business closely and keep experimenting new ideas and iterate quickly. We need to focus on analytics solutions that can help improve profit (e.g., increased subscription or ad revenue in media industry), streamline processes (e.g., identification of bottlenecks in news alert system), or provide better engagement with customers. Clear benefits foster more engagement from business side, generates additional requirements, which in turn force to develop innovative solutions. After the successful adoption of a particular service or system within an organization, the next growth opportunity might be to open up the system as a service to other companies in the same vertical.
Technologies for the Future
Deep Learning has shown great potential in several domains such as image classification and voice recognition. I expect to see more Deep Learning based solutions developed in Natural Language Processing, Question Answering, Topic Categorization, Churn Analysis, etc. I think that we need to evaluate these solutions in Data Analytics area in the near future for adoption.
The Wish List
We already see challenges of finding good data scientists or data analysts. Having diverse and easy-to-use of AI or Machine Learning algorithms as service (Amazon, Facebook, Google already has a service like this) will be one in my wish list. I want to see different kind of models, such as ready to use model that can be plugged in without any modifications, to models that can be trained with customer data, etc.