Turn your DBA skills into big data skills.
If you are a DBA dealing with big databases, you might think that you already handle big data. You’ve already been managing databases that are in the petabytes for years now, right? However, big data isn’t completely about the size of the database or the data.
Big data might be structured or unstructured data, but it is a large quantity of information and likely coming in at a high velocity. Business owners need to figure out what data is valuable to them, and although it might be difficult for DBAs to determine what data is valuable, the DBA can assist in making the data available and helping connect datasources.
In looking at what big data means and how important it could be for companies in the near future—perhaps even becoming the standard way of doing business—I see an opportunity for DBAs to step out and expand their database administration skills to incorporate big data administration skills. With their existing administration skills, including managing data storage, networking traffic for data access, security, data loading, and integrations, DBAs already have a solid foundation for becoming key players in big data.
As more questions are asked of data and more data becomes available, developing big data skills provides new opportunities for every DBA.
There are quite a few new skill areas emerging in the big data arena. A few of these areas leverage current DBA skills and can be a starting focus for the big data DBA.
Apache Hadoop. Hadoop is a framework used for multiple-node processing of data. With high degrees of parallelism, Hadoop can be a very scalable platform for processing large batches of data very fast. There are no real tables or structures, so the data can be imported in various formats, and there are several ways to connect Hadoop to Oracle databases. Understanding how to process data in Hadoop and how to integrate it with existing datasources are valuable big data DBA skills to develop.
R. It would be an achievement to understand all of the R scripting and all of the advanced analytics that can be performed against the data using R for real data solutions, but R might best be left to the stakeholders or the data scientist. Learning how to set up an R environment, however, would be another administrative focus area for the big data DBA. In addition, setup and configuration are required to support R statistics and advanced analytics scripts running in an Oracle Database environment, so DBAs can expand from administrating back-end R environments into R scripting.
Integrations. Being able to integrate data is already an important DBA skill, and with big data, there is an opportunity to talk more to the business to understand the data and what is considered valuable information. Your new big data skill area might just be communication with these business teams. And if you are not regularly connecting different datasources, start by pulling in data from other sources. Developing a service for different datasources will prove to be an invaluable resource for big data projects.
JOIN the IOUG Big Data Special Interest Group
LEARN about Oracle big data solutions
In addition to working with datasource integrations, consider working with big data connectors. Oracle Big Data Connectors connect Hadoop and Oracle Database, and they enable the use of XQuery and SQL in Hadoop. With current DBA skills and learning about connectors, integrations are a great area for the DBA to get involved with big data.
Reporting and visualization. Businesses need to be able to turn big data into a visual story that can be consumed and analyzed. To support this kind of reporting, big data DBAs should learn to administer reporting tools and servers for big data analysis. Another valuable skill to develop is the ability to create the data structures necessary for reporting. For acceptable performance, some reporting tools might require the creation of materialized views or even separate reporting databases. Tuning queries for reporting is a typical DBA skill, and understanding the tuning needs for different graphing and visualization tools is important for big data.
As more questions are asked of data and more data becomes available, developing big data skills provides new opportunities for every DBA. I will definitely be using my trusted sources and user group experts to expand my skill set in Hadoop and R this year.
Michelle Malcher (firstname.lastname@example.org) is president of IOUG. She is an Oracle ACE Director with more than 15 years of experience in database development, security, design, and administration. Malcher is a coauthor of Oracle Database 12c: Install, Configure & Maintain Like a Professional (Oracle Press, 2013) and Securing Oracle Database 12c: A Technical Primer (Oracle Press, 2013).
Send us your comments