Home Technology 3 must-haves for effective data processing

3 must-haves for effective data processing

We’re excited to bring Transform 2022 back in person on July 19 and pretty much July 20-28. Join AI and data leaders for insightful conversations and exciting networking opportunities. Register today!


Data can be a company’s most prized asset — it can be even more valuable than the company itself. But if the data is inaccurate or constantly delayed due to delivery issues, a business cannot properly use it to make informed decisions.

It’s not easy to understand a company’s data assets. Environments are changing and becoming more complex. Tracing the origin of a dataset, analyzing its dependencies, and keeping documentation up-to-date are all resource-intensive responsibilities.

This is where data operations (dataops) come into play. Dataops — not to be confused with its cousin, devops — started out as a set of data analytics best practices. Over time, it evolved into a completely self-contained practice. Here’s its promise: Dataops helps accelerate the data lifecycle from developing data-centric applications to delivering accurate business-critical information to end users and customers.

Dataops came into existence because most companies had inefficiencies within the database. Several IT silos were not communicating effectively (if they were communicating at all). The tooling built for one team – using the data for a specific task – often kept another team from being visible. Data source integration was haphazard, manual and often problematic. The sad result: The quality and value of the information delivered to end users was below expectations or downright inaccurate.

While dataops offers a solution, those in the C-suite may worry that it promises a lot and has little value. It may seem a risk to disrupt pre-existing processes. Do the benefits outweigh the inconvenience of defining, implementing and adopting new processes? In my own organizational debates I have on the subject, I often quote and refer to the rule of ten. It costs ten times as much to complete a task when the data is flawed than when the information is good. With that argument, dataops is vital and well worth the effort.

You may already be using dataops, but don’t know

Broadly speaking, data ops improves communication between data stakeholders. It frees companies from their rapidly growing data silos. dataops is not something new. Many agile companies already adopt dataops constructs, but they may not use or be aware of the term.

Dataops can be transformative, but like any great framework, achieving success requires a few ground rules. Here are the top three must-haves for effective data ops.

1. Bet on observability in the data ops process

Observability is fundamental to the entire data ops process. It gives companies a bird’s eye view of their continuous integration and continuous delivery (CI/CD) pipelines. Without observability, your business cannot safely automate or adopt continuous delivery.

In an experienced devops environment, observation systems provide that holistic view – and that view must be accessible to all departments and incorporated into those CI/CD workflows. When you’re committed to observability, put it on the left side of your data pipeline — monitoring and tuning your communications systems before data goes into production. You should start this process by designing your database and observe your non-production systems along with the various users of that data. By doing this, you can see how well apps are handling your data — before the database goes into productionOn.

Monitoring tools help you stay informed and run more diagnostics. In turn, your troubleshooting recommendations will improve and help fix errors before they become problems. Monitoring gives data professionals context. But remember to abide by the “hippocratic oath” of monitoring: First, do no harm.

If your monitoring involves so much overhead that your performance is declining, you’ve crossed a line. Make sure your overhead is low, especially when adding observability. When data monitoring is seen as the foundation of observability, data professionals can ensure operations are running as expected.

2. Map your data domain

You need to know your schedules and your data. This is fundamental to the dataops process.

First, document your overall data holdings to understand changes and their impact. As database schemas change, you need to measure their effects on applications and other databases. This impact analysis is only possible if you know where your data comes from and where it goes.

In addition to changes to database schema and code, you need to manage data privacy and compliance with a complete view of the data lineage. Tag the location and type of data, especially Personally Identifiable Information (PII) – know where all your data is and wherever it goes. Where is sensitive information stored? What other apps and reports does that data flow over? Who can access it through each of those systems?

3. Automate data testing

The widespread adoption of devops has led to a common culture of unit testing for code and applications. Often overlooked is the testing of the data itself, its quality and how it works and how it works with code and applications. Effective data testing requires automation. It also requires constant testing with your latest data. New data has not been tested, it is fleeting.

To make sure you have the most stable system, test with the most volatile data you have. Break things early. Otherwise you will push inefficient routines and processes into production and be in for a nasty surprise when it comes to costs.

The product you use to test that data — whether it’s from a third party or you write your scripts yourself — should be solid and part of your automated testing and build process. As the data moves through the CI/CD pipeline, you must perform quality, access, and performance tests. Basically, you want to understand what you have before using it.

Dataops is essential to become a data company. It is the ground floor of data transformation. With these three must-haves, you’ll know what you already have and what you need to get to the next level.

Douglas McDowell is the general manager of the database at SolarWinds

DataDecision makers

Welcome to the VentureBeat Community!

DataDecisionMakers is where experts, including the technical people who do data work, can share data-related insights and innovation.

If you want to read about the latest ideas and up-to-date information, best practices and the future of data and data technology, join us at DataDecisionMakers.

You might even consider contributing an article yourself!

Read more from DataDecisionMakers

RELATED ARTICLES

Tokyo experiences worst June heat wave since 1875

Japan baked in scorching temperatures for the fourth straight day on Tuesday, as heat in the capital broke nearly 150-year-old records for June and...

2 shots, 1 fatality, in Kenwood

A man was killed and a woman injured in Kenwood on the South Side on Thursday night. The two were on the sidewalk on...

Ukraine LIVE: Horror as new Chechen warrior regiment prepares for ‘powerful attack’ | World | News

Speaking on his Telegram channel, Russian-Chechen author German Sadulaev claims that the new “Akhamt” regiment – which will belong to the Russian Defense Ministry...

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Tokyo experiences worst June heat wave since 1875

Japan baked in scorching temperatures for the fourth straight day on Tuesday, as heat in the capital broke nearly 150-year-old records for June and...

2 shots, 1 fatality, in Kenwood

A man was killed and a woman injured in Kenwood on the South Side on Thursday night. The two were on the sidewalk on...

Ukraine LIVE: Horror as new Chechen warrior regiment prepares for ‘powerful attack’ | World | News

Speaking on his Telegram channel, Russian-Chechen author German Sadulaev claims that the new “Akhamt” regiment – which will belong to the Russian Defense Ministry...

Why is Seven Cricket Australia suing and wanting to get out of the broadcasting rights deal?

The Seven Network’s legal action against Cricket Australia (CA) is the deepest cut in two years to threaten coverage of the game this summer. Seven...