Data management is a broad topic that encompasses the whole life cycle of data. It includes the extraction, transformation, and loading of data (ETL), data storage, data wrangling and cleaning, data analysis, data visualization, data governance, information security, data mining, and modeling.
Successful data management involves careful attention to every step of the data processing, from choosing which data to collect, to using it to make predictions. Below are the 7 best practices I stress when advising clients.
1 - Start with a business question you need to answer
Begin by setting goals for your data. What kind of insights do you wish to uncover from it? How much of your data can you reliably and repeatably collect? Brainstorm with a multidisciplinary team and note the needs of each member, then prioritize your data fields. Your data should not be cumbersome to explore. Also, if you are collecting data from people, bear in mind that it's often poor business practice to overwhelm them with questions. Prioritize your metrics and select the most relevant ones for collection.
2 - Focus on data quality and integrity before quantity
No data is better than bad data. Think carefully about how data will be collected, as well as the pitfalls of the process. If collecting data from people, remember that despite their best efforts, their answers may not be accurate. For example, think about these two questions:
How many children do you have?
How many calories do you eat in a week?
Obviously, the first question's answer will be a lot more reliable than the second. There are ways to design a questionnaire that makes you check the accuracy of answers, like asking similar questions in different forms. Also, avoid excessive missing values. If the majority of a data field will be missing, there is little benefit in having it.
3 – Monitor and update your data
Data can go stale faster than you may think. In the case of people, they can move, change their phone numbers, change jobs, emails, etc... To keep your data fresh, you need to perform regular assessments to pinpoint problems with it and renew it. You can also establish data quality metrics to keep track of your data integrity and intervene beyond certain thresholds.
4 – Document your data
Ever opened a document and were overwhelmed with acronyms and jargon? Do you know the feeling it gives you? Now imagine how much more daunting it is to open a table full of numbers, with cryptic acronyms as column names and jargon-filled abstract descriptions. The more you reduce the accessibility barrier to your data, the more your employees will be willing to exploit it. If you can, give descriptive names to your data, make their descriptions easy to find and easy to understand with clear illustrations and examples. Want to go above and beyond? Make video tutorials to your employees to explain to them what the data is about and why it is important.
5 – User experience and presentation still matter
Your data might not be available for the world to see, but your employees are still users who work better when using user-friendly interfaces. Whether you are building an interface or buying one as a part of a data management software platform, large fonts, color coding, and proper negative space should be on your priority list. Depending on how many users can benefit from your data and the nature of it, you may want to consider building data visualization dashboards.
6 - Protect your data
It goes without saying that data security is extremely important. Make sure your passwords are strong and regularly updated. Data security best practices can be summarized as follows:
Do not grant write admin privileges or database write privileges to employees who do not need them
Use strong passwords for all accounts and change them regularly. This is more important to administrator accounts
Set up a strong battle-tested firewall and do not allow more traffic than you need
Install good anti-malware software
Train your employees on the best practices and on ways to spot malicious software and email
If using cloud services, use strong data policies to protect your data
7 - Back up your data (and then back it up again)
Your data are precious, that's why you need to back it up on several media to ensure its safety and integrity. One of the best places to back up your data is cloud services, in which you can save your data on several data centers, in several regions, with built-in redundancy. Cloud services also offer several plans for your data depending on how fast you need it. From high-speed SSD storage to inexpensive "cold" storage that can take a day to serve your data.
Data management goes beyond a single skillset. It is a multidisciplinary field in which analytical and creative talents are needed. The key to success in it is to keep things simple and in check. Data management software may be something to consider. A custom solution may make sense if you want to maximize the value you get out of your data and the involvement of your employees with it. Need some more advice? Get in touch! And I will help you plan your data journey.