How to deal with Data Errors or not having enough Data
Ways to address insufficient data
As you know data cleaning and processing play a huge role in the lifecycle of a data scientist. Nearly 80% of the project time has been spent on Data preprocessing. As data is so crucial part it’s our role to use it as efficiently as possible (to get the most out of data).
Dataset has to be inefficient quantity so that we can able to perform processing. In this post, I will share about
Types of insufficient data
Ways to address insufficient data
Types of insufficient data-
Data from only one source
Data that keeps updating
Outdated data
Geographically-limited data
Ways to address insufficient data-
Identify trends with available data — In this way if it is possible to continue the process with available data then go on.
Wait for more data if time allows — Certain project has a tight schedule but some project can wait or allow freedom of time. In that case, wait for more data to be collected over time.
Talk with stakeholders & adjust your objective — If both the mentioned ways are not fixing the problem then have a talk with stakeholders & try to adjust objectives so that available data can be useful.
Look for a new dataset — This sounds a little incorrect but there’s no other option left than of getting a new dataset.
How to deal with data errors or not having enough data
For better understanding following is a decision tree that will help you understand the process of finding a solution.
By asking these questions you will definitely get a solution for the problem that you are facing.
Thank you for reading😁.
Have a nice day!
For more such content make sure to subscribe to my Newsletter 👉here
Follow me on