How to deal with Data Errors or not having enough Data

How to deal with Data Errors or not having enough Data

Ways to address insufficient data

As you know data cleaning and processing play a huge role in the lifecycle of a data scientist. Nearly 80% of the project time has been spent on Data preprocessing. As data is so crucial part it’s our role to use it as efficiently as possible (to get the most out of data).

Dataset has to be inefficient quantity so that we can able to perform processing. In this post, I will share about

  • Types of insufficient data

  • Ways to address insufficient data

Types of insufficient data-

  • Data from only one source

  • Data that keeps updating

  • Outdated data

  • Geographically-limited data

Ways to address insufficient data-

  • Identify trends with available data — In this way if it is possible to continue the process with available data then go on.

  • Wait for more data if time allows — Certain project has a tight schedule but some project can wait or allow freedom of time. In that case, wait for more data to be collected over time.

  • Talk with stakeholders & adjust your objective — If both the mentioned ways are not fixing the problem then have a talk with stakeholders & try to adjust objectives so that available data can be useful.

  • Look for a new dataset — This sounds a little incorrect but there’s no other option left than of getting a new dataset.

How to deal with data errors or not having enough data

For better understanding following is a decision tree that will help you understand the process of finding a solution.

By asking these questions you will definitely get a solution for the problem that you are facing.

Thank you for reading😁.

Have a nice day!

For more such content make sure to subscribe to my Newsletter 👉here
Follow me on

Twitter

GitHub

Linkedin

Did you find this article valuable?

Support writtenbykaushal by becoming a sponsor. Any amount is appreciated!