19 Aug 2020
Knowing when to stop
In predictive analytics, it can be a tricky thing to know when to stop.Unlike many of life’s activities, there’s no definitive finishing line, after which you can say “tick, I’m done”. The possibility always remains that a little more work can yield an improvement to your model. With so many variables to tweak, it’s easy to end up obsessing over tenths of a percentage point, pouring huge amounts of effort into the details before looking up and wondering “Where did the time go?”.
Some strategies that can help you decide when to wrap things up might be:
Set a deadline -Parkinson’s law states that “work expands so as to fill the time available for its completion”. Having an open ended time-frame invites you to procrastinate by spending time on things that ultimately don’t provide much value to the end result. Setting yourself a deadline is a good way of keeping costs low and predictable by forcing you to prioritise effectively. The down-side is of course that if you set your deadline too aggressively, you may deliver a model that is of poor quality.
Acceptable error rate -You could decide beforehand on an acceptable error rate and stop once you reach it. For example, a self-driving car might try to identify cyclists with a 99.99% level of accuracy. The difficulty of this approach is that before you start experimenting, it’s very hard to set expectations as to how accurate your model could be. Your desired accuracy rate might be impossible, given the level of irreducible error. On the other hand, you might stop prematurely whilst there is still room to easily improve your model.
Value gradient method -By plotting the real-world cost of error in your model, vs the effort required to enhance it, you gain an understanding of what the return on investment is for each incremental improvement. This allows you to keep developing your model, only stopping when the predicted value of additional tuning fall below the value of your time.
from: Towards data science