Simplicity vs Complexity

We will now discuss the notion of model simplicity and complexity in detail and use some examples and analogies to understand the pros and cons of simple and complex models in the upcoming videos.

Before going into the details of simple vs complex models, let’s understand some of the practical considerations you need to keep in mind before choosing any model for a given problem.

There are some practical issues that you need to deal with when you make a choice of the models to be considered before going into solving the business problem. Some of these include:

Type of data – numerical, categorical, strings, dates etc.
Data quality – missing values, noisy data
Dimensionality of data – high or low

So different models work well for different types of situations. Some work well with high dimensional data whereas some don’t. Similarly, some can handle noisy data and missing values well naturally, whereas some fail to do so. Thus, every class of models has its own strengths and weaknesses.

Depending on the computational resources and the kind of data that you have, you need to shortlist the choice of models that you can consider for a given problem. Out of these models, how do you pick one and make the decision? Here comes the role of model evaluation. One of the most important rules to keep in mind is that never evaluate your model on the training data. A model should never be evaluated on the data it has been trained on, instead, it should be tested on something it has never seen before and that is what the true test of a model is.

Simple vs Complex– which one should you choose first while building a model and what are the key reasons behind it?

Recall Occam’s error that you learnt in the earlier segment. Keep your model as simple as possible. The simpler the better. Always start with a simple model to set the baseline. Try fitting other complex models on your dataset only if the simple models do not produce a good model with the desired results.

From your school or college, you can probably recall those few fellows who seemed to study less but understood much more than others. They seem to never care about memorising or mechanically practising what was being taught, yet are able to explain complex problems in physics or mathematics with simplicity and elegance.

Assuming that people learn using ‘mental models’, do these students have remarkably different mental models than those who solve a bunch of books and focus on memorisation? How can they learn so much from a finite amount of information and apply that to solve unseen, complex problems?

In this next video, Prof. Raghavan will explain the meaning of model simplicity, complexity as well as the pros and cons associated with them. As a by-product, you will also understand that the best way to ‘learn’ is ‘to keep your mental models simple’.

In the competitive exam analogy, the first person learns using a much more complex mental model than the second one. He mugs up all the guides and tutorials and cannot solve unseen problems which he never learnt before. Whereas the second person focuses on the core subject principles and can solve any problem that is given to him because his basics are quite strong and clear. This is exactly done by a simple model. It is more generic in nature and captures the abstract patterns in the data that is applicable to a wider range of unseen data.

A complex model makes far too many assumptions about the data it has not seen before. Such assumptions may not hold true for all kinds of unseen data that it may encounter later. This is how simple models stand out as compared to complex models because they keep from making any extra assumptions about the unseen data to the least minimum. Another advantage of simple models over complex models is that they require less samples to train the model. Complex models will require far more training to ensure that it is capturing the information well and performing efficiently.

Now as soon as the exam patterns are changed, the first guy will be thrown off guard in the competition because he was trained specific to an examination and as soon as the organisers changed the pattern, he will be completely lost. This is where the robustness of a simple model comes in. The exam pattern does not really affect the second person because he is well versed with all the concepts and does not require any reorientation.

Simpler models tend to make more errors in the training data. Let’s understand this using the same analogy. The second guy who understands the basic principles well may not likely crack the examination because he would derive the formula and take time to solve the questions as compared to the first person who has mugged up the required formulae and can apply them directly. So in this way, he might perform well in this particular exam and crack it. Complex models tend to make lesser error in the training data but in the long run they may not do well as they tend to capture the inconsistencies and noises in the training data which are neither generalisable nor desirable.

Finally, you learned 4 unique points about using a simpler model where ever possible:

A simpler model is usually more generic than a complex model. This becomes important because generic models are bound to perform better on unseen datasets.
A simpler model requires less training data points. This becomes extremely important because in many cases one has to work with limited data points.
A simple model is more robust and does not change significantly if the training data points undergo small changes.
A simple model may make more errors in the training phase but it is bound to outperform complex models when it sees new data.

Report an error