Today, you often hear that you always should use latest/newest version of a software. Because it has less bugs. (But it may contain new features, and so new bugs.)
But from reverse engineer's perspective, old versions may be very useful.
This is a real story from my experience. I tried to dig into a piece of software X. And it was obfuscated. I procured all possible versions of X I could. And I found that one (single) version wasn't obfuscated -- developers simply forgot to run obfuscator that time.
Also, during my Oracle RDBMS years, I amassed a huge collection of Oracle versions up to 7.x. Simply because older versions has less code. And less features. And hence it's easier to understand and analyze.
When I prepared my blog posts about SSH, I learned oldest possible SSH versions.
So when you stuck with X, try other (older) versions. Find the same feature in older version -- it may be implemented in a simpler way.
On the other hand, when fuzzing, newer versions are to be used, because they have more (poorly tested) features and bugs.
Today, I use Racket (Scheme (Lisp dialect) dialect), because it's feature-rich, has good manuals, etc. But I learned pure Scheme because Scheme is such a small language.
I can use OCaml, but I learned Standard ML first, again, because it's small.
I can't imagine today, how can anyone start learning C++ (which is huge) without starting at pure C (which is small).
Yes, I know about these lousy Disqus ads. Please use adblocker. I would consider to subscribe to 'pro' version of Disqus if the signal/noise ratio in comments would be good enough.