A recent discussion (closed group/registration required) on the Stata users LinkedIn group highlights the use of recode to create 5 year periods in a panel dataset. The question asks how to take yearly data and create a variable that contains the 5-year average of some data.
The first step is to recode your year data to 5 year periods. One does this by running:
recode year2 (2001/2005 = 2000) (2006/2010 = 2006), gen(year2)
Then, to take care of creating averages of X by period one uses the -collapse- command by running:
collapse (mean) X, by(year2 id)
where id is a unique identifier for each cross-sectional entity in the dataset. Good luck!
The -recode- method is easy to think about — but with long series tedious to code and in any case susceptible to typos (as in this example where 2000 is presumably intended to stand for 2001-2005).
An alternative for automation is illustrated by
gen year1 = 1 + 5 * floor((year – 1) / 5)
gen year2 = 2 + 10 * floor((year – 2)/10)
Here the arguments 5 and 10 give subseries length and 1 and 2 tune the start of subseries.
This is, I readily admit, not so transparent.
Anyone intrigued by this small problem may want to play a little with -ceil()- or -chop()-.
Thanks Nick! I suppose one could package something like this up in a small ado file but I question how broad the use case for it would be.
Indeed. Much depends on how often it is needed. I would rather work out what works with a few lines of experimentation than create a program whose name and usage then have to be remembered or re-learned.