Good job thanks. I had compiled some similar statistics some time ago, but never got past the 00s decade.
As you have already took the job, some questions:
Do the four slams follow the same pattern of distribution?
Did you count retirements (past the 3rd set) as a normal match? (pure curiosity)
I counted retirements as normal matches for the purposes of set length; here are the numbers with them, along with defaults, removed :
Total matches I count: 19436
5-setters: 3599, 18.5%
4-setters: 5867, 30.2%
3-setters: 9383, 48.3%
2-setters: 587, 3.0%
As for the breakdown of the set differences by event they are here:
One note; I removed from this data all the rounds that were best of three, since they were unevenly distributed, and would tend to skew the results. Also, these results don't have retirements, since I was too lazy to change that when I moved on from your first question.
These rounds, by the ITF data, were:
1970 AO R1
1973 AO R1
1974 AO R1
1982 AO R3-4
1973 RG R1-2
1974 RG R1-2
1975 RG R1-2
1975 USO R1-3
1976 USO R1-3
1977 USO R1-4
1978 USO R1-3
Tour 3 % 4 % 5 % Total
AO 1873 48.6% 1237 32.1% 743 19.3% 3853
RG 2406 49.6% 1514 31.2% 926 19.1% 4846
USO 2334 49.8% 1467 31.3% 882 18.8% 4683
W 2534 48.4% 1649 31.5% 1048 20.0% 5231
So, even with the best-of-three matches (which weren't played at an Open Era Wimby), controlled for, Wimbledon still averaged the highest percentage of five-setters. If I threw back in the matches from the rounds listed above, the difference would of course be even greater.
Also, a retirement/default breakdown:
AO 119 3.0%
RG 128 2.6%
USO 152 3.1%
W 91 1.7%
Overall 490 2.6%
This number has sharply increased; 296 of these retirements came in the period 1969-1999, and 201 since.
Another way to look at it; For an average, 127 match GS, across open era history (assuming all matches are best-of-five)
I'm a little uncertain on the retirement data, since I'm not convinced the ITF results will correctly identify all such matches, but the errors should even out.