September 13, 2018 – As every owner participant knows, filling out the annual questionnaire is no easy task. Not only do we ask for a huge amount of detailed information, we ask for it year after year. This translates into thousands and thousands of individual data points. Add to the questionnaire data the thousands of biological samples and you’ve got a mountain of data points, we estimate 5 million (give or take a few thousand), by the study’s end!
Enter, statistics. We use statistics to make sense of the data our dedicated owners and veterinarians painstakingly collect.
Mean, Median and Mode, Oh My!
Statistics includes many types of tests that can tell us different things about our Golden Retriever Lifetime Study data set. Some of the most well-known statistical groups are measures of centrality. These stats tell us about the distribution of data, and include the mean (or average), median and mode. The mean is the sum of all observations divided by the number of observations; the median is the value located at the midpoint of the data; and the mode is the value that happens most frequently in the data.
For example, if we have five dogs that weigh 45, 50, 50, 52, and 60 pounds, the mean weight is 51.4 pounds, the median is 50 pounds and the mode also is 50 pounds.
Remember the bell curve from high school biology? A perfect bell curve is considered a representation of data that has a normal distribution. If a set of data has normal distribution, the mean, median and mode are the same. This happens in nature with measurements such as height and weight which tend to be normally distributed within populations, including golden retrievers.
An outlier is an observation that is extreme in value, and outliers can exert undue influence on the rest of the data. The mean is vulnerable to influence by outliers, but the median can resist their pull so we often rely on the median when we are describing data.
A good example in real life is median home price - we know that $0.00 is the lowest possible price for a house, but there is no limit to home prices on the upper end so a handful of expensive houses can make home prices look artificially high. In this case, the median would be the more realistic description of what a person can expect to pay for a house in a specific market.
What are the Odds?
Two additional statistical concepts are probabilities and odds, whether you’re looking at data from the Golden Retriever Lifetime Study or picking your numbers for Powerball. A probability is the chance that something will happen. If we take the five dogs from the example above and put their names in a hat, the chance that you would draw the name of a dog weighing 50 pounds is two/fifths or 40 percent.
Odds are a ratio of probabilities, specifically the chance of drawing the name of a dog weighing 50 pounds divided by the chance that you draw the name of a dog who doesn’t weigh 50 pounds, so two out of five/three out of five or 66 percent. When an event is extremely rare, probability and odds are approximately the same. If your probability of winning Power Ball is 1/1,000,000 then the odds of winning power ball is 1/999,999, very close to the same tiny number. Of course, when you win, please remember Morris Animal Foundation in your giving plan.
Statistics and Associations
These concepts are essential to statistics and research, but there are many other tests we use to look for associations between disease and specific data points – which is the reason all of us have devoted time and effort in collecting, storing and analyzing the data.
Statistics allow us to feel confident in our conclusions and recommendations based on our study data, and demonstrate to other scientists and veterinarians the validity of our study. It’s going to take us some time to sift through all the data but it’s worth it – in the end, we’re confident we’ll be able to improve the lives of thousands of dogs and their owners, as well as the practice of veterinary medicine, for years to come.