A Changing Landscape for Statistical Surveys

Published by
December 7, 2015

When the Polish statistician and mathematician Jerzy Neyman presented his theories and methods for sample surveys in the 1930s, it was the answer to many people’s prayers. Suddenly it became possible to base statistical surveys on a sample instead of observing the entire population. It was also possible to state measurements of uncertainty in the results from the sample measurements. The measurements were called error margin and confidence interval.

This was a powerful breakthrough for the research business. It primarily meant that costs could be significantly reduced. But in many places it took time before the method was fully accepted – in Sweden, it was well into the 50s. The selection theory assumed that the selection would be made by chance. The probability that a member of the population would be selected would be known in advance and be greater than zero (everyone would have a chance of being selected).

The new theory assumed that the only uncertainty reflected by the margin of error was the sampling error, i.e. that the results were based on a sample instead of the entire population. Many, including Neyman himself, realized that many other sources of error also affect the results, e.g. dropout, the sample frame’s coverage of the population, measurement errors, etc. But the dropout rate was usually very low at that time and the measurement errors were somewhat invisible, as cognitive aspects of the response process were not so well known. In addition, it was important to launch the new selection methods and their benefits. The probability selection became the current paradigm.

Over the past two decades, the conditions for conducting traditional, good quality samples have gradually deteriorated. There are many factors that come into play. The dropout rate is increasing. It is not uncommon for the dropout rate in market research to be around 90% if it can be calculated at all. It is also increasing in Statistics Sweden’s statistics. The dropout rate in the labour force survey is approaching 40% and in the party sympathy survey 50%. When the dropouts are so large, adjustment methods are required because the selection is no longer representative, and there are a number of weighing methods to use. All serious researchers compensate for the loss, but it is important to point out that all such methods are based on assumptions of various kinds. It is also important to point out that skill is required when applying these methods. It is important that the analyst chooses their model for selection and adjustment in a skilful way and that they can carry out sensitivity analyses of how the results are affected by deviations from the model.

Nowadays, we also know much more about the effect of various measurement errors and that some of them can have a great impact on the results. When it comes to different collection methods, telephone surveys are particularly vulnerable. Mobile phones now largely replace fixed landlines, and many survey companies refrain from tracking mobile numbers and therefore do not call many of them. This of course leads to samples that are not representative.

All errors and uncertainties mean that the traditional margin of error that only takes into account the selection error underestimates the real one. When we say that a result is within the margin of error, it is therefore usually wishful thinking. There is quite a lot of development work going on in the field that deals with the total error in surveys. A conference on the topic is currently being held and the literature is extensive. Two examples are the articles Design, implementation, and evaluation (Biemer, P) and Past, present, and future (Groves, Lyberg)

Traditional telephone interviews and visitor interviews have become increasingly expensive to conduct. They are also time consuming. Customers and users do not have the same patience as before to wait for the results. Given the situation, web surveys have become increasingly common. There are examples of web surveys that use probability sampling, but many of the problems we see in interviews still remain with this method, i.e. it is difficult to get representative samples. Another alternative is self-recruited web panels where the principle of probability selection has been disregarded and replaced by weighting methods that aim to create samples that are representative of the population you want to study. The panels are sometimes called access panels, opt-in panels or dual opt-in panels. Recruitment is through inquiries that pop up on various websites, and can lead to very large panels from which selections can be drawn and weighed in such a way that representativeness is achieved.

Initially, the method was criticised by industry organisations such as The American Association for Public Opinion Research (AAPOR) and by organisations that always use probability sampling. Opt-in panels are made up of samples that are not probability samples and there is not yet a sustainable theory similar to the one developed by Neyman. But as we have seen, Neyman’s theory has major problems and its implementation is based on 80-year-old standards that no longer hold. Anyone who advocates a probability selection with telephone interviews must make a series of assumptions regarding their suitability as a collection method. In today’s world with fewer and fewer fixed landlines and more and more mobile numbers in a household, it’s quite the reach. Equally so for those who advocate opt-in panels. In a way, everyone is in the same boat where the inference problem must be solved as well as possible.

The initial criticism that AAPOR put forward has slowed down as more and more research institutes have realised the need for constructive discussions and innovation in the field of research. In particular, there is discussion of more advanced forms of sampling that are not based on probability sampling. AAPOR has indicated such a discussion in a report.. In that report, there seems to be a relatively broad consensus that non-random sampling may be useful depending on the purpose of the survey. They are often used in situations involving clinical trials, evaluation of reforms and studies of populations that are difficult to reach, such as the homeless, people with rare characteristics, and stigmatized groups. But as we at Inizio – as well as American researchers such as Andrew Gelman – have shown, non-random sampling can work well even in regular surveys and especially in voter surveys where the availability of additional information is so extensive. The quality of the weighting is often crucial. It cannot be said there is a razor-sharp boundary between probability sampling and non-probability sampling, especially since probability sampling according to the definition known probability greater than zero for each population object is never met in societal surveys. It is important that we follow and participate in the continued method development in the field.

The technological revolution that has changed our communication also contributes to the design of the new research landscape. The way we communicate opens up new data collection methods and we also have the opportunity to measure phenomena we may not have thought of before. Various mobile interfaces, such as SMS, apps, photos, videos and GPS, provide new opportunities to ask questions and observe phenomena. New technology is discussed in this report from AAPOR. We are only at the beginning of this development. The next stage is about how the access to big data can be utilised. Surveys are already being conducted on Facebook and Twitter through various methods including sentiment analysis. The publication below describes a comparison between Dutch ‘sentiments’ related to consumption on Twitter, Facebook, Linkedin, Google+ and Hyves, compared with a standard survey of Dutch consumer behaviour. The comparison showed a high correlation between the two studies and is indicated in this report. The Dutch central bureau has also reported its first official statistics based on big data, namely traffic flow data based on data from sensors set up along the major roads in the Netherlands. The agency is probably also the first in the world with these statistics presented here. The study testifies to the power of this type of big data. Statistics Sweden also uses big data in the form of certain scanner data as a complement to the regular collection of the consumer price index. Anyone who wants to know more about big data can read AAPOR’s report what came out six months ago.

The new technology naturally places new demands on security and confidentiality, but also demands on new types of reports of uncertainty. The phrase ‘total error’ now has a modified meaning. Each new data collection method has its own error structure that must be sorted out. In the future, we will report from time to time on some of what is happening in this changed landscape, both conceptually and methodologically.