Up until the mid-1990s it was widely
believed that representative samples of LGBs were too difficult or
even impossible to draw. It was thought that people wouldn't
identify as lesbian, gay or bisexual to researchers, or that the populations
were so rare that it wasn't economically feasible. The surveys
described on this website have shown that representative samples can
be drawn economically.
Today, sexual orientation data is
generally not collected either
because researchers and program
planners don't think to collect the
data (because it hasn't crossed their
mind or they don't know the
relevance to their work), or for
political reasons having nothing to
do with science or community needs. |
On this page we discuss several topics: 1) probability versus
non-probability sampling, 2) modes of sexual orientation
data collection, and 3) sample size. The full discussion
of constructing samples is beyond the scope of this website. For more detailed information see appropriate texts on sampling (such
as the classic text Applied Sampling by Seymour Sudman, 1976)
or contact us for guidance. The
topics discussed here were chosen because they are some of the more
common concerns that arise when sampling LGBs.
PROBABILITY VERSUS NON-PROBABILITY SAMPLING: There are two
types of sampling methods: probability sampling and non-probability
sampling. The difference between them is that in probability sampling,
every unit has a "chance" of being selected, and that
chance can be largely quantified. This is not true for non-probability
sampling; every item in a population does not have an equal chance
of being selected. Historically, samples of LGBs were non-probability
samples drawn from locations such as mental institutions, prisons,
or bars. Not surprisingly, data from these samples were biased
in ways that stigmatized LGBs and supported arguments made by some
that they were inherently "sick." With the advent
of probability samples, many but not all of these myths have been
dispelled.
Because probability sampling allows for the generalization of results
to larger populations, this website has focused on data sources
that have used this method. Probability sampling involves
the selection of a sample from a population, based on the principle
of randomization or chance. Probability sampling is more complex,
more time-consuming and usually more costly than non- probability
sampling. However, because units from the population are randomly
selected and each unit's probability of inclusion can be calculated,
reliable estimates can be produced along with estimates of the sampling
error, and inferences can be made about the population.
There are several different ways in which a probability sample can
be selected. The method chosen depends on a number of factors, such
as the available sampling frame, how spread out the population is,
how costly it is to survey members of the population and how users
will analyse the data. When choosing a probability sample design,
your goal should be to minimize the sampling error of the estimates
for the most important survey variables, while simultaneously minimizing
the time and cost of conducting the survey. The following
are the most common probability sampling methods:
- simple random sampling - In simple random sampling, each member
of a population has an equal chance of being included in the sample.
- systematic sampling - Sometimes called interval sampling, systematic
sampling means that there is a gap, or interval, between each
selected unit in the sample.
- sampling with probability proportional to size - Probability
sampling requires that each member of the survey population have
a chance of being included in the sample, but it does not require
that this chance be the same for everyone.
- stratified sampling - Using stratified sampling, the population
is divided into homogeneous, mutually exclusive groups called
strata, and then independent samples are selected from each stratum.
- cluster sampling - Cluster sampling divides the population into
groups or clusters. A number of clusters are selected randomly
to represent the total population, and then all units within selected
clusters are included in the sample.
- multi-stage sampling - Multi-stage sampling is like the cluster
method, except that it involves picking a sample from within each
chosen cluster, rather than including all units in the cluster.
- multi-phase sampling - A multi-phase sample collects basic information
from a large sample of units and then, for a subsample of these
units, collects more detailed information.
For detailed descriptions of each of these see appropriate texts
on sampling (such as the classic text Applied Sampling by Seymour
Sudman, 1976) or contact us for
guidance. Also, across these methods screeners can be used.
A screener is a tool to screen the sample for persons (units)
of interest. For an example of a screener that was used to
identify lesbians, gays and bisexuals see: Kaiser
Screener.
MODE OF SEXUAL ORIENTATION DATA COLLECTION: As demonstrated
in the surveys described on this website, sexual orientation data
has now been collected: 1) face-to-face, 2) over the telephone,
3) using audio-CASI, 4) in mail surveys, 5) using self-completed
questionnaires, and 6) over the internet. As each method was
first attempted, there was understandably some trepidation concerning
whether it would work. However, we now know that data can
be successfully collected using each of these methods. That
said, further research on the relative benefits and limitations
of each is needed. For further information on the success
of any of these methods, please contact survey administrators that
have used the methods, or contact us.
SAMPLE SIZE: The level of precision needed for survey
estimates (such as estimates of the prevalence of gays or lesbians
in a population, or the prevalence of smoking among gays and lesbians)
will impact the sample size that one needs to draw. Unfortunately,
it is not as easy to determine the sample size as one may think.
Generally, the final sample size of a survey is a compromise between
the level of precision to be achieved, the survey budget and other
operational constraints, such as time. In order to achieve
a certain level of precision, the sample size depends, among other
things, on the following factors:
- The variability of the characteristics being observed: If every
person in a population had the same sexual orientation, then a
sample of one person would be all you would need to estimate the
average sexual orientation of the population. If the sexual orientations
are very different, then you would need a bigger sample in order
to produce a reliable estimate.
- The population size: To a certain extent, the bigger the population,
the bigger the sample needed. But once you reach a certain level,
an increase in population no longer affects the sample size. For
instance, the necessary sample size to achieve a certain level
of precision will be about the same for a population of one million
as for a population twice that size.
- The sampling and estimation methods: Not all sampling and estimation
methods have the same level of efficiency. You will need a bigger
sample if your method is not the most efficient. But because of
operational constraints and the unavailability of an adequate
frame, you cannot always use the most efficient technique.
Estimating overall sample sizes in order to examine a topic of
interest is always a challenge. The best guidance one can
get is from the surveys that have already been conducted. It
is therefore to your advantage when choosing a sample size to review
data sources that have already
sampled LGBs. |