Dr. Johanna Choumert-Nkolo
 Have you carefully thought about the definition of “a household”? This might sound like a trivial question, but the social and statistical constructions of a household greatly vary between and within countries (see Randall et al., 2015). A randomised survey experiment conducted in Mali by Beaman and Dillon (2012) even found that “additional keywords in definitions increase rather than decrease household size and significantly alters household composition” with interesting implications for measuring welfare and production.
The standard definition of a household includes individuals who live together under the same roof and pool resources. But when listing household members, should we include children in boarding school or living with a relative, individuals who have temporarily migrated, or house workers with no family ties? There is no simple answer to this question, however, researchers must make informed choices based on their research question and on existing qualitative and quantitative data about the country and the social norms of the surveyed communities. For example, for a research project about education expenses, it will be important to collect information on all children even those living in boarding school or with other relatives.
Overall, decisions have to be made on statistical household vs. family vs. people living under the same roof, on counting children and elders, on mobile household members and on other specific sub-populations (such as polygamic households and pastoralists) (see Coast et al., 2010). For example, in the context of Tanzania, several definitions of households have been used with obvious implications for household composition:
- “A person or group of related or unrelated persons who live together in the same dwelling unit(s), who acknowledge one adult male or female as the head of the household, who share the same housekeeping arrangements, and who are considered a single unit.”(Demographic and Health Survey and Malaria Indicator Survey, 2015-2016)
- “All individuals who normally live and eat their meals together in this household, starting with the head of household.” (National Panel Survey, 2014-2015)
- “A person or group of persons who live in the same dwelling and eat meals together for at least three of the 12 months preceding the date of the survey. There are four exceptions to this definition: (1) Persons who have recently joined the household, such as spouses, newborn infants, adopted orphans and others who intend to stay until the next interview. (2) The head of the household is identified by the household without any criteria established by the study team and is considered a household member regardless of his/her length of absence. (3) “Makubaliano” servants, who live with the household without contracts, are considered household members as long as they satisfy the residency requirement. (4) Tenants and boarders are not household members, regardless of their length of residence.” (Kagera Health and Development Survey, 2004)
- “All people, including children, who: (1) lived under this “roof” or within the same house for at least 30 days in the past year, and (2) when they are together, they share food from a common source, and (3) contribute to and/or share in a common resource pool. Do not list servants who have a household elsewhere, and guests who are visiting temporarily and have a household elsewhere.” (Measuring Living Standards within Cities, Dar es Salaam 2014-2015)
- “All individuals who normally live and eat their meals together in this household, starting with the household head. Do not include anyone who has been away for 6 months or more. If an individual has just moved in, ask if they intend to stay for more than 6 months, if not, then do not include them.” (Cash Plus Household Questionnaire, Baseline 2017)
 Are you going to target the right respondent? In most household surveys, the household head is the respondent interviewed, under the assumption that he/she is the most knowledgeable person about the household. Although this may be true on average, existing research shows that it is not always the case and that it certainly depends on the topic of the survey. Actually, some researchers have even highlighted the dependence of responses on the characteristics of the respondent (Alwang et al., 2017; Anderson et al., 2017; Fisher et al., 2010). This literature is based on recent theoretical and empirical approaches that underline the unequal ability of the spouses to remember information about the household or individual household members (such as household assets, individual income, and consumption), problems of asymmetry of information within the household, the distinct roles of spouses in the household activities, and conflicts between household members over the distribution of resources (Glennerster et al., 2018; Haddad et al., 1997).
The study of household economics has in fact challenged unitary approaches, meaning that more and more field protocols now consist of interviewing several household members. For example, Fisher et al. (2010) conducted a survey in rural Malawi to test the accuracy of household income information obtained when the household head is the only one interviewed. Their results are striking: “In 28% of households, the husband overestimated the earnings of his wife and, therefore, total household income by an average of 17%. In 66% of households, the husband underestimated his wife’s income by an average of 47%.”. They also found that the “husband is less aware of the household economy when he works away from home at least part of the time; when household livelihoods are more complex, that is, involve more earners; and when the household is more sophisticated, for example, has educated female members or is located in a bigger town.”
From these examples, it should be noted that limiting surveys to a single adult in the household can lead to measurement errors on key variables, even if limiting the number of household members interviewed includes obvious time and cost saving advantages. Researchers should thus first ask themselves who is likely to have the needed -reliable- information? What is the probability that only one household member has accurate information on key variables? What are the intra-household dynamics? What are the roles of the different household members? This preliminary work, which can be qualitative or quantitative, is necessary for the implementation of rigorous research protocols and to decide whether the cost of interviewing several household members leads to a substantial gain in terms of data quality.
Lastly, it is sometimes assumed that interviewing the household head will allow for having enough female respondents. For example, in the case of Tanzania, these represent 25% of households. Surveys looking at individual behaviours or opinions of respondents according to their gender, face the problem that these female-headed households are often more vulnerable. Therefore, one could instead opt for a random selection of the respondent within the household. The literature proposes no less than fifteen different methods for selecting respondents, with probabilistic, quasi-probabilistic, non-probabilistic methods (see Choumert-Nkolo et al., 2018; Gaziano, 2005; Yan, 2009).
 Is your questionnaire too long? To get accurate answers, you need focused and motivated respondents. Needless to say, the same is expected from interviewers. Long interviews can lead to respondent fatigue, measurement errors, yes-saying bias, poor recall efforts, higher non-response rates, etc; put simply data quality is at stake. Is there an ideal questionnaire length? Not to my knowledge, but the shorter the better. Also, the relationship between time spent on each question and the total number of questions is not linear: respondents take less time per question on average for longer surveys, meaning that they rush or fall into the yes-saying syndrome.
Time matters, but a lot of factors come into play, such as the topic, the attitude of interviewers, the context of the survey (e.g. whether survey participants benefit from an intervention or the urban/rural location of the survey). In any case, keep in mind that the average attention span of an adult is around 20 minutes.
Here’s a couple of recommendations to cut the length of your questionnaire:
- Use qualitative approaches to build your questionnaire. For example, if you implement a road survey, you can use participatory mapping to understand how the road is used, by who and to which destinations, which will help you write your survey questions more efficiently.
- Go through each question and ask yourself how each one is going to be used during the analysis phase. If the answer to this question is “I will ask this question in case I need it”, it may be an indication that this one can be dropped. Put differently, having a pre-analysis plan or a theory of change, with a list of outcome variables and covariates will help a lot.
- Look at the existing literature and questionnaires. For example, instead of having a list of 50 assets (to build a wealth index), you can easily go down to 10 by looking at those used in other studies (DHS, LSMS, Poverty Probability Index, etc.).
- Look out for redundant questions. It is tempting to ask different versions of the same question, but this does not increase the accuracy of the responses. Instead, pilot what you think is the best version of the question.
- Ask feedback from colleagues who have implemented similar surveys and if they have used timestamps paradata. In our paper, we explain how we used timestamps to cut the length of our questionnaire between the pilot and fieldwork. We also estimate the time taken to ask and answer different types of questions about asset ownership and perceptions. We found that the “average length of factual questions was 7 seconds compared with 10 seconds for perception questions—around 45% longer for perception questions”. Note that in this survey, perception questions were simple ones. In other surveys I have worked on, I saw questions that took between 1 and 2 minutes to respond to, given their complexity.
- Test your questionnaire to see what works and which sections take too long. Between the outdoor practice and the 1st week of data collection, the length always goes down and continues to do so as interviewers continue to improve their familiarity with the questionnaire and their efficiency using it until time gains can no longer be made. So, yes, the length will decrease but be realistic about this decrease.
Last, if you anticipate that your questionnaire will be long, you can think about in-field strategies to maintain the attention of respondents, such as administering the questionnaire over several sessions and allowing for breaks (also giving time for household chores).
 Will respondents understand each question without ambiguity? The language and wording used should be simple and clear, adapted to the local context, to the education level of respondents and their age. For example, with children, it is preferable to avoid long response lists or complex scales. In addition, some concepts do not translate well: in some situations, keeping the word in English (or any other relevant language) can be a suitable option while in others, it will be necessary to discuss these challenges with field teams to ensure that the right words will be used and understood by local communities. When asking to recall historic information or information from a particular period/time, using important events that happened in the community or the country can facilitate the understanding of the question.
To ensure that respondents will understand every single question, it will be highly beneficial to work with staff from the country/community you are researching so that they can provide you with feedback on the question texts and response lists. In any case, it is essential to run the questionnaire by people who know the local context well and the populations surveyed and to pilot it.
 Is the questionnaire subject to measurement errors? It will probably be the case no matter the topic of the survey. There is no such thing as a perfect questionnaire, but recent research has made tremendous progress towards understanding these errors and their source. Even though controlling for all bias would be resource intensive, going over the literature around your main research question will certainly help. Here are a couple of ideas, if you are looking into this:
- Recall periods in agriculture, see Arthi et al. (2018), Beegle et al. (2012), Gaddis et al. (2019) ;
- Measuring consumption, see Beegle et al. (2012), Caeyers et al. (2012), Farfán et al. (2017) , Oseni et al. (2017) ;
- Land area measurement, see Carletto et al. (2015), Carletto et al. (2016); Killic et al. (2017) ;
- Sensitive topics, see Blair and Imai (2012), Blair et al. (2015), McKenzie and Siegel (2013), Rosenfeld et al. (2016), Tourangeau et Yan (2007).