The Necessity and Complexity of Data Collection on the Roma People in Europe

By: Jozefien Van Caeneghem |

The prohibition of discrimination on the basis of race and/or ethnic origin is contained in a number of international human rights instruments, such as the International Convention on the Elimination of All Forms of Racial Discrimination, which defines “racial discrimination” as “any distinction, exclusion, restriction or preference based on race, colour, descent, or national or ethnic origin which has the purpose or effect of nullifying or impairing the recognition, enjoyment or exercise, on an equal footing, of human rights and fundamental freedoms in the political, economic, social, cultural or any other field of public life”. The Convention determines that States must condemn racial discrimination and take all appropriate means to eliminate all kinds of racial discrimination. Ethnic data collection can be very helpful in this regard. It facilitates the gathering of information on the level of social and economic integration of various groups and the uncovering discrimination. It can also be very helpful to raise awareness of the situation of different groups, to implement and to evaluate policies, as well as to support discrimination claims.

The collection of ethnic data for anti-discrimination law and policy purposes is quite a common practice in the United States. On the contrary, in Europe, this practice is—with the exception of the United Kingdom—quite controversial. At the international and European level, there have been numerous calls to collect ethnic data on the composition and situation of different equality groups in order to render existing anti-discrimination frameworks fully operational. The Roma minority deserve special attention in this debate, because data collection on this group, which is considered one of Europe’s most vulnerable minorities, raises a number of questions and problems which must be addressed carefully in order to produce reliable and accurate data for anti-discrimination purposes.

 Who are the Roma

Before these issues can be addressed, it is important to take a closer look at who the Roma are. While this appears to be an easy question, the answer is complicated by different factors. Despite popular perception of Roma as one homogenous community, the Roma are a very diverse group in terms of nationalities, languages, religions, migration, and socio-economic status. In Europe, the term “Roma” is often used for practical reasons in the framework of anti-discrimination policies as an umbrella term to designate several groups such as Sinti, Roms, Manouches, Travellers, and Roma, who share similar cultural characteristics and a history of discrimination and stigmatization. Each group has its own specific characteristics and cultural identity. Despite significant heterogeneity among various Roma groups in Europe, they are often all treated as one and the same. Prejudice, discrimination, and hostility against Roma are often referred to as anti-Gypsyism. It is based on age-old stereotypes of the Roma being a well-defined group who travel around; speak their own language; and are lazy, dirty and criminal, which are still very present in society today.

An Unmet Need for Data

The Council of Europe estimates that there are approximately ten to twelve million Roma in Europe, of which approximately six million live in the European Union. Estimates of the size of the Roma community in various European countries varies greatly as a result of the lack of official data on the Roma or, if official data are available, the lack of reliable data as a result of the reluctance of many Roma to self-identify. Over the course of years, several initiatives have been taken to advance the situation of Roma in education, employment, housing, and health. For example, the Decade of Roma Inclusion (2005-2015) and the EU Framework for National Roma Integration Strategies (up to 2020) both call on States to set specific targets and to introduce robust monitoring systems to track progress. So far, however, very few States have made considerable efforts to fill gaps in data on Roma. Many States still hide behind international and European privacy and data protection legislation by misinterpreting it as prohibiting the collection of such sensitive data. Moreover, there is often a lack of political will to make real changes in the life of the Roma, who have always been and still remain an unpopular political topic.

Another problem relates to the ambiguous feelings many Roma have towards ethnic data collection. On the one hand, there seems to be some understanding that such data can play a key role in lobbying efforts by demonstrating the extent of discrimination against Roma and lack of progress, as well as in invalidating unreliable data such as unrealistically high crime rates among Roma. Ethnic statistics have also been very helpful in proving indirect discrimination, as has been done successfully to prove discrimination against Roma pupils in education before the European Court of Human Rights. On the other hand, many fear that the possible benefits of ethnic data collection do not outweigh the potential risks it entails. Ethnic data can be misused to harm the Roma, as was done during WWII on the basis of population registers and numerous censuses on the Roma. More recently, in Italy, the fingerprinting of Roma caused international concern. Experience from France shows that data can also be used to limit the freedom of movement of the Roma. Lack of awareness of the benefits of ethnic data collection and the existence of anti-discrimination and data protection legislation make Roma feel hesitant towards this practice and reluctant to self-identify.

 A Variety of Potential Data Sources

Data on Roma can come from a number of different sources. Official statistics, such as censuses, are a first possible data source. The main problem with this source relates to the fact that many States do not collect ethnic data, let alone data on the Roma, in their census. Nationality or national origin can be an indication but is no guarantee of Roma identity. Moreover, official statistics do not count those Roma who are present in a country but who are stateless as well as those who lack identity documents and/or valid residency permits. Whenever censuses do collect data on Roma, they are usually only conducted every ten years, which results in data gaps on the size and geographic location of the Roma in a given State in between censuses. Large national surveys are not a good alternative, because they are usually based on data gathered in the census and so include too small numbers of Roma and therefore also become unreliable with regards to data on Roma.

Academic and ad hoc research can fill many gaps left by official statistics and has been used extensively to collect data on Roma at various levels and on a variety of topics such as education, employment, housing, and health. The usefulness of research on Roma is, however, often limited in terms of representativeness and comparability as a result of methodological choices and because research is often only conducted once. Testing has also proven to be a valuable source of information on discrimination against Roma in accessing a variety of goods and services such as renting a hotel room or visiting a disco. The reports from several international and European monitoring bodies in which they express concern over the situation of and violence against the Roma in various States can also be a valuable source of information. Such reports are often based upon information provided by various stakeholders, as well as information gathered by the experts themselves during country visits.

Complaints data from the police, equality bodies, or other organizations authorized to receive complaints can also be a valuable data source, provided such information is collected, disaggregated, and published, which is often not the case. Where such data exist, they must be interpreted carefully, because research shows that many Roma do not report violence and discrimination to the police for a variety of reasons. These include, among others, that they are used to being discriminated against, that they do not believe filing a complaint will make a difference, or that they fear further discrimination and victimization. Research also shows that many Roma are unaware of where to file a complaint or which procedures to follow. But while complaints data seriously under-represent Roma, crime data often over-represent them. One must be aware, however, that such over-representation may point to problems in the system— with Roma being over-represented in those stopped by police and receiving more severe sentences in court—rather than a problem among the Roma.

The Complexity of Categorizing and Identifying the Roma

 In order to collect ethnic data, one needs to choose the appropriate categories by which people can be identified. With regards to the Roma, the delineation of such categories is quite complex. A first problem relates to the heterogeneity of Roma communities. Because various groups might have different characteristics and problems, it is best to include many and specific categories. Roma communities should be involved in this decision in order to ensure that data collection practices include the categories to which they feel they belong. Moreover, because Roma might feel equally Roma and—for example—Hungarian, it is important that multiple identifications are allowed in order to increase levels of participation and self-identification. Although this may limit comparability throughout time, categories should also evolve when they get a bad connotation, as has happened already in various contexts with “Gypsy” or “Tigan”.

 The identification of Roma in data collection practices is also a much-debated question. Self-identification is the most preferred identification method in ethnic data collection practices, because it is considered to be most in line with the respondents’ human rights. When it comes to the Roma, however, self-identification often results in insufficient or unreliable and inaccurate data. While in certain data collection practices, such as self-report surveys on Roma discrimination, self-identification is the logical approach, it is argued that in other data collection practices for anti-discrimination purposes, external identification is more appropriate because discrimination is based on perception rather than self-identification. Whereas self-identification may lead to the under-representation of Roma in data collection efforts, external identification on the basis of visual observation risks over-representing individuals who correspond to the observer’s stereotypical view of who is Roma. The data may thus also include people who are not Roma but who have darker skin and show more visible signs of social exclusion and poverty, while not including Roma with lighter skin tones and more educated and socially included Roma.

External identification is sometimes also done on the basis of so-called objective criteria, such as language. This method risks under-representing Roma, as many no longer speak the Romani language or dialects as a result of discrimination and assimilation, and even if they do, they might prefer not to disclose this out of fear of facing discrimination. Identification of Roma is also sometimes done by other members of the Roma community, such as Roma self-government or Roma working for the government or NGOs. The main argument against this method is that persons identifying as Roma are not more qualified to identify other Roma as a result of their own ethnic affiliation. A possible solution to these issues may be combining various identification methods. For example, the implicit endorsement of the external identification methods has been used at the international and European level. Respondent households are first selected on the basis of external identification, after which the interviewer asks whether the household would be interested in participating in a survey of the Roma. Agreement to participate is considered confirmation of Roma identity. The respondent thus retains control over the identification process.

 The Importance of Raising Awareness and Active Participation

In order to render data collection practices on Roma successful, Roma should be well informed and made aware of the existence of anti-discrimination and data protection legislation, the benefits and categories of ethnic data collection, and the availability of complaints bodies and procedures. Experience in various countries shows that mobilization campaigns by means of mediation, information meetings, and door-to-door visits can raise levels of trust and self-identification. NGOs can play an important role in this regard at the grassroots level. Moreover, encouraging genuine dialogue between Roma and the rest of society can contribute to a better understanding of the position and worries of both sides, which helps avoid miscommunication.

Roma should also be actively involved in all stages of data collection, from development of measures to implementation, monitoring, evaluation, interpretation, and dissemination of results in order to increase feelings of ownership of the results and to invalidate arguments of manipulation and misinterpretation. Experience also shows that the active involvement of Roma in the actual collection of the data, for example as enumerators or as language assistants, also generates more reliable results. Not only do Roma generally have better access to Roma communities, they can also improve communication and explain concepts and procedures, and they are better positioned to encourage others to participate and self-identify. Other stakeholders, such as local and regional authorities, should also be closely involved in data collection efforts. Statistical agencies and data protection authorities must have adequate capacity to conduct and support data collection practices on Roma. They could benefit from the involvement from experts with experience in this field, such as NGOs or Roma enumerators.

 A Complex Yet Important Undertaking

 The European Commission states clearly that in order to advance non-discrimination and equal opportunities for different racial and ethnic groups, there is an urgent need to supplement the existing legislative framework with a range of policy tools. One of these tools is ethnic data collection. To say it with the words of the UK Commission for Racial Equality: “To have an equality policy without ethnic monitoring is like aiming for good financial management without keeping financial records”. When it comes to the Roma, however, such data collection is anything but an easy or straightforward undertaking. Problems range from how to define the Roma to choosing appropriate categories and identifying who is Roma. Notwithstanding these stumbling blocks, it is imperative that States, equality bodies, NGOs, and academics increase their efforts to close the large data gaps on Roma that still persist across Europe. Considering the complexity of data collection practices on Roma, it is inevitable that the information on this group will come from many different sources and will often not be representative or comparable with other data sets as a result of different methodological approaches. However, this does not necessarily need to be a problem, because there is a need for a variety of data: States must have general data on the size and geographic location of Roma communities in order to set targets and allocate funding, while at the local level—where social inclusion starts—there is a need for specific and in-depth data in order to take the specificities of various (sub)groups into account in policymaking.

 Jozefien Van Caeneghem is a visiting scholar at Berkeley Law, a BAEF Hoover Foundation Brussels Fellow, and a PhD candidate at Vrije Universiteit Brussel.

