posts: 17539
Data license: CC-BY
This data as json
id | title | slug | type | status | content | archieml | archieml_update_statistics | published_at | updated_at | gdocSuccessorId | authors | excerpt | created_at_in_wordpress | updated_at_in_wordpress | featured_image | formattingOptions | markdown | wpApiSnapshot |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
17539 | How is literacy measured? | how-is-literacy-measured | post | publish | <!-- wp:html --> <div class="blog-info">Our World in Data presents the data and research to make progress against the world’s largest problems. <p>This post draws on data and research discussed in our entry on <strong><a href="https://ourworldindata.org/literacy" target="_blank" rel="noopener noreferrer">Literacy</a></strong>.</p> </div> <!-- /wp:html --> <!-- wp:paragraph --> <p>The chart shows global literacy rates among adults since 1800. This is a powerful graph: it tells us that over the last two centuries the share of illiterate adults has gone down from 88% to less than 14%.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>This global perspective on education leads to a natural question: What does it actually mean that a person is ‘literate’ in these international statistics?</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Literacy rates are only a proxy for what we actually care about, namely literacy <em>skills</em>. The distinction matters because literacy skills are complex and span over a range of proficiency shades, while literacy rates assume a sharp, binary distinction between those who are and aren’t ‘literate’.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>What is the definition of literacy that underlies the estimates in the chart? And how do these estimates compare to other measures of educational achievement and literacy skills?</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>In this post we answer these questions. We begin with an overview of recent estimates published by UNESCO, and then move on to discuss long-run estimates that rely on historical data.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p><iframe style="width: 100%; height: 600px; border: 0px none;" src="https://ourworldindata.org/grapher/literate-and-illiterate-world-population"></iframe></p> <!-- /wp:paragraph --> <!-- wp:heading {"level":3} --> <h3>Measurement today</h3> <!-- /wp:heading --> <!-- wp:heading {"level":4} --> <h4>Methodologies for measuring literacy</h4> <!-- /wp:heading --> <!-- wp:paragraph --> <p>Let's start by taking a look at recent estimates of literacy. Specifically, the estimates of literacy rates compiled by UNESCO from different sources.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>In the chart we present a breakdown of these estimates, showing the main methodologies that countries use to measure literacy, and how these have changed over time. (To explore changes across time use the slider underneath the map.)</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The breakdown covers four categories: self-reported literacy declared directly by individuals, self-reported literacy declared by the head of the household, tested literacy from proficiency examinations, and indirect estimation or extrapolation.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>In most cases, the categories covering 'self-reports' (green and orange) correspond to estimates of literacy that rely on answers provided to a simple yes/no question asking people if they can read and write.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The category 'indirect estimation' (red) corresponds mainly to estimates that rely on indirect evidence from educational attainment, usually based on the highest degree of completed education.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>In <a href="https://owid.cloud/app/uploads/2018/03/Literacy-measurement-OWID-metadata.xlsx" target="_blank" rel="noopener noreferrer">this table</a> you find details regarding all literacy definitions and sources, country by country, and how we categorised them for the purpose of this chart.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>This chart is telling us that:</p> <!-- /wp:paragraph --> <!-- wp:list --> <ul><li>There is substantial cross-country variation, with recent estimates covering all four measurement methods.</li><li>There is variation within countries across time (e.g. Mexico switches between self-reports and extrapolation).</li><li>The number of countries that base their estimates on self-reports and testing is increasing.</li></ul> <!-- /wp:list --> <!-- wp:paragraph --> <p><iframe style="width: 100%; height: 600px; border: 0px none;" src="https://ourworldindata.org/grapher/mode-of-reporting-literacy-rates"></iframe></p> <!-- /wp:paragraph --> <!-- wp:heading {"level":4} --> <h4>Data sources for measuring literacy</h4> <!-- /wp:heading --> <!-- wp:paragraph --> <p>Another way to dissect the same data, is to classify literacy estimates according to the type of measurement instrument used to collect the relevant data. Which countries use household sampling instruments such as UNICEF's <em>Multiple Indicator Cluster Surveys</em>? Which countries use census data? And which countries do not collect literacy data directly, but rely instead on other sources?</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>In the chart we explore this, splitting estimates into three categories: sampling, including data from literacy tests and household surveys; census data; and other instruments (e.g. administrative data on school enrollment).</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Here we can see that most countries use sampling instruments (coded as 'surveys' in the map), although in the past census data was more common. Literacy surveys have the potential of being more accurate – when the sampling is done correctly – because they allow for more specific and detailed measurement than short and generic questions in population censuses. Below we discuss this further.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p><iframe style="width: 100%; height: 600px; border: 0px none;" src="https://ourworldindata.org/grapher/literacy-rate-source-type"></iframe></p> <!-- /wp:paragraph --> <!-- wp:heading {"level":4} --> <h4>Data quality: Challenges and limitations</h4> <!-- /wp:heading --> <!-- wp:paragraph --> <p>As mentioned above, recent data on literacy is often based on a single question included in national population censuses or household surveys presented to respondents above a certain age, where literacy skills are self-reported. The question is often phrased as "can you read and write?". These self-reports of literacy skills have several limitations:</p> <!-- /wp:paragraph --> <!-- wp:list --> <ul><li>Simple questions such as "can you read and write?" frame literacy as a skill you either possess or do not when, in reality, literacy is a multi-dimensional skill that exists on a continuum.</li><li>Self-reports are subjective, in that the question is dependent on what each individual understands by "reading" and "writing". The form of a word may be familiar enough for a respondent to recall its sound or meaning without actually ‘reading’ it. Similarly, when writing out one’s name to convey written ability, this can be accomplished by ‘drawing’ a familiar shape rather than writing in an effort to produce a written text with meaning.</li><li>In many cases surveys ask only one individual to report literacy on behalf of the entire household. This indirect reporting potentially introduces further noise, in particular when it comes to estimating literacy among women and children, since these groups are less often considered 'head of household' in the surveys.</li></ul> <!-- /wp:list --> <!-- wp:paragraph --> <p>Similarly, inferring literacy from data on educational attainment is also problematic, since schooling does not produce literacy in the same way everywhere: Proficiency tests show that in many low-income countries, a <a href="https://ourworldindata.org/grapher/students-in-grade-2-who-cant-read-a-single-word-ca-2015" target="_blank" rel="noopener noreferrer">large fraction</a> of second-grade primary-school students cannot read a single word of a short text; and for very few people in these countries going to school for four or five years <a href="https://www.cgdev.org/blog/measuring-quality-girls-education-across-developing-world" target="_blank" rel="noopener noreferrer">guarantees basic literacy</a>.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Even at a conceptual level there is lack of consensus – national definitions of literacy that are based on educational attainment vary substantially from country to country. For example, in Greece people are considered literate if they have finished six years of primary education; while in Paraguay you qualify as literate if you have completed two years of primary school.{ref}You find more details about this in <a href="http://www.unesco.org/education/GMR2006/full/chapt6_eng.pdf" target="_blank" rel="noopener noreferrer">Chapter 6 of the Education for All Global Monitoring Report (2006)</a>.{/ref}</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":4} --> <h4>New perspectives through standardized literacy tests</h4> <!-- /wp:heading --> <!-- wp:paragraph --> <p>Given the limitations of self-reported or indirectly inferred literacy estimates, efforts are being made at both national and international levels to conduct standardized literacy tests to assess proficiency in a systematic way.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>In particular, large cross-country assessment surveys have been developed to overcome the challenges of producing comparable literacy data. Two important examples are the Program for the International Assessment of Adult Competencies (PIAAC), which is a test used for measuring literacy mostly in rich countries; and the Literacy Assessment and Monitoring Programme (LAMP), which is a household assessment aimed at measuring literacy skills in developing countries, while remaining comparable across countries, languages, and scripts.{ref}The LAMP tests three skill domains: reading continuous texts (prose), reading non-continuous texts (documents), and numeracy skills. Each skills domain is divided into three performance levels; the top 30% of respondents at Level 3, the middle 40% of respondents correspondingly at Level 2, and the bottom 30% of respondents at Level 1.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The LAMP tests have only recently been field tested. The LAMP structure in these pilots (implemented in Jordan, Mongolia, Palestine, and Paraguay) is as follows:</p> <!-- /wp:paragraph --> <!-- wp:list --> <ul><li>Background questionnaire: collects information on the respondent’s educational attainment, self-reported literacy, literacy use in and outside of work, and occupation, amongst other information.</li><li>Filter test: Booklet with 17 items to determine if respondent takes Module A (for low performers) or Module B (for high performers)</li><li>Module A: Testing prose, document, and numeracy items.</li><li>Module B: Respondent is randomly assigned Booklet 1 or Booklet 2, both testing prose, document, and numeracy skills for high performers of the filter test.</li></ul> <!-- /wp:list --> <!-- wp:paragraph --> <p>{/ref}</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The LAMP tests have only recently been field tested in four countries: Jordan, Mongolia, Palestine, and Paraguay. The PIAAC tests, on the other hand, have been administered in about 30 countries, and the results are shown in the chart.{ref}The World Bank's STEP Skills Measurement Program also provides a direct assessment of reading proficiency and related competencies scored on the same scale as the OECD's PIAAC. This adds another 11 countries to the coverage of PIAAC tests. You can read more about it here: <a rel="noopener noreferrer" href="http://microdata.worldbank.org/index.php/catalog/step/about" target="_blank">http://microdata.worldbank.org/index.php/catalog/step/about</a>{/ref}</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>We only have these tests for a few countries, but we can still see that there is an overall positive correlation. Moreover, we see that there is substantial variation in scores even for countries with identical and almost perfect literacy rates (e.g. Japan vs Italy). This confirms the fact that PIAAC tests capture a related, but broader concept of literacy.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p><iframe style="width: 100%; height: 600px; border: 0px none;" src="https://owid.cloud/grapher/average-literacy-score-piaac-test-vs-national-literacy-rate-cia"></iframe></p> <!-- /wp:paragraph --> <!-- wp:heading {"level":3} --> <h3>Reconstructing estimates from the past</h3> <!-- /wp:heading --> <!-- wp:heading {"level":4} --> <h4>Census data</h4> <!-- /wp:heading --> <!-- wp:paragraph --> <p>The UNESCO definition of "people who can, with understanding, read and write a short, simple statement on their everyday life", started being recorded in census data from the end of the 19th century onwards. Hence, despite variation on the types of questions and census instruments used, these historical census data remain the best source of data on literacy for the period prior 1990.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The longest series attempting to reconstruct literacy estimates from historical census data is provided in the OECD report <a href="http://www.oecd-ilibrary.org/economics/how-was-life_9789264214262-en" target="_blank" rel="noopener noreferrer">"How Was Life? Global Well-being since 1820"</a>, published in 2014. This is the source used in the first chart in this blog post for the period 1800 to 1990.{ref}The OECD report relies on a number of underlying sources. For the period before 1950, the underlying source is the UNESCO report on the Progress of Literacy in Various Countries Since 1900 (about 30 countries). For the mid-20th century, the underlying source is the UNESCO report on Illiteracy at Mid-Century (about 36 additional countries). And up to 1970, the source is UNESCO statistical yearbooks.{/ref}</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>How accurate are these UNESCO estimates?</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>The evidence suggests that the narrow concept of literacy measured in early census data provides an imperfect, yet informative account of literacy skills. The chart is a good example: As we can see, already in 1947, census estimates from the US correlate strongly with educational attainment, as one would expect.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p><iframe style="width: 100%; height: 600px; border: 0px none;" src="https://owid.cloud/grapher/literacy-by-years-of-schooling-us-1947"></iframe></p> <!-- /wp:paragraph --> <!-- wp:heading {"level":4} --> <h4>The relationship between educational attainment and literacy over time</h4> <!-- /wp:heading --> <!-- wp:paragraph --> <p>Importantly, the correlation between educational attainment and literacy also holds across countries and over time. The chart shows this by plotting changes in literacy rates and average years of schooling. Each country in this chart is represented by a line, where the beginning and ending points correspond to the first and last available observation of these two variables over the period 1970-2010. (As we mention above, before 1990 almost all observations correspond to census data.)</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>As we can see, literacy rates tend to be much higher in countries where people tend to have more years of education; and as average years of education go up in a country, literacy rates also go up.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p><iframe style="width: 100%; height: 600px; border: 0px none;" src="https://owid.cloud/grapher/literacy-rates-vs-average-years-of-schooling"></iframe></p> <!-- /wp:paragraph --> <!-- wp:heading {"level":4} --> <h4>Offical estimates vs one-sentence test results</h4> <!-- /wp:heading --> <!-- wp:paragraph --> <p>Countries with high literacy rates also tend to have higher results in the basic literacy tests included in the DHS surveys (this is a test that requires survey respondents to read a sentence showed to them). As we can see in the chart, there is obviously noise, but these two variables are closely related.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p><iframe style="width: 100%; height: 600px; border: 0px none;" src="https://ourworldindata.org/grapher/literacy-rate-adult-total-dhs-surveys-vs-unesco"></iframe></p> <!-- /wp:paragraph --> <!-- wp:heading {"level":4} --> <h4>Other historical sources</h4> <!-- /wp:heading --> <!-- wp:paragraph --> <p>When census data is not available, a common method used to estimate literacy is to calculate the <a href="https://ourworldindata.org/grapher/historical-literacy-in-england-by-sex" target="_blank" rel="noopener noreferrer">share of people who could sign official documents</a> (e.g. court documents, marriage certificates, etc).{ref}Since estimates of signed documents tend to rely on small samples (e.g. parish documents from specific towns), researchers often rely on additional assumptions to extrapolate estimates to the national level. For example, Bob Allen provides estimates of the evolution of literacy in Europe between 1500 and 1800 using data on urbanization rates. For more details see Allen, R. C. (2003). <a href="https://web.archive.org/web/20190122001259/http://www.econ.queensu.ca/files/other/allen03.pdf">Progress and poverty in early modern Europe</a>. The Economic History Review, 56(3), 403-443.{/ref} As the researcher Jeremiah Dittmar <a href="http://www.jeremiahdittmar.com/files/Print-Welfare-Paper-5.27.2011.pdf" target="_blank" rel="noopener noreferrer">explains</a>, this approach only gives a lower bound of the estimates because the number of people who could read was higher than the number who could write.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Indeed, other methods have been proposed, in order to rely on historical estimates of people who could read. For example, researchers Eltjo Buringh and Jan Luiten van Zanden deduce <a rel="noopener noreferrer" href="https://ourworldindata.org/grapher/estimated-historical-literacy-rates?country=GBR+BEL+FRA+DEU+IRL+ITA+NLD+POL+ESP+SWE+Western%20Europe" target="_blank">literacy rates from estimated per capita book consumption</a>.{ref}They use a demand equation that links book consumption to a number of factors, including literacy and book prices. For more details see Buringh, E. and Van Zanden, J.L., 2009. Charting the “Rise of the West”: Manuscripts and Printed Books in Europe, a long-term Perspective from the Sixth through Eighteenth Centuries. The Journal of Economic History, 69(2), pp.409-445.{/ref} As Buringh and Van Zanden show, their estimates based on book consumption are different, but still fairly close to alternative estimates based on signed documents.</p> <!-- /wp:paragraph --> <!-- wp:heading {"level":3} --> <h3>Concluding remarks</h3> <!-- /wp:heading --> <!-- wp:paragraph --> <p>Literacy is a key skill and a key measure of a population's education. However, measuring literacy is difficult because literacy is a complex multi-dimensional skill.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>For decades, most countries have been measuring literacy through yes/no, self-reported answers to a single question in a survey or census along the lines of "can you read and write?".</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>These estimates provide a basic perspective on literacy skills. They tell us something meaningful about broad changes in education in the long run, since the changes across decades are much larger than the underlying error margins at any point in time. But they remain insufficient to fully characterise literacy skills and understand challenges ahead.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>As populations become more educated, we need more accurate instruments to measure abilities – asking people whether they can read and write is insufficient to meaningfully detect differences in the way people apply skills for work and life.</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>Efforts have been made in recent years, both at national and international levels, to develop standardised, tailor-made survey instruments that measure literacy skills more accurately through tests. Several countries already implement these tests, but we will need to wait several years for these new instruments to become the norm for measuring and reporting literacy rates internationally.</p> <!-- /wp:paragraph --> | { "id": "wp-17539", "slug": "how-is-literacy-measured", "content": { "toc": [], "body": [ { "type": "text", "value": [ { "text": "Our World in Data presents the data and research to make progress against the world\u2019s largest problems.\n", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "This post draws on data and research discussed in our entry on ", "spanType": "span-simple-text" }, { "children": [ { "url": "https://ourworldindata.org/literacy", "children": [ { "text": "Literacy", "spanType": "span-simple-text" } ], "spanType": "span-link" } ], "spanType": "span-bold" }, { "text": ".", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The chart shows global literacy rates among adults since 1800. This is a powerful graph: it tells us that over the last two centuries the share of illiterate adults has gone down from 88% to less than 14%.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "This global perspective on education leads to a natural question: What does it actually mean that a person is \u2018literate\u2019 in these international statistics?", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Literacy rates are only a proxy for what we actually care about, namely literacy ", "spanType": "span-simple-text" }, { "children": [ { "text": "skills", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": ". The distinction matters because literacy skills are complex and span over a range of proficiency shades, while literacy rates assume a sharp, binary distinction between those who are and aren\u2019t \u2018literate\u2019.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "What is the definition of literacy that underlies the estimates in the chart? And how do these estimates compare to other measures of educational achievement and literacy skills?", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "In this post we answer these questions. We begin with an overview of recent estimates published by UNESCO, and then move on to discuss long-run estimates that rely on historical data.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Measurement today", "spanType": "span-simple-text" } ], "type": "heading", "level": 2, "parseErrors": [] }, { "text": [ { "text": "Methodologies for measuring literacy", "spanType": "span-simple-text" } ], "type": "heading", "level": 3, "parseErrors": [] }, { "type": "text", "value": [ { "text": "Let's start by taking a look at recent estimates of literacy. Specifically, the estimates of literacy rates compiled by UNESCO from different sources.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "In the chart we present a breakdown of these estimates, showing the main methodologies that countries use to measure literacy, and how these have changed over time. (To explore changes across time use the slider underneath the map.)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The breakdown covers four categories: self-reported literacy declared directly by individuals, self-reported literacy declared by the head of the household, tested literacy from proficiency examinations, and indirect estimation or extrapolation.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "In most cases, the categories covering 'self-reports' (green and orange) correspond to estimates of literacy that rely on answers provided to a simple yes/no question asking people if they can read and write.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The category 'indirect estimation' (red) corresponds mainly to estimates that rely on indirect evidence from educational attainment, usually based on the highest degree of completed education.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "In ", "spanType": "span-simple-text" }, { "url": "https://owid.cloud/app/uploads/2018/03/Literacy-measurement-OWID-metadata.xlsx", "children": [ { "text": "this table", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " you find details regarding all literacy definitions and sources, country by country, and how we categorised them for the purpose of this chart.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "This chart is telling us that:", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "list", "items": [ { "type": "text", "value": [ { "text": "There is substantial cross-country variation, with recent estimates covering all four measurement methods.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "There is variation within countries across time (e.g. Mexico switches between self-reports and extrapolation).", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The number of countries that base their estimates on self-reports and testing is increasing.", "spanType": "span-simple-text" } ], "parseErrors": [] } ], "parseErrors": [] }, { "text": [ { "text": "Data sources for measuring literacy", "spanType": "span-simple-text" } ], "type": "heading", "level": 3, "parseErrors": [] }, { "type": "text", "value": [ { "text": "Another way to dissect the same data, is to classify literacy estimates according to the type of measurement instrument used to collect the relevant data. Which countries use household sampling instruments such as UNICEF's ", "spanType": "span-simple-text" }, { "children": [ { "text": "Multiple Indicator Cluster Surveys", "spanType": "span-simple-text" } ], "spanType": "span-italic" }, { "text": "? Which countries use census data? And which countries do not collect literacy data directly, but rely instead on other sources?", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "In the chart we explore this, splitting estimates into three categories: sampling, including data from literacy tests and household surveys; census data; and other instruments (e.g. administrative data on school enrollment).", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Here we can see that most countries use sampling instruments (coded as 'surveys' in the map), although in the past census data was more common. Literacy surveys have the potential of being more accurate \u2013 when the sampling is done correctly \u2013 because they allow for more specific and detailed measurement than short and generic questions in population censuses. Below we discuss this further.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Data quality: Challenges and limitations", "spanType": "span-simple-text" } ], "type": "heading", "level": 3, "parseErrors": [] }, { "type": "text", "value": [ { "text": "As mentioned above, recent data on literacy is often based on a single question included in national population censuses or household surveys presented to respondents above a certain age, where literacy skills are self-reported. The question is often phrased as \"can you read and write?\". These self-reports of literacy skills have several limitations:", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "list", "items": [ { "type": "text", "value": [ { "text": "Simple questions such as \"can you read and write?\" frame literacy as a skill you either possess or do not when, in reality, literacy is a multi-dimensional skill that exists on a continuum.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Self-reports are subjective, in that the question is dependent on what each individual understands by \"reading\" and \"writing\". The form of a word may be familiar enough for a respondent to recall its sound or meaning without actually \u2018reading\u2019 it. Similarly, when writing out one\u2019s name to convey written ability, this can be accomplished by \u2018drawing\u2019 a familiar shape rather than writing in an effort to produce a written text with meaning.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "In many cases surveys ask only one individual to report literacy on behalf of the entire household. This indirect reporting potentially introduces further noise, in particular when it comes to estimating literacy among women and children, since these groups are less often considered 'head of household' in the surveys.", "spanType": "span-simple-text" } ], "parseErrors": [] } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Similarly, inferring literacy from data on educational attainment is also problematic, since schooling does not produce literacy in the same way everywhere: Proficiency tests show that in many low-income countries, a ", "spanType": "span-simple-text" }, { "url": "https://ourworldindata.org/grapher/students-in-grade-2-who-cant-read-a-single-word-ca-2015", "children": [ { "text": "large fraction", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " of second-grade primary-school students cannot read a single word of a short text; and for very few people in these countries going to school for four or five years ", "spanType": "span-simple-text" }, { "url": "https://www.cgdev.org/blog/measuring-quality-girls-education-across-developing-world", "children": [ { "text": "guarantees basic literacy", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ".", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Even at a conceptual level there is lack of consensus \u2013 national definitions of literacy that are based on educational attainment vary substantially from country to country. For example, in Greece people are considered literate if they have finished six years of primary education; while in Paraguay you qualify as literate if you have completed two years of primary school.{ref}You find more details about this in ", "spanType": "span-simple-text" }, { "url": "http://www.unesco.org/education/GMR2006/full/chapt6_eng.pdf", "children": [ { "text": "Chapter 6 of the Education for All Global Monitoring Report (2006)", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ".{/ref}", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "New perspectives through standardized literacy tests", "spanType": "span-simple-text" } ], "type": "heading", "level": 3, "parseErrors": [] }, { "type": "text", "value": [ { "text": "Given the limitations of self-reported or indirectly inferred literacy estimates, efforts are being made at both national and international levels to conduct standardized literacy tests to assess proficiency in a systematic way.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "In particular, large cross-country assessment surveys have been developed to overcome the challenges of producing comparable literacy data. Two important examples are the Program for the International Assessment of Adult Competencies (PIAAC), which is a test used for measuring literacy mostly in rich countries; and the Literacy Assessment and Monitoring Programme (LAMP), which is a household assessment aimed at measuring literacy skills in developing countries, while remaining comparable across countries, languages, and scripts.{ref}The LAMP tests three skill domains: reading continuous texts (prose), reading non-continuous texts (documents), and numeracy skills. Each skills domain is divided into three performance levels; the top 30% of respondents at Level 3, the middle 40% of respondents correspondingly at Level 2, and the bottom 30% of respondents at Level 1.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The LAMP tests have only recently been field tested. The LAMP structure in these pilots (implemented in Jordan, Mongolia, Palestine, and Paraguay) is as follows:", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "list", "items": [ { "type": "text", "value": [ { "text": "Background questionnaire: collects information on the respondent\u2019s educational attainment, self-reported literacy, literacy use in and outside of work, and occupation, amongst other information.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Filter test: Booklet with 17 items to determine if respondent takes Module A (for low performers) or Module B (for high performers)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Module A: Testing prose, document, and numeracy items.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Module B: Respondent is randomly assigned Booklet 1 or Booklet 2, both testing prose, document, and numeracy skills for high performers of the filter test.", "spanType": "span-simple-text" } ], "parseErrors": [] } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "{/ref}", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The LAMP tests have only recently been field tested in four countries: Jordan, Mongolia, Palestine, and Paraguay. The PIAAC tests, on the other hand, have been administered in about 30 countries, and the results are shown in the chart.{ref}The World Bank's STEP Skills Measurement Program also provides a direct assessment of reading proficiency and related competencies scored on the same scale as the OECD's PIAAC. This adds another 11 countries to the coverage of PIAAC tests. You can read more about it here: ", "spanType": "span-simple-text" }, { "url": "http://microdata.worldbank.org/index.php/catalog/step/about", "children": [ { "text": "http://microdata.worldbank.org/index.php/catalog/step/about", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": "{/ref}", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "We only have these tests for a few countries, but we can still see that there is an overall positive correlation. Moreover, we see that there is substantial variation in scores even for countries with identical and almost perfect literacy rates (e.g. Japan vs Italy). This confirms the fact that PIAAC tests capture a related, but broader concept of literacy.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Reconstructing estimates from the past", "spanType": "span-simple-text" } ], "type": "heading", "level": 2, "parseErrors": [] }, { "text": [ { "text": "Census data", "spanType": "span-simple-text" } ], "type": "heading", "level": 3, "parseErrors": [] }, { "type": "text", "value": [ { "text": "The UNESCO definition of \"people who can, with understanding, read and write a short, simple statement on their everyday life\", started being recorded in census data from the end of the 19th century onwards. Hence, despite variation on the types of questions and census instruments used, these historical census data remain the best source of data on literacy for the period prior 1990.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The longest series attempting to reconstruct literacy estimates from historical census data is provided in the OECD report ", "spanType": "span-simple-text" }, { "url": "http://www.oecd-ilibrary.org/economics/how-was-life_9789264214262-en", "children": [ { "text": "\"How Was Life? Global Well-being since 1820\"", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ", published in 2014. This is the source used in the first chart in this blog post for the period 1800 to 1990.{ref}The OECD report relies on a number of underlying sources. For the period before 1950, the underlying source is the UNESCO report on the Progress of Literacy in Various Countries Since 1900 (about 30 countries). For the mid-20th century, the underlying source is the UNESCO report on Illiteracy at Mid-Century (about 36 additional countries). And up to 1970, the source is UNESCO statistical yearbooks.{/ref}", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "How accurate are these UNESCO estimates?", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "The evidence suggests that the narrow concept of literacy measured in early census data provides an imperfect, yet informative account of literacy skills. The chart is a good example: As we can see, already in 1947, census estimates from the US correlate strongly with educational attainment, as one would expect.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "The relationship between educational attainment and literacy over time", "spanType": "span-simple-text" } ], "type": "heading", "level": 3, "parseErrors": [] }, { "type": "text", "value": [ { "text": "Importantly, the correlation between educational attainment and literacy also holds across countries and over time. The chart shows this by plotting changes in literacy rates and average years of schooling. Each country in this chart is represented by a line, where the beginning and ending points correspond to the first and last available observation of these two variables over the period 1970-2010. (As we mention above, before 1990 almost all observations correspond to census data.)", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "As we can see, literacy rates tend to be much higher in countries where people tend to have more years of education; and as average years of education go up in a country, literacy rates also go up.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Offical estimates vs one-sentence test results", "spanType": "span-simple-text" } ], "type": "heading", "level": 3, "parseErrors": [] }, { "type": "text", "value": [ { "text": "Countries with high literacy rates also tend to have higher results in the basic literacy tests included in the DHS surveys (this is a test that requires survey respondents to read a sentence showed to them). As we can see in the chart, there is obviously noise, but these two variables are closely related.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Other historical sources", "spanType": "span-simple-text" } ], "type": "heading", "level": 3, "parseErrors": [] }, { "type": "text", "value": [ { "text": "When census data is not available, a common method used to estimate literacy is to calculate the ", "spanType": "span-simple-text" }, { "url": "https://ourworldindata.org/grapher/historical-literacy-in-england-by-sex", "children": [ { "text": "share of people who could sign official documents", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": " (e.g. court documents, marriage certificates, etc).{ref}Since estimates of signed documents tend to rely on small samples (e.g. parish documents from specific towns), researchers often rely on additional assumptions to extrapolate estimates to the national level. For example, Bob Allen provides estimates of the evolution of literacy in Europe between 1500 and 1800 using data on urbanization rates. For more details see Allen, R. C. (2003). ", "spanType": "span-simple-text" }, { "url": "https://web.archive.org/web/20190122001259/http://www.econ.queensu.ca/files/other/allen03.pdf", "children": [ { "text": "Progress and poverty in early modern Europe", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ". The Economic History Review, 56(3), 403-443.{/ref} As the researcher Jeremiah Dittmar ", "spanType": "span-simple-text" }, { "url": "http://www.jeremiahdittmar.com/files/Print-Welfare-Paper-5.27.2011.pdf", "children": [ { "text": "explains", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ", this approach only gives a lower bound of the estimates because the number of people who could read was higher than the number who could write.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Indeed, other methods have been proposed, in order to rely on historical estimates of people who could read. For example, researchers Eltjo Buringh and Jan Luiten van Zanden deduce ", "spanType": "span-simple-text" }, { "url": "https://ourworldindata.org/grapher/estimated-historical-literacy-rates?country=GBR+BEL+FRA+DEU+IRL+ITA+NLD+POL+ESP+SWE+Western%20Europe", "children": [ { "text": "literacy rates from estimated per capita book consumption", "spanType": "span-simple-text" } ], "spanType": "span-link" }, { "text": ".{ref}They use a demand equation that links book consumption to a number of factors, including literacy and book prices. For more details see Buringh, E. and Van Zanden, J.L., 2009. Charting the \u201cRise of the West\u201d: Manuscripts and Printed Books in Europe, a long-term Perspective from the Sixth through Eighteenth Centuries. The Journal of Economic History, 69(2), pp.409-445.{/ref} As Buringh and Van Zanden show, their estimates based on book consumption are different, but still fairly close to alternative estimates based on signed documents.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "text": [ { "text": "Concluding remarks", "spanType": "span-simple-text" } ], "type": "heading", "level": 2, "parseErrors": [] }, { "type": "text", "value": [ { "text": "Literacy is a key skill and a key measure of a population's education. However, measuring literacy is difficult because literacy is a complex multi-dimensional skill.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "For decades, most countries have been measuring literacy through yes/no, self-reported answers to a single question in a survey or census along the lines of \"can you read and write?\".", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "These estimates provide a basic perspective on literacy skills. They tell us something meaningful about broad changes in education in the long run, since the changes across decades are much larger than the underlying error margins at any point in time. But they remain insufficient to fully characterise literacy skills and understand challenges ahead.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "As populations become more educated, we need more accurate instruments to measure abilities \u2013 asking people whether they can read and write is insufficient to meaningfully detect differences in the way people apply skills for work and life.", "spanType": "span-simple-text" } ], "parseErrors": [] }, { "type": "text", "value": [ { "text": "Efforts have been made in recent years, both at national and international levels, to develop standardised, tailor-made survey instruments that measure literacy skills more accurately through tests. Several countries already implement these tests, but we will need to wait several years for these new instruments to become the norm for measuring and reporting literacy rates internationally.", "spanType": "span-simple-text" } ], "parseErrors": [] } ], "type": "article", "title": "How is literacy measured?", "authors": [ "Esteban Ortiz-Ospina", "Diana Beltekian" ], "dateline": "June 8, 2018", "sidebar-toc": false, "featured-image": "Screen-Shot-2021-07-19-at-09.35.03.png" }, "createdAt": "2018-04-03T01:56:36.000Z", "published": false, "updatedAt": "2022-02-11T16:11:47.000Z", "revisionId": null, "publishedAt": "2018-06-08T08:44:50.000Z", "relatedCharts": [], "publicationContext": "listed" } |
{ "errors": [ { "name": "unexpected elements in p", "details": "Found 1 elements" }, { "name": "unexpected wp component tag", "details": "Found unhandled wp:comment tag list" }, { "name": "unexpected elements in p", "details": "Found 1 elements" }, { "name": "unexpected elements in p", "details": "Found 1 elements" }, { "name": "unexpected wp component tag", "details": "Found unhandled wp:comment tag list" }, { "name": "unexpected wp component tag", "details": "Found unhandled wp:comment tag list" }, { "name": "iframe with src that is not a grapher", "details": "Found iframe with src that is not a grapher" }, { "name": "unexpected elements in p", "details": "Found 1 elements" }, { "name": "iframe with src that is not a grapher", "details": "Found iframe with src that is not a grapher" }, { "name": "unexpected elements in p", "details": "Found 1 elements" }, { "name": "iframe with src that is not a grapher", "details": "Found iframe with src that is not a grapher" }, { "name": "unexpected elements in p", "details": "Found 1 elements" }, { "name": "unexpected elements in p", "details": "Found 1 elements" } ], "numBlocks": 54, "numErrors": 13, "wpTagCounts": { "html": 1, "list": 3, "heading": 11, "paragraph": 45 }, "htmlTagCounts": { "p": 46, "h3": 3, "h4": 8, "ul": 3, "div": 1, "iframe": 7 } } |
2018-06-08 08:44:50 | 2024-02-16 14:22:47 | 1YN5wV6Lb2A48s0OoiWA0iZD0Ix2k8xrPpCXrs0iNCWo | [ "Esteban Ortiz-Ospina", "Diana Beltekian" ] |
2018-04-03 01:56:36 | 2022-02-11 16:11:47 | https://ourworldindata.org/wp-content/uploads/2018/06/Screen-Shot-2021-07-19-at-09.35.03.png | {} |
Our World in Data presents the data and research to make progress against the world’s largest problems. This post draws on data and research discussed in our entry on **[Literacy](https://ourworldindata.org/literacy)**. The chart shows global literacy rates among adults since 1800. This is a powerful graph: it tells us that over the last two centuries the share of illiterate adults has gone down from 88% to less than 14%. This global perspective on education leads to a natural question: What does it actually mean that a person is ‘literate’ in these international statistics? Literacy rates are only a proxy for what we actually care about, namely literacy _skills_. The distinction matters because literacy skills are complex and span over a range of proficiency shades, while literacy rates assume a sharp, binary distinction between those who are and aren’t ‘literate’. What is the definition of literacy that underlies the estimates in the chart? And how do these estimates compare to other measures of educational achievement and literacy skills? In this post we answer these questions. We begin with an overview of recent estimates published by UNESCO, and then move on to discuss long-run estimates that rely on historical data. ## Measurement today ### Methodologies for measuring literacy Let's start by taking a look at recent estimates of literacy. Specifically, the estimates of literacy rates compiled by UNESCO from different sources. In the chart we present a breakdown of these estimates, showing the main methodologies that countries use to measure literacy, and how these have changed over time. (To explore changes across time use the slider underneath the map.) The breakdown covers four categories: self-reported literacy declared directly by individuals, self-reported literacy declared by the head of the household, tested literacy from proficiency examinations, and indirect estimation or extrapolation. In most cases, the categories covering 'self-reports' (green and orange) correspond to estimates of literacy that rely on answers provided to a simple yes/no question asking people if they can read and write. The category 'indirect estimation' (red) corresponds mainly to estimates that rely on indirect evidence from educational attainment, usually based on the highest degree of completed education. In [this table](https://owid.cloud/app/uploads/2018/03/Literacy-measurement-OWID-metadata.xlsx) you find details regarding all literacy definitions and sources, country by country, and how we categorised them for the purpose of this chart. This chart is telling us that: * There is substantial cross-country variation, with recent estimates covering all four measurement methods. * There is variation within countries across time (e.g. Mexico switches between self-reports and extrapolation). * The number of countries that base their estimates on self-reports and testing is increasing. ### Data sources for measuring literacy Another way to dissect the same data, is to classify literacy estimates according to the type of measurement instrument used to collect the relevant data. Which countries use household sampling instruments such as UNICEF's _Multiple Indicator Cluster Surveys_? Which countries use census data? And which countries do not collect literacy data directly, but rely instead on other sources? In the chart we explore this, splitting estimates into three categories: sampling, including data from literacy tests and household surveys; census data; and other instruments (e.g. administrative data on school enrollment). Here we can see that most countries use sampling instruments (coded as 'surveys' in the map), although in the past census data was more common. Literacy surveys have the potential of being more accurate – when the sampling is done correctly – because they allow for more specific and detailed measurement than short and generic questions in population censuses. Below we discuss this further. ### Data quality: Challenges and limitations As mentioned above, recent data on literacy is often based on a single question included in national population censuses or household surveys presented to respondents above a certain age, where literacy skills are self-reported. The question is often phrased as "can you read and write?". These self-reports of literacy skills have several limitations: * Simple questions such as "can you read and write?" frame literacy as a skill you either possess or do not when, in reality, literacy is a multi-dimensional skill that exists on a continuum. * Self-reports are subjective, in that the question is dependent on what each individual understands by "reading" and "writing". The form of a word may be familiar enough for a respondent to recall its sound or meaning without actually ‘reading’ it. Similarly, when writing out one’s name to convey written ability, this can be accomplished by ‘drawing’ a familiar shape rather than writing in an effort to produce a written text with meaning. * In many cases surveys ask only one individual to report literacy on behalf of the entire household. This indirect reporting potentially introduces further noise, in particular when it comes to estimating literacy among women and children, since these groups are less often considered 'head of household' in the surveys. Similarly, inferring literacy from data on educational attainment is also problematic, since schooling does not produce literacy in the same way everywhere: Proficiency tests show that in many low-income countries, a [large fraction](https://ourworldindata.org/grapher/students-in-grade-2-who-cant-read-a-single-word-ca-2015) of second-grade primary-school students cannot read a single word of a short text; and for very few people in these countries going to school for four or five years [guarantees basic literacy](https://www.cgdev.org/blog/measuring-quality-girls-education-across-developing-world). Even at a conceptual level there is lack of consensus – national definitions of literacy that are based on educational attainment vary substantially from country to country. For example, in Greece people are considered literate if they have finished six years of primary education; while in Paraguay you qualify as literate if you have completed two years of primary school.{ref}You find more details about this in [Chapter 6 of the Education for All Global Monitoring Report (2006)](http://www.unesco.org/education/GMR2006/full/chapt6_eng.pdf).{/ref} ### New perspectives through standardized literacy tests Given the limitations of self-reported or indirectly inferred literacy estimates, efforts are being made at both national and international levels to conduct standardized literacy tests to assess proficiency in a systematic way. In particular, large cross-country assessment surveys have been developed to overcome the challenges of producing comparable literacy data. Two important examples are the Program for the International Assessment of Adult Competencies (PIAAC), which is a test used for measuring literacy mostly in rich countries; and the Literacy Assessment and Monitoring Programme (LAMP), which is a household assessment aimed at measuring literacy skills in developing countries, while remaining comparable across countries, languages, and scripts.{ref}The LAMP tests three skill domains: reading continuous texts (prose), reading non-continuous texts (documents), and numeracy skills. Each skills domain is divided into three performance levels; the top 30% of respondents at Level 3, the middle 40% of respondents correspondingly at Level 2, and the bottom 30% of respondents at Level 1. The LAMP tests have only recently been field tested. The LAMP structure in these pilots (implemented in Jordan, Mongolia, Palestine, and Paraguay) is as follows: * Background questionnaire: collects information on the respondent’s educational attainment, self-reported literacy, literacy use in and outside of work, and occupation, amongst other information. * Filter test: Booklet with 17 items to determine if respondent takes Module A (for low performers) or Module B (for high performers) * Module A: Testing prose, document, and numeracy items. * Module B: Respondent is randomly assigned Booklet 1 or Booklet 2, both testing prose, document, and numeracy skills for high performers of the filter test. {/ref} The LAMP tests have only recently been field tested in four countries: Jordan, Mongolia, Palestine, and Paraguay. The PIAAC tests, on the other hand, have been administered in about 30 countries, and the results are shown in the chart.{ref}The World Bank's STEP Skills Measurement Program also provides a direct assessment of reading proficiency and related competencies scored on the same scale as the OECD's PIAAC. This adds another 11 countries to the coverage of PIAAC tests. You can read more about it here: [http://microdata.worldbank.org/index.php/catalog/step/about](http://microdata.worldbank.org/index.php/catalog/step/about){/ref} We only have these tests for a few countries, but we can still see that there is an overall positive correlation. Moreover, we see that there is substantial variation in scores even for countries with identical and almost perfect literacy rates (e.g. Japan vs Italy). This confirms the fact that PIAAC tests capture a related, but broader concept of literacy. ## Reconstructing estimates from the past ### Census data The UNESCO definition of "people who can, with understanding, read and write a short, simple statement on their everyday life", started being recorded in census data from the end of the 19th century onwards. Hence, despite variation on the types of questions and census instruments used, these historical census data remain the best source of data on literacy for the period prior 1990. The longest series attempting to reconstruct literacy estimates from historical census data is provided in the OECD report ["How Was Life? Global Well-being since 1820"](http://www.oecd-ilibrary.org/economics/how-was-life_9789264214262-en), published in 2014. This is the source used in the first chart in this blog post for the period 1800 to 1990.{ref}The OECD report relies on a number of underlying sources. For the period before 1950, the underlying source is the UNESCO report on the Progress of Literacy in Various Countries Since 1900 (about 30 countries). For the mid-20th century, the underlying source is the UNESCO report on Illiteracy at Mid-Century (about 36 additional countries). And up to 1970, the source is UNESCO statistical yearbooks.{/ref} How accurate are these UNESCO estimates? The evidence suggests that the narrow concept of literacy measured in early census data provides an imperfect, yet informative account of literacy skills. The chart is a good example: As we can see, already in 1947, census estimates from the US correlate strongly with educational attainment, as one would expect. ### The relationship between educational attainment and literacy over time Importantly, the correlation between educational attainment and literacy also holds across countries and over time. The chart shows this by plotting changes in literacy rates and average years of schooling. Each country in this chart is represented by a line, where the beginning and ending points correspond to the first and last available observation of these two variables over the period 1970-2010. (As we mention above, before 1990 almost all observations correspond to census data.) As we can see, literacy rates tend to be much higher in countries where people tend to have more years of education; and as average years of education go up in a country, literacy rates also go up. ### Offical estimates vs one-sentence test results Countries with high literacy rates also tend to have higher results in the basic literacy tests included in the DHS surveys (this is a test that requires survey respondents to read a sentence showed to them). As we can see in the chart, there is obviously noise, but these two variables are closely related. ### Other historical sources When census data is not available, a common method used to estimate literacy is to calculate the [share of people who could sign official documents](https://ourworldindata.org/grapher/historical-literacy-in-england-by-sex) (e.g. court documents, marriage certificates, etc).{ref}Since estimates of signed documents tend to rely on small samples (e.g. parish documents from specific towns), researchers often rely on additional assumptions to extrapolate estimates to the national level. For example, Bob Allen provides estimates of the evolution of literacy in Europe between 1500 and 1800 using data on urbanization rates. For more details see Allen, R. C. (2003). [Progress and poverty in early modern Europe](https://web.archive.org/web/20190122001259/http://www.econ.queensu.ca/files/other/allen03.pdf). The Economic History Review, 56(3), 403-443.{/ref} As the researcher Jeremiah Dittmar [explains](http://www.jeremiahdittmar.com/files/Print-Welfare-Paper-5.27.2011.pdf), this approach only gives a lower bound of the estimates because the number of people who could read was higher than the number who could write. Indeed, other methods have been proposed, in order to rely on historical estimates of people who could read. For example, researchers Eltjo Buringh and Jan Luiten van Zanden deduce [literacy rates from estimated per capita book consumption](https://ourworldindata.org/grapher/estimated-historical-literacy-rates?country=GBR+BEL+FRA+DEU+IRL+ITA+NLD+POL+ESP+SWE+Western%20Europe).{ref}They use a demand equation that links book consumption to a number of factors, including literacy and book prices. For more details see Buringh, E. and Van Zanden, J.L., 2009. Charting the “Rise of the West”: Manuscripts and Printed Books in Europe, a long-term Perspective from the Sixth through Eighteenth Centuries. The Journal of Economic History, 69(2), pp.409-445.{/ref} As Buringh and Van Zanden show, their estimates based on book consumption are different, but still fairly close to alternative estimates based on signed documents. ## Concluding remarks Literacy is a key skill and a key measure of a population's education. However, measuring literacy is difficult because literacy is a complex multi-dimensional skill. For decades, most countries have been measuring literacy through yes/no, self-reported answers to a single question in a survey or census along the lines of "can you read and write?". These estimates provide a basic perspective on literacy skills. They tell us something meaningful about broad changes in education in the long run, since the changes across decades are much larger than the underlying error margins at any point in time. But they remain insufficient to fully characterise literacy skills and understand challenges ahead. As populations become more educated, we need more accurate instruments to measure abilities – asking people whether they can read and write is insufficient to meaningfully detect differences in the way people apply skills for work and life. Efforts have been made in recent years, both at national and international levels, to develop standardised, tailor-made survey instruments that measure literacy skills more accurately through tests. Several countries already implement these tests, but we will need to wait several years for these new instruments to become the norm for measuring and reporting literacy rates internationally. | { "id": 17539, "date": "2018-06-08T09:44:50", "guid": { "rendered": "https://owid.cloud/?p=17539" }, "link": "https://owid.cloud/how-is-literacy-measured", "meta": { "owid_publication_context_meta_field": { "latest": true, "homepage": true, "immediate_newsletter": true } }, "slug": "how-is-literacy-measured", "tags": [ 114 ], "type": "post", "title": { "rendered": "How is literacy measured?" }, "_links": { "self": [ { "href": "https://owid.cloud/wp-json/wp/v2/posts/17539" } ], "about": [ { "href": "https://owid.cloud/wp-json/wp/v2/types/post" } ], "author": [ { "href": "https://owid.cloud/wp-json/wp/v2/users/10", "embeddable": true } ], "curies": [ { "href": "https://api.w.org/{rel}", "name": "wp", "templated": true } ], "replies": [ { "href": "https://owid.cloud/wp-json/wp/v2/comments?post=17539", "embeddable": true } ], "wp:term": [ { "href": "https://owid.cloud/wp-json/wp/v2/categories?post=17539", "taxonomy": "category", "embeddable": true }, { "href": "https://owid.cloud/wp-json/wp/v2/tags?post=17539", "taxonomy": "post_tag", "embeddable": true } ], "collection": [ { "href": "https://owid.cloud/wp-json/wp/v2/posts" } ], "wp:attachment": [ { "href": "https://owid.cloud/wp-json/wp/v2/media?parent=17539" } ], "version-history": [ { "href": "https://owid.cloud/wp-json/wp/v2/posts/17539/revisions", "count": 29 } ], "wp:featuredmedia": [ { "href": "https://owid.cloud/wp-json/wp/v2/media/44300", "embeddable": true } ], "predecessor-version": [ { "id": 48864, "href": "https://owid.cloud/wp-json/wp/v2/posts/17539/revisions/48864" } ] }, "author": 10, "format": "standard", "status": "publish", "sticky": false, "content": { "rendered": "\n<div class=\"blog-info\">Our World in Data presents the data and research to make progress against the world\u2019s largest problems.\n<p>This post draws on data and research discussed in our entry on <strong><a href=\"https://ourworldindata.org/literacy\" target=\"_blank\" rel=\"noopener noreferrer\">Literacy</a></strong>.</p>\n</div>\n\n\n\n<p>The chart shows global literacy rates among adults since 1800. This is a powerful graph: it tells us that over the last two centuries the share of illiterate adults has gone down from 88% to less than 14%.</p>\n\n\n\n<p>This global perspective on education leads to a natural question: What does it actually mean that a person is \u2018literate\u2019 in these international statistics?</p>\n\n\n\n<p>Literacy rates are only a proxy for what we actually care about, namely literacy <em>skills</em>. The distinction matters because literacy skills are complex and span over a range of proficiency shades, while literacy rates assume a sharp, binary distinction between those who are and aren\u2019t \u2018literate\u2019.</p>\n\n\n\n<p>What is the definition of literacy that underlies the estimates in the chart? And how do these estimates compare to other measures of educational achievement and literacy skills?</p>\n\n\n\n<p>In this post we answer these questions. We begin with an overview of recent estimates published by UNESCO, and then move on to discuss long-run estimates that rely on historical data.</p>\n\n\n\n<p><iframe style=\"width: 100%; height: 600px; border: 0px none;\" src=\"https://ourworldindata.org/grapher/literate-and-illiterate-world-population\"></iframe></p>\n\n\n\n<h3>Measurement today</h3>\n\n\n\n<h4>Methodologies for measuring literacy</h4>\n\n\n\n<p>Let’s start by taking a look at recent estimates of literacy. Specifically, the estimates of literacy rates compiled by UNESCO from different sources.</p>\n\n\n\n<p>In the chart we present a breakdown of these estimates, showing the main methodologies that countries use to measure literacy, and how these have changed over time. (To explore changes across time use the slider underneath the map.)</p>\n\n\n\n<p>The breakdown covers four categories: self-reported literacy declared directly by individuals, self-reported literacy declared by the head of the household, tested literacy from proficiency examinations, and indirect estimation or extrapolation.</p>\n\n\n\n<p>In most cases, the categories covering ‘self-reports’ (green and orange) correspond to estimates of literacy that rely on answers provided to a simple yes/no question asking people if they can read and write.</p>\n\n\n\n<p>The category ‘indirect estimation’ (red) corresponds mainly to estimates that rely on indirect evidence from educational attainment, usually based on the highest degree of completed education.</p>\n\n\n\n<p>In <a href=\"https://owid.cloud/app/uploads/2018/03/Literacy-measurement-OWID-metadata.xlsx\" target=\"_blank\" rel=\"noopener noreferrer\">this table</a> you find details regarding all literacy definitions and sources, country by country, and how we categorised them for the purpose of this chart.</p>\n\n\n\n<p>This chart is telling us that:</p>\n\n\n\n<ul><li>There is substantial cross-country variation, with recent estimates covering all four measurement methods.</li><li>There is variation within countries across time (e.g. Mexico switches between self-reports and extrapolation).</li><li>The number of countries that base their estimates on self-reports and testing is increasing.</li></ul>\n\n\n\n<p><iframe style=\"width: 100%; height: 600px; border: 0px none;\" src=\"https://ourworldindata.org/grapher/mode-of-reporting-literacy-rates\"></iframe></p>\n\n\n\n<h4>Data sources for measuring literacy</h4>\n\n\n\n<p>Another way to dissect the same data, is to classify literacy estimates according to the type of measurement instrument used to collect the relevant data. Which countries use household sampling instruments such as UNICEF’s <em>Multiple Indicator Cluster Surveys</em>? Which countries use census data? And which countries do not collect literacy data directly, but rely instead on other sources?</p>\n\n\n\n<p>In the chart we explore this, splitting estimates into three categories: sampling, including data from literacy tests and household surveys; census data; and other instruments (e.g. administrative data on school enrollment).</p>\n\n\n\n<p>Here we can see that most countries use sampling instruments (coded as ‘surveys’ in the map), although in the past census data was more common. Literacy surveys have the potential of being more accurate \u2013 when the sampling is done correctly \u2013 because they allow for more specific and detailed measurement than short and generic questions in population censuses. Below we discuss this further.</p>\n\n\n\n<p><iframe style=\"width: 100%; height: 600px; border: 0px none;\" src=\"https://ourworldindata.org/grapher/literacy-rate-source-type\"></iframe></p>\n\n\n\n<h4>Data quality: Challenges and limitations</h4>\n\n\n\n<p>As mentioned above, recent data on literacy is often based on a single question included in national population censuses or household surveys presented to respondents above a certain age, where literacy skills are self-reported. The question is often phrased as “can you read and write?”. These self-reports of literacy skills have several limitations:</p>\n\n\n\n<ul><li>Simple questions such as “can you read and write?” frame literacy as a skill you either possess or do not when, in reality, literacy is a multi-dimensional skill that exists on a continuum.</li><li>Self-reports are subjective, in that the question is dependent on what each individual understands by “reading” and “writing”. The form of a word may be familiar enough for a respondent to recall its sound or meaning without actually \u2018reading\u2019 it. Similarly, when writing out one\u2019s name to convey written ability, this can be accomplished by \u2018drawing\u2019 a familiar shape rather than writing in an effort to produce a written text with meaning.</li><li>In many cases surveys ask only one individual to report literacy on behalf of the entire household. This indirect reporting potentially introduces further noise, in particular when it comes to estimating literacy among women and children, since these groups are less often considered ‘head of household’ in the surveys.</li></ul>\n\n\n\n<p>Similarly, inferring literacy from data on educational attainment is also problematic, since schooling does not produce literacy in the same way everywhere: Proficiency tests show that in many low-income countries, a <a href=\"https://ourworldindata.org/grapher/students-in-grade-2-who-cant-read-a-single-word-ca-2015\" target=\"_blank\" rel=\"noopener noreferrer\">large fraction</a> of second-grade primary-school students cannot read a single word of a short text; and for very few people in these countries going to school for four or five years <a href=\"https://www.cgdev.org/blog/measuring-quality-girls-education-across-developing-world\" target=\"_blank\" rel=\"noopener noreferrer\">guarantees basic literacy</a>.</p>\n\n\n\n<p>Even at a conceptual level there is lack of consensus \u2013 national definitions of literacy that are based on educational attainment vary substantially from country to country. For example, in Greece people are considered literate if they have finished six years of primary education; while in Paraguay you qualify as literate if you have completed two years of primary school.{ref}You find more details about this in <a href=\"http://www.unesco.org/education/GMR2006/full/chapt6_eng.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">Chapter 6 of the Education for All Global Monitoring Report (2006)</a>.{/ref}</p>\n\n\n\n<h4>New perspectives through standardized literacy tests</h4>\n\n\n\n<p>Given the limitations of self-reported or indirectly inferred literacy estimates, efforts are being made at both national and international levels to conduct standardized literacy tests to assess proficiency in a systematic way.</p>\n\n\n\n<p>In particular, large cross-country assessment surveys have been developed to overcome the challenges of producing comparable literacy data. Two important examples are the Program for the International Assessment of Adult Competencies (PIAAC), which is a test used for measuring literacy mostly in rich countries; and the Literacy Assessment and Monitoring Programme (LAMP), which is a household assessment aimed at measuring literacy skills in developing countries, while remaining comparable across countries, languages, and scripts.{ref}The LAMP tests three skill domains: reading continuous texts (prose), reading non-continuous texts (documents), and numeracy skills. Each skills domain is divided into three performance levels; the top 30% of respondents at Level 3, the middle 40% of respondents correspondingly at Level 2, and the bottom 30% of respondents at Level 1.</p>\n\n\n\n<p>The LAMP tests have only recently been field tested. The LAMP structure in these pilots (implemented in Jordan, Mongolia, Palestine, and Paraguay) is as follows:</p>\n\n\n\n<ul><li>Background questionnaire: collects information on the respondent\u2019s educational attainment, self-reported literacy, literacy use in and outside of work, and occupation, amongst other information.</li><li>Filter test: Booklet with 17 items to determine if respondent takes Module A (for low performers) or Module B (for high performers)</li><li>Module A: Testing prose, document, and numeracy items.</li><li>Module B: Respondent is randomly assigned Booklet 1 or Booklet 2, both testing prose, document, and numeracy skills for high performers of the filter test.</li></ul>\n\n\n\n<p>{/ref}</p>\n\n\n\n<p>The LAMP tests have only recently been field tested in four countries: Jordan, Mongolia, Palestine, and Paraguay. The PIAAC tests, on the other hand, have been administered in about 30 countries, and the results are shown in the chart.{ref}The World Bank’s STEP Skills Measurement Program also provides a direct assessment of reading proficiency and related competencies scored on the same scale as the OECD’s PIAAC. This adds another 11 countries to the coverage of PIAAC tests. You can read more about it here: <a rel=\"noopener noreferrer\" href=\"http://microdata.worldbank.org/index.php/catalog/step/about\" target=\"_blank\">http://microdata.worldbank.org/index.php/catalog/step/about</a>{/ref}</p>\n\n\n\n<p>We only have these tests for a few countries, but we can still see that there is an overall positive correlation. Moreover, we see that there is substantial variation in scores even for countries with identical and almost perfect literacy rates (e.g. Japan vs Italy). This confirms the fact that PIAAC tests capture a related, but broader concept of literacy.</p>\n\n\n\n<p><iframe style=\"width: 100%; height: 600px; border: 0px none;\" src=\"https://owid.cloud/grapher/average-literacy-score-piaac-test-vs-national-literacy-rate-cia\"></iframe></p>\n\n\n\n<h3>Reconstructing estimates from the past</h3>\n\n\n\n<h4>Census data</h4>\n\n\n\n<p>The UNESCO definition of “people who can, with understanding, read and write a short, simple statement on their everyday life”, started being recorded in census data from the end of the 19th century onwards. Hence, despite variation on the types of questions and census instruments used, these historical census data remain the best source of data on literacy for the period prior 1990.</p>\n\n\n\n<p>The longest series attempting to reconstruct literacy estimates from historical census data is provided in the OECD report <a href=\"http://www.oecd-ilibrary.org/economics/how-was-life_9789264214262-en\" target=\"_blank\" rel=\"noopener noreferrer\">“How Was Life? Global Well-being since 1820”</a>, published in 2014. This is the source used in the first chart in this blog post for the period 1800 to 1990.{ref}The OECD report relies on a number of underlying sources. For the period before 1950, the underlying source is the UNESCO report on the Progress of Literacy in Various Countries Since 1900 (about 30 countries). For the mid-20th century, the underlying source is the UNESCO report on Illiteracy at Mid-Century (about 36 additional countries). And up to 1970, the source is UNESCO statistical yearbooks.{/ref}</p>\n\n\n\n<p>How accurate are these UNESCO estimates?</p>\n\n\n\n<p>The evidence suggests that the narrow concept of literacy measured in early census data provides an imperfect, yet informative account of literacy skills. The chart is a good example: As we can see, already in 1947, census estimates from the US correlate strongly with educational attainment, as one would expect.</p>\n\n\n\n<p><iframe style=\"width: 100%; height: 600px; border: 0px none;\" src=\"https://owid.cloud/grapher/literacy-by-years-of-schooling-us-1947\"></iframe></p>\n\n\n\n<h4>The relationship between educational attainment and literacy over time</h4>\n\n\n\n<p>Importantly, the correlation between educational attainment and literacy also holds across countries and over time. The chart shows this by plotting changes in literacy rates and average years of schooling. Each country in this chart is represented by a line, where the beginning and ending points correspond to the first and last available observation of these two variables over the period 1970-2010. (As we mention above, before 1990 almost all observations correspond to census data.)</p>\n\n\n\n<p>As we can see, literacy rates tend to be much higher in countries where people tend to have more years of education; and as average years of education go up in a country, literacy rates also go up.</p>\n\n\n\n<p><iframe style=\"width: 100%; height: 600px; border: 0px none;\" src=\"https://owid.cloud/grapher/literacy-rates-vs-average-years-of-schooling\"></iframe></p>\n\n\n\n<h4>Offical estimates vs one-sentence test results</h4>\n\n\n\n<p>Countries with high literacy rates also tend to have higher results in the basic literacy tests included in the DHS surveys (this is a test that requires survey respondents to read a sentence showed to them). As we can see in the chart, there is obviously noise, but these two variables are closely related.</p>\n\n\n\n<p><iframe style=\"width: 100%; height: 600px; border: 0px none;\" src=\"https://ourworldindata.org/grapher/literacy-rate-adult-total-dhs-surveys-vs-unesco\"></iframe></p>\n\n\n\n<h4>Other historical sources</h4>\n\n\n\n<p>When census data is not available, a common method used to estimate literacy is to calculate the <a href=\"https://ourworldindata.org/grapher/historical-literacy-in-england-by-sex\" target=\"_blank\" rel=\"noopener noreferrer\">share of people who could sign official documents</a> (e.g. court documents, marriage certificates, etc).{ref}Since estimates of signed documents tend to rely on small samples (e.g. parish documents from specific towns), researchers often rely on additional assumptions to extrapolate estimates to the national level. For example, Bob Allen provides estimates of the evolution of literacy in Europe between 1500 and 1800 using data on urbanization rates. For more details see Allen, R. C. (2003). <a href=\"https://web.archive.org/web/20190122001259/http://www.econ.queensu.ca/files/other/allen03.pdf\">Progress and poverty in early modern Europe</a>. The Economic History Review, 56(3), 403-443.{/ref} As the researcher Jeremiah Dittmar <a href=\"http://www.jeremiahdittmar.com/files/Print-Welfare-Paper-5.27.2011.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">explains</a>, this approach only gives a lower bound of the estimates because the number of people who could read was higher than the number who could write.</p>\n\n\n\n<p>Indeed, other methods have been proposed, in order to rely on historical estimates of people who could read. For example, researchers Eltjo Buringh and Jan Luiten van Zanden deduce <a rel=\"noopener noreferrer\" href=\"https://ourworldindata.org/grapher/estimated-historical-literacy-rates?country=GBR+BEL+FRA+DEU+IRL+ITA+NLD+POL+ESP+SWE+Western%20Europe\" target=\"_blank\">literacy rates from estimated per capita book consumption</a>.{ref}They use a demand equation that links book consumption to a number of factors, including literacy and book prices. For more details see Buringh, E. and Van Zanden, J.L., 2009. Charting the \u201cRise of the West\u201d: Manuscripts and Printed Books in Europe, a long-term Perspective from the Sixth through Eighteenth Centuries. The Journal of Economic History, 69(2), pp.409-445.{/ref} As Buringh and Van Zanden show, their estimates based on book consumption are different, but still fairly close to alternative estimates based on signed documents.</p>\n\n\n\n<h3>Concluding remarks</h3>\n\n\n\n<p>Literacy is a key skill and a key measure of a population’s education. However, measuring literacy is difficult because literacy is a complex multi-dimensional skill.</p>\n\n\n\n<p>For decades, most countries have been measuring literacy through yes/no, self-reported answers to a single question in a survey or census along the lines of “can you read and write?”.</p>\n\n\n\n<p>These estimates provide a basic perspective on literacy skills. They tell us something meaningful about broad changes in education in the long run, since the changes across decades are much larger than the underlying error margins at any point in time. But they remain insufficient to fully characterise literacy skills and understand challenges ahead.</p>\n\n\n\n<p>As populations become more educated, we need more accurate instruments to measure abilities \u2013 asking people whether they can read and write is insufficient to meaningfully detect differences in the way people apply skills for work and life.</p>\n\n\n\n<p>Efforts have been made in recent years, both at national and international levels, to develop standardised, tailor-made survey instruments that measure literacy skills more accurately through tests. Several countries already implement these tests, but we will need to wait several years for these new instruments to become the norm for measuring and reporting literacy rates internationally.</p>\n", "protected": false }, "excerpt": { "rendered": "", "protected": false }, "date_gmt": "2018-06-08T08:44:50", "modified": "2022-02-11T16:11:47", "template": "", "categories": [ 82, 1 ], "ping_status": "closed", "authors_name": [ "Esteban Ortiz-Ospina", "Diana Beltekian" ], "modified_gmt": "2022-02-11T16:11:47", "comment_status": "closed", "featured_media": 44300, "featured_media_paths": { "thumbnail": "/app/uploads/2018/06/Screen-Shot-2021-07-19-at-09.35.03-150x72.png", "medium_large": "/app/uploads/2018/06/Screen-Shot-2021-07-19-at-09.35.03.png" } } |