Learn R Programming

censobr: Download Data from Brazil's Population Census

{censobr} is an R package to download data from Brazil's Population Census. It provides a very simple and efficient way to download and read the data sets and documentation of all the population censuses taken in and after 1960 in the country. The package is built on top of the Arrow platform, which allows users to work with larger-than-memory census data using {dplyr} familiar functions.

Installation

# install from CRAN
install.packages("censobr")

# or use the development version with latest features
utils::remove.packages('censobr')
remotes::install_github("ipeaGIT/censobr", ref="dev")
library(censobr)

Basic usage

The package currently includes 6 main functions to download & read census data:

  1. read_population()
  2. read_households()
  3. read_mortality()
  4. read_families()
  5. read_emigration()
  6. read_tracts()

{censobr} also includes a few support functions to help users navigate the documentation Brazilian censuses, providing convenient information on data variables and methodology:

  1. data_dictionary()
  2. questionnaire()
  3. interview_manual()

Finally, the package includes three functions to help users manage the data chached locally.

  1. censobr_cache()
  2. set_censobr_cache_dir()
  3. get_censobr_cache_dir()

The syntax of all {censobr} functions to read data operate on the same logic so it becomes intuitive to download any data set using a single line of code. Like this:

read_households(
  year,          # year of reference
  columns,       # select columns to read
  add_labels,    # add labels to categorical variables
  as_data_frame, # return an Arrow DataSet or a data.frame
  showProgress,  # show download progress bar
  cache,         # cache data for faster access later
  verbose        # whether to print informative messages
  )

Note: all data sets in {censobr} are enriched with geography columns following the name standards of the {geobr} package to help data manipulation and integration with spatial data from {geobr}. The added columns are: c(‘code_muni’, ‘code_state’, ‘abbrev_state’, ‘name_state’, ‘code_region’, ‘name_region’, ‘code_weighting’).

Data cache

The first time the user runs a function, {censobr} will download the file and store it locally. This way, the data only needs to be downloaded once. When the cache parameter is set to TRUE (Default), the function will read the cached data, which is much faster.

  • censobr_cache(): can be used to list and/or delete data files cached locally
  • set_censobr_cache_dir(): can be used to set custom cache directory for {censobr} files
  • get_censobr_cache_dir(): returns the path of the cache directory in use

Larger-than-memory Data

Microdata of Brazilian census are often be too big to load in users' RAM memory. To avoid this problem, {censobr} will by default return an Arrow table, which can be analyzed like a regular data.frame using the dplyr package without loading the full data to memory.

More info in the package vignette.

Contributing to censobr

If you would like to contribute to {censobr}, you're welcome to open an issue to explain the proposed a contribution.


Related projects

As far as we know, {censobr} is the only R package that provides fast and convenient access to the complete data sets and documentation of Brazilian censuses. The microdadosBrasil package used to provide access to microdata of several public data sets, but unfortunately, it has been discontinued.

Similar packages for other countries

Credits

Original Census data is collected by the Brazilian Institute of Geography and Statistics (IBGE). The {censobr} package is developed by a team at the Institute for Applied Economic Research (Ipea), Brazil. If you want to cite this package, you can cite it as:

  • Pereira, Rafael H. M.; Barbosa, Rogério J. (2023) censobr: Download Data from Brazil's Population Census. R package version v0.4.0, https://CRAN.R-project.org/package=censobr. DOI: 10.32614/CRAN.package.censobr.
bibentry(
  bibtype  = "Manual",
  title       = "censobr: Download Data from Brazil's Population Census",
  author      = "Rafael H. M. Pereira [aut, cre] and Rogério J. Barbosa [aut]",
  year        = 2023,
  version     = "v0.2.0",
  url         = "https://CRAN.R-project.org/package=censobr",
  textVersion = "Pereira, R. H. M.; Barbosa, R. J. (2023) censobr: Download Data from Brazil's Population Census. R package version v0.2.0, <https://CRAN.R-project.org/package=censobr>."
)

Copy Link

Version

Install

install.packages('censobr')

Monthly Downloads

752

Version

0.5.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Rafael H. M. Pereira

Last Published

July 7th, 2025

Functions in censobr (0.5.0)

read_households

Download microdata of household records from Brazil's census
read_mortality

Download microdata of death records from Brazil's census
read_emigration

Download microdata of emigration records from Brazil's census
read_population

Download microdata of population records from Brazil's census
read_tracts

Download census tract-level data from Brazil's censuses
read_families

Download microdata of family records from Brazil's census
using_default_censobr_cache_dir

Check if user is using the default cache dir of censobr
error_missing_datasets

Error missing data sets
interview_manual

Interview manual of the data collection of Brazil's censuses
get_censobr_cache_dir

Get path to cache directory for censobr files
data_dictionary

Data dictionary of Brazil's census data
censobr_cache

Manage cached files from the censobr package
questionnaire

Questionnaires used in the data collection of Brazil's censuses
censobr

censobr: Download Data from Brazil's Population Census
download_file

Download file from url
merge_household_var

Add household variables to the data set
arrow_open_dataset

Safely use arrow to open a Parquet file
error_missing_years

Error missing years
cache_message

Message when caching file
set_censobr_cache_dir

Set custom cache directory for censobr files