The user-level disk cache has been removed to comply with CRAN's policy on
home-filespace writes. The package no longer writes to
~/.cache/realestatebr/ (or any other location outside the R session's
temporary directory).
get_dataset() is now two-tier: it tries the package's GitHub release
asset first and falls back to a fresh download from the original source.
The source argument no longer accepts "cache"; valid values are
"auto", "github", and "fresh". The max_age argument has been
removed (cache freshness is now managed by the weekly CI/CD release
pipeline).get_dataset() calls within one R session are served from a
package-private in-memory environment. Use the new
clear_session_cache() function to drop it.clear_user_cache(), check_cache_status(), and
update_cache_from_github() have been removed. There is no user cache
to manage.get_abecip_indicators(), get_secovi(),
get_rppi_*(), etc.) no longer accept a cached argument. They always
download from the original source; use get_dataset(source = "github")
for the pre-processed asset.piggyback (formerly Suggests) and rappdirs (formerly Imports) have
been dropped. GitHub release assets are now fetched directly via
httr::GET() against the public release-asset URL, avoiding the
transitive gh::gh() cache writes that also violated CRAN policy.nre_ire has been removed. It required fully manual updates and had no
automated pipeline support; it will be reconsidered for a future release.property_records has been removed because the upstream data source is no
longer available.cbic has been removed. The upstream CBIC portal migrated to a
restricted-access platform. The five cement tables will be rebuilt from
IBGE open data in a future release.itbi_summary and the internal ITBI helpers (get_itbi, get_itbi_bhe)
have been removed. They were incomplete (single-municipality coverage) and
are deferred to a future version.get_dataset("bcb_series") now returns only four columns: date,
code_bcb, name_simplified, and value. Full metadata is available
via bcb_metadata.table argument now accepts a hierarchy level: "core" (default),
"primary", "secondary", "tertiary", or "full". The levels are
cumulative — "primary" includes all "core" series plus key macro
indicators such as SELIC, IPCA, and INCC. Previously the argument
accepted BCB category names ("credit", "price", etc.).bcb_metadata gains a hierarchy column (integer 1–4) that records the
relevance tier assigned to each series.get_dataset("rppi", table = "all") now returns two additional columns:
transaction_type ("sale" or "rent") and source (the index name,
e.g., "IGMI-R", "IVG-R", "FipeZap"). Previously the stacked table
had no way to distinguish transaction type from index source.Sheets not found errors after FIPE added a new summary sheet. Sheet
selection now uses numeric indices to avoid Latin-1/UTF-8 mismatches in
accented sheet names.get_range() failing with a malformed cell range when tidyxl
received a sheet name with a mismatched encoding attribute. The function
now derives the sheet lookup key directly from the cells returned by
tidyxl rather than from the user-supplied string.fetch_fgv_local(), which was inadvertently dropped during
earlier refactoring but is still called by the targets pipeline to
process the manually-maintained FGV IBRE CSV export.download_* / clean_* naming
convention: download_* functions return a file path; clean_* functions
parse and tidy that path into a tibble.tryCatch with rlang::try_fetch throughout, using
parent = cnd to preserve the original error chain.validate_dataset_params(),
attach_dataset_metadata(), and validate_dataset() from
R/helpers_dataset.R and R/helpers_download.R instead of
re-implementing these patterns inline..github/workflows/update_data_weekly.yml.
The workflow was silently skipping abecip and abrainc targets on every
automated run; those entries now use the current granular target names
(abecip_sbpe_data, abecip_units_data, abrainc_indicator_data, etc.).fgv_ibre_file to the weekly and all target groups so the FGV
file-change target is included in scheduled runs.bcb_series table reference in vignettes/getting-started.Rmd
to use the new hierarchy levels ("core", "primary", "secondary",
"tertiary", "full") instead of the removed category names.rppi_bis table listing to include detailed_annual and
detailed_halfyearly, which were previously omitted.get_secovi.RURL field to DESCRIPTION@source tag for dim_city dataset documentationb3_real_estate documentation ("mian" -> "main")skip_on_cran() definition that shadowed testthatVersion 0.6.0 introduces an intelligent cache freshness detection system with relaxed defaults to avoid annoying users with unnecessary warnings.
get_cache_age(): Returns cache age in days for any datasetis_cache_stale(): Checks if cache exceeds recommended freshness thresholdscheck_cache_status(): Diagnostic function showing status of all cached datasetsCache warnings only appear when data is significantly stale (exceeds 2x the update frequency):
max_age parameter in get_dataset(): Force fresh download if cache exceeds specified ageAll datasets in inst/extdata/datasets.yaml now include:
update_schedule: "weekly", "monthly", or "manual"warn_after_days: Custom threshold for staleness warnings (NULL for manual datasets)# Check status of all cached datasets
check_cache_status()
# Get age of specific dataset
get_cache_age("bcb_series")
# Check if dataset is stale (uses relaxed defaults from registry)
is_cache_stale("bcb_series")
# Advanced: Force very fresh data (< 1 day old)
get_dataset("bcb_series", max_age = 1)
# Advanced: Only use cache if less than 3 days old
get_dataset("rppi", table = "sale", max_age = 3)
Version 0.6.0 introduces 7 generic helper functions that consolidate 890 lines of repetitive code patterns across dataset functions.
| File | Before | After | Lines Saved | % Reduction | |------|--------|-------|-------------|-------------| | get_abecip_indicators.R | 551 | 431 | 120 | 21.8% | | get_abrainc_indicators.R | 544 | 445 | 99 | 18.2% | | get_secovi.R | 438 | 356 | 82 | 18.7% | | get_bcb_series.R | 334 | 278 | 56 | 16.8% | | get-dataset.R | 833 | 773 | 60 | 7.2% | | TOTAL | 2,700 | 2,283 | 417 | 15.4% |
validate_dataset_params() (R/helpers-dataset.R)
handle_dataset_cache() (R/helpers-dataset.R)
attach_dataset_metadata() (R/helpers-dataset.R)
validate_dataset() (R/helpers-dataset.R)
validate_excel_file() (R/helpers-dataset.R)
download_with_retry() (R/rppi-helpers.R - REUSED)
apply_table_filtering() (R/get-dataset.R)
See .claude/phase3_completion_summary.md for complete details.
Version 0.6.0 removes 8 deprecated functions from the public API. These functions are now internal-only.
Removed from NAMESPACE: 8 deprecated functions no longer exported:
get_abecip_indicators()get_abrainc_indicators()get_bcb_realestate()get_bcb_series()get_fgv_ibre()get_nre_ire()get_rppi_bis()get_secovi()get_dataset()These functions were deprecated in v0.4.0 (18+ months ago). Users must now use get_dataset():
# Old way (NO LONGER WORKS):
data <- get_secovi()
data <- get_bcb_series(table = "price")
data <- get_abecip_indicators(table = "sbpe")
# New way (REQUIRED):
data <- get_dataset("secovi")
data <- get_dataset("bcb_series", "price")
data <- get_dataset("abecip", "sbpe")
get_dataset()) instead of 15+get_from_legacy_function() → get_from_internal_function()Files changed: R/get-dataset.R
suppress_external_warnings() - Never calledexplore_cbic_structure() - Only in examplesget_cbic_files() - Only in examplesget_cbic_materials() - Only in examplesget_cbic_steel() and get_cbic_pim():
attr(result, "source")attr(result, "download_time")attr(result, "download_info")steel_prices and pim tables now accessiblesteel_production remains blocked (has data quality issues)Files changed: R/get_cbic.R
Version 0.6.0 removes usage examples from deprecated legacy functions to simplify the codebase. Since we are pre-1.0.0, this is an acceptable breaking change.
Removed: All @examples blocks from 8 deprecated functions:
get_secovi()get_bcb_realestate()get_abrainc_indicators()get_abecip_indicators()get_rppi_bis()get_bcb_series()get_fgv_ibre()get_nre_ire()Removed: Verbose @section blocks (Progress Reporting, Error Handling)
Simplified: @details sections to 1-3 essential lines
Enhanced: @section Deprecation blocks with code migration examples
get_dataset() insteadThese functions were deprecated in v0.4.0. Users should migrate to the modern API:
# Old way (still works, but no longer documented with examples):
data <- get_secovi()
# New way (recommended):
data <- get_dataset("secovi")
Full migration examples are available in each function's @section Deprecation block.
get_dataset() interfaceFixed SECOVI dataset to return all categories by default instead of only "condo"
Problem: get_dataset("secovi") was only returning the "condo" category (1,939 rows) instead of all categories (9,398 rows). This caused test failures for launch/rent/sale tables.
Root Cause: When no table parameter was specified, the code defaulted to the first category alphabetically ("condo"), rather than fetching all categories.
Solution:
default_table configuration support in datasets.yamlvalidate_and_resolve_table() to check for default_table settingdefault_table: "all" in registryImpact:
# Now returns all categories by default
get_dataset("secovi") # → 9,398 rows, 4 categories ✅
# Specific tables still work correctly
get_dataset("secovi", "launch") # → 780 rows
get_dataset("secovi", "rent") # → 2,779 rows
get_dataset("secovi", "sale") # → 3,900 rows
devtools::load_all() instead of library() to ensure testing of development versiontests/comprehensive_check_v0.5.qmd)tests/TEST_RESULTS_SUMMARY.md, tests/QUICK_SUMMARY.md)_targets.R to always load development version for consistencyVersion 0.5.0 introduces user-level caching, removing bundled datasets from the package to comply with CRAN's 5MB size limit. This is a BREAKING CHANGE that affects how datasets are accessed.
inst/cached_data/ (previously ~25MB)~/.local/share/realestatebr/ (Linux/Mac) or %LOCALAPPDATA%/realestatebr/Cache/ (Windows)source="cache" now refers to user cache, not package cachesource="github" now downloads from GitHub releases, not package files# First use: downloads from GitHub releases to user cache
data <- get_dataset("abecip") # Downloads once
# Subsequent uses: loads from user cache (instant, offline)
data <- get_dataset("abecip") # Loads from ~/.local/share/realestatebr/
# Force fresh download from original source
data <- get_dataset("abecip", source = "fresh") # Downloads and caches
# Explicit source selection
data <- get_dataset("abecip", source = "cache") # User cache only
data <- get_dataset("abecip", source = "github") # GitHub releases only
~/.local/share/realestatebr/ (instant, offline)piggyback package)rappdirs (Imports) - Cross-platform user cache directory supportpiggyback (Suggests) - GitHub releases download supportget_user_cache_dir(): Get path to user cache directorylist_cached_files(): List all cached datasetsclear_user_cache(): Remove cached datasetsis_cached(): Check if dataset is in cachelist_github_assets(): List available datasets on GitHub releasesdownload_from_github_release(): Download specific dataset from releasesupdate_cache_from_github(): Update cached datasets from GitHubis_cache_up_to_date(): Compare local vs GitHub cache timestamps# Install updated package
install.packages("realestatebr") # or devtools::install_github()
# Install piggyback for GitHub downloads (recommended)
install.packages("piggyback")
# First use after update: will download datasets to user cache
data <- get_dataset("abecip")
# Check cache location
get_user_cache_dir()
# Manage cache
list_cached_files() # See what's cached
clear_user_cache("abecip") # Clear specific dataset
clear_user_cache() # Clear all (with confirmation)
.Rbuildignoreinst/cached_data/ kept for development/CI but excluded from distributiondata-raw/publish-cache.Rget_dataset() interface unchangedimport_cached(): Still works but now loads from user cache (previously from inst/)cached=TRUE parameter in legacy functions: Still supported but uses new cacheR/cache-user.R - User cache managementR/cache-github.R - GitHub releases integrationdata-raw/publish-cache.R - Upload cache to releasesR/get-dataset.R - Refactored cache logicR/cache.R - Marked as deprecated (kept for compatibility).Rbuildignore - Exclude inst/cached_data/ filesDESCRIPTION - Added rappdirs and piggyback dependenciessource="fresh" to source="github" for manually-updated datasetsget_fgv_ibre() and get_nre_ire()
fgv_data and ire objects from R/sysdata.rdacached=FALSEmanual_update flag to datasets.yaml for FGV IBRE and NRE-IREupdate_notes field documenting why fresh downloads aren't available_targets.R explaining data source choices_targets.R: Updated fetch_dataset() to support source parameter; FGV and NRE-IRE now use source="github"R/get_fgv_ibre.R: Removed broken internal data fallback; added clear error for fresh downloadsR/get_nre_ire.R: Removed broken internal data fallback; added clear error for fresh downloadsinst/extdata/datasets.yaml: Added manual update flags and notesget_property_records.R (14% code reduction: 780→673 lines)get_ri_capitals() and get_ri_aggregates() with warning messagessource, download_time, download_info) that were never usedscrape_registro_imoveis_links() with better connection cleanup and reduced complexitynrow() before CLI interpolation to avoid closure issuespurrr::possibly() patternbcb_category when table specifiedbcb_metadata dynamically (now downloads all 140 series, not just 15)get-dataset.Rbcb_categorytable="all" in validate_and_resolve_table() functionbcb_series categories in datasets.yaml to match metadata.envir = parent.frame() to cli::cli_inform() calls in cli_user() and cli_debug()standardize_city_names() call after binding FipeZap dataproperty_records structure in get-dataset.Rget_dataset() functionalitysource="fresh" to catch real-world failures before productiontests/basic_checks.R for developmenteval=FALSE for faster developmentget_dataset("rppi", "ivgr") and other individual RPPI tables now work correctlyget_rppi() function now supports all individual RPPI tables (fipezap, ivgr, igmi, iqa, iqaiw, ivar, secovi_sp) in addition to stacked tables (sale, rent, all)get_bcb_realestate.R, get_cbic.R, get_fgv_ibre.R, get_property_records.R, get_rppi.R, get_rppi_bis.R, get_secovi.Rcategory= parameter to table= in tests/sanity_check.RThis release implements a major breaking change that consolidates 15+ individual get_*() functions into a single, unified get_dataset() interface. This dramatically simplifies the package API while maintaining full functionality.
BREAKING CHANGE: All individual get_*() functions have been removed:
get_abecip_indicators(), get_abrainc_indicators(), get_rppi(), get_bcb_realestate(), etc.get_dataset("dataset_name") insteadMajor refactoring of RPPI functions for better maintainability:
name_muni == "Brazil"rppi-helpers.R with common functions to eliminate duplicationstack parameter, cli_debug calls, and metadata attributes@keywords internal: Only get_dataset() is user-facingBenefits:
Note: In v0.4.0, the CBIC dataset is limited to cement tables only (validated data). Steel and PIM tables will be added in v0.4.1.
Available in v0.4.0:
cement_monthly_consumption - Monthly cement consumption by statecement_annual_consumption - Annual cement consumption by regioncement_production_exports - Production, consumption, and export datacement_monthly_production - Monthly cement production by statecement_cub_prices - CUB cement prices by stateComing in v0.4.1:
# Works in v0.4.0
get_dataset("cbic") # Default: cement_monthly_consumption
get_dataset("cbic", "cement_cub_prices")
# Will error with informative message
get_dataset("cbic", "steel_prices") # Deferred to v0.4.1
fetch_*() functions with @keywords internalinst/extdata/datasets.yamlrppi and rppi_indices into single hierarchical structuretable, cached, quiet, max_retriesNew unified interface:
# Get data from any dataset
data <- get_dataset("abecip") # Default table
data <- get_dataset("abecip", table = "sbpe") # Specific table
data <- get_dataset("rppi", table = "fipezap") # Hierarchical access
# Discover datasets
datasets <- list_datasets()
info <- get_dataset_info("rppi")
Removed functions (now internal):
get_abecip_indicators() → get_dataset("abecip")get_abrainc_indicators() → get_dataset("abrainc")get_rppi() → get_dataset("rppi")get_bcb_realestate() → get_dataset("bcb_realestate")get_bcb_series() → get_dataset("bcb_series")source = "cache"/"github"/"fresh" optionstest-internal-functions-0.4.0.R with 100 tests# OLD (0.3.x) - Will no longer work
data <- get_abecip_indicators(table = "sbpe")
data <- get_rppi(table = "fipezap")
data <- get_bcb_realestate(table = "all")
# NEW (0.4.0) - Required migration
data <- get_dataset("abecip", table = "sbpe")
data <- get_dataset("rppi", table = "fipezap")
data <- get_dataset("bcb_realestate", table = "all")
| Old Function | New get_dataset() Name |
|-------------|---------------------|
| get_abecip_indicators() | "abecip" |
| get_abrainc_indicators() | "abrainc" |
| get_rppi() | "rppi" |
| get_bcb_realestate() | "bcb_realestate" |
| get_bcb_series() | "bcb_series" |
| get_rppi_bis() | "rppi_bis" |
| get_secovi() | "secovi" |
| get_fgv_indicators() | "fgv_indicators" |
| get_b3_stocks() | "b3_stocks" |
| get_nre_ire() | "nre_ire" |
| get_cbic_*() | "cbic" |
| get_itbi() | "itbi" |
| get_property_records() | "registro" |
# OLD - Multiple functions
fipezap <- get_rppi_fipezap()
igmi <- get_rppi_igmi()
bis <- get_rppi_bis()
# NEW - Unified hierarchical access
fipezap <- get_dataset("rppi", table = "fipezap")
igmi <- get_dataset("rppi", table = "igmi")
bis <- get_dataset("rppi", table = "bis")
fetch_rppi(), fetch_abecip(), etc.datasets.yamlget_from_internal_function() → get_from_legacy_function()get_dataset(), list_datasets(), utilitiesThis release represents a major architectural shift toward a unified, maintainable API. While it introduces breaking changes, the new interface is significantly simpler and more powerful.
Full Changelog: https://github.com/viniciusoike/realestatebr/compare/v0.3.0...v0.4.0
_targets.R workflow with automated dependency management and parallel processingcache.R with better fallback mechanismsget_abrainc_indicators() (category → table)get_nre_ire() to use internal package data directlysysdata.rda with latest processed datasetstargets and tarchetypes to package dependenciesThis release establishes the foundation for automated data processing and validation, setting the stage for Phase 3 implementation with large dataset support.
Full Changelog: https://github.com/viniciusoike/realestatebr/compare/v0.2.0...v0.3.0
get_* functions with consistent APIs, CLI-based error handling, and progress reportingtable, cached, quiet, and max_retries parameterslist_datasets() - Discover available datasets with filtering capabilitiesget_dataset() - Unified data access function with intelligent fallbackinst/extdata/datasets.yaml for centralized dataset managementtable parameter replacing category across all functionscategory parametertable = "all"get_cbic_cement() - Cement consumption, production, and CUB pricesget_cbic_steel() - Steel prices and production dataget_cbic_pim() - Industrial production indicescli package integration for long-running operationscategory parameter deprecated across all functions in favor of table
category = "value" with table = "value"cached_data/ to inst/cached_data/ for package complianceget_abecip_indicators() - ABECIP real estate financing dataget_abrainc_indicators() - ABRAINC launches and sales dataget_b3_stocks() - B3 stock market data with improved column namingget_bcb_realestate() - Central Bank real estate credit dataget_bcb_series() - BCB macroeconomic time seriesget_rppi_bis() - Bank for International Settlements RPPI dataget_cbic_cement() - CBIC cement industry data (NEW)get_cbic_steel() - CBIC steel industry data (NEW)get_cbic_pim() - CBIC industrial production data (NEW)get_fgv_indicators() - FGV construction confidence indicatorsget_nre_ire() - Real Estate Index from NRE-Poli USPget_property_records() - Property registration data with robust Excel processingget_rppi() - Comprehensive RPPI coordinator with all sourcesget_secovi() - SECOVI-SP real estate data with parallel processingget_rppi_bis() - Main function with modernized backend and single tibble returnsget_itbi() and get_itbi_bhe() - Planned for Phase 3 (DuckDB integration)devtools integration# Old (deprecated but still works)
data <- get_abecip_indicators(category = "all")
# New (recommended)
data <- get_abecip_indicators(table = "all")
# Discover available datasets
datasets <- list_datasets()
# Get data with unified interface
data <- get_dataset("abecip_indicators")
# Use modernized functions with progress
data <- get_abecip_indicators(table = "indicators", quiet = FALSE)
cli for modern error handling and progress reportingdplyr, readr, httr, and rvestThis release represents the completion of Phase 1 modernization, establishing a solid foundation for Phase 2 (data pipeline automation) and Phase 3 (large dataset support with DuckDB).
Full Changelog: https://github.com/viniciusoike/realestatebr/compare/v0.1.5...v0.2.0