Last week, I was fortunate to join a group of leaders building solutions for a problem many of us have faced over the last 6 months—important data being removed from government websites. Public data used to set baselines for interventions, identify capital access challenges, and shape policy went missing or was taken offline and then restored with key variables missing.
This moment of erasure of specific data points underscores the risk of relying solely on official data streams. Before these current threats, alternative data sources—ranging from credit card transactions and satellite imagery to web and social media analytics—have increasingly been used to inform local economic development and fine-tune supports for small businesses. These non-traditional datasets provide granular, real-time insights that traditional government statistics often cannot. Without proper safeguards and transparency, the risks of these opportunities threaten both individual privacy and the larger goals of sparking a more inclusive economy. But completely ignoring the opportunity altogether threatens evidence-based decision-making, especially when the public data infrastructure is threatened.
Innovating Through Disruption
Integrating different government, field collection, and proprietary data sets to build and evaluate interventions has been an important piece of my past decade of work as a social scientist. Even in less extreme versions of managing disappearing data, like the last government shutdown and pandemic-era agency data release delays, our teams have thoughtfully diversified our data stacks to get a fuller picture of the economy in real time.
We prioritize these alternative data sources when:
Decision-making requires speed: Local leaders increasingly need to make quick decisions about targeting resources or expanding smaller-scale initiatives due to funding uncertainty or shifting public sentiment. Granular location or customer data—generated and updated continuously—can pinpoint communities or even specific streets where targeted support can have outsized effects. In local capital deployment, directing small business grants to neighborhoods where foot traffic has been affected by new development or changing public transit creates immediate impact. As we've discussed in previous posts, alternative data is transforming small business lending by providing new ways to evaluate a firm's creditworthiness. For businesses operating on thin margins, timely support to manage a downturn or position for a larger opportunity is also a matter of speed.
Conventional indicators can only give conventional insights: Just as conventional data may underestimate who can afford mortgage loans or what types of businesses are creditworthy, alternative data captures aspects of regional economies that traditional metrics miss, uncovering hidden or novel trends. As we saw with the techniques deployed by Opportunity Insights, anonymized credit card transaction data can reliably complement slowly released official statistics to monitor small business activity. On a larger scale, researchers have also constructed novel datasets by mapping web linkages between companies to understand regional innovation networks, essentially using the "relationship maps" of firms and populations as an indicator of key growth levers.
Historical data is not available: When traditional statistical sources lack historical records or are insufficient for long-term trend analysis, geospatial information—including satellite imagery, GIS data, and location traces from mobile devices—offers valuable insights. High-resolution satellite images can measure vehicle traffic and real estate development, providing insight into retail and trade activity weeks before official reports while also offering comparison data through archived imagery dating back decades. Geospatial data can also fill critical gaps when examining underserved communities or rural areas that are often overlooked in standard economic surveys.
Perception matters as much as reality: The imbalance of opportunity and investment is a glaring example of how money follows both hard data and perception. The vast information on the web—consumer trends/preferences, sentiment on government actions—is a rich trove for economic insight that can drive inclusive growth. For example, aggregated social media provides signals of public confidence or anxiety about the economy that complements and contextualizes data from traditional resident surveys. Nonprofits and local chambers of commerce monitor these channels to identify specific areas and businesses to support, thereby reducing investment risk when combined with traditional metrics.
Mitigating Risks
Despite the opportunities, both data use and the associated decision-making demand heightened scrutiny. This is particularly true in a moment where data reliability is frequently challenged—and the bias introduced with the rapid proliferation of AI models is poorly understood. As my team builds new products, most relying on a mix of data from traditional sources, here are some of the mitigation strategies we're deploying.
Risk: Data Quality and Representativeness
Many of these data represent only users of specific services or platforms, missing important segments of the population. However, analysts only adjusting data weights for groups who are underrepresented (e.g., unbanked individuals) would still miss the compounded bias that arises when combining multiple datasets. These biases can lead to blind spots and skewed policy decisions if not properly addressed.
Approach: Validation and Integration
Our primary approach is to validate alternative indicators against traditional metrics and benchmark data and apply the appropriate statistical techniques to adjust for known biases. A great starting point is the Bureau of Labor Statistics framework for checking source reliability, consistency, and bias before integration.
Risk: Privacy and Ethical Concerns
Unlike aggregated official statistics, alternative datasets (e.g., location data, transaction histories, social media posts) often contain personally identifiable or sensitive information. This raises significant privacy risks and ethical concerns, particularly when individuals are not aware that they have consented to their data being used for policy decisions and other purposes.
Approach: Confidentiality Measures
Our standard for mitigating privacy risks is informed by privacy by design principles and the Five Safes framework. Beyond mere compliance with privacy laws, transparent communication about data use has proven essential in maintaining trust among funders, research firms, and the communities we collaborate with.
Risk: Bias in Decision-Making
Algorithms, even well-intentioned ones, can perpetuate or amplify existing inequities. Previous iterations of tools designed to broaden credit eligibility (e.g., social media) have inadvertently disadvantaged the very businesses they intended to help or led to costly interventions without accurate estimations of outcome likelihood and effect size.
Approach: Fairness Audits and Contextual Analysis
Though fairness audits and bias testing methodologies should be routinely applied to models, integrating alternative data with traditional sources requires intentional controls that extend beyond using demographics alone. End-users (think small business employees and other community members) often have the domain knowledge that ensures interventions most directly target root causes, especially when interpreting sentiment or assessing subpopulation needs.
Protect, Nurture & Build
While many colleagues are working on protective legal strategies, data archiving, training programs, and data governance structures to preserve federal data, others have doubled down on building and maintaining datasets that exceed traditional standards for completeness and representativeness. We need to continue to do both at the same time.