Package 'shadowpop'

Title: Creates a "shadow population" for comparative data analysis
Description: Creates a "shadow population" for comparative data analysis.
Authors: Fran Barton [aut, cre]
Maintainer: Fran Barton <[email protected]>
License: MIT + file LICENSE
Version: 0.0.1
Built: 2024-09-08 13:21:02 UTC
Source: https://codeberg.org/francisbarton/shadowpop

Help Index


Create a shadow population that closely resembles a sample population

Description

This is a bit like creating a control group, or a training/testing set. However, the aim is to create the best possible match between the sample population (provided by the user) and the shadow population (generated by the shadowpop() function).

Usage

shadowpop(
  sample_data,
  source_data,
  match_cols = NULL,
  id_col = NULL,
  shadow_size = NULL
)

Arguments

sample_data

data frame Your sample population

source_data

data frame A larger population from which to extract a shadow sample

match_cols

A character vector of column names shared between sample_data and source_data, to be used as variables to match against. Put the variable you care about most at the start of the list; "nice to have" matches towards the end

id_col

character A column name not included in match_cols, which acts as a unique ID or key for each row in sample_data and in source_data. An ID column named "id" will be automatically created if no id_col is specified

shadow_size

integer The number of rows of shadow population to create. If not specified this will be set to the same nrow as sample_data. Should be (meaningfully) less than nrow(source_data)!