Title: | Creates a "shadow population" for comparative data analysis |
---|---|
Description: | Creates a "shadow population" for comparative data analysis. |
Authors: | Fran Barton [aut, cre] |
Maintainer: | Fran Barton <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.0.1 |
Built: | 2024-11-08 05:45:08 UTC |
Source: | https://codeberg.org/francisbarton/shadowpop |
This is a bit like creating a control group, or a training/testing set.
However, the aim is to create the best possible match between the sample
population (provided by the user) and the shadow population (generated by
the shadowpop()
function).
shadowpop( sample_data, source_data, match_cols = NULL, id_col = NULL, shadow_size = NULL )
shadowpop( sample_data, source_data, match_cols = NULL, id_col = NULL, shadow_size = NULL )
sample_data |
data frame Your sample population |
source_data |
data frame A larger population from which to extract a shadow sample |
match_cols |
A character vector of column names shared between sample_data and source_data, to be used as variables to match against. Put the variable you care about most at the start of the list; "nice to have" matches towards the end |
id_col |
character A column name not included in match_cols, which acts as a unique ID or key for each row in sample_data and in source_data. An ID column named "id" will be automatically created if no id_col is specified |
shadow_size |
integer The number of rows of shadow population to create. If not specified this will be set to the same nrow as sample_data. Should be (meaningfully) less than nrow(source_data)! |