Misassign probability plots and table — misassign • ScatMatch

misassign assesses different mismatch thresholds by comparing overlap between allele mismatch distribution of samples assigned to the same group versus different groups and calculates probability of misassignment of group membership then summarise them into plots and a table.

misassign(dist, maxh = 10, lt = 0.005, ut = 0.995, bins = 30)

Arguments

dist: A list object created after running dissimilarity.
maxh: Integer. It is the maximum "height", or in this case allowable mismatches, and impacts the length of the x-axis. Default is 10.
lt: Numeric. Lower threshold. Represents the lower tail of the between individuals distribution. Defaults to 0.005.
ut: Numeric. Upper threshold. Represents the upper tail of the within individuals distribution. Defaults to 0.995.
bins: Numeric. Number of bins to split the data. Defaults to 30.

Value

A number of histogram plots, governed by maxh parameter, showing possible overlap and probability of misassignment are written to jpg file. A csv summary of key values is also saved.

Details

The function compares different mismatch thresholds by generating the distribution of the pairwise allele mismatch scores for each sample pair. This distribution is then separated into two groups: allele mismatches between samples assigned to the same group (i.e. mismatches between scats from the same putative individual) and allele mismatches between samples assigned to different groups (i.e. mismatches between scats from different putative individuals). To assess individual identification success, the same and different group mismatch distributions are ranked and the upper and lower 0.5 percentiles are calculated. If the difference between the lower and the upper 0.5 percentile is positive (the overlap column in the summary table), this means that the distributions are less overlapped and < 1 samples have been wrongly assigned. In addition, the probability of misassignment is calculated using the `overlap` function in the package `birdring` with 100,000 simulation and the upper and lower parameter space set at 99.5

Outputs of this function generates a series of plots for different thresholds and a table summary. Each plot consists of the “within” group distribution in red and “between” groups distribution in blue. The upper 0.5 percentiles of “within” group distribution and the lower 0.5 percentile of “between” groups distribution are plotted in dash lines. The number of individuals indicates the total number of groups identified from each threshold (h) value. The probability of misassignment is calculated with the “overlap” function as described above. The table summary consists of the following columns: h indicates the threshold number, ind indicates the total number of groups identified by each threshold, upper shows the upper 0.5 percentile value of the “within” group distribution, lower shows the lower 0.5 percentile value of the “between” groups distribution, overlap is the difference between the upper and lower 0.5 percentiles columns, and prob_misassign is the probability of misassignment.

Author

Rujiporn Thavornkanlapachai, rujiporn.sun@dbca.wa.gov.au

For more details see https://dbca-wa.github.io/ScatMatch/index.html the ScatMatch website

Examples

if (FALSE) {
misassign(dist = dissimilarity_list, maxh = 5)
}