Sequential importance sampling for multiresolution kingman-tajima coalescent counting

dc.contributor.authorCappello, Lorenzo
dc.contributor.authorPalacios, Julia A.
dc.date.accessioned2024-02-28T08:06:18Z
dc.date.available2024-02-28T08:06:18Z
dc.date.issued2020
dc.description.abstractStatistical inference of evolutionary parameters from molecular sequence data relies on coalescent models to account for the shared genealogical ancestry of the samples. However, inferential algorithms do not scale to available data sets. A strategy to improve computational efficiency is to rely on simpler coalescent and mutation models, resulting in smaller hidden state spaces. An estimate of the cardinality of the state space of genealogical trees at different resolutions is essential to decide the best modeling strategy for a given dataset. To our knowledge, there is neither an exact nor approximate method to determine these cardinalities. We propose a sequential importance sampling algorithm to estimate the cardinality of the sample space of genealogical trees under different coalescent resolutions. Our sampling scheme proceeds sequentially across the set of combinatorial constraints imposed by the data which, in this work, are completely linked sequences of DNA at a nonrecombining segment. We analyze the cardinality of different genealogical tree spaces on simulations to study the settings that favor coarser resolutions. We apply our method to estimate the cardinality of genealogical tree spaces from mtDNA data from the 1000 genomes and a sample from a Melanesian population at the β-globin locus.
dc.format.mimetypeapplication/pdf
dc.identifier.citationCappello BL, Palacios JA. Sequential importance sampling for multiresolution kingman-tajima coalescent counting. Ann Appl Stat. 2020;14(2):727-51. DOI: 10.1214/19-AOAS1313
dc.identifier.doihttp://dx.doi.org/10.1214/19-AOAS1313
dc.identifier.issn1932-6157
dc.identifier.urihttp://hdl.handle.net/10230/59281
dc.language.isoeng
dc.publisherInstitute of Mathematical Statistics
dc.relation.ispartofThe Annals of Applied Statistics. 2020;14(2):727-51.
dc.rights© Institute of Mathematical Statistics, 2020
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess
dc.titleSequential importance sampling for multiresolution kingman-tajima coalescent counting
dc.typeinfo:eu-repo/semantics/article
dc.type.versioninfo:eu-repo/semantics/publishedVersion

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Cappello_ann_sequ.pdf
Size:
807.62 KB
Format:
Adobe Portable Document Format