Unique Molecular Identifiers (UMIs) provide an opportunity to count individual RNA or DNA molecules while eliminating the redundant and distorting effect PCR amplification and sequencing errors. Most UMI processing pipelines are algorithmic and ad hoc. Baker Center personnel (Peng and Dorman) are working on a novel probabilistic framework to detect true biological sequences and accurately estimate their deduplicated abundance from amplicon sequence data. The underlying model is a one-step Hidden Markov Model for the relationship between true UMIs and the attached true sample sequence. A penalty is imposed on the transition probabilities to encourage a sparse UMI-to-sample sequence mapping except when the data support UMI collision, where the same UMI is accidentally reused. The framework is not limited to amplicon sequence data and may be extended to droplet-based single-cell RNA-seq data.