Distribution of complete CRISPR-Cas systems in vibrio cholerae and its effect on presence of plasmid derived contigs in complete genome assemblies
Abstract
Cholera is a water-borne disease caused by Vibrio cholerae that causes severe diarrhea and dehydration. Plasmids disseminate antibiotic resistance and have the potential to play critical roles in epidemic outbreaks. Understanding the distribution and coexistence of V. cholerae plasmids and CRISPR-Cas systems is critical for investigating pathophysiology and developing effective control methods. The NCBI database yielded a total of 5873 genomic assemblies. PlasForest, a machine learning-based classifier, was used to predict plasmid sequences, while CRISPRCasTyper was utilized to identify Cas operons and CRISPR arrays. The results demonstrate a statistically significant decrease in %PDC between groups with no CCS and groups with 1 CCS and 2 CCS, although the difference in %PDC across groups with multiple CCS was not significant.