The DeePaC-F fungal database and models presented at ECCB 2022
You can find our database of labelled fungal pathogens and their genomes HERE!
This database contains a manually curated set of human, animal and plant pathogens, annotated with their confirmed host range and relevant sources. In addition to that, we include additional sets of plant-associated fungi (which may include non-pathogens), as well as fungi with an automatically assigned, putative human, animal or plant host. The labelled fungal species are linked to their representative GenBank genomes wherever possible. Genomes that were screened, but no label was found, are also included.
The database is stored in a flat-file format. All metadata are stored in all_data.csv, and all_data.rds contains the same data in a compressed format that can be easily loaded in R. The core database is limited to manually confirmed human, animal and plant pathogens with available genomes.
You may also be interested in trained neural network models predicting pathogenic potentials of novel fungi from DNA sequences and simulated Illumina read sets used to train and evaluate them.
You can classify your own reads with DeePaC, which we also used to train the models.
Please also have a look at the associated paper!