Oliver Schwengers

Microbial bioinformatics, WGS bacteria, plasmids, PostDoc @JLUGiessen, father of 2, husband, astrophotographer

Oliver Schwengersoschwengers@mstdn.science
2023-11-29

And for the sake of completeness, there's already a v1.9.1 patch release catching 2 minor bugs 😆
github.com/oschwengers/bakta/r
(7/7)

Oliver Schwengersoschwengers@mstdn.science
2023-11-29

We replaced HMMER with PyHMMER, and updated to Pyrodigal to v3.1. Furthermore, we bumped various dependencies to most recent versions.
(6/7)

Oliver Schwengersoschwengers@mstdn.science
2023-11-29

We introduce auxiliary scripts for common downstream tasks as for example the extraction of annotations for certain sub regions or the aggregation of annotation stats of multiple genomes. Ideas, contributions & PRs are highly welcome!
(5/7)

Oliver Schwengersoschwengers@mstdn.science
2023-11-29

Bakta now annotates and exports spacer and repeat sequences within CRISPR arrays.
(4/7)

Oliver Schwengersoschwengers@mstdn.science
2023-11-29

Currently, only import of CDS coordinates are supported, but more might come later.

BTW, to additionally provide functional annotations of these CDS, you can provide related aa sequences with custom annotations via --proteins.
(3/7)

Oliver Schwengersoschwengers@mstdn.science
2023-11-29

We introduce a new --region parameter supporting user-provided pre-annotated feature regions in Genbank/GFF3 format.
For example, CDS coordinates are imported, supersede ab initio-predicted CDS, and then are subject to the regular internal annotation workflow.
(2/7)

Oliver Schwengersoschwengers@mstdn.science
2023-11-29

🦠🧬💻 Just released Bakta 1.9.0 with new features & various improvements:

- new --regions option to provide pre-annotated feature regions
- annotation of spacer & repeat sequences in CRISPR arrays

github.com/oschwengers/bakta/r

More information below 👇 (1/7)

Oliver Schwengersoschwengers@mstdn.science
2023-05-31

We fixed some rare occasions of wrong 5' / 3' ("prime") characters in product descriptions causing issues in downstream analyses. (6/6)

Oliver Schwengersoschwengers@mstdn.science
2023-05-31

Now "bakta_proteins" writes its full annotation results as a comprehensive JSON - just like the main workflow. (5/6)

Oliver Schwengersoschwengers@mstdn.science
2023-05-31

Compatibility with NCBI Bankit was improved:
- setting genome sequences' attributes "location" and "plasmid-name" (explicitly or auto-generated)
- removing strain designation from "organism"
(4/6)

Oliver Schwengersoschwengers@mstdn.science
2023-05-31

and improved the --plasmid parameter automatically setting input sequence attributes (complete, circular) for improved convenience in most cases.
This can be overwritten via a replicon table (--replicons) (3/6)

Oliver Schwengersoschwengers@mstdn.science
2023-05-31

We introduced a new --force option explicitly allowing to overwrite existing output directories (2/6)

Oliver Schwengersoschwengers@mstdn.science
2023-05-31

🦠🧬💻 Just released Bakta 1.8.0 with various improvements:

- new --force option & improved --plasmid parameter
- improved
@NCBI
#Bankit compatibility
- increased sensitivity for user protein sequences

github.com/oschwengers/bakta/r

More information below 👇 (1/6)

Oliver Schwengersoschwengers@mstdn.science
2023-03-13

🦠🚀FYI: Bakta is now available via @galaxyproject!

Thanks & kudos to
@Pi_R_Marin@twitter.com
and the #UseGalaxy team.

v1.6.1 is available for all Galaxy instances via IUC.
So, if you're interested in using Bakta in Galaxy, kindly ask your local admins 😉

Oliver Schwengersoschwengers@mstdn.science
2023-02-28

As well as tons of minor fixes and improvements, that have been applied. (10/10)

Oliver Schwengersoschwengers@mstdn.science
2023-02-28

Of note, we updated Pyrodigal to most recent v2.1.0 fixing a bug in the SD motif-detection on reverse contig edges (9/10)

Oliver Schwengersoschwengers@mstdn.science
2023-02-28

We introduced a "--meta" option to run gene prediction in metagenome mode. Of course, still only bacterial genome features and proteins will be annotated. (8/10)

Oliver Schwengersoschwengers@mstdn.science
2023-02-28

Also several 3-letter gene symbol creation/curation steps were implemented for tRNAs/rRNAs/ncRNAs. (7/10)

Oliver Schwengersoschwengers@mstdn.science
2023-02-28

"TssL" is automatically extracted from protein description, added to a list of alternative gene symbols, and finally the best-matching gene symbol for each gene is selected in line with its neighbors:
tssH T4SS ... TssH
tssL T4SS ... TssL
tssK T4SS ... TssK
(6/10)

Oliver Schwengersoschwengers@mstdn.science
2023-02-28

We implemented several gene symbol curation steps for CDS operons.
For example:
tssH T4SS ATPase TssH
icmH T4SS protein TssL
tssK T4SS baseplate subunit TssK
... (5/10)

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst