Red-eying it to Boston to get some exciting projects started at Ginkgo this week. Lots of meetings and lots of design decisions to make in a very short amount of time.
Bioinformaticist interested in #Proteomics, #Genomics, and #DataScience. Currently building software for #SyntheticBiology at Ginkgo Bioworks.
Red-eying it to Boston to get some exciting projects started at Ginkgo this week. Lots of meetings and lots of design decisions to make in a very short amount of time.
Another periodic reminder that multiprocessing.cpu_count() will not give you the correct number of vCPUs on AWS batch. It returns the CPU count of the machine, even if you can't use all the cores.
"An assembly is a hypothesis of the genome" something I try to keep in mind through all this.
Cochrane Reviews has issued an editor's statement about the mask-wearing paper that has been getting so much attention lately.
Below, the statement, in which they both endeavor to clarify the implications of the study and take responsibility for the poor initial job of public communication.
I understand that micromamba is supposed to be faster than conda. But I didn't know it was SO much faster.
We can and still do enforce schemas though. We've just moved this logic out of the main database and the API that serves it.
My 300th blog post where I write about customising BLAST output https://davetang.org/muse/2023/02/15/til-that-you-can-customise-blast-output/
An interesting conundrum about dealing with data in a company with so many different types of biological teams and experiments is satisfying domain specific needs with very general database infrastructure.
We have a relational database and the models, like Sample, have a well defined meaning. But if a team wants a Tissue Sample, now I have to specify a set of properties to store for each Sample to make it a new "type", e.g organ.
Generality is awesome but sometimes kinda messy.
New preprint, by Victor Rossier with the group of Christophe Dessimoz (#UNIL and #SIB), introducing #Matreex, a new dynamic tool to scale-up the visualisation of gene families, and its application to showing loss of intraflagellar transport in a myxozoan
https://www.biorxiv.org/content/10.1101/2023.02.18.529053v1
#phylogeny #phylogenomics #bioinformatics #BigData #visualization #vizbi #myxozoan @dee_unil
1/thread
🦠I'm happy to present a new nf-core pipeline for people interested in functional analysis of #MicrobialGenomes / #Metagenomes / #Microbiomes !
If you're interested in mining metagenomes for functional groups such as #AntiMicrobialResistance genes, #AntiMicrobialPeptides or #BiosyntheticGeneClusters, @nf_core /funcscan automates the screening of such sequences from input contigs or genomes in a highly parallelised, portable, and reproducible manner
https://genomic.social/@nf_core@mstdn.science/109868600236675816 🧵 [1/2]
What a nice profile of @reneegeck in @AJHGNews! [twitter handles]
Sequencing is fun but gosh dang do I miss mass spectrometry. Maybe it's just how up close and personal you get with the raw data, but I do love the feeling of analytically potential that is in spectra.
Living on the opposite coast as my company certainly has it's annoyances. Got invited to a discussion about using proteomics for one of our current projects... at 6 am my time...
1) The topic of data storage and longevity in research labs came up recently, so I wanted to throw my $0.02 into how I try to manage things.
Active Projects: This is work that's going on now. We're still poking it with a stick and need it on hand. All working data is *copied* from an original data source. Never, ever work with the original data directories from long-term storage. A single typo in an rm command can wreck it all. For that reason, all active projects get a folder and a Git
Sent off my first review for JOSS today, and it was a fairly pleasant and familiar feeling process. I liked that it was encouraged to open issues directly on the submitted repository. Feels like exactly how scientific software should be reviewed.
@PhilippBayer haha I have no idea why my eyes just zeroed in on that one K. Your explanation makes total sense though.
@PhilippBayer How big of a subset is that? Is there a way to ban characters in the output instead?
@PhilippBayer What does K represent in this case?
Short blog post on stopping BLAST from phoning home https://davetang.org/muse/2023/02/08/stop-blast-from-phoning-home/