Lmst

#BioPerl

Here's a #BioPerl blast from the past @hyphaltip - https://github.com/OBF/wp-content/blob/gh-pages/uploads/2006/01/bioperl_graffiti.jpg

(Spotted while working with @gedankenstuecke et al on the trial @OpenBio website migration from WordPress to Hugo on GitHub Pages)

Wrapping over #hmmer http://hmmer.org/ made me appreciate AUTOLOAD in #perl @Perl
The actual code I had to write was minimal , i.e. about 23 lines in the pm file and ~85 in the alienfile, but it ended up "containerizing" (inside perlbrew) all 41 programs of the HMMER and EASEL suites #bioinformatics
#Github repo:
https://github.com/chrisarg/alien-seqalignment-hmmer3
#cpan:
https://metacpan.org/pod/Alien::SeqAlignment::hmmer3

#bioperl relevant modules:
https://metacpan.org/pod/Bio::Tools::Run::Hmmer
https://metacpan.org/pod/Bio::Index::Hmmer
Great start for building one's own programs.

The beauty, succinctness & speed of #bioperl #perl
(creating and accessing an index of 191,106 sequences ~ 275MB of biological (human #cDNA and #ncRNA) sequence data

5 sec to create the index (using BerkeleyDB), and 12 sec to transverse the sequence data
#bioinformatics @Perl

(Code in alt text of the left image edited to fit the character limit)

$use LWP::Simple; use FindBin qw($Bin); use File::Basename; use File::Spec; use Bio::DB::Fasta; use Memory::Usage; my $download_dir = File::Spec->catfile( $Bin, 'fastaloc' ); mkdir $download_dir unless -e $download_dir; my @fasta_files = qw( https://ftp.ensembl.org/pub/release-110/fasta/homo_sapiens/ncrna/Homo_sapiens.GRCh38.ncrna.fa.gz https://ftp.ensembl.org/pub/release-110/fasta/homo_sapiens/cds/Homo_sapiens.GRCh38.cds.all.fa.gz ); my @index_files; for my $dataset (@fasta_files) { $download_dir and upack them my $dataset_fname = basename($dataset); my $uncompressed_dataset_name = $dataset_fname =~ s/.gz//r; $dataset_fname = File::Spec->catfile( $download_dir, $dataset_fname ); $uncompressed_dataset_name = File::Spec->catfile( $download_dir, $uncompressed_dataset_name ); unless ( -e $dataset_fname || -e $uncompressed_dataset_name ) { my $rc = getstore( $dataset, $dataset_fname ); if ( is_error($rc) ) { next "getstore of <$dataset> failed with $rc"; } } system 'gzip', '-d', $dataset_fname unless -e $uncompressed_dataset_name; push @index_files,$uncompressed_dataset_name; } my $mu = Memory::Usage->new(); $mu->record('Before db'); my $db = Bio::DB::Fasta->new( $download_dir); $mu->record('After db'); my $stream = $db->get_PrimarySeq_stream; $mu->record('After stream'); while (my $seq = $stream->next_seq) { my $sequence = $seq->seq(); } $mu->record('Acc seq'); $mu->dump();$