Lmst

Here's a new post on my first encounter with building a simple deep learning model on manually-compiled adverse drug reactions data (thanks to @baoilleach for feedback) - https://jhylin.github.io/Data_in_life_blog/posts/22_Simple_dnn_adrs/2_ADR_regressor.html

Notes re. data - https://jhylin.github.io/Data_in_life_blog/posts/22_Simple_dnn_adrs/1_ADR_data.html

#PyTorch #RDKit #ChEMBL #embeddings #cheminformatics

At the ChEBI 2.0 workshop, Muhammad Arsalan is presenting how ChEBI is using the Bioregistry to standardize its cross-references, generate URLs on their front-end, and more

#chembl #ebi #chebi #sssom #cheminformatics

A schematic diagram on how database cross-references in ChEBI are standarized using the Bioregistry

An update on an older post looking at saving a relatively large csv file (although may not be considered large by some) as a Parquet file first (to be followed by 3 other smaller posts later detailing the use of Polars with scikit-learn without using Pandas at all)

https://jhylin.github.io/Data_in_life_blog/posts/21_ML1-1-1_Small_mols_in_chembl_update/ML1-1-1_chembl_cpds_parquet_new.html

#Scikit_Learn #Polars #parquet #Python #ChEMBL #Cheminformatics

Here are some snapshots from the #ChEMBL symposium! Dr. Samantha attended & delivered a wonderful talk about the #SemanticWeb! You can find the slides right here ➡️ https://zenodo.org/records/13882075

#PSDI #event #semantics #semanticweb

Don't forget to catch Dr. Samantha's talk about the topic: The Semantic Web is dead, Long live the Semantic Web - The future of Semantics in the Physical Sciences at 14:00 BST at the @chembl symposium. Join virtually using the link in the agenda: https://t.ly/H20xH

#PSDI #CheMBL#event #talk #semantics2024

Dr. Samantha Pearman-Kanza, is speaking at the @chembl 15 year symposium!

In 2024 European Bioinformatics Institute | @emblebi celebrated the 15th anniversary of the first public release of the #ChEMBL database as well as the 10th anniversary of #SureChEMBL.

oh, that's new for me (well, I haven't been using Google a lot lately)... when searching for #ChEMBL and #RDF it actually first lists a few datasets.

Did anyone else see this too? I have it in Chrome/Brave but not in Firefox nor Falkon. Is this because Google knows I like facts (and likely data)?

Page with Google Search results, showing three datasets before the first general web page. The three datasets are the ChEMBL website (okay, mixed one), but then my DataVerse archive of the v13.5 ChEMBL-RDF, and the third one is the metadata of that propagated into a German index.

somewhere in the next months I am going to try to repeat this: https://github.com/egonw/chembl.rdf #chembl #cheminformatics #rdf

I have finally grown more trees leading to this new post on boosted trees - re. chaining Scikit-mol's transformers along with AdaBoost and XGBoost via Scikit_learn's interface and pipelines
https://jhylin.github.io/Data_in_life_blog/posts/19_ML2-3_Boosted_trees/1_adaboost_xgb.html
#cheminformatics #chembl #rdkit #python #ml #xgboost #adaboost #sklearn #scikit_mol

Blog post on "Every ChEMBL everywhere, all at once"
https://baoilleach.blogspot.com/2024/06/every-chembl-everywhere-all-at-once.html
#chembl #cheminformatics

In an attempt to complete the random forest (RF) series, here's another follow-up post on RF classifier with more on imbalanced dataset - https://jhylin.github.io/Data_in_life_blog/posts/17_ML2-2_Random_forest/2_random_forest_classifier.html

#cheminformatics #ml #rf #chembl #chembl_downloader #scikit_mol #rdkit #Scikit_Learn #ghostml #Python

A follow-up on the decision tree series leading to a random forest this time with details on model building, imbalanced dataset, feature importances & hyperparameter tuning - https://jhylin.github.io/Data_in_life_blog/posts/17_ML2-2_Random_forest/1_random_forest.html

Jupyter notebook link: https://github.com/jhylin/ML2-2_random_forest/blob/main/1_random_forest.ipynb

Post updated to show a different max_features used for regression task (thanks @dr_greg_landrum for pointing this out)

#ml #randomforest #scikitlearn #pandas #seaborn #matplotlib #python #cheminformatics #chembl #drugdiscovery

okay, the `curl` command is not correct yet (after shopping/dinner), but the "Run" and "Edit" links are now working for all SPARQL endpoints :) https://bigcat-um.github.io/PRA3006-SPARQL/wikipathways.html #wikidata #wikipathways #chembl #AOPWiki

Screenshot of part of the linked webpage, showing a syntax-highlighted SPARQL query and just above that two links: one for "run" (which will execute the query with the SPARQL endpoint directly) and one for "edit" (which opens the query in the editor of the SPARQL endpoint, if available). At the bottom, we see a table with the some of the query results.

my lightning talk from the #RDKitUGM2023 is now on YouTube - all about making your work that uses datasets derived from ChEMBL more reproducible

📺 Video: https://youtu.be/PY-xaoRoSOY?list=PLugOo5eIVY3ExzpyKll6GGz4FRgBD2qzN&t=28

📜 Slides: https://bit.ly/cth-rdkit-ugm-2023

🤖 Code/Docs: https://github.com/cthoyt/chembl-downloader

Get started with: pip install chembl-downloader

#cheminformatics #chemoinformatics #chembl #pubchem

Shiny app in R - https://lnkd.in/gGwVYQ2r - this post walks through the process of making a simple Shiny app (without the help of any LLMs). #shiny #rstats #rladies #chembl #cheminformatics

I have my instruction on how to download the #pubChem + #ChEMBL + #NCI60 database.

This goes directly into #neo4j and will only take an hour instead of the weeks my computer crunched to create the database

https://medium.com/p/d9ee9779dfbe

It's always a bit awkward for me to post the first official post on any new platform (my day 3 on 🐘), so here it is, my latest post from my humble portfolio blog on data science, drug discovery, pharmaceuticals and their related data - Re-training and re-evaluation of machine learning model with scikit-learn - ML series 3 on "Small molecules in ChEMBL database" - https://jhylin.github.io/Data_in_life_blog/posts/11_ML3_Small_molecules_in_ChEMBL_database/ML3_chembl_cpds.html #machinelearning #scikitlearn #chembl #cheminformatics #python #pyladies

#CheMBL

Client Info