#chembl

Charles Tapley Hoytcthoyt@scholar.social
2025-09-22

ChEMBL 36 is out! if you're using chembl-downloader for all of your ChEMBL needs, then you just have to re-run your reproducible workflows and get arbitrarily new and better results, you beautiful nerd

chembl.blogspot.com/2025/09/ch

#chembl #cheminformatics #reproducibility

Charles Tapley Hoytcthoyt@scholar.social
2025-08-26

I used chembl-downloader to create some nice charts on how the number of compounds, assays, activities, and other entities in ChEMBL have grown over time

📖 cthoyt.com/2025/08/26/chembl-h

#chembl #chemistry #chemometrics #chemoinformatics #cheminformatics #rdkit #cdk #proteochemometrics

Egon Willighagenegonw@social.edu.nl
2025-08-09

new blog: "One Million IUPAC names #4: a lot is happening" chem-bla-ics.linkedchemistry.i

"A lot is happening. If you have been following this project more closesly, you may have already seen some interesting updates, but I will post it here too."

replies to this post become blog comments.

#iupac #chemistry #openscience #chembl #beilstein

Charles Tapley Hoytcthoyt@scholar.social
2025-07-25

Most cheminformatics code that queries ChEMBL struggles with reproducibility.

chembl-downloader can help:

>>> import chembl_downloader as cd
>>> df = cd.query("""
SELECT chembl_id, pref_name
FROM molecule_dictionary
WHERE pref_name IS NOT NULL
""")

It's even sneaking its way into @wpwalters and @dr_greg_landrum blogs :)

Code/Docs: github.com/cthoyt/chembl-downl

Preprint: arxiv.org/pdf/2507.17783

#cheminformatics #chemoinformatics #chembl #reproducibility #chemistry #openscience

2025-01-09

Here's a new post on my first encounter with building a simple deep learning model on manually-compiled adverse drug reactions data (thanks to @baoilleach for feedback) - jhylin.github.io/Data_in_life_

Notes re. data - jhylin.github.io/Data_in_life_

#PyTorch #RDKit #ChEMBL #embeddings #cheminformatics

2024-11-19

At the ChEBI 2.0 workshop, Muhammad Arsalan is presenting how ChEBI is using the Bioregistry to standardize its cross-references, generate URLs on their front-end, and more

#chembl #ebi #chebi #sssom #cheminformatics

A schematic diagram on how database cross-references in ChEBI are standarized using the Bioregistry
2024-10-17

An update on an older post looking at saving a relatively large csv file (although may not be considered large by some) as a Parquet file first (to be followed by 3 other smaller posts later detailing the use of Polars with scikit-learn without using Pandas at all)

jhylin.github.io/Data_in_life_

#Scikit_Learn #Polars #parquet #Python #ChEMBL #Cheminformatics

2024-10-03

Here are some snapshots from the #ChEMBL symposium! Dr. Samantha attended & delivered a wonderful talk about the #SemanticWeb! You can find the slides right here ➡️ zenodo.org/records/13882075

#PSDI #event #semantics #semanticweb

2024-10-02

Don't forget to catch Dr. Samantha's talk about the topic: The Semantic Web is dead, Long live the Semantic Web - The future of Semantics in the Physical Sciences at 14:00 BST at the @chembl symposium. Join virtually using the link in the agenda: t.ly/H20xH

#PSDI #CheMBL#event #talk #semantics2024

2024-09-30

Dr. Samantha Pearman-Kanza, is speaking at the @chembl 15 year symposium!

In 2024 European Bioinformatics Institute | @emblebi celebrated the 15th anniversary of the first public release of the #ChEMBL database as well as the 10th anniversary of #SureChEMBL.

Egon Willighagenegonw@social.edu.nl
2024-08-16

oh, that's new for me (well, I haven't been using Google a lot lately)... when searching for #ChEMBL and #RDF it actually first lists a few datasets.

Did anyone else see this too? I have it in Chrome/Brave but not in Firefox nor Falkon. Is this because Google knows I like facts (and likely data)?

Page with Google Search results, showing three datasets before the first general web page. The three datasets are the ChEMBL website (okay, mixed one), but then my DataVerse archive of the v13.5 ChEMBL-RDF, and the third one is the metadata of that propagated into a German index.
Egon Willighagenegonw@social.edu.nl
2024-08-08

somewhere in the next months I am going to try to repeat this: github.com/egonw/chembl.rdf #chembl #cheminformatics #rdf

2024-06-06

I have finally grown more trees leading to this new post on boosted trees - re. chaining Scikit-mol's transformers along with AdaBoost and XGBoost via Scikit_learn's interface and pipelines
jhylin.github.io/Data_in_life_
#cheminformatics #chembl #rdkit #python #ml #xgboost #adaboost #sklearn #scikit_mol

2024-01-17

In an attempt to complete the random forest (RF) series, here's another follow-up post on RF classifier with more on imbalanced dataset - jhylin.github.io/Data_in_life_

#cheminformatics #ml #rf #chembl #chembl_downloader #scikit_mol #rdkit #Scikit_Learn #ghostml #Python

2023-11-22

A follow-up on the decision tree series leading to a random forest this time with details on model building, imbalanced dataset, feature importances & hyperparameter tuning - jhylin.github.io/Data_in_life_

Jupyter notebook link: github.com/jhylin/ML2-2_random

Post updated to show a different max_features used for regression task (thanks @dr_greg_landrum for pointing this out)

#ml #randomforest #scikitlearn #pandas #seaborn #matplotlib #python #cheminformatics #chembl #drugdiscovery

Egon Willighagenegonw@social.edu.nl
2023-11-04

okay, the `curl` command is not correct yet (after shopping/dinner), but the "Run" and "Edit" links are now working for all SPARQL endpoints :) bigcat-um.github.io/PRA3006-SP #wikidata #wikipathways #chembl #AOPWiki

Screenshot of part of the linked webpage, showing a syntax-highlighted SPARQL query and just above that two links: one for "run" (which will execute the query with the SPARQL endpoint directly) and one for "edit" (which opens the query in the editor of the SPARQL endpoint, if available). At the bottom, we see a table with the some of the query results.
Charles Tapley Hoytcthoyt@scholar.social
2023-10-26

my lightning talk from the #RDKitUGM2023 is now on YouTube - all about making your work that uses datasets derived from ChEMBL more reproducible

📺 Video: youtu.be/PY-xaoRoSOY?list=PLug

📜 Slides: bit.ly/cth-rdkit-ugm-2023

🤖 Code/Docs: github.com/cthoyt/chembl-downl

Get started with: pip install chembl-downloader

#cheminformatics #chemoinformatics #chembl #pubchem

2023-04-08

Shiny app in R - lnkd.in/gGwVYQ2r - this post walks through the process of making a simple Shiny app (without the help of any LLMs). #shiny #rstats #rladies #chembl #cheminformatics

Tom Nijhof-Verheeswagenrace@tfwnogf.nl
2023-03-29

I have my instruction on how to download the #pubChem + #ChEMBL + #NCI60 database.

This goes directly into #neo4j and will only take an hour instead of the weeks my computer crunched to create the database

medium.com/p/d9ee9779dfbe

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst