@BenjaminHan @natolambert is one of my favorites on the game right now. Thanks for sharing!
University of Toronto + Vector Institute + MIT Machine Learning PhD student interested in Reinforcement Learning, Decision Making under Uncertainty & Transfer Learning, motivated by challenges in Healthcare
@BenjaminHan @natolambert is one of my favorites on the game right now. Thanks for sharing!
Risk Sensitive Dead-end Identification in Safety-Critical Offline Reinforcement Learning
Taylor W. Killian, Sonali Parbhoo, Marzyeh Ghassemi
I also need to mention the #RLDM and #TMLR communities. Having venues such as these, that welcome cross-disciplinary work, is such a benefit to our ML research community.
Our paper can be seen at: https://openreview.net/forum?id=oKlEOT83gI
And our code (soon!): https://github.com/MLforHealth/DistDeD
(7/7)
There are lots of people to thank. Foremost, I wanted to publicly acknowledge those whose enthusiasm and encouragement helped push me to continue building upon these ideas. Thank you Mehdi Fatemi, Marc Bellemare, Will Dabney, Vinith Suriyakumar, and Haoran Zhang. (6/7)
The improvements we've made with DistDeD are exciting! This is one promising direction to make RL useful in the real-world.
We are working with several clinical collaborators to determine how we can best use DistDeD. Lots to come in the near future, stay tuned! (5/7)
There are 2 immediate benefits of using CVaR and distributional RL in dead-end discovery:
1) We enable *even earlier indication* of when things may go wrong.
2) The implementation of DistDeD is tunable in order to account for specific aspects of the intended use case. (4/7)
We use distributional RL to give this rich representation of possible outcomes for each action and use the CVaR to assess the risk of any action leading to a dead-end. Extending our prior work on dead-end discovery (https://tinyurl.com/Neurips21DeD) we introduce DistDeD! (3/7)
In safety-critical environments, such as healthcare, it is important to be mindful of worst case outcomes when assessing which actions to avoid. With an estimated distribution over expected return, we can use the conditional value at risk (CVaR) to characterize this (2/7)
🎉Last week, amid the #ICML2023 rush, we were informed that our paper (w/ Sonali Parbhoo and Marzyeh Ghassemi): "Risk Sensitive Dead-end Identification in Safety-Critical Offline Reinforcement Learning" was accepted to
#TMLR! 🎉 #ReinforcementLearning #Healthcare
While this may be somewhat of a faux pas, in about an hour @natolambert and I will be joining @talkrl in a Twitter Spaces discussion to chat about recent advances in the world of #ReinforcementLearning!
I hope that anyone interested and can afford a break from #ICML2023 writing feels free to join!
@mmakar this is possibly another reason I will choose not to be a professor. I couldn’t help myself to not care what all of my students think.
I don't know what's more problematic, that #ICML2023 reviewer invites went out during a holiday, or that many of us still saw the email during the holiday.
@jonny we can only hope that grad students everywhere are more discerning than just blindly pulling these kinds of citations into their papers and that reviewers are willing to call out their BS.
As a (re-) #introduction (#hashtags): I am a senior ML researcher at Microsoft Research in Cambridge (UK, aka "Original Cambridge") and part of Health Futures.
I work in ML for #health, in particular on #EHR data (especially #ICU time series), and these days am thinking about #biomedical vision-language processing (#vlp) in #radiology.
@_hylandSL humble brag
I do want to follow up and share that the title of the paper (and the bibtex) seem to be completely made up. E.g. this is mostly wishcasting than actually viable.
The link provided to the pubmed article is unrelated.
In the Google vs. ChatGPT search debate, you.com just went ahead and built a really intriguing integration (youChat)
This is a great add-on that I have been testing a bit today while doing literature review alongside writing a draft of my PhD thesis.
While requiring some validation--don't believe everything on the internet--this has been a research upgrade for me (see attached pictures):
Working in pen and paper is detangler for my brain.
Just a quick reminder of the many interesting talks / interviews on the "Talk RL" podcast: https://www.talkrl.com/
I am a scientist at Meta AI in NYC and study machine learning and optimization, recently involving reinforcement learning, control, optimal transport, and geometry. On social media, I enjoy finding and boosting interesting content from the original authors on these topics
I made this small animation with my recent project on optimal transport that connects continuous structures in the world. The source code to reproduce this and other examples is online at https://github.com/facebookresearch/w2ot