Lmst

#MLLMs

Readings shared June 13, 2025. https://jaalonso.github.io/vestigium/posts/2025/06/14-readings_shared_06-13-25 #AI #AIforMath #Autoformalization #CoqProver #HOL_Light #ITP #IsabelleHOL #LLMs #LeanProver #Logic #MLLMs #Math #Rocq

Reseña de «MATP-BENCH: Can MLLM be a good automated theorem prover for multimodal problems?». https://jaalonso.github.io/vestigium/posts/2025/06/13-resena-de-matp-bench-can-mllm-be-a-good-automated-theorem-prover-for-multimodal-problems/ #AI #MLLMs #Math #ITP #IsabelleHOL #LeanProver #CoqProver #AIforMath

MATP-BENCH: Can MLLM be a good automated theorem prover for multimodal problems? ~ Zhitao He et als. https://arxiv.org/abs/2506.06034 #AI #MLLMs #Math #ITP #IsabelleHOL #LeanProver #CoqProver #AIforMath

Collective Monte Carlo Tree Search (CoMCTS): A New Learning-to-Reason Method for Multimodal Large Language Models

https://www.marktechpost.com/2024/12/27/collective-monte-carlo-tree-search-comcts-a-new-learning-to-reason-method-for-multimodal-large-language-models/

#LLMs #MLLMs #AI

If you would like to learn more how it works: Guiding Instruction-based Image Editing via Multimodal Large Language Models. Check out the code repository for the ICLR'24 Spotlight paper by Tsu-Jui Fu, Wenze Hu, Xianzhi Du, William Yang Wang, Yinfei Yang, and Zhe Gan.
https://github.com/apple/ml-mgie
#ICLR24 #ImageEditing #MLLMs #AIResearch

A diagram illustrating the process of a multimodal large language model (MLLM) editing an image of a cabin in the woods to place it in a desert setting.

Apple has released Ferret, a new type of multimodal large language model (MLLM) that excels in both image understanding and language processing, particularly demonstrating significant advantages in understanding spatial references.

Paper: https://arxiv.org/abs/2310.07704
Github: https://github.com/apple/ml-ferret?tab=readme-ov-file

Source: https://www.threads.net/@luokai/post/C1OE1MNPVQA/?igshid=MzRlODBiNWFlZA==

#ai #LLMs #mllms