New blog post:
Isotonic regression, PAVA algorithm and a bit of optimization.
https://josephsalmon.eu/blog/isotonic/
Inspired by an old post by @fabian
Researcher in #optimization, #ML and #statistics. Webmaster at @jmlr and managing editor at @tmlrsub. Lover of coffee and open source.
I occasionally blog at http://fa.bianp.net
New blog post:
Isotonic regression, PAVA algorithm and a bit of optimization.
https://josephsalmon.eu/blog/isotonic/
Inspired by an old post by @fabian
The Machine Learning Research (MLR) team @ Apple, has an open position for a research software engineer (RSWE) in Paris.
Please apply here:
https://jobs.apple.com/en-us/details/200517435/aiml-machine-learning-engineer-mlr?team=MLAI
RSWEs stands at the core of what we do @ MLR. If you have any questions on fit / research topics, please reach out to MLR_Paris_FTE@group.apple.com .
@ogrisel done!
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond https://arxiv.org/abs/2304.13712v2
Finished formalizing in #Lean4 the proof of an actual new theorem (Theorem 1.3) in my recent paper https://arxiv.org/abs/2310.05328 : https://github.com/teorth/symmetric_project/blob/master/SymmetricProject/jensen.lean . The proof in my paper is under a page, but it the formal proof uses 200 lines of Lean. For instance, in the paper, I simply asserted that \( t \mapsto \log(e^t + a) \) was convex on the reals for any \( a > 0 \) as this was a routine calculus exercise, and then invoked Jensen's inequality but writing out all the details took about 50 lines of code. But it seems there are tools in development to automate such exercises.
The ability of Github copilot to correctly anticipate multiple lines of code for various routine verifications, and inferring the direction I want to go in from clues such as the names I am giving the theorems, continues to be uncanny.
Lean's "rewrite" tactic of being able to modify a lengthy hypothesis or goal by making a targeted substitution is indispensable, allowing one to manipulate such expressions without having to always type them out in full. When writing out proofs in LaTeX, I often crudely simulated such a tactic by cutting-and-pasting the lengthy expression I was manipulating from one line to the next and making targeted edits, but this sometimes led to typos propagating themselves for multiple lines in the document, so it is nice to have this rewriting done in an automated and verified fashion. But the current tools still have some limitations; for instance, rewriting expressions that involve bound variables (e.g., the summation variable in a series) is not always easy to accomplish with current tactics. Looking forward to when one can simply ask an LLM in natural language to make such transformations...
This quote by Carl Sagan hangs in my office. #science
📢 New blog post 📢 Stochastic Polyak step-size (SPS): A simple step-size tuner with optimal rates.
This covers a very short and simple proof of convergence rates for SPS developed by Garrigos et al.. Unlike other proofs it doesn’t depend on interpolation, nor does the variance at optimum enter into the rates. Simple short beautiful #optimization.
Read more: https://fa.bianp.net/blog/2023/sps/
Apple MLR has a few intern positions in Paris. They can start pretty much anytime now, and last for up to a year (pending a finish before Sept. 24).
You must be a Phd student, have published in ML (e.g. Neurips/ICML/ICLR/AISTATS etc...). Topics of interest include differentiable optimization, generative models and uncertainty quantification.
Please reach out by email to MLR_Paris_Internships@group.apple.com , list CV + github + whatever relevant details if you are interested!
We propose a new family of probability densities that have closed form normalising constants. Our densities use two layer neural networks as parameters, and strictly generalise exponential families. We show that the squared norm can be integrated in closed form, resulting in the normalizing constant. We call the densities Squared Neural Family (#SNEFY), which are closed under conditioning.
Accepted at #NeurIPS2023. #MachineLearning #Bayesian #GaussianProcess
scikit-learn 1.3.1 is out!
This release fixes a bunch of annoying bugs. Here is the changelog:
https://scikit-learn.org/stable/whats_new/v1.3.html#version-1-3-1
Thanks very much to all bug reporters, PR authors and reviewers and thanks in particular to @glemaitre, the release manager of 1.3.1.
Surprising property mentioned to me by Robert Gower about the stochastic Polyak step-size variant below is that the iterate error is monotonically decreasing with just convexity.
Note that the decrease is _not_ in expectation! What other stochastic methods have this property? Can't think of any!
I'm in #ECMLPKDD2023:
Presenting https://link.springer.com/article/10.1007/s10994-022-06277-7 in room A9i today 2pm
Exciting work on embeddings in databases
Tomorrow in the Causal Machine Learning for Operational Decision Making workshop, I'll be giving a keynote on various results on individualizing treatment effet: how to select models, to choose covariates, and summary statistics
https://upliftworkshop.ipipan.waw.pl
nice paper that presents the Quantum Path Kernel (QPK), a new quantum machine learning approach that mirrors deep learning's hierarchy and generalization in the classical domain. The method's efficacy is showcased using the Gaussian xor mixtures classification.
📝 M. Incudini, M. Grossi, A. Mandarino, S. Vallecorsa, A. D. Pierro and D. Windridge, "The Quantum Path Kernel: A Generalized Neural Tangent Kernel for Deep Quantum Machine Learning," in IEEE Transactions on Quantum Engineering, vol. 4, pp. 1-16, 2023, Art no. 3101616, doi: 10.1109/TQE.2023.3287736.
🔗 https://ieeexplore.ieee.org/document/10158043
🏷️ #quantumComputing #machineLearning
@marcobuiatti @vborghesani Thanks Marco!
@vlad @vborghesani thanks Vlad!
@vborghesani My heart melts every time I look at her
Hello world, I'm Lea! Even if I've been around for just one week, I've filled the hearts of these two with awe & love. So don't fret about their silence on all platforms: I'm keeping them busy.
https://arxiv.org/abs/2302.10081
'Improved dimension dependence of a proximal algorithm for sampling'
- Jiaojiao Fan, Bo Yuan, Yongxin Chen
The 'proximal sampler' is an MCMC algorithm which is particularly compelling from the point of view of complexity analysis, and with increasing practical relevance. This work presents a modification of the ideal algorithm which is i) essentially implementable, ii) inexact, and iii) mixes well in terms of both dimension and error tolerance.
Patches Are All You Need?
Beautiful paper on the importance of being precise when talking about large language models:
Shanahan, Murray. "Talking About Large Language Models."
https://arxiv.org/abs/2212.03551