Lmst

Fabian Pedregosa boosted:

New blog post:

Isotonic regression, PAVA algorithm and a bit of optimization.

Inspired by an old post by @fabian

Fabian Pedregosa boosted:

The Machine Learning Research (MLR) team @ Apple, has an open position for a research software engineer (RSWE) in Paris.

Please apply here:
https://jobs.apple.com/en-us/details/200517435/aiml-machine-learning-engineer-mlr?team=MLAI

RSWEs stands at the core of what we do @ MLR. If you have any questions on fit / research topics, please reach out to MLR_Paris_FTE@group.apple.com .

@ogrisel done!

Fabian Pedregosa boosted:

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond https://arxiv.org/abs/2304.13712v2

Fabian Pedregosa boosted:

Finished formalizing in #Lean4 the proof of an actual new theorem (Theorem 1.3) in my recent paper https://arxiv.org/abs/2310.05328 : https://github.com/teorth/symmetric_project/blob/master/SymmetricProject/jensen.lean . The proof in my paper is under a page, but it the formal proof uses 200 lines of Lean. For instance, in the paper, I simply asserted that \( t \mapsto \log(e^t + a) \) was convex on the reals for any \( a > 0 \) as this was a routine calculus exercise, and then invoked Jensen's inequality but writing out all the details took about 50 lines of code. But it seems there are tools in development to automate such exercises.

The ability of Github copilot to correctly anticipate multiple lines of code for various routine verifications, and inferring the direction I want to go in from clues such as the names I am giving the theorems, continues to be uncanny.

Lean's "rewrite" tactic of being able to modify a lengthy hypothesis or goal by making a targeted substitution is indispensable, allowing one to manipulate such expressions without having to always type them out in full. When writing out proofs in LaTeX, I often crudely simulated such a tactic by cutting-and-pasting the lengthy expression I was manipulating from one line to the next and making targeted edits, but this sometimes led to typos propagating themselves for multiple lines in the document, so it is nice to have this rewriting done in an automated and verified fashion. But the current tools still have some limitations; for instance, rewriting expressions that involve bound variables (e.g., the summation variable in a series) is not always easy to accomplish with current tactics. Looking forward to when one can simply ask an LLM in natural language to make such transformations...

Fabian Pedregosa boosted:

This quote by Carl Sagan hangs in my office. #science

“In science it often happens that scientists say, 'You know that's a really good argument; my position is mistaken,' and then they actually change their minds and you never hear that old view from them again. They really do it. It doesn't happen as often as it should, because scientists are human and change is sometimes painful. But it happens every day. I cannot recall the last time something like that happened in politics or religion.”
- Carl Sagan

📢 New blog post 📢 Stochastic Polyak step-size (SPS): A simple step-size tuner with optimal rates.

This covers a very short and simple proof of convergence rates for SPS developed by Garrigos et al.. Unlike other proofs it doesn’t depend on interpolation, nor does the variance at optimum enter into the rates. Simple short beautiful #optimization.

Fabian Pedregosa boosted:

Apple MLR has a few intern positions in Paris. They can start pretty much anytime now, and last for up to a year (pending a finish before Sept. 24).

You must be a Phd student, have published in ML (e.g. Neurips/ICML/ICLR/AISTATS etc...). Topics of interest include differentiable optimization, generative models and uncertainty quantification.

Please reach out by email to MLR_Paris_Internships@group.apple.com , list CV + github + whatever relevant details if you are interested!

Fabian Pedregosa boosted:

We propose a new family of probability densities that have closed form normalising constants. Our densities use two layer neural networks as parameters, and strictly generalise exponential families. We show that the squared norm can be integrated in closed form, resulting in the normalizing constant. We call the densities Squared Neural Family (#SNEFY), which are closed under conditioning.

Accepted at #NeurIPS2023. #MachineLearning #Bayesian #GaussianProcess

https://arxiv.org/abs/2305.13552

Fabian Pedregosa boosted:

scikit-learn 1.3.1 is out!

This release fixes a bunch of annoying bugs. Here is the changelog:

https://scikit-learn.org/stable/whats_new/v1.3.html#version-1-3-1

Thanks very much to all bug reporters, PR authors and reviewers and thanks in particular to @glemaitre, the release manager of 1.3.1.

#PyData #SciPy #sklearn #Python #machinelearning

Surprising property mentioned to me by Robert Gower about the stochastic Polyak step-size variant below is that the iterate error is monotonically decreasing with just convexity.

Note that the decrease is _not_ in expectation! What other stochastic methods have this property? Can't think of any!

Ref: https://arxiv.org/abs/2307.14528

Fabian Pedregosa boosted:

I'm in #ECMLPKDD2023:
Presenting https://link.springer.com/article/10.1007/s10994-022-06277-7 in room A9i today 2pm
Exciting work on embeddings in databases

Tomorrow in the Causal Machine Learning for Operational Decision Making workshop, I'll be giving a keynote on various results on individualizing treatment effet: how to select models, to choose covariates, and summary statistics
https://upliftworkshop.ipipan.waw.pl

Fabian Pedregosa boosted:

nice paper that presents the Quantum Path Kernel (QPK), a new quantum machine learning approach that mirrors deep learning's hierarchy and generalization in the classical domain. The method's efficacy is showcased using the Gaussian xor mixtures classification.

📝 M. Incudini, M. Grossi, A. Mandarino, S. Vallecorsa, A. D. Pierro and D. Windridge, "The Quantum Path Kernel: A Generalized Neural Tangent Kernel for Deep Quantum Machine Learning," in IEEE Transactions on Quantum Engineering, vol. 4, pp. 1-16, 2023, Art no. 3101616, doi: 10.1109/TQE.2023.3287736.
🔗 https://ieeexplore.ieee.org/document/10158043
🏷️ #quantumComputing #machineLearning

@marcobuiatti @vborghesani Thanks Marco!

@vlad @vborghesani thanks Vlad!

@vborghesani My heart melts every time I look at her

Fabian Pedregosa boosted:

Hello world, I'm Lea! Even if I've been around for just one week, I've filled the hearts of these two with awe & love. So don't fret about their silence on all platforms: I'm keeping them busy.

@fabian

Two pictures of a new born baby: on the left swaddled in a green blanket, on the right wearing a body with Mafalda

Fabian Pedregosa boosted:

https://arxiv.org/abs/2302.10081
'Improved dimension dependence of a proximal algorithm for sampling'
- Jiaojiao Fan, Bo Yuan, Yongxin Chen

The 'proximal sampler' is an MCMC algorithm which is particularly compelling from the point of view of complexity analysis, and with increasing practical relevance. This work presents a modification of the ideal algorithm which is i) essentially implementable, ii) inexact, and iii) mixes well in terms of both dimension and error tolerance.

#sparxivdigest