#ieee754

2025-06-13

OK -- I finished a prototype for my custom mono-field floating-point format, in Golang. I haven't done extensive testing, but it works fine so far -- addition, subtraction, multiplication, and division.

One of the distinguishing characteristics of it is that the amount of bits *available* for the significand varies based on the exponent. You get between 0 (yes) and 32 (yes) bits for the significand.

I think I'll put this up on #GitLab at some point.
#computerScience #computing #IEEE #IEEE754

2025-06-02

and for the #IEEE754 / #floatingpoint nerds (you know who you are!) here's a much more definitive answer/breakdown of our IEEE Binary FP32 conformance for the Vector Unit! github.com/tenstorrent/tt-isa-

#RVV #SIMD #HPC

2025-03-06

I've finally finished my blog post about formalizing floating point numbers in Lean. I've put my work in a a little library called "Flean". It has some of the basic properties of IEEE-754 floats.

josephmckinsey.com/flean2.html

The post has details about my process, the library design, and what I learned about Lean itself. Next time I'm aiming for a smaller side project, since I've spent way too much time on this already.

#lean #ieee754

2025-01-05

For #programmers:
You are familiar with 64-bit floating point and 32-bit floating point, and may have heard about 16-bit floating point (present in some GPUs), but there is actually work on 8-BIT floating-point!

arxiv.org/abs/2209.05433
developer.nvidia.com/blog/nvid

There is the "E5M2" variant, a "truncated IEEE FP16 format" (nice if lacking FP8). Although, at the miniscule 8-bit level, you don't necessarily need multiple NaNs or need infinities, so there is the "E4M3" variant as well.

#IEEE754 #AI

2024-12-12

Sertifioitu IEEE 754 -hetki Ilmatieteen laitoksella.

#IEEE754 #FloatingPoint #ohjelmointi

Sademäärä: 1.70000000000000002 mm.
2024-12-09

Channeling my inner @shafik, assuming a standard, compliant #riscv processor, what kind of float instructions can be executed on the vector unit of a processor that advertises

"RV32IMFDZve64f"

#HPC #IEEE754 #SIMD #RISCV #RVV

github.com/riscvarchive/riscv-

2024-12-02

Повышение эффективности образования методом «Безумного Макса», в применении для хардвера высокоскоростных вычислений

Когда студент устраивается на работу в электронную компанию, очень здорово, если он уже умеет строить одну и ту же электронную схему разными способами, в зависимости от требований пропускной способности, максимальной тактовой частоты, размера и энергопотребления. Как натренировать такое умение? Для новых домашних работ в программе Школы Синтеза Цифровых Схем мы решили разодрать на блоки реальный процессор и дать студентам задачу собирать разные специализированные вычислительные устройства из этих блоков, примерно как герои фильма "Безумный Макс: Дорога ярости" собирали свои боевые драндулеты из частей реальных автомобилей. В качестве первой жертвы мы выбрали ...

habr.com/ru/articles/862734/

#Verilog #VHDL #микроархитектура #riscv #FPU #ieee754 #SystemVerilog #школа_синтеза_цифровых_схем #openhwgroup #образование

Torsten Brongerbronger
2024-11-15

Whom should I call to add NaN support in ?

2024-10-26

🎉 🎉 C23 and C++23 are finally joining the quadruple precision club, by bringing a standard way to handle 128-bit floating point numbers!
(FP16 is also here if you need it)

Here is hoping that a future Fortran standard will adopt the C_Float128 kind specifier that gcc/gfortran already has as an extension.
en.cppreference.com/w/cpp/type

#c23 #cpp23 #cpp #ieee754 #Fortran #floatingpoint

HP van Braamhp@tmm.cx
2024-09-12

Presented without comment.

#computer #nerdjoke #computerjoke #IEEE754

A version of the "Look What They Need to Mimic a Fraction of Our Power " meme.

the focal point of the meme is a picture of the IEEE 754 Floating Point Standard. 

In the final panel of the meme the word "Power" is blacked out, making the final panel read "Look What They Need to Mimic a fraction"
Yes, I Know IT ! 🎓YesIKnowIT
2024-09-03

Understanding "posits", an alternative to the IEEE-754 floating point formats for representing reals:

johndcook.com/blog/2018/04/11/

2024-08-29

Что такое Decimal64 из Decimal floating point из IEEE 754 или точные десятичные числа с плавающей запятой в компьютере

Более 90% всех программистов знают, что такое обычные числа с плавающей запятой: binary32/binary64/binary128, их часто называют float, double и т.д. соответсвенно, есть много информации о том почему 0,1 не может существовать в бинарном виде, что при большом количестве значащих цифр будут недостаток точности, даже, если ты не выходишь за рамки 16 цифр, зато они быстрые… Но почти нет информации о том, что прекрасное решение, которое сохраняет все достоинства и исправляет недостатки есть, даже в самом обновленном стандарте плавающих чисел IEEE 754-2008 уже больше 15 лет, это decimal floating point(DFP) . Для начала вспомним устройство обычного binary64: 1 бит знака, 11 битов экспоненты, 52 бита мантиссы. Давайте лучше картинку покажу:

habr.com/ru/articles/839524/

#плавающая_запятая #плавающая_точка #dpd #decimal #float #floating_point #ieee754 #математика #технологии #компьютер

2024-08-15

The whiteboard next to the department coffee machine is covered in ancient comics and tedious administrative notices, so to liven it up a bit, I put up the three-page list of definitions from #IEEE754-2019.

Pustam | पुस्तम | পুস্তম🇳🇵pustam_egr@mathstodon.xyz
2024-06-24

1. IEEE standard 754 (floating point number arithmetic) which Desmos uses cannot accurately mod \(10\) at this high of a number (\(10^{18}\varphi \)), as when it tries to divide the large number by \(10\) and find the remainder, the last bits is \(512\) and \(2048\), and when getting divided by \(10\), these become \(256\) and \(512\) (with the precision). The mod algorithm accidentally grabs the \(256\) bit and reverses the sign.

2. Desmos displays undefined, but internally it stores \(\infty\), as per the IEEE 754 standard. Then, \(\frac1\infty\) is considered to be \(0\), which is why that fraction is equal to \(0\). It doesn't store the value of \(\frac10\). Instead, it's stored as the Desmos-specific mathematical constant \(\infty\), roughly equal to \(10^{308}\). So, \(\frac{3}{2(2-2)}-1\) is about \(10^{308}\), and \(\frac{5}{10^{308}}\) is roughly \(0\), since all numbers in Desmos are rounded to \(15\) decimal points.

Here's an online calculator you can use to see how some values are represented in IEEE754 floating point!
h-schmidt.net/FloatConverter/I
#Desmos #IEEE754 #FloatingPoint #Error #FloatingPointError #FloatingPointNumber #Precision #Accuracy #Numbers #Maths #NumericalValues

Yes, I Know IT ! 🎓YesIKnowIT
2024-06-05

Understanding "posits", an alternative to the IEEE-754 floating point formats for representing reals:

johndcook.com/blog/2018/04/11/

Pavankumar 🦅pvnkmr_s
2024-06-01

Can anyone help me with an (at least addition and multiplication) for point numbers with .... Pls respond...🙃

A good dive into how Python represents floating point numbers, and how comparisons against integers can give unexpected results.

blog.codingconfessions.com/p/h

#python #programming #floatingpoint #math #ieee754

2024-04-22

if female is 0.3, my gender is 0.1 + 0.2
#IEEE754

2024-04-17

How it started:

I should write a nifty NIR helper for doing format conversion.

How it's going:

Why doesn't our pack function for R11G11B10_FLOAT round or handle denorms properly?!? I should fix that.

Cursed to understand #IEEE754...

Yes, I Know IT ! 🎓YesIKnowIT
2024-03-06

Understanding "posits", an alternative to the IEEE-754 floating point formats for representing reals:

johndcook.com/blog/2018/04/11/

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst