Lmst

I ended up adding new super instructions (merging IR entities into one), first by following my own intuition to optimize something done quite frequently:

(while (< index 100) …)

This would generate 4 instructions for the condition, load symbol, load const, LT, jump if false, I compacted it to 2, load sym and LT_CONST_JUMP_IF_FALSE to do the comparison with a constant. This gave a 30% perf improvement on the comparison/looping!

Well according to my IR entities frequencies it wasn’t the most common set of entities so I have more to add, and some are… disturbing

#ArkScript #pldev #compiler #optimization

#ArkScript April 2025 update is up!

https://lexp.lt/posts/arkscript_update_april_2025/

#pldev #langdev #compiler #blog

We're in may and I haven't posted a new #ArkScript update article yet, but you can read this one instead:

https://lexp.lt/posts/inst_source_tracking_in_arkscript/

I talk about how I added source tracking on a per instruction basis inside ArkScript VM and it was quite well received on the not-very-orange-website (https://www.reddit.com/r/ProgrammingLanguages/comments/1kcef2l/instruction_source_location_tracking_in_arkscript/)

#pldev #langdev #cpp

God I hate error recovery in #PLDev, there's just no good way to think about it that I've ever found. It's my bane -_-

Tempted to try to do some #PLDev today 🤔

On another note, I’ve added instruction source location tracking to #ArkScript!

Meaning, we can (finally) have runtime errors that point to the line which threw the error. As well as go up the call tree and display it with the line of each call as well!

However I’m still dueling with #msvc that loves generating weird errors at runtime (and my favorite OS, Windows, using back slashes in path instead of forward slashes…)

#pldev #compiler #cplusplus

Error messages generated by the tests being run on Windows. A lot of « unknown characters » appear (the interrogation point in a rotated square), with the test runner reporting that « weird name here » already exist (three times).

There is also an error saying that some content do not match the expected content. The only difference is a backward slash in place of a forward one. Thanks Windows.

Turns out I was semi-wrong in my article: I can reference a local by its index on the stack, because we now have a dedicated stack for locals, and closures now own their fields and have their own lookup algorithm

I will try and implement a LOAD_SYMBOL_BY_INDEX (currently loading by id, which means we must iterate over the entire scope to find the variable we need), since it’s a new instruction it shouldn’t break things and I’ll be able to iterate

Again, I expect a significant performance boost from this as we will finally be able to load a variable value in O(1) instead of O(n), with n being the size of the current scope
#ArkScript #pldev

I indeed found a better memory layout to store variables in ArkScript, and I got a 76% performance boost on the binary tree benchmark, and a 21% perf boost on Ackermann(3, 7)
Who knew using a contiguous storage buffer could be beneficial? 🤡

I retraced all the performance improvements I applied to #ArkScript through the last five years, with updated benchmarks, AND DAMN what a journey

https://lexp.lt/posts/optimizing_scopes_data_in_arkscript_vm/

#pldev #compiler #cplusplus

I might have found yet another (better? At least on paper) memory layout for storing #ArkScript scopes and locals

Currently I create a vector<pair<id, value>> for each scope. Quite costly in terms of copies and all

What if I had a
array<pair<id, value>, N>
And my scopes were just views:
view(start, length, min id, max id)

Since only the last scope can grow… it could work. Min and max id are there for a basic bloom filter. I just have to solve the problem for closures that have their own scope that must be kept alive, but by pushing references (value holding a ptr to value) this could be solved easily

Also because of closures, scopes are shared ptr to Scope (the class holding the vec of pair) currently. Quite costly to construct…

I will go back to my old scopes/locals management code and I think I will write an article about how it evolved, and the various performance boosts it yielded

#pldev #proglang

One big item in the list (plus a few bugfixes). Taking it easy today as I'm feeling sickly, but I managed to get raylib's GOL going again on the self-hosted implementation.

#badlang #pldev

Haven't posted much about #badlang this past week but been quietly working to get the self-hosting happen. Many pieces are already in place and it can already compile many of the tests and a few examples (fib, rule110, gol).

Still missing a few big chunks (namely poly functions and methods) but happy with the progress.

Gotta be nice when this is over and I can add a few more QoL features, like foreach loops for iterators, interfaces, and potentially overloaded operators (!?).

#PLDev

A todo list for the steps needed on the self-hosting compiler. Many checked steps and a few steps missing on the codegen section (array/struct literals, methods, namespaces, polymorphism, match expressions)

This sure feels a lot better than what I was doing before. Sometimes you just need to walk the walk a few times until you reach the right destination I guess!

Got variables, arithmetic/logical ops and conditionals working on the new IR.

#PLDev #badlang

Code for some conditional expression in badlang alongside the linear IR representation it produces.

After reading yesterday post by @tekknolagi [1] I felt the urge to cleanup my linear IR implementation, as I wasn't happy with how the backend was turning out. At the moment I'm not trying to do SSA, but I took inspiration from the post to separate functions in basic blocks (a function contains N basic blocks, each with M instructions) and jumps go to basic blocks instead of labels.

Starting out with let/set expressions to avoid running into the same issues I was having in the previous implementation.

[1]: https://mastodon.social/@tekknolagi/113948010540553742

#badlang #PLDev

a text representation of a linear IR and the badlang code that generates it

We are so back baby. Self-hosting works with rule110.

#pldev #badlang

Terminal screen showcasing the output of the rule110 program.

Everyone seemed to disagree with me on the poll I put up for this, but I really like how having both the parameters and return (denoted by ->) inside the parenthesis on function declarations make for pretty aesthetic code (bias alert!).

It's pretty clear where the argument list ends and the function body starts, specially since commas are just whitespace in this language and can be omitted here when a long list is broken up in multiple lines.

#PLDev #badlang

some random badlang code, where a function with a longish argument list is broken up into multiple lines.

The self-hosted compiler is now able to compile the fibonnacci example. Still ways to go, but getting closer every day.

#badlang #pldev

a terminal window showcasing some badlang code for calculating fibonacci numbers for N=1000 and the result of compiling it and running it with the self-hosted compiler

Might not look like much but this is pretty huge. Self-hosted typechecker is mostly done (the biggest priority tasks at least) and I've started with code generation.

I've decided to go with a 2 step approach for codegen. The first step linearizes the AST into a high level linear IR (so no gotos/labels at this stage yet), and the second level translates the LIR into C. This way it should be easier to make new backends I think, and it's a bit easier on my brain too.

#badlang #PLDev

A simple expression 12 + 36 being fully compiled into C and executed, printing out the result: 48

Sneaking some more badlang improvements before I start my workday. Typechecking is coming along, so I wanted to visualize the symbol tables. This time, all the columns are properly padded to the maximum length of that column. In the previous version I was just using `\t` for separating things, but that breaks very easily.

Happy with the ergonomics of the language and the stdlib for doing this. In fact, after a few 1000 lines of code written in badlang, I'm really liking it (shocking and unbiased opinion of course). Once the self-host is complete I'll do a pass for a few more syntactic sugar things but don't have many big changes planned for the syntax.

#badlang #PLDev

badlang code alongside a properly tabulated symbol table

A little distraction, trying to create a generic hash table for the symbol tables et al.

Ran into an issue where my bootstrap implementation didn't fully support nested generic type delarations. Alas, we push on, making specialized implementations for now (StrMap::V instead Map::(K,V)) to continue with self-hosting. I prefer to fix this on the new implementation, and not on code I'm throwing away.

#badlang #PLDev

code where a couple of structs for a StrMap implementation written in badlang.

The self-hosted parser can now parse itself :ouroboros:.

Funnily enough, it ended up being more or less the same CLOC than the C implementation, but with a few extra features (plus the C parser makes use of some macros to avoid repetitious code).

This parser should be much more robust and not segfault, since it uses a combination of U32 handles instead of pointers and Optional values that are exhaustively verified for None values. Errors are also much nicer and informative!

#badlang #PLDev

#PLDev

Client Info