#RegExps

2025-04-30

Empty matches in Python’s `re` module

blog.narf.ssji.net/2025/04/30/

Python’s `re.sub` method has a weird, though documented, behaviour.

Replacements for empty-matching patterns such as `/.*/` applied to a non-empty string will lead to two matches. The replacement will therefore be applied twice.

A simple fix is to make sure the pattern is not empty-matching, e.g. `/.+/`.

#debugging #Python #regexps #sed

Joaquim Homrighausenjoho@mastodon.online
2024-09-11

Best answer on regexps ever? 🧐 😎

- What is the plural form of regex?

- If you've used more than one of them, you'll know that the plural of "regex" is "regrets." – cjs (Mar 14, 2022 at 23:46)

#regex #regexps #programmer #programming #devops #regularexpression

R. L. Dane :debian: :openbsd:RL_Dane@fosstodon.org
2024-05-06

Extended #regexps are great for place names like "Wellesley" where you can not remember if it's Welesley, Welleseley, Wellesselley, Welleslley, or Welesslley to save your dadburn life. ;)

/wel+es+e?l+ey/ !!!

cc: @amin

Marcos Dionemdione@en.osm.town
2024-03-19

I just noticed. #regexps are code. It's text matching and cutting language. You're writing a regexp, you're writing a function. And as such, of course, you have to test them.

Marcos Dionemdione@en.osm.town
2024-03-19

I had a problem. I used #regexps... and I tested them. Througly. I cut them down in pieces. And tested the pieces first...

And now I don't have a problem!

Test! And don't forget to also test your regexps!

Marcos Dionemdione@en.osm.town
2024-03-19

#til

* Base64 is not idempotent:`b64(x)` != `b64(b64(x))`. This is because it represents values 0-63 with values 65-90 (`A-Z`), 97-122 (`a-z`), 48-57 (`0-9`), 43 (`+`) and 47 (`/`); and 61 (`=`) for padding.
* `grep` not only has options to support more complex #regexps (`-E, --extended-regexp`, `-P, --perl-regexp`), it also has `-F,--fixed-strings` to treat the pattern as just a string. This not only makes it slightly faster, it's easier to write if you would need too much escaping.

2024-02-23

@neustradamus #PCRE continues to be a misnomer; it’s a modified subset of #Perl #RegularExpressions with dozens of differences: pcre.org/current/doc/html/pcre

It's not "(C)ompatible." Accept no substitutes: perldoc.perl.org/perlre

#PCRE2 #PerlIncompatibleRegularExpressions #RegularExpression #RegExes #RegExps #regex #regexp

Todd A. Jacobs | Rubyisttodd_a_jacobs@ruby.social
2023-12-30

@hschne I think a lot of folks know about it, but there aren't too many use cases in my own code where it isn't easier just to assign capture groups to variables post facto. Plus, I'm one of those people that think #regex is often abused when simple matches followed by code is better than really complicated #regexps.

Plus, "shiny and cool" isn't always better than readable, and named capture groups just hurt my eyes. 😎

2023-06-09

Solved #bug today in #matomo import script.
Took me two weeks to find what went wrong.
Turns out if the host variable in the import script has no "http" in it, host is not recorded in database.
You gotta love #regexps

2023-06-02

@regehr @commodore @dev There is even a (low-severity, a/k/a “cruel”) #PerlCritic policy to discourage everything but $_, @_, $], and numbered #RegularExpression capture variables: metacpan.org/pod/Perl::Critic:

metacpan.org/pod/Perl::Critic: already protects you against the performance-sapping $`, $&, and $' match variables

And you can configure your own prohibited list with metacpan.org/pod/Perl::Critic:

#Perl #RegEx #RegExp #RegExes #RegExps

a66ey 🇪🇺🏳️‍🌈 she/hera66ey
2023-02-17

Spent half a day polishing complex in . And they work too, but perl has a little different standard, so I'm basically stuck in grep -p punching a file until it works. As they say in , sometimes you just have to brute force prototype.

2023-02-06

@Codely @drupler It helps to build your complicated #RegularExpressions in pieces and store them in separate variables. You can then test them in isolation and not be confused when you concatenate them together for your actual matching.

Both #PHP and #JavaScript also support named capture groups if you’re doing replacements. They’re a lot more readable.

Also, PHP’s #PCRE-based engine has a PCRE_EXTENDED flag that lets you add whitespace, newlines, and comments.

#regexes #regexps

Brendan Halpinbthalpin
2023-01-23

In the absence of , my anti-twitter-RTs filter is now

twitter.com/

The trailing backslash catches links to tweets (twitter.com/screenname/12345 etc) but allows other mentions of twitter.com, e.g., a username.

2023-01-23

@Perl Are you working through the #OReilly #book ‘Learning #Perl’? Get extra practice with co-author brian d foy’s ‘Learning Perl Exercises’ #ebook: leanpub.com/learning_perl_exer

Don’t have ‘Learning Perl’ yet? Buy it in paperback or ebook here: shop.aer.io/oreilly/p/learning
Prefer #Amazon #Kindle? amzn.to/3QZj7t6 (affiliate link)

#books #bookstodon #coding #programming #SoftwareDevelopment #ProgrammingLanguages #Perl5 #RegularExpressions #regexes #regexps #Unicode #CPAN

2023-01-22

@sjn @cb 99% of the “#Perl is line noise” complaints are because of unformatted #RegularExpressions. Every language worth anything eventually supports them, but only @Perl (and #awk, earlier) makes them first-class citizens. And with Perl you can format and comment them for readability: perldoc.perl.org/perlretut#Emb

We format the rest of our code for humans. Why not #regexps?

#PerlCritic can warn against bad regexps: metacpan.org/search?size=200&q

#regex #regexes #programming #coding #SoftwareDevelopment

2023-01-07

@randomatic @ChristosArgyrop@mstdn.science @HaplogroupNews @ChristosArgyrop@mastodon.social A previous employer of mine used re::engine::RE2 for many things, because a) you can cap the memory usage to avoid #DoS attacks, and b) the lead developer was all about premature optimization. metacpan.org/pod/re::engine::R

But you can’t use the /x flag for better readability, and we ran into some nasty #Unicode bugs and had to fall back to regular Perl #RegExps in those cases.

GenghisKen CoarGenghisKen@ruby.social
2023-01-06

@Xiy
I was unclear; I use either ! or § for #regexps (%r, %R) but [...] for others of these... whatever you call them, because they define arrays.

GenghisKen CoarGenghisKen@ruby.social
2023-01-04

@Xiy
I use this a lot for #Ruby #regular expressions; I typically use %𝚛!...!. Also, for arrays of symbols; less commonly for arrays of strings. Oddly enough, I'll use paired delimiters (such as brackets, braces, or parentheses) for any 𝘰𝘵𝘩𝘦𝘳 list of this sort, just not #regexps. Mostly brackets, since its an array.

I found locating an authoritative list of these, by implementing Ruby version, much more of a chore than I expected.

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst