LapisRising πŸ‘Œ

I like good coffee, cycling and software engineering. Preferably in that order after I wake up.

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-30

@molly0xfff Update hell is bad on one machine, but what about 2 or 3? Many people have a working and a home laptop. Add a beefy PC tower and you are updating things all the time :/

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-29

@artemesia @carnage4life Yes, they own their content. If they don't want their data to be scraped they can use a wildcard and prohibit all bots to do it.

It is their content and they should decide who gets access.

The internet is by default open, so the default should be "everybody can use the content".

Do you have another opinion?

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-29

@artemesia @carnage4life I would prefer to have categories like: "SearchEngine", "DataModel", "NewsAggregator" and the like.

robots.txt is not meant for that, but changing the interpretation of the agent string would technically allow for that, similar to using multiple CSS classes for one element back in the day.

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-29

@artemesia @carnage4life
You can match all with * or name specific agents like Googlebot.

I believe the white and blacklisting must be done by the content owner, so the robots.txt is the right place.

I also believe that we need categories instead of specific agents, otherwise we get a closed web very fast. Why should ChatGPT be allowed to scrape and Gemini not?

For "deals" in B2B APIs are the correct approach. They are more efficient and accurate.

moz.com/learn/seo/robotstxt

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-28

@carnage4life technical solutions exist though. Take robots.txt for example. Every page can specify which agents are allowed to crawl which parts of the website. It would be easy to extend this to classifications of bots, for example search engines and gpts.

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-24

The critical slope is 13-15% where walking becomes more efficient than #cycling.

This is almost always for recreational cyclists. Basically, when you are able to cycle up the the hill it's more efficient.

pedalchile.com/blog/uphill

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-24

@carnage4life We only have so many original documents and texts, and they have biases.

Filtering content for #LLMs only reduces the total amount of training data.

Until someone rewrites our history and knowledge within their worldview in its entirety, humanity will live in its bubble for better or worse.

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-23

@carnage4life the way out of this problem are a just legislation and a good judicial system.
They better not be biased or corrupt. 🀞

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-23

@atineoSE @themarkup I believe that the current available tools are enough regulate web scraping. There are the robots.txt, IP based blocking / throttling and Captchas. Additionally, external APIs clearly define pricing and access control.
On the one hand, I don't want to see prosecuting scraping on the internet become common place. On the other hand, I don't like people circumventing paid APIs to get the data for free.
It boils down to the scale and how the agent / scraper behaves.

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-23

@carnage4life I just hope that the same news sites don't use these AI models to generate their articles.
This would lead to paywalls for artificial content and new AI models that don't improve because they train on the data they produce.

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-20

@carnage4life we had different scooter brands, but I also don't see them as often as before.
I hope bike sharing will survive, because #cycling it is often faster than public transport or cars in big cities.

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-20
LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-20

Wow, todays Cell Tower was really straight forward...

#CellTower 594

➑️➑️➑️➑️➑️➑️➑️➑️➑️➑️➑️➑️➑️

andrewt.net/puzzles/cell-tower

#celltower

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-20

@danep people that drive a Teslas chose to do it, those cars are more expensive than safer alternatives.
All the information about the failures of the cars are public and covered by the news. I think it's too easy to blame all the problems on a company and call it a day.
I'm open to alternatives, but I see the victims and value their lives at least as high.

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-19

@seldo There is a reason why they call it survival of the fittest. This is natural selection at its finest.
The only tragedy I see are the innocent victims that didn't have a choice.

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-19

@hanno all look similar, but I feel the aom looks faster and a bit more hectic than the rav one.

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-19

I'm calling it, #GTA6 will have the #markzuckerberg vs #elonmusk fight we all dream of. My money is on Mark.

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-18

#youtube became the biggest podcast platform in the last few years. Many creators have started to sprinkle in podcasts into their channels.
I assume it's because it's easy to keep the content fresh when they invite new guests.
I think it's hit and miss in many cases, not everybody is #joerogan

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-16

"Getting riders to move around and stop pressure build up during turbo sessions – β€œliterally just stand up for two seconds every 10 minutes” ..." msn.com/en-us/health/wellness/ #cycling #kickr

LapisRising πŸ‘ŒLapisRising@mastodon.online
2023-12-16

#strava basically rickrolled me. There was a call to action to update the iOS app to get my highlights of 2023.
After following the instructions they teased me with some fancy animations and presented me with an upsell page.
Not appreciated, deleted the app from my phone...
Do they think I don't backup my fit files and can do some basic statics?

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst