Elias Dabbas :verified:elias@seocommunity.social
2024-11-13

@wraptile

One of the fundamental rules in open source:

"Talk is cheap. Show me the code."

- Linus

Elias Dabbas :verified:elias@seocommunity.social
2024-11-13

What's your favorite Python SEO crawler?

According to The Google, there seems to be one dominant option.

python3 -m pip install advertools

If you dig deeper, you'll find two other crawlers:
- One for status codes and all response headers
- One for downloading images from a list of URLs

#advertools
#SEO
#DataScience
#Python

Elias Dabbas :verified:elias@seocommunity.social
2024-11-01

Video explaining how you can make these

youtube.com/watch?v=nCLBOrMMg1

2/2

Elias Dabbas :verified:elias@seocommunity.social
2024-11-01

Plotly is running an app challenge, for apps to analyze a Michelin Guide restaurant awards dataset.

I created a few charts on LearnPlotly[.]com without any code to explore the dataset.

These might be interesting examples for how to explore the data using a highly configurable/customizable:

- pie chart
- treemap
- histogram (categorical)
- density heatmap (2d histogram)
- scatter map chart

#DataScience #DataVisualization #AgGrid

1/2

Elias Dabbas :verified:elias@seocommunity.social
2024-10-23

App update: upload your own CSV file and visualize it.

Here are two examples of how to visualize some GSC data (countries on a map, and queries on an ECDF chart).

#DataScience #DataVisualization #Python #Plotly #Dash

youtube.com/watch?v=62eAMWigoT

Elias Dabbas :verified:elias@seocommunity.social
2024-10-21

🔵 No "sane defaults", and it doesn't even try to "save you from yourself". In fact you're encouraged to try out different combinations, make meaningless charts until you find the one that works.
🔵 Error messages that you would have received from a selected combination are always displayed at the bottom of the page

I'm finding this much easier to explore than actually typing the code for every combination of options. Hope you do too.

Check it out here:

bit.ly/4fcqzfc

Elias Dabbas :verified:elias@seocommunity.social
2024-10-21

New dataviz app: plotly express interactive

The full (almost) library made interactive to explore all combinations of options

🔵 10 px datasets available to choose from
🔵 40 plot types to visualize any dataset
🔵 Can be used to learn/explore px or dataviz in general

#DataScience #DataVisualization #Python #Plotly #Dash

1/2

Elias Dabbas :verified:elias@seocommunity.social
2024-10-15

Would this be useful?
What would you use it for?

Examples:

- Page testing using different proxies
- Page testing using different devices (still can't figure why it's not properly working though)
- Ads tracking (yours and/or competitors)
- Tracking featured content
- other?

bit.ly/486KQRa

#scrapy #Python #playwright

2/2

Elias Dabbas :verified:elias@seocommunity.social
2024-10-15

Screen shot taker tool, first step

Just created a very simple tool that takes a list of URLs, and takes screenshots of them, saving the images as PNG files.

It takes a screenshot of the full page if it needs scrolling.

You need scrapy and scrapy-playwright, so make sure you read the latter's install instructions, especially for Windows.

1/2

Elias Dabbas :verified:elias@seocommunity.social
2024-10-11

science. I have moved from the first to the second activity, which I now primarily focus on. This is a career shift and most people probably don’t want to do this. I’m writing the software so most of you can use it, and not worry about building it. You focus on gaining insights, and getting more productive, while I focus on building and improving. You're more than welcome to join in the development of course.

#DataScience #Python #SEO

4/4

Elias Dabbas :verified:elias@seocommunity.social
2024-10-11

select products, prices, they don’t do advertising, and so on. I've written previously about the difference between programming and software development, and I think it is empowering and liberating for all digital marketing people. You can really boost your data work and productivity, and analyze data in ways you never thought possible. Yet, you don’t have to get into software development, and thereby provide high quality entertainment for your developer friends. Try googling data

3/4

Elias Dabbas :verified:elias@seocommunity.social
2024-10-11

libraries can take your data work (analytics and productivity) to a new level. Building software tools is software development, even if you are building SEO tools, and it is completely out of scope of SEO. Just like a web developer can build an ecommerce website (software development), yet what they are doing is not called “ecommerce”. Sure, they need to have an understanding of ecommerce, the process, delivery, returns, invoicing, etc. But they still don’t “do” ecommerce. They don’t

2/4

Elias Dabbas :verified:elias@seocommunity.social
2024-10-11

Using Python for doing SEO, versus using Python to develop software/tools for SEO.

The first activity is called SEO.

The second one is called software development.

Important difference.

You can use Python for crawling (try one of the #advertools crawlers), analyzing log files (also advertools), XML sitemaps (yes, yes, advertools), running bulk robots.txt tests, weighted n-grams, and much more. These are SEO tasks. Running them in bulk, using a programming language and its powerful

1/4

Elias Dabbas :verified:elias@seocommunity.social
2024-10-10

would it be easy for you to run this operation (in Python, spreadsheet, R, or with a paper, pen and a bunch of crayons).
It’s much clearer and more concise when expressed in Python than it is in English:

df.sort_values(['date', 'views'],ascending=[True, False]) .groupby('date') .head(3)

You will not ever learn this if you want to “learn Python”. It's not a Python thing, this is a data skill. Unless you “learn data science/analysis with python”, then yes.

3/3

#DataScience #Python #SEO #SEM

Elias Dabbas :verified:elias@seocommunity.social
2024-10-10

order.

This is done by taking every group of rows that contain a unique date, sorting only those rows in descending order (according to the “views” column).
This is independent of all other rows in the table.

- Using the “date” column as a grouping variable, take the first/last three rows of each group of rows Now we have the top pages that received the most views in descending order for each day.

If you understood the above instructions, and can imagine the final outcome, only then

2/3

Elias Dabbas :verified:elias@seocommunity.social
2024-10-10

Learning #Python for #SEO: Sorting by multiple columns

Did you know you can sort a table by two (or more) columns?
And that they can be independently sorted in a mix of descending/ascending orders? Why the hell would I want to do that?
What would that look like anyway?

Why: I want the top/bottom N URLs for every day. What would that look like? Here are the instructions:

- Sort the table using the “date” column in ascending order

- Sort the table using the “views” column in descending

1/3

Elias Dabbas :verified:elias@seocommunity.social
2024-10-08

Python for SEO

I don’t talk much about Python for SEO, because I mainly spend time writing actual Python for SEO, SEM, & digital marketing.

I’ll be talking a bit more about it, about how to approach it, learn it, enjoy it, and why it’s a misleading keyword.

Do you have any questions?
Any strong beliefs?
Any suggested topics?

I’d like to validate some ideas, learn more about what you think, and hopefully provide something useful.

#DataScience #Python

Elias Dabbas :verified:elias@seocommunity.social
2024-09-18

Here's one of the already crawled sites:
bit.ly/3XMewzq

Or start crawling and then analyze:
bit.ly/3MTYGM4

Elias Dabbas :verified:elias@seocommunity.social
2024-09-18

🕷 🕸 🕷 🕸 🕷 🕸
Crawl analytics: added link analysis tools

Internal vs external links

Default editable regex to distinguish internal from external

External links: De-duplicated, and domains extracted and counted. Shows which domains you really give importance to in terms of linking.

Internal links:

inlinks + outlinks = links (simple counting per URL)
Degree centrality: percentage of links for a URL vs potential links
Pagerank: internal pagerank

#crawling #scraping #SEO #dash #plotly #AGGrid

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst