#FeatureSelection

InterData VNinterdatavn
2025-04-28

Feature Selection là gì? A-Z về lựa chọn đặc trưng trong ML

Trong lĩnh vực Học máy, Feature Selection đóng vai trò như một bộ lọc thông minh, giúp tinh chỉnh dữ liệu đầu vào cho mô hình. Bằng cách sàng lọc và loại bỏ các đặc trưng ít giá trị hoặc gây nhiễu, phương pháp này trực tiếp góp phần nâng cao hiệu suất và độ tin cậy của mô hình dự đoán. Bài viết sau đây sẽ làm rõ Feature Selection là gì và nêu chi tiết các ưu điểm của nó.

Xem ngay: interdata.vn/blog/feature-sele

2025-02-10

⚙️ Hybrid Approach: The Hybrid method combines the strengths of both Filter and Wrapper approaches, offering a balance between speed and accuracy. By using a filter to narrow down the features and a wrapper for fine-tuning, it provides an effective and efficient feature selection process.

2025-02-10

🤖 Ensemble Approach: This technique combines multiple models to select the best features. By using multiple algorithms and aggregating their results, it improves robustness and accuracy. Common methods include Random Forest and Gradient Boosting.

2025-02-10

🎯 Wrapper Approach: Unlike the Filter approach, the Wrapper method evaluates feature subsets by training a model. It iteratively adds or removes features to find the optimal set. While more computationally expensive, it tends to provide better results when combined with powerful models.

2025-02-10

🔍 Filter Approach: This technique evaluates the relevance of features based on statistical tests, such as correlation, chi-square, or mutual information. It selects features independently of the learning algorithm, making it fast but sometimes less accurate.

2025-02-10

🚀 Feature Selection is a crucial step in the data preprocessing pipeline for machine learning. In my latest article, I explore various techniques to choose the most relevant features from a dataset, enhancing model performance while reducing complexity. Stay tuned as I dive deeper into these approaches.

2024-12-17

Enteric Fermentation in 2022

Livestock digestion emits too much methane:
* Too many bovines in India, Pakistan, Brazil, United States, China;
* Too many sheep and pigs in China.

(The bubble sizes depend on the amount of methane sent in 2022.)

#GreenhouseForcing #methane #emissions #climateChange #climateBreakdown #climateCollapse #dataViz #bubbleChart #dataMining #plotly #featureEngineering #featureSelection #dataDon

Enteric Fermentation in 2022

Livestock digestion emits too much methane:
* Too many bovines in India, Pakistan, Brazil, United States, China;
* Too many sheep and pigs in China.

(The bubble sizes depend on the amount of methane sent in 2022.)
2024-10-13

#ml #machinelearning #datascience #dailyreport
#featureselection
I have been reading about "feature selection" for ML. It
is eliminating candidate features from dataset.

Main types:
- intrinsic (or implicit) methods
- Filter Methods - before training calc correlation. by
correlation or F-test.
- Wrapper Methods or embedded methods -
ex. forward/backward selection...

May be categorized to:
- supervised - based on correlation to the target
- Non-Supervised - based on correlation of among features
themselves, without target. ex. PCA, t-SNE, Autoencoders,
Independent component analysis (ICA)

Interesting methods are “Stepwise forward/backward
selection”, “Simulated Annealing (SA)” and “Genetic
Algorithms”.

Links:
- Applied Predictive Modeling. Max Kuhn. Kjell
Johnson. Springer.
- feat.engineering/feature-selec

IB Teguh TMteguhteja
2024-08-26

Discover effective feature selection strategies in machine learning. Learn how filter, wrapper, and embedded methods improve model accuracy and efficiency.

teguhteja.id/feature-selection

katch wreckkatchwreck
2023-04-11

`It is considered a non-linear approach as the mapping cannot be represented as a linear combination of the original variables as possible in techniques such as principal component analysis, which also makes it more difficult to use for classification applications`

en.wikipedia.org/wiki/Sammon_m

Jason H. Moore, Ph.D.moorejh@mastodon.online
2022-12-17
2022-12-16

iCite: "ITERATIVE RE‐WEIGHTED COVARIATES SELECTION FOR ROBUST FEATURE SELECTION MODELLING IN THE PRESENCE OF OUTLIERS (IRCOVSEL)"
Journal of Chemometrics 2022
doi.org/10.1002/cem.3458.
#openaccess #chemometrics #featureselection

2022-12-02

📢📢📢 New #Paper: '#FeatureSelection with Distance Correlation' (arxiv.org/abs/2212.00046) - a short #PaperSummary thread

We investigates how to automatically find a small # of features that - when put into a simple #NeuralNetwork - yield good performance (e.g. for classification)

Two possible uses:
- Explain the behavior of a #BlackBox classifier
- Build a light-weight classifier from scratch

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst