#anonymized

2025-05-22

Researchers Scrape 2 Billion #Discord Messages & Publish Them Online

The data was pulled from 3,167 servers & covers posts made between 2015 & 2024, the entire time Discord has been active.

… claim they’ve #anonymized the data, it’s hard to imagine anyone is comfortable with almost a decade of their Discord messages sitting in a public JSON file. Separately, … a Discord tool called "Searchcord" based on a diff data set that shows non-anonymized chat histories
#privacy

404media.co/researchers-scrape

Todd A. Jacobs | Pragmatic Cybersecuritytodd_a_jacobs@infosec.exchange
2025-01-13

#DuckDuckGo is now offering free, #anonymized access to a number of fast #AI #chatbots that won't train in your data. You currently don't get all the premium models and features of paid services, but you do get access to privacy-promoting, anonymized versions of smaller models like GPT-4o mini from #OpenAI and open-source #MoE (mixture of experts) models like Mixstral 8x7B.

Of course, for truly sensitive or classified data you should never use online services at all. Anything online carries heightened risks of human error; deliberate malfeasance; corporate espionage; legal, illegal, or extra-legal warrants; and network wiretapping. I personally trust DuckDuckGo's no-logging policies and presume their anonymization techniques are sound, but those of us in #cybersecurity know the practical limitations of such measures.

For any situation where those measures are insufficient, you'll need to run your own instance of a suitable model on a local AI engine. However, that's not really the #threatmodel for the average user looking to get basic things done. Great use cases include finding quick answers that traditional search engines aren't good at, or performing common AI tasks like summarizing or improving textual information.

The AI service provides the typical user with essential AI capabilities for free. It also takes steps to prevent for-profit entities with privacy-damaging #TOS from training on your data at whim. DuckDuckGo's approach seems perfectly suited to these basic use cases.

I laud DuckDuckGo for their ongoing commitment to privacy, and for offering this valuable additional to the AI ecosystem.

duckduckgo.com/chat

sounddrill :verified_dragon:​sounddrill@infosec.exchange
2024-08-27

What are people's thoughts on "#ethicaltelemetry"?

I'm making a simple app with the purpose of releasing it on the Play store, FDroid, etc.

And I want to track:
- #Anonymized install count
- source of install(#Source built, Bleeding Edge, #fdroid and #playstore)

(I have a way to track source built/play store/fdroid already but no way to get the #data off-device)

I'm quite new to the whole scene so I'm asking here:
- How do I do it in a way that doesn't piss people off? (ofc target market doesn't care but still)
- I also want to ensure the telemetry method works on phones as #old as possible, like #android 4.0 api 16, if possible
- How do I #protect myself from some kid figuring out my telemetry method and spamming it with bots?

Intro screen -> Opt in/opt out(with red warning explaining how much it'll help us to have those stats) -> button at the bottom after all the checks called "Login" leading to Login.kt fragment.

I'm thinking of something like this but I still have no idea how to #collect this info

2024-02-27

#Meta will start collecting “anonymized” data about #Quest headset usage
#facebook #privacy #surveillance #anonymized

arstechnica.com/?p=2006128

Underflow :verified:underflow@infosec.exchange
2022-11-15

87% of the population in the United States has characteristics that likely made them unique based only on 5-digit ZIP, gender, date of birth!

#leaked or #anonymized datasets can hold a #privacy risk with very little information.

These stats come from a study done by Latanya Sweeney in 2000 viewable here (PDF) dataprivacylab.org/projects/id

CK's Technology NewsCKsTechnologyNews
2021-12-30

data is rarely

Mozilla and Microsoft using this argument since years but if data are truly anonymous then there is no point in collecting it because you could not use them at all.

flowingdata.com/2021/12/29/ano

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst