dev @ datarequests.org

A behind-the-scenes log by the developers at Datenanfragen.de/datarequests.org. We are hacking on data protection. #gdpr #dsgvo #privacy

You can expect threads about our latest changes, technical details, and interesting things we discover.
Longer blog posts in our devlog: datarequests.org/devlog

Toots by @baltpeter (^b) and @zner0L (^z).

Contact details and legal notice: datarequests.org/contact

dev @ datarequests.org boosted:
datarequests.orgdatarequestsorg
2024-10-25

Interested in contributing to our project? On November 06 at 17:00 (CEST), we're holding an online meetup. Regardless of whether you're already contributing to datarequests.org, whether you're interested in joining, or you just want to get to know us: Our contributor meetup is supposed to be a space to ask questions on participating, for learning on how to contribute to datarequests.org and Tweasel, and where data protection nerds can meet and network on various issues.

datarequests.org/verein/event/

2024-09-11

Our open request database at data.tweasel.org/ had been practically unusable for a while with pages taking way too long to load.

Turns out we were getting hammered with ridiculous amounts of requests by inconsiderate LLM crawlers. :|
This should now be fixed—we are now blocking their user agents in our reverse proxy. Thanks to @ubernauten for this very helpful blog post which pointed us in the right direction: blog.uberspace.de/2024/08/bad-

More details in this issue: github.com/tweaselORG/data.twe ^b

2024-09-11

@rugk Indeed, just as @rufposten said in chaos.social/@tracktor/1131175, we have been in contact about this already.

We've actually just started working on extending our functionality for websites this month. I'll do my best to publish a new devlog as soon as I find the time—a lot has happened since the last one. ^b

dev @ datarequests.org boosted:
2024-05-14

Android: Der Beitrag stellt die Vorbereitung des Testgeräts sowie Werkzeuge (Frida, Magisk) zur Analyse des Datensendeverhaltens von Apps vor. Reinschauen! ✌️ 👇

kuketz-blog.de/in-den-datenstr

#share #android #frida #objection #tweasel #pirogue #tls #ssl #CertificatePinning #mitmproxy #proxy #intercepting #analyse #datenschutz #sicherheit #privacy #security #dsgvo

2024-05-06

New data in our open request database!

I've just finished another monkey run on 2,358 #Android apps. That's another 70k requests from April 2024 that can be used for understanding and researching #tracking. ^b

#tweasel #privacy

2023-10-10

We have also started doing legal research, looking into relevant complaints, court submissions, decisions, rulings, DPA recommendations, and legal commentary regarding tracking. This is to inform our decisions on how we establish tracker IDs as personal data in our complaints and also to prepare for writing our complaint templates.

As always: Have a look at the blog post for the full details.
datarequests.org/devlog/twease

2023-10-10

We’re back after the summer with our fourth #tweasel devlog: datarequests.org/devlog/twease

A few highlights: We’ve been busy improving the documentation of our TrackHAR adapters to provide better reasoning on why we think properties contain certain data types. We’ve also written a script for debugging our adapters, which allows us to run them against all matching requests in our open request database.

We already announced the database in a previous toot: chaos.social/@dev_at_datareque ^b #privacy #tracking

Screenshot of the debug script being run in a terminal. It shows properties (manfacturer, model, isRooted, carrier, screenHeight, etc.) being matched to values. For example, the values for osVersion are 11 and 13, for isEmulator: true and false, for timezone: 3600000.
2023-09-04

@rufposten @taketwo This is meant at mostly for researchers and journalists. We are building a much easier platform for users to run their own analyses, this is just one of the steps on the way there, we are excited to share.

2023-09-04

Our open request database is online: data.tweasel.org/ \o/

We regularly run #traffic analyses on thousands of #Android and #iOS apps. As we want to enable as many people as possible to look into the inner workings of trackers, we are publishing our datasets for other researchers, activists, journalists, and anyone else who is interested in understanding #tracking. There are already 250k requests from between January 2021 and July 2023, with more to come in the future. ^b
#tweasel #privacy

Screenshot of data.tweasel.org, showing a query that returns multiple requests.

The screenshot shows seven rows, with the following columns: initiator (e.g. com.amazon.dee.app@2.2.453377.0, com.opera.app.news@11.1.2254.67011), platform (android or ios), endpointUrl (e.g. https://us-u.openx.net/w/1.0/cm, https://csi.gstatic.com/csi), content (one requests has a base64-encoded binary content, another one has an XML document, for the others, the content is blank), and headers.
2023-07-31

We made a few UX improvements: You don’t have to specify the IP address on iOS anymore as we use a usbmuxd proxy. In CLI, you don't need to specify an app ID anymore.
We also fixed a bunch of bugs.

Finally, we have a new docs site and gave a workshop at the @digitalcourage #Aktivcongress. We have also collected some new traffic data.

You can find our new docs here: docs.tweasel.org/

And as always, there's much more in the blog post: datarequests.org/devlog/twease

2023-07-31

We have just published our third #tweasel devlog. You can read it here: datarequests.org/devlog/twease

One of the major changes we have made is switching to the @httptoolkit #Frida unpinning script for bypassing certificate pinning on Android. We had run an analysis comparing its performance to the script we used before and found that it works better for our use case. As a bonus, this change allowed us to close two related issues. ^b #privacy #Android #iOS

2023-06-12

Finally, we have trackers.tweasel.org/, a wiki that explains how TrackHAR recognizes and decodes requests, and provides some sample values of real observed transmissions from our #research data. We hope that this will become a valuable resource for everyone who wants to dig deeper into #tracking.

There's much more in the blog post, so check it out if you're interested. And stay tuned for more updates!

datarequests.org/devlog/twease

Screenshot of https://trackers.tweasel.org/t/branch-io/v1/ that has details on tracking endpoint. There are three sections:

"Endpoint URLs" lists a series of URL regexes that are used to determine whether a request matches the adapter. One example is: /^https:\/\/api2?\.branch\.io\/v1\/install$/

"Decoding steps" has a description of how to parse requests to this endpoint: "Parse the request body as JSON. Store that in the result for the request body."

Finally, "Observed data transmissions" is data that was observed being transmitted by this tracker. Only the first property, "App ID" is visible in the screenshot. It has a "context" of "body", two paths (cd.pn and ios_bundle_id), an a few examples of observed values like com.allesklar.meinestadt and com.adobe.PSMobile.
2023-06-12

Another tool we released is TrackHAR (github.com/tweaselORG/TrackHAR), a library for detecting #tracking data transmissions from #traffic in #HAR format. It uses custom adapters (and soon, indicator matching) to handle different tracking endpoints and extract the transmitted data.

You can also use TrackHAR through our CLI. Just provide a traffic recording as a HAR file and you'll get a list of the detected tracking data transmissions, including the actual values.

Screenshot of running the tweasel detect-traffic command on a HAR file recorded from de.check24.check24. Two POST requests are shown, with a table of the detected data transmissions underneath, each with a property, context, path, and value. The first request is to app.adjust.net.in and transmitted appId, appVersion, idfa, otherIdentifiers, language, model, osName, osVersion, country, manufacturer, screenWidth, and screenHeight. The second request is to app-measurement.com and transmitted appId, appVersion, idfa, osName, and osVersion.
2023-06-12

We also released cyanoacrylate (github.com/tweaselORG/cyanoacr), a toolkit for large-scale automated #traffic #analysis of mobile apps on #Android and #iOS. It uses #mitmproxy to capture the HTTP(S) traffic of apps in HAR format and appstraction to instrument physical devices or emulators. Cyanoacrylate handles the management of certificate authorities and #WireGuard mitmproxy setup automatically.

You can use cyanoacrylate without writing any code through our CLI (github.com/tweaselORG/cli).

2023-06-12

We published our first update blog post for #tweasel. Our plan is to do these biweekly from now on.

datarequests.org/devlog/twease

A lot has happened since our last update in January. We have released a set of tools and library for mobile #tracking analysis.

First up: appstraction (github.com/tweaselORG/appstrac), an abstraction layer for instrumenting #Android and #iOS. It allows you to install, uninstall, start, stop apps, manage emulator snapshots, clipboard, proxy, and certificates, etc. ^b

Screenshot from the appstraction README showing an example of how to reset an Android emulator and install an app in it.

Full text:

Example usage

The following example shows how to reset an Android emulator and then install an app on it:

import { platformApi } from 'appstraction';

(async () => {
    const android = platformApi({
        platform: 'android',
        runTarget: 'emulator',
        capabilities: []
    });
    
    await android.ensureDevice();
    await android.resetDevice('<snapshot name>');
    await android.installApp('</path/to/app/files/*.apk>');
})();
2023-01-24

If you're instead interested in #Android apps on #GooglePlay, here's my earlier parse-play library: github.com/baltpeter/parse-pla

Reverse-engineering those APIs was quite the rodeo as well. I've written an in-depth blog post on that: benjamin-altpeter.de/android-t

For an idea on why we need reliable automated access to this data, have a look at our analysis of data safety labels: datarequests.org/blog/android-

We hope that others will use these libraries as well. Please reach out if you need additional features!

2023-01-24

As the library is using internal #API endpoints, I've had to resort to doing a lot of #ReverseEngineering. That was quite the rabbit hole. I've tried to document as much as possible of the arcane knowledge about Apple internals I gained along the way in these issues:

github.com/tweaselORG/parse-tu
github.com/tweaselORG/parse-tu
github.com/tweaselORG/parse-tu
github.com/tweaselORG/parse-tu

2023-01-24

Our first project is to create reusable and well-documented tools and libraries from the code (github.com/baltpeter/thesis-mo) I already wrote for my master's thesis (benjamin-altpeter.de/doc/thesi).

And we've just made our first release: parse-tunes is a library for fetching select data on #iOS apps from the #Apple App Store via undocumented internal iTunes APIs.

github.com/tweaselORG/parse-tu

Currently, it can fetch charts of the most popular apps, and meta data (including privacy labels) for individual apps.

Screenshot from the parse-tunes README showing example usages for fetching top charts and app metadata.

Full text:

Fetch app top charts

The following example fetches the app IDs of the current 200 top free iPhone apps across all categories for Germany:

import { fetchTopApps, charts, countries, genres } from 'parse-tunes';

(async () => {
    const topChart = await fetchTopApps({ genre: genres.all, chart: charts.topFreeIphone, country: countries.DE });
    console.log(topChart.length); // 200
    console.log(topChart[0]); // 1186271926
})();

Fetching more app metadata in addition to the app IDs is currently not possible due to server-side limitations by the endpoint we're using. See #2 for details.
Fetch app metadata

The following example fetches the developer name and custom artwork for the Facebook app on iPhone for the German App Store in German:

import { fetchAppDetails } from 'parse-tunes';

(async () => {
    const appDetails = await fetchAppDetails({
        appId: 284882215,
        platforms: ['iphone'],
        attributes: ['artistName', 'customArtwork'],
        country: 'DE',
        language: 'de-DE',
    });
    console.log(appDetails.artistName);
    // Meta Platforms, Inc.
    console.log(appDetails.platformAttributes.ios?.customAttributes.default.default.customArtwork.url);
    // https://is5-ssl.mzstatic.com/image/thumb/Purple113/v4/45/ab/be/45abbeac-3a7e-aa86-c1c5-007c09df6d7c/Icon-Production-0-1x_U007emarketing-0-7-0-85-220.png/{w}x{h}{c}.{f}
})();
2023-01-24

Introducing #tweasel. @zner0L and I (@baltpeter) will be working on fighting #tracking in mobile #apps thanks to #NLnet funding (nlnet.nl/project/TrackingWease). Our goal is to automate complaints against tracking under the #GDPR and #ePrivacy directive.

We don't have a website yet (this is our behind-the-scenes account, after all), but all code will of course be FOSS (github.com/tweaselORG) and we'll report here.

First overview in our #FireShonks talk: media.ccc.de/v/fire-shonks-202 (DE with EN dub). ^b

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst