@georges RTC should actually be a great fit for this! Grouping the triples extracted from a page into a compound and annotating it with provenance metadata (source URL, scraping timestamp, etc.) is pretty much the core use case of RTC.
Ontogen could work too if you want full version history of the dataset, but it still has some limitations. I'm working on a larger update that should address these, hopefully releasing this summer.

