After struggling to get #python #PyMuPDF to work and being close the deadline, I shifted to using a combination of other commands.
First using the #linux #pdftohtml command, which is so much faster than PyMuPDF and packages the result similar to saving a website.
Next with #NeoVim and #RegEx format the #HTML file to be able to be quickly processed with #NodeJs #cheerio and eventually through #json to be saved in #sqlite.
Is it elegant and automatic? No, though it works!