Where I work we have a fairly large archive of documents going back to the 1930's and I want to assist the archive team in importing these into a GPT model. We have already begun the process of digitizing all the documents into OCR'ed PDF files, so this part at least is covered.
My question is, what are the hot fully offline AI models I could try in an airgapped environment that will allow us to import all of the PDF files and their metadata (title/date/tags/etc), to incorporate their content on top of the larger general model?
[link] [comments]