An example screen used by In Codice Ratio to allow students to train the AI OCR model. Image: In Codice Ratio
Researchers in Italy, on a project called In Codice Ratio, are using OCR and AI to begin to digitise hand written manuscripts from The Vatican Secret Archives. The challenge lies in creating an OCR system efficient enough in recognising hand written Latin script. The researchers have pioneered a mehtod called jigsaw segmentation, where a word is segmented into blocks analogous to pen strokes, as opposed to segmentation by letter. The researchers then sought the help of students from 24 schools in Italy to train the AI model used in the OCR system by tagging correctly identified characters. The model boasts a 96% identification success rate.