ERA “Ancient Script Recognition”

dc.contributor.advisorArandi, Samer
dc.contributor.authorSawalha, Eman
dc.contributor.authorAbu Alrob, Ghadeer
dc.date.accessioned2019-02-07T07:46:25Z
dc.date.available2019-02-07T07:46:25Z
dc.date.issued2018
dc.description.abstractOptical Character Recognition (OCR) is an image processing technique that takes the image, recognizes and extracts the text contents from it. This technique depends on preprocessing for image, noise reduction, machine learning, and need a huge number of data to be trained to guarantee the best accuracy. One of the open source engines that work under OCR is Tesseract. Which is a portable software try to give the best result of character recognition based on machine learning and image processing techniques. Tesseract is based on artificial neural network and needs a huge number of samples to be trained using tesseract command in all operating systems. ERA project is built on these techniques to support ancient languages characters recognition. The languages that ERA works on them were supported in the beginnings of the 21st century in Unicode version 5.2 and version 7. ERA as a project, and after training phase, did provide the .traineddata files for these old languages which can be very useful to the usage of OCR projects around the world. Especially, focusing on old Arabic languages recognition. ERA support 5 ancient languages, three of them unsupported by any resources yet which are Old South Arabian, Old North Arabic (Dadanitic, Safaitic, Taymanitic, and Hismaic), Nabataean, and the other ones are Greek, Romanian. ERA use any digital image as input and provide methods to get clearer one, and go through Tesseract engine using .traineddata file of the specified language and give a text data of input contents as output. Aِlso ERA can make the user help in character recognition by drawing on images if the result was incorrect. The result of this project will be much important and useful for people who are fond of archaeology.en_US
dc.identifier.urihttps://hdl.handle.net/20.500.11888/14114
dc.language.isoenen_US
dc.titleERA “Ancient Script Recognition”en_US
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
ERA ملخص عربي.pdf
Size:
108.51 KB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
ERA Presentation.pptx
Size:
4.63 MB
Format:
Microsoft Powerpoint XML
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: