Apache Tika is opensource software working about OCR from PDF, Image file. In this example is using Java Maven project to work with Apache Tika. First of all, you need to add dependencies for using Apache Tika by add these dependencies into pom.xml <dependency> <groupId> org.apache.tika </groupId> <artifactId> tika-parsers </artifactId> <version> 1.18 </version> </dependency> <dependency> <groupId> com.levigo.jbig2 </groupId> <artifactId> levigo-jbig2-imageio </artifactId> <version> 2.0 </version> </dependency> <dependency> <groupId> com.github.jai-imageio </groupId> <artifactId> jai-imageio-core </artifactId> <version> 1.4.0 </version> </dependency> <depende...