smart-search

A search algorithm for efficient searching in PDFs


Keywords
nlp, pip, pdf-search-engine, context-search
License
MIT
Install
pip install smart-search==0.0.5

Documentation

MIT license PRs Welcome

🔄 Context-Search is a tools designed to increase efficiency during your study hours.

Usage

  1. Pip install the package.
$ pip3 install smart-search
  • NOTE : Please have the pickle file in the same folder as the python script in which you will use our pip package.

Here i use the glove.6B.zip file from Standfords Github repository from the hyperlink.

Syntax

  1. Import the library.
>> import smart_search
  1. Create an object of the class, smart_search.model(). Say, functioncaller.
>> functioncaller = smart_search.model()
  1. Now to convert a pdf to a list of lists containing page.no and words after stop word removal, we use the built in function getting_list_of_words(). This accepts 1 argument, i.e the path to the pdf and returns the required list to be fed to the model.
>> pdf_list = functioncaller.getting_list_of_words('path to your pdf')
  1. Pass this list to the model along with the word you want to get the search result of using the perform_skip() function. This accepts 2 variables, i.e the list produced by the previous function and the word you want to search for and retuns the top 5 relevant search locations of the word you searched for.
>> location[0:5] = perform_skip(pdf_list, input_word)
  1. You can use subprocesses library of python to navigate to the page if you want to.

LICENSE

MIT