Elastic search with python

Introduction

Elasticsearch is a distributed, real-time, search and analytics platform. In this tutorial, we are going to use elastic search with python.

Elasticsearch saves data and indexes it automatically, using a restful API. If you want to know more about elastic search, please check our introduction to elastic search here.

For this tutorial we are going to create a rest api endpoint which uses elastic search. The idea is that some javascript frontend uses this endpoint to search into documents.

Should you use Elastic Search?

Elastic search has the property of scalability and allows to get results and insights very fast. Elastic can handle tons of data and it distributes the information across nodes to handle thousands of users.

Step 1: Using elastic search API

If you already are familiar with elastic search, you can skip this step since we are going to introduce how to use it. An index is like a database in relational databases, we need to create an index before starting. Our index will be called “articles”. Elastic search has an API Rest and has one endpoint to create an index.

Creating indexes

Here is the curl command to create an index called “articles”:

 $ curl -X PUT localhost:9200/articles
{"acknowledged":true,"shards_acknowledged":true,"index":"articles"}

Now you can access with GET the “articles” index:

 $ curl localhost:9200/articles
{"articles":{"aliases":{},"mappings":{},"settings":{"index":{"creation_date":"1540961127798","number_of_shards":"5","number_of_replicas":"1","uuid":"bcbE1aNzS5OhlmPjRSlKNg","version":{"created":"6040299"},"provided_name":"articles"}}}}

Let’s see with some details the output:

  • Mappins: The schema of your document
  • number_of_shards: number of partitions that will keep the data. the data is splitted across all nodes

Creating records

To create entries in the index you can do a POST to the following url:

http://localhost:9200/articles/architecture With the body:

{
“title”: “Unique title”,
“description”: “Article description”,
“author”: “Peter Thompson”
}

As you can see the url now contains “architecture” in the url path and this is the type, which is similar to a table in relational databases.

Update records

In the url you can also specify the ID of the documents, by just appending “/10220” or any number you want. You can also use the ID to update the document using also a POST request.

Using the search

Elastic search allows you to perform searches using the _search endpoint, for example:

curl http://localhost:9200/articles/architecture/_search?q=unique

The previous query wil search in all fields, if you want to limit it you can use “title:unique” and it will only search in the title field.

Step 2: Installing python libs

For this tutorial, we are going to use a library called “elasticsearch”, to install it execute:

pip install elasticsearch

Connecting to elastic search

import sys
from elasticsearch import Elasticsearch

def connect_elasticsearch():
    elastic_conn = Elasticsearch([{'host': 'localhost', 'port': 9200}])
    if not elastic_conn.ping():
        print('Could not connect to elastic search')
        sys.exit(1)
   return elastic_conn

When you create the elastic_conn, this will not check the connection. We use the ping method to verify that the connection is working.

Creating indexes

Everytime you want to do something on elastic search we will use the elastic_conn instance. The next example will create the index “articles”

elastic_conn.indices.create(index=”articles”, ignore=400)

You can use the optional parameter body to send the mapping and other settings, like number of shards. Ignore 400 will not raise an exception if the elastic response status code was 400. Elastic usually return status code 400 when the index already exists.

Creating records

Once we have our elastic_conn instance we can use the index method to create records in an index.

data= {“title”: “Unique 1”, “author”: “Linda null”}
elastic_conn.index(index=”articles”, doc_type=”architecture”, body=data)

Using the search

Now we are ready to use the search from python.

query = {‘query’: ‘match’: {‘title’: ‘unique’}}
elastic_conn.search(index=”articles”, body=query)

Step 3: Creating the API Rest endpoint

You can use directly Elastic search API endpoint “search”, just like a proxy. This could be the easy way to expose Elastic on your current application and the most sane wat also. Remember not to expose anyother end point since users could manipulate the index.

For this example we will use bottle framework

import sys
import json
from bottle import route, run
from elasticsearch import Elasticsearch

def connect_elasticsearch():
    elastic_conn = Elasticsearch([{'host': 'localhost', 'port': 9200}])
    if not elastic_conn.ping():
        print('Could not connect to elastic search')
        sys.exit(1)
   return elastic_conn

elastic_conn = connect_elasticsearch()

@route('/search)
def hello():
    query = json.load(request.body)
    return elastic_conn.search(index=”articles”, body=query)

run(host='localhost', port=8080, debug=True)

Conclusion

Using elastic search is very simple and it provides a distributed, real-time, search and analytics platform. Using it from an api rest is very easy and it will allow to scale your application.