rasdapy Tutorial


Overview

What is rasdapy?

rasdapy is a client API for rasdaman that enables building and executing rasql queries with python.

How to get rasdapy?

rasdapy could be installed easily with pip3 (e.g: on Ubuntu: sudo apt install python-pip3) from the official Python public libraries: https://pypi.org/project/rasdapy3.

In [ ]:
pip3 install rasdapy3

What do I need before using rasdapy?

Can I get this Jupyter Notebook to run on my local system?

You can download it from http://tutorial.rasdaman.org/rasdapy-tutorial/rasdapy.ipynb and run. (after you installed rasdaman, rasdapy3 and Jupyter Notebook).

rasdapy in actions

Import rasdapy core API

After rasdapy installation, you can import rasdapy to your Python script like others libraries.

In [ ]:
from rasdapy.db_connector import DBConnector
from rasdapy.query_executor import QueryExecutor

Initialize connection to rasdaman Manager (rasmgr)

The DBConnector maintains the connection to rasdaman. In order to connect it is necessary to specify the host (default: localhost) and port (default: 7001) on which rasmgr is running, as well as valid rasdaman username with read and write permissions (default: rasadmin) and password (default: rasadmin).

In [52]:
db_connector = DBConnector("localhost", 7001, "rasadmin", "rasadmin")
db_connector.open()

Create the query executor

QueryExcutor is the interface through which rasql queries (create, insert, update, delete, etc...) are executed.

In [53]:
query_executor = QueryExecutor(db_connector)

Returns a list of all the collections available in rasdaman

It is a good practice to check the connection to rasdaman by this Rasql query and have an idea about which collections can be queries before-hand. Here, only 1 collection was created in rasdaman, named rgb.

In rasdaman databases, arrays are grouped into collections. All elements of a collection share the same array type definition. Collections form the basis for array handling, just as tables do in relational databasetechnology.

Note: We use query_executor.execute_read() because SELECT does not need the write permission in transaction to rasdaman.

In [54]:
collection_list = query_executor.execute_read("select c from RAS_COLLECTIONNAMES as c")
print(collection_list)
['rgb']

Create a new collection in rasdaman

We want to create a new rasdaman collection using rasdapy in this demonstration which can be used to query later. The collection is supposed to store a 2D PNG image with dataType: char.

Note: We use query_executor.execute_write() because CREATE needs the write permission in transaction to rasdaman.

In [ ]:
query_executor.execute_write("create collection test_mr GreySet")

Insert data from file to newly created collection

The collection is empty and we want to import data from a 2D PNG file as multi-dimensional array (MDD). You can download the input file from: http://rasdaman.org/browser/systemtest/testcases_mandatory/test_select/testdata/mr_1.png and save it to your local system which rasdaman can have permission to read. In this tutorial, we save the file to: /home/rasdaman/mr.png.

Note: We use query_executor.execute_update_from_file() because INSERT needs the write permission in transaction to Rasdaman and it is used to INSERT/UPDATE rasdaman collections.

In [ ]:
query_executor.execute_update_from_file("insert into test_mr values decode($1)", "/home/rasdaman/mr.png")

Check the spatial domain

When the INSERT query is successful, the MDD is created inside the collection test_mr. We can see the spatial domain (width x height) via this SELECT query:

In [58]:
sdom = query_executor.execute_read("select sdom(c) from test_mr as c")
print(sdom)
[0:255,0:210]

Calculate the average of all values in collection

We want to get the average of all pixel values from test collection.

In [59]:
result = query_executor.execute_read("select (char)avg_cells(c) from test_mr as c")
print(result)
39

Leverage the power of Numpy

Select a particular subset of array in collection test_mr. This query will return raw array data that can be converted to a Numpy ndarray. Then, you can do all the features of Numpy ndarray normally.

In [61]:
result = query_executor.execute_read("select m[30:40 , 20:40] from test_mr as m")
numpy_array = result.to_array()
print(numpy_array.shape)
print(numpy_array)
(21, 11)
[[ 0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  0  0  0]
 [ 0  0  0  0  0  0  0  0  3 12 19]
 [ 0  0  0  0  2 13 18 23 32 42 48]
 [ 0  5 21 29 33 42 47 59 56 64 66]
 [27 31 48 60 61 64 68 79 70 77 73]
 [60 56 69 76 78 78 79 84 76 83 78]
 [83 74 80 81 85 86 84 85 78 86 86]
 [90 85 86 83 86 90 89 91 82 89 95]
 [90 87 90 87 87 92 95 99 89 89 92]
 [90 91 94 94 92 90 93 91 84 76 69]
 [97 95 94 94 87 76 71 59 54 45 32]
 [97 87 77 72 61 45 30 20 17 20 19]
 [74 56 38 30 23 17 16 21 26 35 36]
 [33 19 12 22 23 27 40 44 44 53 49]
 [19 32 35 46 39 44 53 54 51 54 51]]

Display the output as 2D image

Using matplotlib library, you can display the raw data from rasdaman as Numpy ndarray easily.

In [50]:
import matplotlib.pyplot as plt
import numpy as np

result = query_executor.execute_read("select m from test_mr as m")
numpy_array = result.to_array()

# Plot the grid
plt.imshow(numpy_array)
plt.show()

Encode data and write to binary file

rasdaman supports multilple type of encodes (e.g: jpeg, png, tiff, csv, json,...) so you can select data from rasdaman with encoded format and write the result to a file normally (e.g: /tmp/output.png).

In [40]:
result = query_executor.execute_read('select encode(m[30:40 , 20:40], "png") from test_mr as m')
with open("/tmp/output.png", "wb") as binary_file:
    binary_file.write(result.data[0])

Close the connection to rasdaman

It is important to close the connection to rasdaman when you've finished your Python script. That will release the connection to rasserver and allow another client can connect to this server afterwards.

In [ ]:
db_connector.close()

Conclusion

Best practice

It is recommended to follow this template in order to avoid problems with leaked transactions:

In [ ]:
from rasdapy.db_connector import DBConnector
from rasdapy.query_executor import QueryExecutor

db_connector = DBConnector("localhost", 7001, "rasadmin", "rasadmin")
query_executor = QueryExecutor(db_connector)

db_connector.open()

try:
    query_executor.execute_read("...")
    query_executor.execute_write("...")
    # ... more Python code
finally:
    db_connector.close()