Python module tngri functions
The Python module tngri was created for convenient work with data in cells of type Python. The functions described on this page are available in it.
|
The |
tngri.run_notebook()
| Description |
Runs the specified notebook from another notebook. |
| Usage |
|
Runs the specified notebook from another notebook. Returns the outputs of all cells of the executed notebook in a single text block.
The function is convenient to use when you need to orchestrate the launching of several notebooks. It can be used to implement different logic:
-
Starting a notebook by condition
-
Starting a notebook immediately after another notebook is executed
-
Allocating a part of calculations to a separate notebook and launching this part from other notebooks
See example
import tngri
result = tngri.run_notebook('My Notebook')
print(result)
<Output of cell 1 of My Notebook>
<Output of cell 2 of My Notebook>
done at 2025-12-12 14:09:14.497666
tngri.sql()
| Description |
Executes the specified query SQL inside a cell of type Python. |
| Usage |
|
This function is useful when you need to execute any SQL queries directly inside a cell of type Python, such as inside a loop or other complex constructs, without creating a separate cell of type SQL and using any local variables and functions Python inside the query text SQL.
One usage scenario is described here.
See examples
Example 1
Let’s create a table with a name from the table_name variable and write to it in a loop:
-
index (starting with
1) -
word from the phrase specified in the
test_phrasevariable -
the result of applying the given function
length_in_charsto this word
In each iteration of the loop we will output the index value, the added word and the result of the query SQL with the current number of rows in the table being created.
import tngri
def length_in_chars(text):
if len(text) == 1:
return '1 character'
else:
return f'{len(text)} characters'
table_name = 'my_table'
test_phrase = 'I love Tengri'
tngri.sql(f'CREATE OR REPLACE TABLE {table_name} \
(index INT, word VARCHAR, length VARCHAR)'
)
ind = 0
for word in test_phrase.split(' '):
ind += 1
tngri.sql(f"INSERT INTO {table_name} VALUES \
({ind}, '{word}', '{length_in_chars(word)}')"
)
print(f'Added word: "{word}"')
print(tngri.sql(f'SELECT count(*) FROM {table_name}'))
Step: 1
Added word: "I"
shape: (1, 1)
+----------+
│ column_0 │
│ --- │
│ i64 │
+----------+
│ 1 │
+----------+
Step: 2
Added word: "love"
shape: (1, 1)
+----------+
│ column_0 │
│ --- │
│ i64 │
+----------+
│ 2 │
+----------+
Step: 3
Added word: "Tengri"
shape: (1, 1)
+----------+
│ column_0 │
│ --- │
│ i64 │
+----------+
│ 3 │
+----------+
Now in a cell of type SQL we will display the created table ordered by index:
SELECT * FROM my_table
ORDER BY index
+-------+--------+--------------+
| index | word | length |
+-------+--------+--------------+
| 1 | I | 1 character |
+-------+--------+--------------+
| 2 | love | 4 characters |
+-------+--------+--------------+
| 3 | Tengri | 6 characters |
+-------+--------+--------------+
Example 2
Let’s iteratively load data from .parquet files from S3 storage into a table by file path mask.
In each iteration of the loop we will output the result of the query with the number of rows in the created table.
import tngri
for i in range(1,4):
file_name = f"s3://prostore/Stage/<lake_path>/{i}.parquet"
tngri.sql(f"INSERT INTO raw_dyntest SELECT * FROM read_parquet('{file_name}')")
print(tngri.sql("SELECT count(*) FROM raw_dyntest"))
shape: (1, 1)
+----------+
│ column_0 │
│ --- │
│ i64 │
+----------+
│ 10000000 │
+----------+
shape: (1, 1)
+----------+
│ column_0 │
│ --- │
│ i64 │
+----------+
│ 20000000 │
+----------+
shape: (1, 1)
+----------+
│ column_0 │
│ --- │
│ i64 │
+----------+
│ 30000000 │
+----------+
tngri.upload_df()
| Description |
Uploads data from DataFrame to Tengri. |
| Usage |
|
Uploads data from the specified DataFrame to Tengri (in the S3) storage.
Returns a string containing the name of the .parquet file to which the data was uploaded.
Loading data into Tengri using Python is described in more detail at here.
See examples
-
Let’s create a DataFrame, and load it into Tengri:
import tngri import pandas my_df = pandas.DataFrame(range(100)) tngri.upload_df(my_df)UploadedFile(s3_path='s3://prostore/Stage/ijwsajclddxw.parquet', _client=None) -
Let’s create a DataFrame, load it into Tengri and write the file name
.parquetto thefile_namevariable:import tngri import pandas my_df = pandas.DataFrame(range(100)) file_name = tngri.upload_df(my_df) print(my_df) print(file_name)0 0 0 1 1 2 2 3 3 4 4 .. .. 95 95 96 96 97 97 98 98 99 99 [100 rows x 1 columns] tdiiuetablyx.parquet
tngri.upload_file()
| Description |
Uploads data from a file to Tengri. |
| Usage |
|
-
FilePath— path to the file to upload to Tengri -
FileName— the name for the file in the Tengri
Loads the data from the file at the specified path into Tengri (in the S3) repository.
Returns a string containing the path and filename of the file within the S3 storage to which the data was loaded.
One of the usage scenarios is described at here.
See examples
-
Let’s load data into Tengri from the
.jsonfile available at URL:import tngri import urllib.request urllib.request.urlretrieve( 'https://tngri.postgrespro.ru/documentation/ru/stable/_attachments/tengri_data_types.json', 'my_file.json' ) tngri.upload_file('my_file.json')UploadedFile(s3_path='s3://prostore/Stage/pxfihzbonctd.json', _client=None)Let’s output the first 5 rows of the table by reading it from the loaded file:
SELECT * FROM read_json("pxfihzbonctd.json") LIMIT 5+----------+-----------+----------+---------------------------+ | name | type | category | description | +----------+-----------+----------+---------------------------+ | BIGINT | data type | numeric | Целые числа. | +----------+-----------+----------+---------------------------+ | BIGINT[] | data type | array | Массивы целых чисел. | +----------+-----------+----------+---------------------------+ | BLOB | data type | blob | Двоичные объекты. | +----------+-----------+----------+---------------------------+ | BOOL | data type | boolean | Булевы значения. | +----------+-----------+----------+---------------------------+ | BOOL[] | data type | array | Массивы булевых значений. | +----------+-----------+----------+---------------------------+ -
Let’s load data into Tengri from the
.jsonfile available at URL, and save the name of the loaded file to a variable:import tngri import urllib.request urllib.request.urlretrieve( 'https://tngri.postgrespro.ru/documentation/ru/stable/_attachments/tengri_data_types.json', 'my_file.json' ) file_name = tngri.upload_file('my_file.json') print(file_name)pxfihzbonctd.jsonLet’s output the first 5 rows of the table by reading it from the loaded file:
SELECT * FROM read_json("pxfihzbonctd.json") LIMIT 5+----------+-----------+----------+---------------------------+ | name | type | category | description | +----------+-----------+----------+---------------------------+ | BIGINT | data type | numeric | Целые числа. | +----------+-----------+----------+---------------------------+ | BIGINT[] | data type | array | Массивы целых чисел. | +----------+-----------+----------+---------------------------+ | BLOB | data type | blob | Двоичные объекты. | +----------+-----------+----------+---------------------------+ | BOOL | data type | boolean | Булевы значения. | +----------+-----------+----------+---------------------------+ | BOOL[] | data type | array | Массивы булевых значений. | +----------+-----------+----------+---------------------------+
tngri.upload_s3()
| Description |
Uploads a file from the specified bucket S3 to Tengri. |
| Usage |
|
-
object— path to file in S3 storage . -
access_key— access key for S3. -
secret_key— secret key for S3.
The file extension can be anything. It will remain the same as it was in the initial file.
If necessary, the path and name of the uploaded file can be specified via the optional filename parameter.
An example of use is described at here.