Python module tngri functions
The Python module tngri was created for convenient work with data in cells of type Python. The functions described on this page are available in it.
|
The |
tngri.create_table()
| Description |
Creates a table and populates it with data from a DataFrame. |
| Usage |
|
Creates a table with the specified name and populates it with data from the specified DataFrame.
Parameters
- data: pandas.DataFrame | polars.DataFrame
-
DataFrame with data for table creation.
- table_name: str
-
The name of the table to be created. Specified by a text string with an optional prefix via
.to specify the schema. If there is no prefix, the table is created in the default schema for the user. - replace: bool, default False
-
Optional parameter — whether the table should be replaced if it already exists. By default, it should not.
See example
import tngri
import pandas as pd
df = pd.DataFrame(
[
{"Country": "Russia", "Capital": "Moscow"},
{"Country": "Italy", "Capital": "Rome"},
]
)
tngri.create_table(df, 'demo.capitals', replace=True)
done at 2026-02-11 17:39:06.378942
SELECT * FROM demo.capitals
+---------+---------+
| Country | Capital |
+---------+---------+
| Russia | Moscow |
+---------+---------+
| Italy | Rome |
+---------+---------+
tngri.delete_file()
| Description |
Deletes a file at the specified path. |
| Usage |
|
To delete all files in a directory, use in combination with the function tngri.list_files().
|
Parameters
- file: str
-
The path to the file to delete.
See example
Let’s loop through all the files in the test directory and delete them:
import tngri
for file in tngri.list_files('test/'):
print(file)
tngri.delete_file(file)
StagedFile(path=test/my_file.parquet size=734 modified=2026-04-16 10:24:51.755000+00:00)
StagedFile(path=test/my_file_2.parquet size=734 modified=2026-04-16 10:24:56.399000+00:00)
tngri.list_files()
| Description |
Lists files at the specified path. |
| Usage |
|
Outputs a list of files uploaded to Tengri at the specified path.
The list of files displays:
-
File name and path
-
File size
-
Time of last modification
| The function is convenient to use to view all files in a given directory and deleting unnecessary ones. |
Parameters
- filepath: str, default ""
-
The path to the directory. Default — empty string (file system root).
See example
import tngri
tngri.list_files('test/')
[StagedFile(path=test/my_file.parquet size=734 modified=2026-04-16 10:11:40.288000+00:00),
StagedFile(path=test/test.json size=6174 modified=2026-04-16 09:57:14.179000+00:00)]
tngri.run_notebook()
| Description |
Runs the specified notebook from another notebook. |
| Usage |
|
Runs the specified notebook from another notebook. Returns the execution results of all cells of the running notebook as a RunStatus object with fields:
-
ok: bool— Whether the run was successful -
output: str— Output results of all cells of type Python -
errors: str— Text of errors
|
In order for a notebook to be launched from another notebook, it must be published (Publish button). |
This function is convenient to use if you need to orchestrate the launching of several notebooks. It can be used to implement different logic:
-
Starting a notebook by condition
-
Starting a notebook immediately after another notebook is executed
-
Allocating a part of calculations to a separate notebook and launching this part from other notebooks
Parameters
- notebook_id: str
-
The path and name of the notebook to run.
See example
import tngri
result = tngri.run_notebook('my_folder/My Notebook')
print(result)
RunStatus(ok=True, output='Output of cell 1 of My Notebook\nOutput of cell 2 of My Notebook\ndone at 2025-12-19 14:02:02.933882', errors='')
done at 2025-12-12 14:09:14.497666
tngri.sql()
| Description |
Executes the specified SQL query within a Python cell. |
| Usage |
|
Executes the specified SQL query within a Python cell and returns the result as a Polars DataFrame.
This function is useful when you need to execute any SQL queries directly inside a cell of type Python, such as inside a loop or other complex constructs, without creating a separate cell of type SQL and using any local variables and functions Python inside the query text SQL.
One usage scenario is described here.
Parameters
- sql: str
-
The text of the SQL query to execute.
See examples
# Example 1
Let’s create a table with a name from the table_name variable and write to it in a loop:
-
index (starting with
1) -
word from the phrase specified in the
test_phrasevariable -
the result of applying the given function
length_in_charsto this word
In each iteration of the loop we will output the index value, the added word and the result of the query SQL with the current number of rows in the table being created.
import tngri
def length_in_chars(text):
if len(text) == 1:
return '1 character'
else:
return f'{len(text)} characters'
table_name = 'my_table'
test_phrase = 'I love Tengri'
tngri.sql(f'CREATE OR REPLACE TABLE {table_name} \
(index INT, word VARCHAR, length VARCHAR)'
)
ind = 0
for word in test_phrase.split(' '):
ind += 1
print(f'Step: {ind}')
tngri.sql(f"INSERT INTO {table_name} VALUES \
({ind}, '{word}', '{length_in_chars(word)}')"
)
print(f'Added word: "{word}"')
print(tngri.sql(f'SELECT count(*) FROM {table_name}'))
print(f"Result table:\n{tngri.sql(f'SELECT * FROM {table_name} ORDER BY index')}")
Step: 1
Added word: "I"
shape: (1, 1)
┌───────┐
│ count │
│ --- │
│ i64 │
╞═══════╡
│ 1 │
└───────┘
Step: 2
Added word: "love"
shape: (1, 1)
┌───────┐
│ count │
│ --- │
│ i64 │
╞═══════╡
│ 2 │
└───────┘
Step: 3
Added word: "Tengri"
shape: (1, 1)
┌───────┐
│ count │
│ --- │
│ i64 │
╞═══════╡
│ 3 │
└───────┘
Result table:
shape: (3, 3)
┌───────┬────────┬──────────────┐
│ index ┆ word ┆ length │
│ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ str │
╞═══════╪════════╪══════════════╡
│ 1 ┆ I ┆ 1 character │
│ 2 ┆ love ┆ 4 characters │
│ 3 ┆ Tengri ┆ 6 characters │
└───────┴────────┴──────────────┘
Now in a cell of type SQL we will display the created table ordered by index:
SELECT * FROM my_table
ORDER BY index
+-------+--------+--------------+
| index | word | length |
+-------+--------+--------------+
| 1 | I | 1 character |
+-------+--------+--------------+
| 2 | love | 4 characters |
+-------+--------+--------------+
| 3 | Tengri | 6 characters |
+-------+--------+--------------+
# Example 2
Let’s access the function’s output (the Polars DataFrame object) via cell coordinates:
print(tngri.sql('SELECT 2*2')[0,0])
4
# Example 3
Let’s iteratively load data from .parquet files from S3 storage into a table by file path mask.
In each iteration of the loop we will output the result of the query with the number of rows in the created table.
import tngri
for i in range(1,4):
file_name = f"s3://prostore/Stage/<lake_path>/{i}.parquet"
tngri.sql(f"INSERT INTO raw_dyntest SELECT * FROM read_parquet('{file_name}')")
print(tngri.sql("SELECT count(*) FROM raw_dyntest"))
shape: (1, 1)
+----------+
│ column_0 │
│ --- │
│ i64 │
+----------+
│ 10000000 │
+----------+
shape: (1, 1)
+----------+
│ column_0 │
│ --- │
│ i64 │
+----------+
│ 20000000 │
+----------+
shape: (1, 1)
+----------+
│ column_0 │
│ --- │
│ i64 │
+----------+
│ 30000000 │
+----------+
tngri.upload_df()
| Description |
Uploads data from DataFrame to Tengri. |
| Usage |
|
Uploads data from the specified DataFrame to Tengri (in the S3) storage.
Returns a string containing the name of the .parquet file to which the data was uploaded.
If necessary, you can specify a name (or path and name) for the uploaded file via the optional filename parameter.
For more details on loading data into Tengri using Python, see here.
Parameters
- df: pandas.DataFrame | polars.DataFrame
-
DataFrame with data to load.
- filename: str | None, default None
-
Path and name of the
.parquetfile where the data will be loaded. If not specified, a random name will be assigned.
See examples
# Example 1
Create a DataFrame and load it into Tengri:
import tngri
import pandas
my_df = pandas.DataFrame(range(100))
tngri.upload_df(my_df)
UploadedFile(s3_path='s3://prostore/Stage/ijwsajclddxw.parquet', _client=None)
# Example 2
Create a DataFrame and upload it to Tengri at the given path and filename:
import tngri
import pandas
my_df = pandas.DataFrame(range(100))
tngri.upload_df(my_df, filename='test/my_file.parquet')
UploadedFile(s3_path='s3://prostore/Stage/test/my_file.parquet', _client=None)
# Example 3
Let’s create a DataFrame, load it into Tengri and write the file name .parquet to the file_name variable:
import tngri
import pandas
my_df = pandas.DataFrame(range(100))
file_name = tngri.upload_df(my_df)
print(my_df)
print(file_name)
0
0 0
1 1
2 2
3 3
4 4
.. ..
95 95
96 96
97 97
98 98
99 99
[100 rows x 1 columns]
s3://prostore/Stage/tcewxknvcadf.parquet
# Example 4
Create a DataFrame and insert it into an existing table. In a separate column record the insertion time. After that delete the .parquet file.
import tngri
import pandas
df = pandas.DataFrame({"id": [1,2], "name": ['smith','brown']})
res = tngri.upload_df(df, filename=f"test_data/file.parquet")
tngri.sql(f"""
insert into raw.test_data
select now(), * from read_parquet('{res.s3_path}')
""")
tngri.delete_file(res)
done at 2026-04-20 14:42:07.601155
Let’s output the contents of the updated table:
SELECT * FROM raw.test_data
ORDER BY upload_time
+----------------------------+----+-------+
| upload_time | id | name |
+----------------------------+----+-------+
| 2026-04-20T14:42:00.707294 | 0 | alex |
+----------------------------+----+-------+
| 2026-04-20T14:42:06.812879 | 1 | smith |
+----------------------------+----+-------+
| 2026-04-20T14:42:06.812879 | 2 | brown |
+----------------------------+----+-------+
tngri.upload_file()
| Description |
Uploads data from a file to Tengri. |
| Usage |
|
Uploads data from a file at the specified path to Tengri repository (in S3).
Returns a string with the path and filename of the file inside the S3 repository where the data was uploaded.
One of the usage scenarios is described at here.
Parameters
- file: str
-
Path to the file to be uploaded to Tengri.
- filename: str | None, default None
-
The path and filename of the file in Tengri where the data will be uploaded. If not specified, a random name will be assigned.
See examples
# Example 1
Let’s load into Tengri the data from the .json file available at URL:
import tngri
import urllib.request
urllib.request.urlretrieve(
'https://tngri.postgrespro.ru/documentation/ru/stable/_attachments/tengri_data_types.json',
'my_file.json'
)
tngri.upload_file('my_file.json')
UploadedFile(s3_path='s3://prostore/Stage/pxfihzbonctd.json', _client=None)
Let’s output the first 5 rows of the table by reading it from the loaded file:
SELECT * FROM read_json('pxfihzbonctd.json')
LIMIT 5
+----------+-----------+----------+---------------------------+
| name | type | category | description |
+----------+-----------+----------+---------------------------+
| BIGINT | data type | numeric | Целые числа. |
+----------+-----------+----------+---------------------------+
| BIGINT[] | data type | array | Массивы целых чисел. |
+----------+-----------+----------+---------------------------+
| BLOB | data type | blob | Двоичные объекты. |
+----------+-----------+----------+---------------------------+
| BOOL | data type | boolean | Булевы значения. |
+----------+-----------+----------+---------------------------+
| BOOL[] | data type | array | Массивы булевых значений. |
+----------+-----------+----------+---------------------------+
# Example 2
Let’s load data from the .json file available at URL into Tengri and save the name of the loaded file into a variable:
import tngri
import urllib.request
urllib.request.urlretrieve(
'https://tngri.postgrespro.ru/documentation/ru/stable/_attachments/tengri_data_types.json',
'my_file.json'
)
file_name = tngri.upload_file('my_file.json')
print(file_name)
s3://prostore/Stage/cytkcifdbszn.json
Let’s output the first 5 rows of the table by reading it from the loaded file:
SELECT * FROM read_json('cytkcifdbszn.json')
LIMIT 5
+----------+-----------+----------+---------------------------+
| name | type | category | description |
+----------+-----------+----------+---------------------------+
| BIGINT | data type | numeric | Целые числа. |
+----------+-----------+----------+---------------------------+
| BIGINT[] | data type | array | Массивы целых чисел. |
+----------+-----------+----------+---------------------------+
| BLOB | data type | blob | Двоичные объекты. |
+----------+-----------+----------+---------------------------+
| BOOL | data type | boolean | Булевы значения. |
+----------+-----------+----------+---------------------------+
| BOOL[] | data type | array | Массивы булевых значений. |
+----------+-----------+----------+---------------------------+
# Example 3
Create a local .json file, upload it to Tengri and insert its contents into an existing table. In a separate column record the insertion time. After that, delete the file from Tengri.
To read from the .json file in list format, use the read_json_objects function with the format='array' parameter.
import tngri
localpath = 'file.json'
with open(localpath, 'w') as file:
file.write('[{"id": 123, "name": "smith"}, {"id": 124, "name": "brown"}]')
res = tngri.upload_file(localpath, filename=f"test_data/file.json")
tngri.sql(f"""
insert into raw.test_data
select now(), * from read_json_objects('{res.s3_path}', format='array')
""")
tngri.delete_file(res)
done at 2026-02-11 16:34:22.378942
Let’s output the contents of the updated table:
SELECT * FROM raw.test_data
+----------------------------+------------------------------+
| upload_time | body |
+----------------------------+------------------------------+
| 2026-04-20T16:34:21.875707 | {"id": 123, "name": "smith"} |
+----------------------------+------------------------------+
| 2026-04-20T16:34:21.875707 | {"id": 124, "name": "brown"} |
+----------------------------+------------------------------+
| 2026-04-20T16:30:22.865707 | {"id": 122, "name": "alex"} |
+----------------------------+------------------------------+
tngri.upload_s3()
| Description |
Uploads a file from the specified bucket S3 to Tengri. |
| Usage |
|
Uploads a file from the specified bucket S3 to Tengri. The file extension can be anything. It will remain the same as it was in the source file.
Parameters
- object: str
-
Path to the file in the storage S3.
- access_key: str
-
Access key for S3.
- secret_key: str
-
Secret key for S3.
- filename: str | None, default None
-
Path and filename of the file in Tengri where the data will be uploaded. If not specified, a random name will be assigned.
An example of usage is described here.