Pack
A pack is a program that executes an analysis on a data source. Packs are executed by agents according to routines. **Its purpose is to process the source and retrieve useful information about it, and send it back to the platform.
How to create a 'Pack'?
Create a pack
- Commande
- Result
qalita pack init --name my_pack
>>> qalita pack init --name my_pack
Created package folder: my_pack_pack
Created file: properties.yaml
Created file: pack_conf.json
Created file: main.py
Please update the main.py file with the required code
Created file: run.sh
Please update the run.sh file with the required commands
Created file: requirements.txt
Please update the requirements.txt file with the required packages depdencies
Created file: README.md
Please READ and update the README.md file with the description of your pack
This creates a my_pack_pack
folder with the following files:
my_pack_pack
├── main.py
├── pack_conf.json
├── properties.yaml
├── README.md
├── requirements.txt
└── run.sh
The most important files are run.sh
and properties.yaml
which are indispensable, before each pack release a check for the existence of these files is systematically carried out.
File | Description | Examples |
---|---|---|
main.py | Contains pack code | main.py |
pack_conf.json | Contains pack configuration | pack_conf.json |
properties.yaml | Contains pack properties | properties.yaml |
README.md | Contains pack description | README.md |
requirements.txt | Contains pack dependencies | requirements.txt |
run.sh | Is the pack's entry point | run.sh |
To see models and examples of packs, you can take inspiration from the public ones github repository
Testing a pack
You can test your pack locally before publishing it on the platform.
To do this, you need to use the Qalita CLI :
qalita pack validate -n my_pack
qalita pack run -n my_pack
If you want to use the pack locally on datasets, please create a temporary file source_conf.json
.
Example with a local dataset:
{
"config": {
"path": "/home/aleopold/data/heart"
},
"description": "11 clinical features for predicting heart disease events.",
"id": 6,
"name": "Heart Failure Prediction Dataset",
"owner": "aleopold",
"owner_id": 2,
"reference": false,
"sensitive": false,
"type": "file",
"validate": "valid",
"visibility": "internal"
}
Be sure to delete results files, logs, source configuration files and cache files before publishing your pack.
At runtime
The pack entry point is the run.sh
file, located in the root path of the temporary local folder created by the agent.
run.sh Example:
#/bin/bash
python -m pip install --quiet -r requirements.txt
python main.py
The pack is fed by a source_conf.json
file and also target_conf.json
if the pack is of type compare.
These files contain the config:
data for the source.
These files are located next to the run.sh
entry point.
source_conf.json Example:
{
"config": {
"path": "/home/lucas/desktop"
},
"description" : "Desktop files",
"id" : 1,
"name": "local_data",
"owner" : "lucas",
"type" : "fichier",
"reference" : false,
"sensitive": false,
"visibility": "private",
"validate" : "valid"
}
The pack is responsible for managing its own source type compatibility by checking the source type in the source_conf.json
file.
After execution
At the end of pack execution, the agent searches for :
logs.txt
: File downloaded to give feedback logs to the platform in the frontend.
logs.txt Example:
2023-07-21 11:51:12,688 - qalita.commands.pack - INFO - ------------- Pack Run -------------
2023-07-21 11:51:15,087 - qalita.commands.pack - INFO - CSV files found :
2023-07-21 11:51:15,222 - qalita.commands.pack - ERROR - Summarize dataset : 0%| | 0/5 [00:00` ?, ?it/s]
...
Visible on the platform :
recommendations.json
The Recommendations file contains the recommendations given by the pack about the source.
recommendations.json Example:
{
[
{
"content": "Cholesterol has 172 (18.7%) zeros",
"type": "Zeros",
"scope": {
"perimeter": "column",
"value": "Cholesterol"
},
"level": "info"
},
{
...
}
...
]
}
The recommendations are then materialized in the pack view on the source page.
There are several level
recommendations:
info
: Information- warning
- high
metrics.json
The metrics file contains the metrics given by the pack about the source.
metrics.json Example:
{
[
{
"key": "types_numeric",
"value": "7",
"scope": {
"perimeter": "dataset",
"value": "Heart Failure Prediction Dataset"
}
},
{
"key": "is_unique",
"value": "0",
"scope": {
"perimeter": "column",
"value": "Age"
}
},
{
...
}
...
]
}
Metrics are then materialized in the pack view on the source page.
Metrics and recommendations are transmitted to the platform and then made available to the source pack execution view.
schemas.json
The metrics file contains the metrics given by the pack about the source.
metrics.json Example:
[
{
"key": "dataset",
"value": "Heart Failure Prediction Dataset",
"scope": {
"perimeter": "dataset",
"value": "Heart Failure Prediction Dataset"
}
},
{
"key": "column",
"value": "Age",
"scope": {
"perimeter": "column",
"value": "Age"
}
},
{
"key": "column",
"value": "Sex",
"scope": {
"perimeter": "column",
"value": "Sex"
}
},
....
]
The schematics are then materialized in the pack view on the source page.
Publish a pack
Packs have authors, you can only publish a pack of which you are the author. You can see the author of a pack on the pack page:
Viewing the author of a pack
When adding partners, their public packs will be available to you in another tab on the application's packs page.
They'll also be listed using qalita pack list
you'll be able to use them just like any other pack.
However, you won't be able to modify them.
The author of these packs will be the "partner" user created when the partner was added.
To publish a pack, you must use the Qalita CLI:
- Install the Qalita CLI
pip install qalita
- Retrieve his API token from his profile page
- Connect to the platform
agentName=admin
fileName="$HOME/.qalita/.env-$agentName"
mkdir -p $(dirname $fileName)
echo "QALITA_AGENT_NAME=$agentName" > $fileName
echo "QALITA_AGENT_MODE=worker" >> $fileName
echo "QALITA_AGENT_ENDPOINT=http://localhost:3080/api/v1" >> $fileName
echo "QALITA_AGENT_TOKEN=" >> $fileName
- Move to the root of the pack's parent folder
Example for a pack named my-pack
:
/-- parent-folder <----- here
|-- my-pack_pack
| __init__.py
| my-pack.py
- Publish the pack
- Commande
- Result
qalita pack push -n my_pack
>>> qalita pack push -n my_pack
------------- Pack Validation -------------
Pack [my_pack] validated.
------------- Pack Push -------------
Pack [my_pack] published
New pack version [1.0.0] detected. Pushing pack version
Pack [my_pack] updated successfully
Pack asset uploaded
Pack pushed !
You can then find your pack on the platform:
Qalita Pack Assistant
You can use our conversational robot Qalita Pack Assistant to help you create your pack.
Our robot benefits from a knowledge base specific to the Qalita pack creation use case. It will guide you and optimize your productivity.
QALITA public packs
You can find Qalita public packs on our github repository. These packs are maintained by QALITA SAS and the community. All contributions are appreciated.