Skip to main content

Pack

A pack is a program that executes an analysis on a data source. Packs are executed by agents according to routines. **Its purpose is to process the source and retrieve useful information about it, and send it back to the platform.

How to create a 'Pack'?

Create a pack

pack-steps

qalita pack init --name my_pack

This creates a my_pack_pack folder with the following files:

my_pack_pack
├── main.py
├── pack_conf.json
├── properties.yaml
├── README.md
├── requirements.txt
└── run.sh
tip

The most important files are run.sh and properties.yaml which are indispensable, before each pack release a check for the existence of these files is systematically carried out.

FileDescriptionExamples
main.pyContains pack codemain.py
pack_conf.jsonContains pack configurationpack_conf.json
properties.yamlContains pack propertiesproperties.yaml
README.mdContains pack descriptionREADME.md
requirements.txtContains pack dependenciesrequirements.txt
run.shIs the pack's entry pointrun.sh
info

To see models and examples of packs, you can take inspiration from the public ones github repository

Testing a pack

You can test your pack locally before publishing it on the platform.

To do this, you need to use the Qalita CLI :

qalita pack validate -n my_pack
qalita pack run -n my_pack
info

If you want to use the pack locally on datasets, please create a temporary file source_conf.json.

Example with a local dataset:

{
"config": {
"path": "/home/aleopold/data/heart"
},
"description": "11 clinical features for predicting heart disease events.",
"id": 6,
"name": "Heart Failure Prediction Dataset",
"owner": "aleopold",
"owner_id": 2,
"reference": false,
"sensitive": false,
"type": "file",
"validate": "valid",
"visibility": "internal"
}
warning

Be sure to delete results files, logs, source configuration files and cache files before publishing your pack.

At runtime

The pack entry point is the run.sh file, located in the root path of the temporary local folder created by the agent.

run.sh Example:

#/bin/bash
python -m pip install --quiet -r requirements.txt
python main.py

The pack is fed by a source_conf.json file and also target_conf.json if the pack is of type compare.

These files contain the config: data for the source.

These files are located next to the run.sh entry point.

source_conf.json Example:

{
"config": {
"path": "/home/lucas/desktop"
},
"description" : "Desktop files",
"id" : 1,
"name": "local_data",
"owner" : "lucas",
"type" : "fichier",
"reference" : false,
"sensitive": false,
"visibility": "private",
"validate" : "valid"
}
info

The pack is responsible for managing its own source type compatibility by checking the source type in the source_conf.json file.

After execution

At the end of pack execution, the agent searches for :

  • logs.txt : File downloaded to give feedback logs to the platform in the frontend.

logs.txt Example:

2023-07-21 11:51:12,688 - qalita.commands.pack - INFO - ------------- Pack Run -------------
2023-07-21 11:51:15,087 - qalita.commands.pack - INFO - CSV files found :
2023-07-21 11:51:15,222 - qalita.commands.pack - ERROR - Summarize dataset : 0%| | 0/5 [00:00` ?, ?it/s]
...

Visible on the platform :

logs

  • recommendations.json

The Recommendations file contains the recommendations given by the pack about the source.

recommendations.json Example:

{
[
{
"content": "Cholesterol has 172 (18.7%) zeros",
"type": "Zeros",
"scope": {
"perimeter": "column",
"value": "Cholesterol"
},
"level": "info"
},
{
...
}
...
]
}

The recommendations are then materialized in the pack view on the source page.

reco-pack-source

info

There are several level recommendations:

  • info : Information
  • warning
  • high
  • metrics.json

The metrics file contains the metrics given by the pack about the source.

metrics.json Example:

{
[
{
"key": "types_numeric",
"value": "7",
"scope": {
"perimeter": "dataset",
"value": "Heart Failure Prediction Dataset"
}
},
{
"key": "is_unique",
"value": "0",
"scope": {
"perimeter": "column",
"value": "Age"
}
},
{
...
}
...
]
}

Metrics are then materialized in the pack view on the source page.

metrics-pack-source

Metrics and recommendations are transmitted to the platform and then made available to the source pack execution view.

  • schemas.json

The metrics file contains the metrics given by the pack about the source.

metrics.json Example:

[
{
"key": "dataset",
"value": "Heart Failure Prediction Dataset",
"scope": {
"perimeter": "dataset",
"value": "Heart Failure Prediction Dataset"
}
},
{
"key": "column",
"value": "Age",
"scope": {
"perimeter": "column",
"value": "Age"
}
},
{
"key": "column",
"value": "Sex",
"scope": {
"perimeter": "column",
"value": "Sex"
}
},
....
]

The schematics are then materialized in the pack view on the source page.

schema-pack-source

Publish a pack

Packs have authors, you can only publish a pack of which you are the author. You can see the author of a pack on the pack page:

Viewing the author of a pack

info

When adding partners, their public packs will be available to you in another tab on the application's packs page. They'll also be listed using qalita pack list you'll be able to use them just like any other pack. However, you won't be able to modify them.

The author of these packs will be the "partner" user created when the partner was added.

To publish a pack, you must use the Qalita CLI:

  1. Install the Qalita CLI
pip install qalita
  1. Retrieve his API token from his profile page

profile-get-token

  1. Connect to the platform
agentName=admin
fileName="$HOME/.qalita/.env-$agentName"
mkdir -p $(dirname $fileName)
echo "QALITA_AGENT_NAME=$agentName" > $fileName
echo "QALITA_AGENT_MODE=worker" >> $fileName
echo "QALITA_AGENT_ENDPOINT=http://localhost:3080/api/v1" >> $fileName
echo "QALITA_AGENT_TOKEN=" >> $fileName
  1. Move to the root of the pack's parent folder

Example for a pack named my-pack :

/-- parent-folder <----- here
|-- my-pack_pack
| __init__.py
| my-pack.py
  1. Publish the pack
qalita pack push -n my_pack

You can then find your pack on the platform:

pack-list

Qalita Pack Assistant

You can use our conversational robot Qalita Pack Assistant to help you create your pack.

Our robot benefits from a knowledge base specific to the Qalita pack creation use case. It will guide you and optimize your productivity.

qalita-pack-assistant

QALITA public packs

You can find Qalita public packs on our github repository. These packs are maintained by QALITA SAS and the community. All contributions are appreciated.