Getting Started¶
echo-postgres is an interface to easily interact with the performance PostgreSQL database (PerformanceDB). It was completely rewritten in version 2.0 to allow for much easier and concise usability. Now all the methods are contained in a single class called PerfDB and the user only needs to import this class to use the functionalities of the package.
A lot of the code implemented here is based on the approach used by the Python Reddit API Wrapper PRAW. It's a very good example of how to implement an API wrapper in Python and it's recommended to take a look at it if you want to understand how this package works.
Usage¶
To interact with PerformanceDB using echo-postgres you just need to create an instance of the PerfDB class, like shown in the example below.
from echo_postgres import PerfDB
perfdb = PerfDB()
Tip
Usually no arguments are requires, but you can define different host, user, password etc. at connection time. If they are not defined, the default values will be used.
Also, one important option is the application_name, which can come in handy when trying to relate the SQL sessions within pgAdmin to the application that is using the database. The name you define here will be shown in pgAdmin.
Structure¶
The PerfDB class is the only top level class in the package. It holds objects that make it easy to access different types of data stored in PerformanceDB. The structure of the class is shown below.
PerfDB
├───DataTypes
├───Attributes
├───DataSources
│ ├───Types
│ └───Instances
│ └───Attributes
├───Objects
│ ├───Types
│ ├───Models
│ │ └───Attributes
│ ├───Instances
│ │ └───Attributes
│ └───Groups
│ ├───Types
│ └───Instances
├───Alarms
│ ├───Definitions
│ └───History
├───Features
│ ├───Definitions
│ │ └───Attributes
│ └───Values
│ ├───Series
│ ├───Latest
│ ├───Aggregations
│ └───WhenAtCondition
├───CalcModels
│ ├───Types
│ └───Instances
│ └───Files
│ ├───Definitions
│ └───Values
├───Forecasts
│ ├───Models
│ │ └───Attributes
│ └───Features
│ ├───Definitions
│ └───Values
├───Ons
│ ├───Limitations
│ └───SPEs
│ └───SiteMapping
├───Ccee
│ ├───Submarkets
│ ├───PLD
│ ├───IPCA
│ ├───PPA
│ │ ├───Definitions
│ │ └───Values
│ └───TUST
├───KPIs
│ ├───Availability
│ │ ├───Categories
│ │ ├───Types
│ │ ├───Subtypes
│ │ │ └───Categories
│ │ ├───Amounts
│ │ ├───Forecasts
│ │ │ ├───Revisions
│ │ │ │ └───Assignments
│ │ │ ├───Assumptions
│ │ │ │ └───Assets
│ │ │ ├───DowntimeEvents
│ │ │ │ └───Assets
│ │ │ └───CurtailmentEvents
│ │ │ └───Assets
│ │ ├───Targets
│ │ └───Values
│ ├───StoppedAssets
│ ├───TrackerAvailability
│ │ ├───Types
│ │ ├───Amounts
│ │ └───Values
│ ├───Energy
│ │ ├───MeasurementPoints
│ │ ├───Targets
│ │ ├───Values
│ │ ├───Waterfall
│ │ └───Losses
│ │ ├───Types
│ │ ├───Values
│ │ └───Targets
│ ├───CapacityFactor
│ │ ├───Targets
│ │ └───Values
│ ├───Resource
│ │ ├───Types
│ │ └───Values
│ ├───EnergyPerformanceIndex
│ │ └───Values
│ └───PerformanceRatio
│ └───Values
├───BazefieldUsers
├───Users
│ ├───Instances
│ └───Roles
│ ├───Instances
│ └───Permissions
│ ├───Types
│ └───Values
├───Settings
├───Comments
│ ├───Events
│ └───IVCurve
│ └───Reports
│ └───Details
├───Labels
│ ├───Types
│ └───Values
├───Companies
├───Cities
├───Components
│ ├───Types
│ ├───Models
│ │ └───Attributes
│ ├───Instances
│ │ ├───Attributes
│ │ ├───History
│ │ └───Latest
│ └───Subcomponents
│ ├───Types
│ ├───Models
│ │ └───Attributes
│ └───Instances
│ ├───Attributes
│ ├───History
│ └───Latest
├───RootCauses
├───Events
│ ├───Types
│ ├───Instances
│ │ ├───Comments
│ ├───DetectionTypes
├───Documents
│ ├───Types
│ └───Instances
│ ├───Events
│ └───Labels
├───RawData
│ ├───Definitions
│ └───Values
├───Vibration
│ ├───Frequencies
│ ├───TimeSeries
│ └───Spectrum
├───ResourceAssessments
│ ├───Types
│ ├───Instances
│ ├───Values
│ ├───Pxx
│ └───Losses
│ ├───Types
│ └───Values
├───IVCurves
│ ├───Diagnostics
│ │ ├───Types
│ │ └───Instances
│ └───Reports
│ └───Details
├───ServiceOrders
│ ├───Notes
│ └───Status
├───Inventory
│ ├───Centers
│ ├───StorageLocations
│ │ └───AllowedTransactionTypes
│ ├───Materials
│ ├───CurrentBalances
│ ├───Transactions
│ │ ├───Types
│ │ ├───Headers
│ │ │ └───Documents
│ │ └───Items
│ ├───Withdrawals
│ │ ├───Items
│ │ └───ServiceOrders
│ └───PhysicalCounts
│ ├───Items
│ └───Documents
└───Jobs
├───Types
├───Instances
├───Steps
└───States
Each object below PerfDB is an instance of it's own specific class, but all inherit from the base class PerfDbRoot. This class contains the methods that are common to all objects, and specially, defines the constructor that should not be overridden in the child classes. The constructor is responsible instantiating common attributes to all objects, such as the connection handler to the database.
General Guidelines¶
-
Polars: Since version 3.0.0, Polars is the standard DataFrame library for this package and must be preferred over Pandas in all new development. Polars is significantly faster and more memory-efficient than Pandas, and all internal queries now use
conn.read_to_polars()by default. The key implications are:- All
getmethods expose anoutput_type="pl.DataFrame"option that returns a native Polars DataFrame. This is the preferred output type for new code. - Internally, always use
conn.read_to_polars(query)to fetch results from the database. - When a method needs to return a Pandas DataFrame for backwards compatibility (
output_type="DataFrame"), convert at the last possible step usingdf.to_pandas(use_pyarrow_extension_array=True). - Avoid holding Pandas DataFrames inside a method longer than necessary. Convert to Polars as early as possible.
- JSON/mixed-type columns should be handled with the
cast_column_to_objectutility fromecho_postgres.utils, which parses stringified JSON into native Python objects and stores them aspl.Object.
- All
-
Key naming: The package maintains the same names used in the database for consistency. This means that when returning a dictionary or a DataFrame, the keys will be the same as the column names in the database. This is done to make it easier to relate the data returned by the package to the data in the database.
- Default keys: Methods that return values such as dicts or DataFrames will always use the names of the objects as keys/index. This is done to make it easier to use the returned values in other methods. The ids of those objects will be returned in the dict values or DataFrame columns as they are of secondary usage (very important to the database itself but not to the user or applications interacting with it).
- Queries: All queries should be executed within a
withstatement like the one in example below. This is done to ensure that the connection to the database is closed after the query is executed, avoiding issues due to too many open connections.
# suppose we are in a class that is a children of PerfDbRoot
# example of reading to polars (preferred)
df = self._perfdb.conn.read_to_polars(query)
# example of executing a query
self._perfdb.conn.execute(query)
Common Methods¶
Note
Most of the classes will have at least some of the methods below to interact with the database. Regardless of that, it's important to note that most of the classes will just have some of them, as they are not all applicable to all classes.
-
get_idsMethod used to list all instances of the desired class. For example, if
perfdb.objects.types.get_ids()is called the id's of the possible object types would be returned.For consistency, it's expected that all get_ids methods return a dict containing the names as keys and the ids as values.
There might be cases where arguments are required to filter the results. In case there are multiple arguments to filter, an argument called
filter_typewith possible values ofandandorwill be used to define how the filters are combined. The default value isand. -
getMethod used to get the values of the desired class. For example, if
perfdb.objects.types.get()is called all the possible object types would be returned, but this would include getting more data related to it than just the ids and names. This way this method can be seen as an extension of theget_idsmethod.As the return value of this can vary depending on the considered class, it's expected that all
getmethods have an argument called output_type to define what is the format of the output. The supported values are"dict","DataFrame"(Pandas, kept for backwards compatibility), and"pl.DataFrame"(Polars, preferred). New methods should default to"pl.DataFrame"unless there is a strong reason to do otherwise.Like in the
get_idsmethod, there might be cases where arguments are required to filter the results. In case there are multiple arguments to filter, an argument calledfilter_typewith possible values ofandandorwill be used to define how the filters are combined. The default value isand. -
insertMethod used to insert a new instance of the desired class. For example, if
perfdb.objects.types.insert()is called a new object type would be created in PerformanceDB.In most cases it should have an argument called on_conflict to define how conflicts are handled. It's possible values will be:
- raise: Raises an exception and is the default value.
- ignore: Ignores the error and continues with the execution.
- update: Updates the existing instance with the new values.
-
updateMethod used to update an existing instance of the desired class. For example, if
perfdb.objects.types.update()is called an existing object type would be updated in PerformanceDB. -
deleteMethod used to delete an existing instance of the desired class. For example, if
perfdb.objects.types.delete()is called an existing object type would be deleted in PerformanceDB. -
export_fileMethod used to export the data of the desired class. For example, if
perfdb.objects.types.export_file()is called all the object types would be exported to the desired file type.This method will infer the file type from the file extension.
-
import_fileMethod used to import data of the desired class. For example, if
perfdb.objects.types.import_file()is called all the object types would be imported from the desired file type.
Internal Utilities for Developers¶
The following utilities are available for developing new modules or refactoring existing ones. They reduce boilerplate and enforce consistency across the codebase.
BaseLookupTable¶
Base class for simple lookup/type tables that only expose get_ids() and get(). Instead of writing ~70 lines of boilerplate per module, subclasses need only a few lines.
Location: echo_postgres/base_lookup.py
Usage:
from echo_postgres.base_lookup import BaseLookupTable
class EventTypes(BaseLookupTable):
"""Class used for handling event types. Can be accessed via `perfdb.events.types`."""
_table_name = "event_types"
This automatically provides get_ids() -> dict[str, int] and get(output_type=...) -> dict | DataFrame | pl.DataFrame methods that query performance.event_types.
Configurable class attributes:
| Attribute | Default | Description |
|---|---|---|
_table_name |
(required) | Table or view name (e.g., "event_types") |
_schema_name |
"performance" |
Schema name |
_index_col |
"name" |
Column used as index/key in results |
_id_col |
"id" |
Column used as the ID in get_ids() |
_cols_schema |
None |
Polars schema overrides dict for read_to_polars() |
When to use: For any module that queries a single table with SELECT name, id and SELECT *, has no _check_get_args, and has no extra methods beyond get() and get_ids().
When NOT to use: For modules that have custom filtering parameters, additional methods (e.g., insert, delete), or complex query logic.
WhereClauseBuilder¶
Fluent builder for SQL WHERE clauses using psycopg.sql safe composition. Replaces the repetitive _check_get_args() boilerplate found across 60+ modules.
Location: echo_postgres/query_utils.py
Usage:
from echo_postgres.query_utils import WhereClauseBuilder
def _check_get_args(self, descriptions, sap_ids, is_active, base_units, filter_type):
return (
WhereClauseBuilder(filter_type=filter_type)
.add_any("description", descriptions)
.add_any("sap_id", sap_ids)
.add_eq("is_active", is_active)
.add_any("base_unit", base_units)
.build()
)
Available methods:
| Method | SQL Pattern | Example |
|---|---|---|
add_in(col, values) |
col IN (v1, v2, ...) |
add_in("name", ["a", "b"]) |
add_any(col, values) |
col = ANY(array) |
add_any("id", [1, 2, 3]) |
add_eq(col, value) |
col = value |
add_eq("is_active", True) |
add_between(col, start, end) |
col BETWEEN start AND end |
add_between("date", d1, d2) |
add_date_range(col, start, end) |
(col >= start AND col <= end) |
add_date_range("date", d1, d2) |
add_overlaps(s_col, e_col, start, end) |
(s_col, e_col) OVERLAPS (start, end) |
add_overlaps("start", "end", d1, d2) |
add_regex(col, pattern) |
col ~* pattern |
add_regex("comment", "error.*") |
add_regex_any(col, patterns) |
col ~* ANY(ARRAY[...]) |
add_regex_any("name", ["a.*", "b.*"]) |
add_array_overlap(col, values) |
col && ARRAY[v1, ...] |
add_array_overlap("labels", ["x"]) |
add_is_null(col) |
col IS NULL |
add_is_null("deleted_at") |
add_fts(col, query) |
col @@ plainto_tsquery(lang, query) |
add_fts("search_vector", "motor") |
add_raw(condition) |
(any sql.Composable) | add_raw(sql.SQL("FALSE")) |
All methods skip the condition when the value is None or empty (except add_eq with skip_none=False, add_is_null, and add_raw). The build() method returns sql.SQL("") if no conditions were added. The build() method also accepts a prefix parameter (default " WHERE ") to customize the prefix of the WHERE clause.
Tip
Prefer add_any() over add_in() for new code — PostgreSQL handles = ANY(array) better with prepared statement caching.
convert_output()¶
Standard helper to convert a Polars DataFrame to the requested output format (dict, DataFrame, or pl.DataFrame). Replaces the 4-line pattern that was duplicated across 140+ files.
Location: echo_postgres/query_utils.py
Usage:
from echo_postgres.query_utils import convert_output
# in a get() method:
df = self._perfdb.conn.read_to_polars(query)
return convert_output(df, output_type)
# with a custom index column:
return convert_output(df, output_type, index_col="id")
# with multi-level index and nested dict output:
return convert_output(df, output_type, index_col=["object_model_name", "name"], nest_by_index=True)
# with list-of-dicts output:
return convert_output(df, output_type, orient="records")
# dropping id columns from dict output:
return convert_output(df, output_type, drop_id_cols=True)
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
df |
pl.DataFrame |
(required) | The Polars DataFrame to convert |
output_type |
Literal["dict", "DataFrame", "pl.DataFrame"] |
(required) | Desired output format |
index_col |
str \| list[str] |
"name" |
Column(s) to use as the index/key. Accepts a list for MultiIndex |
orient |
Literal["index", "records"] |
"index" |
Dict orientation. "index" produces {key: {col: val}}, "records" produces [{col: val}, ...] |
drop_id_cols |
bool |
False |
If True, drop columns ending with _id before dict conversion. Only affects dict output |
nest_by_index |
bool |
False |
If True and index_col is a list, build nested dicts where each index column becomes a nesting level. Only affects dict output |
values_only_key |
str \| None |
None |
When nest_by_index is True, extract only this key from the innermost data dict |
The function also automatically converts NaN keys to None in dict output.
Reference¶
The scope of this documentation doesn't go as far as explaining how the data is structured in the database. More details on this can be found in the PerformanceDB documentation using password p3rf0rM#nce.