CCEE IPCA¶

`CceeIpca(perfdb)` ¶

Class used for handling IPCA. Can be accessed via perfdb.ccee.ipca.

Parameters:

perfdb ¶
(PerfDB) –

Top level object carrying all functionality and the connection handler.

Source code in echo_postgres/perfdb_root.py

Python

def __init__(self, perfdb: e_pg.PerfDB) -> None:
    """Base class that all subclasses should inherit from.

    Parameters
    ----------
    perfdb : PerfDB
        Top level object carrying all functionality and the connection handler.

    """
    self._perfdb: e_pg.PerfDB = perfdb

`delete(period)` ¶

Deletes IPCA values from the database.

Parameters:

period ¶
(DateTimeRange) –

Period for which to delete the IPCA values.

Source code in echo_postgres/ccee_ipca.py

Python

@validate_call
def delete(
    self,
    period: DateTimeRange,
) -> None:
    """Deletes IPCA values from the database.

    Parameters
    ----------
    period : DateTimeRange
        Period for which to delete the IPCA values.
    """
    # defining the query
    query = [
        sql.SQL("DELETE FROM performance.ccee_ipca"),
        sql.SQL(" WHERE month >= {start} AND month <= {end}").format(
            start=sql.Literal(f"{period.start:%Y-%m-%d %H:%M:%S}"),
            end=sql.Literal(f"{period.end:%Y-%m-%d %H:%M:%S}"),
        ),
    ]

    query = sql.Composed(query)

    # deleting from the database
    self._perfdb.conn.execute(query)

    logger.debug(f"Deleted {self._perfdb.conn.rowcount} rows from performance.ccee_ipca table")

`get(period, output_type='DataFrame')` ¶

Gets IPCA values from the database.

The values are acquired automatically from IBGE and stored in the database. This means that we might have a delay of up to 10 days to get the IPCA values for the last month, as that is the time it takes for IBGE to release the data.

Parameters:

period ¶
(DateTimeRange) –

Period for which to get the IPCA values.
output_type ¶
(Literal['DataFrame', 'pl.DataFrame'], default: 'DataFrame' ) –

Output type of the data. Can be one of ["DataFrame", "pl.DataFrame"] By default "DataFrame"

Returns:

DataFrame –

In case output_type is "DataFrame", returns a pandas DataFrame with the IPCA values.
DataFrame –
In case output_type is "pl.DataFrame", returns a Polars DataFrame with the IPCA values.

It will have the columns:
- month (datetime), which represents the month of the IPCA value
- index_value (float), which represents the IPCA value for the month
- change_month (float), which represents the percentage change from the previous month
- change_ytd (float), which represents the percentage change from the beginning of the year
- change_12m (float), which represents the percentage change from the same month of the previous year (12 months ago)

Source code in echo_postgres/ccee_ipca.py

Python

@validate_call
def get(
    self,
    period: DateTimeRange,
    output_type: Literal["DataFrame", "pl.DataFrame"] = "DataFrame",
) -> pd.DataFrame | pl.DataFrame:
    """Gets IPCA values from the database.

    The values are acquired automatically from IBGE and stored in the database. This means that we might have a delay of up to 10 days to get the IPCA values for the last month, as that is the time it takes for IBGE to release the data.

    Parameters
    ----------
    period : DateTimeRange
        Period for which to get the IPCA values.
    output_type : Literal["DataFrame", "pl.DataFrame"], optional
        Output type of the data. Can be one of ["DataFrame", "pl.DataFrame"]
        By default "DataFrame"

    Returns
    -------
    pd.DataFrame
        In case output_type is "DataFrame", returns a pandas DataFrame with the IPCA values.
    pl.DataFrame
        In case output_type is "pl.DataFrame", returns a Polars DataFrame with the IPCA values.

        It will have the columns:

        - month (datetime), which represents the month of the IPCA value
        - index_value (float), which represents the IPCA value for the month
        - change_month (float), which represents the percentage change from the previous month
        - change_ytd (float), which represents the percentage change from the beginning of the year
        - change_12m (float), which represents the percentage change from the same month of the previous year (12 months ago)
    """
    # defining the query
    query = sql.SQL("SELECT * FROM v_ccee_ipca WHERE month >= {start} AND month <= {end}").format(
        start=sql.Literal(f"{period.start:%Y-%m-%d %H:%M:%S}"),
        end=sql.Literal(f"{period.end:%Y-%m-%d %H:%M:%S}"),
    )

    # getting the data
    df = self._perfdb.conn.read_to_polars(query)

    if output_type == "pl.DataFrame":
        return df

    return df.to_pandas(use_pyarrow_extension_array=True)

`insert(df, on_conflict='ignore')` ¶

Inserts IPCA values into the database.

Parameters:

df ¶
(DataFrame | DataFrame) –
DataFrame with IPCA values to be inserted. Must have the columns:
- month (datetime)
- value (float)
on_conflict ¶
(Literal['ignore', 'update'], default: 'ignore' ) –

What to do in case of conflict. Can be one of ["ignore", "update"]. By default "ignore"

Source code in echo_postgres/ccee_ipca.py

Python

@validate_call
def insert(
    self,
    df: pd.DataFrame | pl.DataFrame,
    on_conflict: Literal["ignore", "update"] = "ignore",
) -> None:
    """Inserts IPCA values into the database.

    Parameters
    ----------
    df : pd.DataFrame | pl.DataFrame
        DataFrame with IPCA values to be inserted. Must have the columns:

        - month (datetime)
        - value (float)

    on_conflict : Literal["ignore", "update"], optional
        What to do in case of conflict. Can be one of ["ignore", "update"].
        By default "ignore"
    """
    # converting to pl.DataFrame if necessary
    if isinstance(df, pd.DataFrame):
        df = pl.from_pandas(df)

    # checking columns
    required_cols = [
        "month",
        "value",
    ]

    if any(col not in df.columns for col in required_cols):
        missing_cols = [col for col in required_cols if col not in df.columns]
        raise ValueError(f"df is missing the following columns: {missing_cols}")

    # making a copy of the DataFrame
    df = df.clone()

    # getting only the wanted columns
    df = df.select(required_cols)

    # casting month to datetime to avoid errors
    df = df.with_columns(pl.col("month").cast(pl.Datetime("ms")))  # assuming month is in milliseconds, adjust if necessary

    # casting value to float to avoid errors
    df = df.with_columns(pl.col("value").cast(pl.Float64))

    # renaming value to index_value
    df = df.rename({"value": "index_value"})

    # dropping duplicates
    df = df.unique()

    # inserting
    if_exists_mapping = {
        "ignore": "append",
        "update": "update",
    }
    self._perfdb.conn.polars_to_sql(
        df=df,
        table_name="ccee_ipca",
        schema="performance",
        if_exists=if_exists_mapping[on_conflict],
    )

    logger.debug(f"Inserted {df.shape[0]} rows into ccee_ipca table")

CCEE IPCA¶

`CceeIpca(perfdb)` ¶

`perfdb` ¶

`delete(period)` ¶

`period` ¶

`get(period, output_type='DataFrame')` ¶

`period` ¶

`output_type` ¶

`insert(df, on_conflict='ignore')` ¶

`df` ¶

`on_conflict` ¶

CCEE IPCA¶

CceeIpca(perfdb) ¶

perfdb ¶

delete(period) ¶

period ¶

get(period, output_type='DataFrame') ¶

period ¶

output_type ¶

insert(df, on_conflict='ignore') ¶

df ¶

on_conflict ¶

`CceeIpca(perfdb)` ¶

`perfdb` ¶

`delete(period)` ¶

`period` ¶

`get(period, output_type='DataFrame')` ¶

`period` ¶

`output_type` ¶

`insert(df, on_conflict='ignore')` ¶

`df` ¶

`on_conflict` ¶