Vibration Spectrum¶

For more details on vibration data see this dedicated page in the reference section.

`VibrationSpectrum(perfdb)` ¶

Class used for handling Vibration Spectrum. Can be accessed via perfdb.vibration.spectrum.

Parameters:

perfdb ¶
(PerfDB) –

Top level object carrying all functionality and the connection handler.

Source code in echo_postgres/perfdb_root.py

def __init__(self, perfdb: e_pg.PerfDB) -> None:
    """Base class that all subclasses should inherit from.

    Parameters
    ----------
    perfdb : PerfDB
        Top level object carrying all functionality and the connection handler.

    """
    self._perfdb: e_pg.PerfDB = perfdb

`get(period, manufacturer, data_type='Vibration', object_names=None, unit='Order', spectrum_type='Normal', amplitude_type='Peak', sensors=None, acquisition_frequencies=None, variable_names=None, filter_type='and', output_type='DataFrame')` ¶

Gets the vibration Spectrum data.

Currently, this will process the data in two different ways depending on the manufacturer of the turbine:

Gamesa: Time series data will be converted to spectrum using a FFT. If envelope is selected, the Hilbert transform will be used to get the envelope of the signal before the FFT.
GE: Spectrum data will be used directly, converting the frequency axis to be relative to the generator shaft speed.

The values will be numpy ndarrays with two dimensions [2, :], where first dimension is the frequency (in hertz or orders) and the second is the value.

Assuming array is a value of one row and column "value", if you want to get frequency of the array, you can do array[0, :] and for the value itself array[1, :]

Parameters:

period ¶
(DateTimeRange | list[date]) –
Can be a DateTimeRange or a list of dates.
- If DateTimeRange, will get the data for the entire range (limiting on start and end)
- If list of dates, will get the data for each date in the list.
manufacturer ¶
(Literal['Gamesa', 'GE']) –

Manufacturer of the wind turbine. Either Gamesa or GE.
data_type ¶
(Literal['Vibration'], default: 'Vibration' ) –

Type of the data to get. Can be one of ["Vibration"]
object_names ¶
(list[str] | None, default: None ) –

Names of the objects to check. If None will check for all objects. By default None
unit ¶
(Literal['Hz', 'Order'], default: 'Order' ) –

Unit of the frequency. Can be one of ["Hz", "Order"]. If Order is used, it will always be relative to the HSS/Generator Shaft speed. By default "Order"
spectrum_type ¶
(Literal['Normal', 'Envelope'], default: 'Normal' ) –

What kind of spectrum should be returned. By default "Normal"
amplitude_type ¶
(Literal['RMS', 'Peak', 'Peak-to-Peak'], default: 'Peak' ) –

Type of amplitude to return. Can be one of ["RMS", "Peak", "Peak-to-Peak"].

For GE turbines only Peak is allowed. Peak-to-Peak and RMS are not supported as the data is acquired directly as spectrum.

By default "Peak"
sensors ¶
(list[VIBRATION_GE_ALLOWED_SENSOR_NAMES | VIBRATION_GAMESA_ALLOWED_SENSOR_NAMES] | None, default: None ) –
List of the sensors to get the data for. The options are as shown below:
- GE: "Planetary", "LSS", "HSS", "Generator RS", "Generator GS", "Main Bearing", "Tower Sway Axial", "Tower Sway Transverse"
- Gamesa Vibration: "1 - Generator GS - Radial", "2 - Planetary - Axial", "3 - Main Bearing GS - Radial", "4 - HSS - Radial", "5 - Main Bearing RS - Axial", "6 - HSS - Axial", "7 - Generator RS - Axial", "8 - Generator RS - Radial"
These must be specified with the matching manufacturer and cannot be mixed. If GE is selected only GE sensors are allowed and vice versa.

By default None.
acquisition_frequencies ¶
(list[Literal['Low', 'High', 'Filter']] | None, default: None ) –

Acquisition frequency, only applicable for Gamesa turbines. By default, None
variable_names ¶
(list[Literal['Acceleration - X', 'Acceleration - Y', 'Position - X', 'Position - Y']] | None, default: None ) –

Names of the variables to filter by.
filter_type ¶
(Literal['and', 'or'], default: 'and' ) –

How to treat multiple filters. Can be one of ["and", "or"]. By default "and"
output_type ¶
(Literal['dict', 'DataFrame'], default: 'DataFrame' ) –

Output type of the data. Can be one of ["dict", "DataFrame"] By default "DataFrame"

Returns:

DataFrame –

DataFrame with a MultiIndex[object_name, raw_data_name, timestamp] and columns: value, metadata. Value column contais a numpy ndarray with the Spectrum (2d array with first dimension as time and second as value).
dict[str, dict[str, dict[datetime, dict[str, dict[str, Any]]]]] –

Dictionary in the format {object_name: {raw_data_name: {datetime: {value: value, metadata: metadata}, ...}, ...}, ...}

Source code in echo_postgres/vibration_spectrum.py

@validate_call
def get(
    self,
    period: DateTimeRange | list[date],
    manufacturer: Literal["Gamesa", "GE"],
    data_type: Literal["Vibration"] = "Vibration",
    object_names: list[str] | None = None,
    unit: Literal["Hz", "Order"] = "Order",
    spectrum_type: Literal["Normal", "Envelope"] = "Normal",
    amplitude_type: Literal["RMS", "Peak", "Peak-to-Peak"] = "Peak",
    sensors: list[VIBRATION_GE_ALLOWED_SENSOR_NAMES | VIBRATION_GAMESA_ALLOWED_SENSOR_NAMES] | None = None,
    acquisition_frequencies: list[Literal["Low", "High", "Filter"]] | None = None,
    variable_names: list[Literal["Acceleration - X", "Acceleration - Y", "Position - X", "Position - Y"]] | None = None,
    filter_type: Literal["and", "or"] = "and",
    output_type: Literal["dict", "DataFrame"] = "DataFrame",
) -> DataFrame | dict[str, dict[str, dict[datetime, dict[str, dict[str, Any]]]]]:
    """Gets the vibration Spectrum data.

    Currently, this will process the data in two different ways depending on the manufacturer of the turbine:

    - Gamesa: Time series data will be converted to spectrum using a FFT. If envelope is selected, the Hilbert transform will be used to get the envelope of the signal before the FFT.
    - GE: Spectrum data will be used directly, converting the frequency axis to be relative to the generator shaft speed.

    The values will be numpy ndarrays with two dimensions [2, :], where first dimension is the frequency (in hertz or orders) and the second is the value.

    Assuming array is a value of one row and column "value", if you want to get frequency of the array, you can do array[0, :] and for the value itself array[1, :]

    Parameters
    ----------
    period : DateTimeRange | list[date]
        Can be a DateTimeRange or a list of dates.

        - If DateTimeRange, will get the data for the entire range (limiting on start and end)
        - If list of dates, will get the data for each date in the list.

    manufacturer : Literal["Gamesa", "GE"]
        Manufacturer of the wind turbine. Either Gamesa or GE.
    data_type : Literal["Vibration"], optional
        Type of the data to get. Can be one of ["Vibration"]
    object_names : list[str] | None, optional
        Names of the objects to check. If None will check for all objects. By default None
    unit : Literal["Hz", "Order"], optional
        Unit of the frequency. Can be one of ["Hz", "Order"]. If Order is used, it will always be relative to the HSS/Generator Shaft speed.
        By default "Order"
    spectrum_type : Literal["Normal", "Envelope"], optional
        What kind of spectrum should be returned. By default "Normal"
    amplitude_type : Literal["RMS", "Peak", "Peak-to-Peak"], optional
        Type of amplitude to return. Can be one of ["RMS", "Peak", "Peak-to-Peak"].

        For GE turbines only Peak is allowed. Peak-to-Peak and RMS are not supported as the data is acquired directly as spectrum.

        By default "Peak"
    sensors : list[VIBRATION_GE_ALLOWED_SENSOR_NAMES | VIBRATION_GAMESA_ALLOWED_SENSOR_NAMES] | None
        List of the sensors to get the data for. The options are as shown below:

        - GE: "Planetary", "LSS", "HSS", "Generator RS", "Generator GS", "Main Bearing", "Tower Sway Axial", "Tower Sway Transverse"
        - Gamesa Vibration: "1 - Generator GS - Radial", "2 - Planetary - Axial", "3 - Main Bearing GS - Radial", "4 - HSS - Radial", "5 - Main Bearing RS - Axial", "6 - HSS - Axial", "7 - Generator RS - Axial", "8 - Generator RS - Radial"

        These must be specified with the matching manufacturer and cannot be mixed. If GE is selected only GE sensors are allowed and vice versa.

        By default None.
    acquisition_frequencies : list[Literal["Low", "High", "Filter"]] | None, optional
        Acquisition frequency, only applicable for Gamesa turbines. By default, None
    variable_names : list[Literal["Acceleration - X", "Acceleration - Y", "Position - X", "Position - Y"]] | None, optional
        Names of the variables to filter by.
    filter_type : Literal["and", "or"], optional
        How to treat multiple filters. Can be one of ["and", "or"]. By default "and"
    output_type : Literal["dict", "DataFrame"], optional
        Output type of the data. Can be one of ["dict", "DataFrame"]
        By default "DataFrame"

    Returns
    -------
    DataFrame
        DataFrame with a MultiIndex[object_name, raw_data_name, timestamp] and columns: value, metadata. Value column contais a numpy ndarray with the Spectrum (2d array with first dimension as time and second as value).
    dict[str, dict[str, dict[datetime, dict[str, dict[str, Any]]]]]
        Dictionary in the format {object_name: {raw_data_name: {datetime: {value: value, metadata: metadata}, ...}, ...}, ...}

    """
    # checking arguments
    if output_type not in ["dict", "DataFrame"]:
        raise ValueError(f"output_type must be one of ['dict', 'DataFrame'], not {output_type}")
    if unit not in ["Hz", "Order"]:
        raise ValueError(f"unit must be one of Hz or Order, not {unit}")

    _, wanted_names = self._check_get_args(
        object_names=object_names,
        period=period,
        acquisition_frequencies=acquisition_frequencies,
        variable_names=variable_names,
        manufacturer=manufacturer,
        data_type=data_type,
        sensors=sensors,
        spectrum_type=spectrum_type,
        amplitude_type=amplitude_type,
        filter_type=filter_type,
    )

    match manufacturer:
        # * Gamesa -----------------------------------
        case "Gamesa":
            # getting time series
            df: DataFrame = self._perfdb.vibration.timeseries.get(
                period=period,
                object_names=object_names,
                manufacturer=manufacturer,
                data_type=data_type,
                sensors=sensors,
                acquisition_frequencies=acquisition_frequencies,
                variable_names=variable_names,
                filter_type=filter_type,
                output_type="DataFrame",
            )

            if not df.empty:
                # removing time component from time series
                df["value"] = df["value"].apply(lambda x: x[1, :])

                match spectrum_type:
                    # * Normal ---------------------------------
                    case "Normal":
                        # no need for further processing
                        pass
                    # * Envelope ---------------------------------
                    case "Envelope":
                        # performing hilbert transform to get envelope
                        df["value"] = df["value"].apply(lambda x: np.abs(hilbert(x)))
                    case _:
                        raise ValueError(f"Spectrum type {spectrum_type} is not valid")

                # getting number of samples in value
                df["n_samples"] = df["value"].apply(lambda x: x.size)
                # applying fft
                df["value"] = df["value"].apply(lambda x: np.abs(rfft(x, workers=4)))
                # normalize by number of samples
                # this is needed to get the correct amplitude as the fft is not normalized. See https://numpy.org/doc/stable/reference/routines.fft.html#module-numpy.fft for more details
                df["value"] = df[["value", "n_samples"]].apply(lambda x: x["value"] / x["n_samples"], axis=1)
                # double the magnitude to account for negative frequencies. DC component is not doubled
                # this is already the peak value
                df["value"] = df["value"].apply(lambda x: np.concatenate(([x[0]], x[1:] * 2)))
                # converting to Peak-to-Peak or RMS if necessary
                match amplitude_type:
                    case "Peak":
                        # already peak
                        pass
                    case "Peak-to-Peak":
                        # doubling the value, except for DC component
                        df["value"] = df["value"].apply(lambda x: np.concatenate(([x[0]], x[1:] * 2)))
                    case "RMS":
                        # converting to RMS by dividing by sqrt(2), except for DC component
                        df["value"] = df["value"].apply(lambda x: np.concatenate(([x[0]], x[1:] / np.sqrt(2))))
                    case _:
                        raise ValueError(f"Amplitude type {amplitude_type} is not valid")

                # getting frequencies
                df["frequencies"] = df[["n_samples", "sampling_time"]].apply(
                    lambda row: rfftfreq(n=int(row["n_samples"]), d=row["sampling_time"]),
                    axis=1,
                )
                # converting to orders in case necessary
                if unit == "Order":
                    df["frequencies"] = df[["frequencies", "metadata"]].apply(
                        lambda row: row["frequencies"] / (row["metadata"]["generator_speed_rpm"] / 60),
                        axis=1,
                    )
                # converting back to only one array with two dimensions in value column
                df["value"] = df[["value", "frequencies"]].apply(
                    lambda row: np.concatenate(
                        (
                            np.expand_dims(row["frequencies"], axis=0),
                            np.expand_dims(row["value"], axis=0),
                        ),
                        axis=0,
                    ),
                    axis=1,
                )

            # dropping unwanted columns
            df = df.drop(columns=["n_samples", "frequencies"], errors="ignore")

        # * GE ---------------------------------------
        case "GE":
            # checking which component types we need to get for the turbines
            if sensors is None:
                sensors = list(VIBRATION_CONFIG[manufacturer]["sensors"][data_type].keys())
            required_component_types = list(
                {VIBRATION_CONFIG[manufacturer]["sensors"][data_type][sensor]["ComponentType"] for sensor in sensors},
            )
            # adding Gearbox if it's not present - this is required as the gearbox holds the attributes with ratio between shafts
            if "Gearbox" not in required_component_types:
                required_component_types.append("Gearbox")
            required_component_types.sort()

            # getting the component instances for the wanted turbines and component types
            component_instances: DataFrame = self._perfdb.components.instances.history.get(
                object_names=object_names,
                component_types=required_component_types,
                period=period
                if isinstance(period, DateTimeRange)
                else None,  # only using period if it's a DateTimeRange, if its a list of dates we just get all
                get_attributes=True,
            )
            # validating if all the required attributes are present
            self._validate_ge_component_attributes(components_df=component_instances)

            # getting raw data
            df: DataFrame = self._perfdb.rawdata.values.get(
                object_names=wanted_names["object_names"],
                raw_data_names=wanted_names["raw_data_names"],
                period=period,
                filter_type=filter_type,
            )

            # converting names
            df = self._convert_raw_names_to_sensor_names(
                df=df,
                manufacturer=manufacturer,
                spectrum_type=spectrum_type,
            )

            # converting from orders relative to it's reference shaft to orders relative to the generator shaft
            df = self._convert_ge_freq_orders(spectrum_df=df, components_df=component_instances, spectrum_type=spectrum_type, unit=unit)

            # converting back to only one array with two dimensions in value column
            df["value"] = df[["value", "frequencies"]].apply(
                lambda row: np.concatenate(
                    (
                        np.expand_dims(row["frequencies"], axis=0),
                        np.expand_dims(row["value"], axis=0),
                    ),
                    axis=0,
                ),
                axis=1,
            )

            # dropping unwanted columns
            df = df.drop(columns=["frequencies"])
        case _:
            raise ValueError(f"Manufacturer {manufacturer} is not valid")

    if output_type == "DataFrame":
        return df

    # converting to dict
    result = df.to_dict(orient="index")
    final_result = {}
    for (object_name, sensor, acquisition_frequency, timestamp), values in result.items():
        if object_name not in final_result:
            final_result[object_name] = {}
        if sensor not in final_result[object_name]:
            final_result[object_name][sensor] = {}
        if acquisition_frequency not in final_result[object_name][sensor]:
            final_result[object_name][sensor][acquisition_frequency] = {}
        final_result[object_name][sensor][acquisition_frequency][timestamp] = values

    return final_result

`get_timestamps(manufacturer, data_type='Vibration', object_names=None, period=None, spectrum_type='Normal', sensors=None, acquisition_frequencies=None, variable_names=None, value_type='date', output_type='DataFrame')` ¶

Gets the timestamps/dates where there is vibration spectrum data available.

If you only want the timestamps as a list, you can set the output_type to 'DataFrame' and then do df["timestamp"].unique().tolist() at the result.

Parameters:

manufacturer ¶
(Literal['Gamesa', 'GE']) –

Manufacturer of the wind turbine. Either Gamesa or GE.
data_type ¶
(Literal['Vibration'], default: 'Vibration' ) –

Type of the data to get. Can be one of ["Vibration"]
object_names ¶
(list[str] | None, default: None ) –

Names of the objects to check. If None will check for all objects. By default None
period ¶
(DateTimeRange | None, default: None ) –

Period to check. If None the entire raw_data_values table will be scanned. By default None
spectrum_type ¶
(Literal['Normal', 'Envelope'], default: 'Normal' ) –

What kind of spectrum should be returned. By default, Normal
sensors ¶
(list[VIBRATION_GE_ALLOWED_SENSOR_NAMES | VIBRATION_GAMESA_ALLOWED_SENSOR_NAMES] | None, default: None ) –
List of the sensors to get the data for. The options are as shown below:
- GE: "Planetary", "LSS", "HSS", "Generator RS", "Generator GS", "Main Bearing", "Tower Sway Axial", "Tower Sway Transverse"
- Gamesa: "1 - Generator GS - Radial", "2 - Planetary - Axial", "3 - Main Bearing GS - Radial", "4 - HSS - Radial", "5 - Main Bearing RS - Axial", "6 - HSS - Axial", "7 - Generator RS - Axial", "8 - Generator RS - Radial"
These must be specified with the matching manufacturer and cannot be mixed. If GE is selected only GE sensors are allowed and vice versa.

By default None.
acquisition_frequencies ¶
(list[Literal['Low', 'High', 'Filter']] | None, default: None ) –

Acquisition frequency, only applicable for Gamesa turbines. By default, None
variable_names ¶
(list[Literal['Acceleration - X', 'Acceleration - Y', 'Position - X', 'Position - Y']] | None, default: None ) –

Names of the variables to filter by.
value_type ¶
(Literal['timestamp', 'date'], default: 'date' ) –

If timestamp, will return timestamps as datetimes, if date will return as date (removing hour, minute, second).
output_type ¶
(Literal['dict', 'DataFrame'], default: 'DataFrame' ) –

Output type of the data. Can be one of ["dict", "DataFrame"] By default "DataFrame"

Returns:

DataFrame –

DataFrame with columns: object_name, sensor, acquisition_frequency, timestamp. Index can be ignored.
dict[str, dict[str, list[date | datetime]]] –

Dictionary in the format {object_name: {sensor: {acquisition_frequency: [date | datetime], ...}, ...}, ...}

Source code in echo_postgres/vibration_spectrum.py

@validate_call
def get_timestamps(
    self,
    manufacturer: Literal["Gamesa", "GE"],
    data_type: Literal["Vibration"] = "Vibration",
    object_names: list[str] | None = None,
    period: DateTimeRange | None = None,
    spectrum_type: Literal["Normal", "Envelope"] = "Normal",
    sensors: list[VIBRATION_GE_ALLOWED_SENSOR_NAMES | VIBRATION_GAMESA_ALLOWED_SENSOR_NAMES] | None = None,
    acquisition_frequencies: list[Literal["Low", "High", "Filter"]] | None = None,
    variable_names: list[Literal["Acceleration - X", "Acceleration - Y", "Position - X", "Position - Y"]] | None = None,
    value_type: Literal["timestamp", "date"] = "date",
    output_type: Literal["dict", "DataFrame"] = "DataFrame",
) -> DataFrame | dict[str, dict[str, list[date | datetime]]]:
    """Gets the timestamps/dates where there is vibration spectrum data available.

    If you only want the timestamps as a list, you can set the output_type to 'DataFrame' and then do df["timestamp"].unique().tolist() at the result.

    Parameters
    ----------
    manufacturer : Literal["Gamesa", "GE"]
        Manufacturer of the wind turbine. Either Gamesa or GE.
    data_type : Literal["Vibration"], optional
        Type of the data to get. Can be one of ["Vibration"]
    object_names : list[str] | None, optional
        Names of the objects to check. If None will check for all objects. By default None
    period : DateTimeRange | None, optional
        Period to check. If None the entire raw_data_values table will be scanned. By default None
    spectrum_type : Literal["Normal", "Envelope"], optional
        What kind of spectrum should be returned. By default, Normal
    sensors : list[VIBRATION_GE_ALLOWED_SENSOR_NAMES | VIBRATION_GAMESA_ALLOWED_SENSOR_NAMES] | None
        List of the sensors to get the data for. The options are as shown below:

        - GE: "Planetary", "LSS", "HSS", "Generator RS", "Generator GS", "Main Bearing", "Tower Sway Axial", "Tower Sway Transverse"
        - Gamesa: "1 - Generator GS - Radial", "2 - Planetary - Axial", "3 - Main Bearing GS - Radial", "4 - HSS - Radial", "5 - Main Bearing RS - Axial", "6 - HSS - Axial", "7 - Generator RS - Axial", "8 - Generator RS - Radial"

        These must be specified with the matching manufacturer and cannot be mixed. If GE is selected only GE sensors are allowed and vice versa.

        By default None.
    acquisition_frequencies : list[Literal["Low", "High", "Filter"]] | None, optional
        Acquisition frequency, only applicable for Gamesa turbines. By default, None
    variable_names : list[Literal["Acceleration - X", "Acceleration - Y", "Position - X", "Position - Y"]] | None, optional
        Names of the variables to filter by.
    value_type : Literal["timestamp", "date"], optional
        If timestamp, will return timestamps as datetimes, if date will return as date (removing hour, minute, second).
    output_type : Literal["dict", "DataFrame"], optional
        Output type of the data. Can be one of ["dict", "DataFrame"]
        By default "DataFrame"

    Returns
    -------
    DataFrame
        DataFrame with columns: object_name, sensor, acquisition_frequency, timestamp. Index can be ignored.
    dict[str, dict[str, list[date | datetime]]]
        Dictionary in the format {object_name: {sensor: {acquisition_frequency: [date | datetime], ...}, ...}, ...}

    """
    # checking arguments
    if output_type not in ["dict", "DataFrame"]:
        raise ValueError(f"output_type must be one of ['dict', 'DataFrame'], not {output_type}")
    if value_type not in ["date", "timestamp"]:
        raise ValueError(f"value_type must be one of ['date', 'timestamp']. Got {value_type}")

    where, _ = self._check_get_args(
        object_names=object_names,
        period=period,
        acquisition_frequencies=acquisition_frequencies,
        data_type=data_type,
        variable_names=variable_names,
        manufacturer=manufacturer,
        sensors=sensors,
        spectrum_type=spectrum_type,
        amplitude_type=None,
        filter_type="and",
    )

    # building the query
    query = [
        sql.SQL("SELECT DISTINCT object_name, raw_data_name, timestamp{type_cast} FROM v_raw_data_values ").format(
            type_cast=sql.SQL("::DATE") if value_type == "date" else sql.SQL("::TIMESTAMP"),
        ),
        where,
        sql.SQL(" ORDER BY object_name, raw_data_name, timestamp"),
    ]
    query = sql.Composed(query)

    # executing the query
    with self._perfdb.conn.reconnect() as conn:
        df: DataFrame = conn.read_to_pandas(
            query,
            dtype={
                "object_name": "string[pyarrow]",
                "raw_data_name": "string[pyarrow]",
                "timestamp": "datetime64[s]",
            },
        )

    # converting names
    df = df.set_index(["object_name", "raw_data_name"])
    df = self._perfdb.vibration.spectrum._convert_raw_names_to_sensor_names(  # noqa: SLF001
        df=df,
        manufacturer=manufacturer,
        spectrum_type=spectrum_type,
    )
    df = df.reset_index(drop=False)

    if output_type == "DataFrame":
        return df

    # converting to dict
    df = df.groupby(["object_name", "sensor", "acquisition_frequency"])["timestamp"].apply(list)

    result = df.to_dict()
    final_result = {}
    for (object_name, sensor, acquisition_frequency), value in result.items():
        if object_name not in final_result:
            final_result[object_name] = {}
        if sensor not in final_result[object_name]:
            final_result[object_name][sensor] = {}
        final_result[object_name][sensor][acquisition_frequency] = value

    return final_result

Vibration Spectrum¶