Vibration Spectrum¶
For more details on vibration data see this dedicated page in the reference section.
VibrationSpectrum(perfdb)
¶
Class used for handling Vibration Spectrum. Can be accessed via perfdb.vibration.spectrum.
Parameters:
Source code in echo_postgres/perfdb_root.py
def __init__(self, perfdb: e_pg.PerfDB) -> None:
"""Base class that all subclasses should inherit from.
Parameters
----------
perfdb : PerfDB
Top level object carrying all functionality and the connection handler.
"""
self._perfdb: e_pg.PerfDB = perfdb
get(period, manufacturer, data_type='Vibration', object_names=None, unit='Order', spectrum_type='Normal', amplitude_type='Peak', sensors=None, acquisition_frequencies=None, variable_names=None, filter_type='and', output_type='DataFrame')
¶
Gets the vibration Spectrum data.
Currently, this will process the data in two different ways depending on the manufacturer of the turbine:
- Gamesa: Time series data will be converted to spectrum using a FFT. If envelope is selected, the Hilbert transform will be used to get the envelope of the signal before the FFT.
- GE: Spectrum data will be used directly, converting the frequency axis to be relative to the generator shaft speed.
The values will be numpy ndarrays with two dimensions [2, :], where first dimension is the frequency (in hertz or orders) and the second is the value.
Assuming array is a value of one row and column "value", if you want to get frequency of the array, you can do array[0, :] and for the value itself array[1, :]
Parameters:
-
(period¶DateTimeRange | list[date]) –Can be a DateTimeRange or a list of dates.
- If DateTimeRange, will get the data for the entire range (limiting on start and end)
- If list of dates, will get the data for each date in the list.
-
(manufacturer¶Literal['Gamesa', 'GE']) –Manufacturer of the wind turbine. Either Gamesa or GE.
-
(data_type¶Literal['Vibration'], default:'Vibration') –Type of the data to get. Can be one of ["Vibration"]
-
(object_names¶list[str] | None, default:None) –Names of the objects to check. If None will check for all objects. By default None
-
(unit¶Literal['Hz', 'Order'], default:'Order') –Unit of the frequency. Can be one of ["Hz", "Order"]. If Order is used, it will always be relative to the HSS/Generator Shaft speed. By default "Order"
-
(spectrum_type¶Literal['Normal', 'Envelope'], default:'Normal') –What kind of spectrum should be returned. By default "Normal"
-
(amplitude_type¶Literal['RMS', 'Peak', 'Peak-to-Peak'], default:'Peak') –Type of amplitude to return. Can be one of ["RMS", "Peak", "Peak-to-Peak"].
For GE turbines only Peak is allowed. Peak-to-Peak and RMS are not supported as the data is acquired directly as spectrum.
By default "Peak"
-
(sensors¶list[VIBRATION_GE_ALLOWED_SENSOR_NAMES | VIBRATION_GAMESA_ALLOWED_SENSOR_NAMES] | None, default:None) –List of the sensors to get the data for. The options are as shown below:
- GE: "Planetary", "LSS", "HSS", "Generator RS", "Generator GS", "Main Bearing", "Tower Sway Axial", "Tower Sway Transverse"
- Gamesa Vibration: "1 - Generator GS - Radial", "2 - Planetary - Axial", "3 - Main Bearing GS - Radial", "4 - HSS - Radial", "5 - Main Bearing RS - Axial", "6 - HSS - Axial", "7 - Generator RS - Axial", "8 - Generator RS - Radial"
These must be specified with the matching manufacturer and cannot be mixed. If GE is selected only GE sensors are allowed and vice versa.
By default None.
-
(acquisition_frequencies¶list[Literal['Low', 'High', 'Filter']] | None, default:None) –Acquisition frequency, only applicable for Gamesa turbines. By default, None
-
(variable_names¶list[Literal['Acceleration - X', 'Acceleration - Y', 'Position - X', 'Position - Y']] | None, default:None) –Names of the variables to filter by.
-
(filter_type¶Literal['and', 'or'], default:'and') –How to treat multiple filters. Can be one of ["and", "or"]. By default "and"
-
(output_type¶Literal['dict', 'DataFrame'], default:'DataFrame') –Output type of the data. Can be one of ["dict", "DataFrame"] By default "DataFrame"
Returns:
-
DataFrame–DataFrame with a MultiIndex[object_name, raw_data_name, timestamp] and columns: value, metadata. Value column contais a numpy ndarray with the Spectrum (2d array with first dimension as time and second as value).
-
dict[str, dict[str, dict[datetime, dict[str, dict[str, Any]]]]]–Dictionary in the format {object_name: {raw_data_name: {datetime: {value: value, metadata: metadata}, ...}, ...}, ...}
Source code in echo_postgres/vibration_spectrum.py
@validate_call
def get(
self,
period: DateTimeRange | list[date],
manufacturer: Literal["Gamesa", "GE"],
data_type: Literal["Vibration"] = "Vibration",
object_names: list[str] | None = None,
unit: Literal["Hz", "Order"] = "Order",
spectrum_type: Literal["Normal", "Envelope"] = "Normal",
amplitude_type: Literal["RMS", "Peak", "Peak-to-Peak"] = "Peak",
sensors: list[VIBRATION_GE_ALLOWED_SENSOR_NAMES | VIBRATION_GAMESA_ALLOWED_SENSOR_NAMES] | None = None,
acquisition_frequencies: list[Literal["Low", "High", "Filter"]] | None = None,
variable_names: list[Literal["Acceleration - X", "Acceleration - Y", "Position - X", "Position - Y"]] | None = None,
filter_type: Literal["and", "or"] = "and",
output_type: Literal["dict", "DataFrame"] = "DataFrame",
) -> DataFrame | dict[str, dict[str, dict[datetime, dict[str, dict[str, Any]]]]]:
"""Gets the vibration Spectrum data.
Currently, this will process the data in two different ways depending on the manufacturer of the turbine:
- Gamesa: Time series data will be converted to spectrum using a FFT. If envelope is selected, the Hilbert transform will be used to get the envelope of the signal before the FFT.
- GE: Spectrum data will be used directly, converting the frequency axis to be relative to the generator shaft speed.
The values will be numpy ndarrays with two dimensions [2, :], where first dimension is the frequency (in hertz or orders) and the second is the value.
Assuming array is a value of one row and column "value", if you want to get frequency of the array, you can do array[0, :] and for the value itself array[1, :]
Parameters
----------
period : DateTimeRange | list[date]
Can be a DateTimeRange or a list of dates.
- If DateTimeRange, will get the data for the entire range (limiting on start and end)
- If list of dates, will get the data for each date in the list.
manufacturer : Literal["Gamesa", "GE"]
Manufacturer of the wind turbine. Either Gamesa or GE.
data_type : Literal["Vibration"], optional
Type of the data to get. Can be one of ["Vibration"]
object_names : list[str] | None, optional
Names of the objects to check. If None will check for all objects. By default None
unit : Literal["Hz", "Order"], optional
Unit of the frequency. Can be one of ["Hz", "Order"]. If Order is used, it will always be relative to the HSS/Generator Shaft speed.
By default "Order"
spectrum_type : Literal["Normal", "Envelope"], optional
What kind of spectrum should be returned. By default "Normal"
amplitude_type : Literal["RMS", "Peak", "Peak-to-Peak"], optional
Type of amplitude to return. Can be one of ["RMS", "Peak", "Peak-to-Peak"].
For GE turbines only Peak is allowed. Peak-to-Peak and RMS are not supported as the data is acquired directly as spectrum.
By default "Peak"
sensors : list[VIBRATION_GE_ALLOWED_SENSOR_NAMES | VIBRATION_GAMESA_ALLOWED_SENSOR_NAMES] | None
List of the sensors to get the data for. The options are as shown below:
- GE: "Planetary", "LSS", "HSS", "Generator RS", "Generator GS", "Main Bearing", "Tower Sway Axial", "Tower Sway Transverse"
- Gamesa Vibration: "1 - Generator GS - Radial", "2 - Planetary - Axial", "3 - Main Bearing GS - Radial", "4 - HSS - Radial", "5 - Main Bearing RS - Axial", "6 - HSS - Axial", "7 - Generator RS - Axial", "8 - Generator RS - Radial"
These must be specified with the matching manufacturer and cannot be mixed. If GE is selected only GE sensors are allowed and vice versa.
By default None.
acquisition_frequencies : list[Literal["Low", "High", "Filter"]] | None, optional
Acquisition frequency, only applicable for Gamesa turbines. By default, None
variable_names : list[Literal["Acceleration - X", "Acceleration - Y", "Position - X", "Position - Y"]] | None, optional
Names of the variables to filter by.
filter_type : Literal["and", "or"], optional
How to treat multiple filters. Can be one of ["and", "or"]. By default "and"
output_type : Literal["dict", "DataFrame"], optional
Output type of the data. Can be one of ["dict", "DataFrame"]
By default "DataFrame"
Returns
-------
DataFrame
DataFrame with a MultiIndex[object_name, raw_data_name, timestamp] and columns: value, metadata. Value column contais a numpy ndarray with the Spectrum (2d array with first dimension as time and second as value).
dict[str, dict[str, dict[datetime, dict[str, dict[str, Any]]]]]
Dictionary in the format {object_name: {raw_data_name: {datetime: {value: value, metadata: metadata}, ...}, ...}, ...}
"""
# checking arguments
if output_type not in ["dict", "DataFrame"]:
raise ValueError(f"output_type must be one of ['dict', 'DataFrame'], not {output_type}")
if unit not in ["Hz", "Order"]:
raise ValueError(f"unit must be one of Hz or Order, not {unit}")
_, wanted_names = self._check_get_args(
object_names=object_names,
period=period,
acquisition_frequencies=acquisition_frequencies,
variable_names=variable_names,
manufacturer=manufacturer,
data_type=data_type,
sensors=sensors,
spectrum_type=spectrum_type,
amplitude_type=amplitude_type,
filter_type=filter_type,
)
match manufacturer:
# * Gamesa -----------------------------------
case "Gamesa":
# getting time series
df: DataFrame = self._perfdb.vibration.timeseries.get(
period=period,
object_names=object_names,
manufacturer=manufacturer,
data_type=data_type,
sensors=sensors,
acquisition_frequencies=acquisition_frequencies,
variable_names=variable_names,
filter_type=filter_type,
output_type="DataFrame",
)
if not df.empty:
# removing time component from time series
df["value"] = df["value"].apply(lambda x: x[1, :])
match spectrum_type:
# * Normal ---------------------------------
case "Normal":
# no need for further processing
pass
# * Envelope ---------------------------------
case "Envelope":
# performing hilbert transform to get envelope
df["value"] = df["value"].apply(lambda x: np.abs(hilbert(x)))
case _:
raise ValueError(f"Spectrum type {spectrum_type} is not valid")
# getting number of samples in value
df["n_samples"] = df["value"].apply(lambda x: x.size)
# applying fft
df["value"] = df["value"].apply(lambda x: np.abs(rfft(x, workers=4)))
# normalize by number of samples
# this is needed to get the correct amplitude as the fft is not normalized. See https://numpy.org/doc/stable/reference/routines.fft.html#module-numpy.fft for more details
df["value"] = df[["value", "n_samples"]].apply(lambda x: x["value"] / x["n_samples"], axis=1)
# double the magnitude to account for negative frequencies. DC component is not doubled
# this is already the peak value
df["value"] = df["value"].apply(lambda x: np.concatenate(([x[0]], x[1:] * 2)))
# converting to Peak-to-Peak or RMS if necessary
match amplitude_type:
case "Peak":
# already peak
pass
case "Peak-to-Peak":
# doubling the value, except for DC component
df["value"] = df["value"].apply(lambda x: np.concatenate(([x[0]], x[1:] * 2)))
case "RMS":
# converting to RMS by dividing by sqrt(2), except for DC component
df["value"] = df["value"].apply(lambda x: np.concatenate(([x[0]], x[1:] / np.sqrt(2))))
case _:
raise ValueError(f"Amplitude type {amplitude_type} is not valid")
# getting frequencies
df["frequencies"] = df[["n_samples", "sampling_time"]].apply(
lambda row: rfftfreq(n=int(row["n_samples"]), d=row["sampling_time"]),
axis=1,
)
# converting to orders in case necessary
if unit == "Order":
df["frequencies"] = df[["frequencies", "metadata"]].apply(
lambda row: row["frequencies"] / (row["metadata"]["generator_speed_rpm"] / 60),
axis=1,
)
# converting back to only one array with two dimensions in value column
df["value"] = df[["value", "frequencies"]].apply(
lambda row: np.concatenate(
(
np.expand_dims(row["frequencies"], axis=0),
np.expand_dims(row["value"], axis=0),
),
axis=0,
),
axis=1,
)
# dropping unwanted columns
df = df.drop(columns=["n_samples", "frequencies"], errors="ignore")
# * GE ---------------------------------------
case "GE":
# checking which component types we need to get for the turbines
if sensors is None:
sensors = list(VIBRATION_CONFIG[manufacturer]["sensors"][data_type].keys())
required_component_types = list(
{VIBRATION_CONFIG[manufacturer]["sensors"][data_type][sensor]["ComponentType"] for sensor in sensors},
)
# adding Gearbox if it's not present - this is required as the gearbox holds the attributes with ratio between shafts
if "Gearbox" not in required_component_types:
required_component_types.append("Gearbox")
required_component_types.sort()
# getting the component instances for the wanted turbines and component types
component_instances: DataFrame = self._perfdb.components.instances.history.get(
object_names=object_names,
component_types=required_component_types,
period=period
if isinstance(period, DateTimeRange)
else None, # only using period if it's a DateTimeRange, if its a list of dates we just get all
get_attributes=True,
)
# validating if all the required attributes are present
self._validate_ge_component_attributes(components_df=component_instances)
# getting raw data
df: DataFrame = self._perfdb.rawdata.values.get(
object_names=wanted_names["object_names"],
raw_data_names=wanted_names["raw_data_names"],
period=period,
filter_type=filter_type,
)
# converting names
df = self._convert_raw_names_to_sensor_names(
df=df,
manufacturer=manufacturer,
spectrum_type=spectrum_type,
)
# converting from orders relative to it's reference shaft to orders relative to the generator shaft
df = self._convert_ge_freq_orders(spectrum_df=df, components_df=component_instances, spectrum_type=spectrum_type, unit=unit)
# converting back to only one array with two dimensions in value column
df["value"] = df[["value", "frequencies"]].apply(
lambda row: np.concatenate(
(
np.expand_dims(row["frequencies"], axis=0),
np.expand_dims(row["value"], axis=0),
),
axis=0,
),
axis=1,
)
# dropping unwanted columns
df = df.drop(columns=["frequencies"])
case _:
raise ValueError(f"Manufacturer {manufacturer} is not valid")
if output_type == "DataFrame":
return df
# converting to dict
result = df.to_dict(orient="index")
final_result = {}
for (object_name, sensor, acquisition_frequency, timestamp), values in result.items():
if object_name not in final_result:
final_result[object_name] = {}
if sensor not in final_result[object_name]:
final_result[object_name][sensor] = {}
if acquisition_frequency not in final_result[object_name][sensor]:
final_result[object_name][sensor][acquisition_frequency] = {}
final_result[object_name][sensor][acquisition_frequency][timestamp] = values
return final_result
get_timestamps(manufacturer, data_type='Vibration', object_names=None, period=None, spectrum_type='Normal', sensors=None, acquisition_frequencies=None, variable_names=None, value_type='date', output_type='DataFrame')
¶
Gets the timestamps/dates where there is vibration spectrum data available.
If you only want the timestamps as a list, you can set the output_type to 'DataFrame' and then do df["timestamp"].unique().tolist() at the result.
Parameters:
-
(manufacturer¶Literal['Gamesa', 'GE']) –Manufacturer of the wind turbine. Either Gamesa or GE.
-
(data_type¶Literal['Vibration'], default:'Vibration') –Type of the data to get. Can be one of ["Vibration"]
-
(object_names¶list[str] | None, default:None) –Names of the objects to check. If None will check for all objects. By default None
-
(period¶DateTimeRange | None, default:None) –Period to check. If None the entire raw_data_values table will be scanned. By default None
-
(spectrum_type¶Literal['Normal', 'Envelope'], default:'Normal') –What kind of spectrum should be returned. By default, Normal
-
(sensors¶list[VIBRATION_GE_ALLOWED_SENSOR_NAMES | VIBRATION_GAMESA_ALLOWED_SENSOR_NAMES] | None, default:None) –List of the sensors to get the data for. The options are as shown below:
- GE: "Planetary", "LSS", "HSS", "Generator RS", "Generator GS", "Main Bearing", "Tower Sway Axial", "Tower Sway Transverse"
- Gamesa: "1 - Generator GS - Radial", "2 - Planetary - Axial", "3 - Main Bearing GS - Radial", "4 - HSS - Radial", "5 - Main Bearing RS - Axial", "6 - HSS - Axial", "7 - Generator RS - Axial", "8 - Generator RS - Radial"
These must be specified with the matching manufacturer and cannot be mixed. If GE is selected only GE sensors are allowed and vice versa.
By default None.
-
(acquisition_frequencies¶list[Literal['Low', 'High', 'Filter']] | None, default:None) –Acquisition frequency, only applicable for Gamesa turbines. By default, None
-
(variable_names¶list[Literal['Acceleration - X', 'Acceleration - Y', 'Position - X', 'Position - Y']] | None, default:None) –Names of the variables to filter by.
-
(value_type¶Literal['timestamp', 'date'], default:'date') –If timestamp, will return timestamps as datetimes, if date will return as date (removing hour, minute, second).
-
(output_type¶Literal['dict', 'DataFrame'], default:'DataFrame') –Output type of the data. Can be one of ["dict", "DataFrame"] By default "DataFrame"
Returns:
-
DataFrame–DataFrame with columns: object_name, sensor, acquisition_frequency, timestamp. Index can be ignored.
-
dict[str, dict[str, list[date | datetime]]]–Dictionary in the format {object_name: {sensor: {acquisition_frequency: [date | datetime], ...}, ...}, ...}
Source code in echo_postgres/vibration_spectrum.py
@validate_call
def get_timestamps(
self,
manufacturer: Literal["Gamesa", "GE"],
data_type: Literal["Vibration"] = "Vibration",
object_names: list[str] | None = None,
period: DateTimeRange | None = None,
spectrum_type: Literal["Normal", "Envelope"] = "Normal",
sensors: list[VIBRATION_GE_ALLOWED_SENSOR_NAMES | VIBRATION_GAMESA_ALLOWED_SENSOR_NAMES] | None = None,
acquisition_frequencies: list[Literal["Low", "High", "Filter"]] | None = None,
variable_names: list[Literal["Acceleration - X", "Acceleration - Y", "Position - X", "Position - Y"]] | None = None,
value_type: Literal["timestamp", "date"] = "date",
output_type: Literal["dict", "DataFrame"] = "DataFrame",
) -> DataFrame | dict[str, dict[str, list[date | datetime]]]:
"""Gets the timestamps/dates where there is vibration spectrum data available.
If you only want the timestamps as a list, you can set the output_type to 'DataFrame' and then do df["timestamp"].unique().tolist() at the result.
Parameters
----------
manufacturer : Literal["Gamesa", "GE"]
Manufacturer of the wind turbine. Either Gamesa or GE.
data_type : Literal["Vibration"], optional
Type of the data to get. Can be one of ["Vibration"]
object_names : list[str] | None, optional
Names of the objects to check. If None will check for all objects. By default None
period : DateTimeRange | None, optional
Period to check. If None the entire raw_data_values table will be scanned. By default None
spectrum_type : Literal["Normal", "Envelope"], optional
What kind of spectrum should be returned. By default, Normal
sensors : list[VIBRATION_GE_ALLOWED_SENSOR_NAMES | VIBRATION_GAMESA_ALLOWED_SENSOR_NAMES] | None
List of the sensors to get the data for. The options are as shown below:
- GE: "Planetary", "LSS", "HSS", "Generator RS", "Generator GS", "Main Bearing", "Tower Sway Axial", "Tower Sway Transverse"
- Gamesa: "1 - Generator GS - Radial", "2 - Planetary - Axial", "3 - Main Bearing GS - Radial", "4 - HSS - Radial", "5 - Main Bearing RS - Axial", "6 - HSS - Axial", "7 - Generator RS - Axial", "8 - Generator RS - Radial"
These must be specified with the matching manufacturer and cannot be mixed. If GE is selected only GE sensors are allowed and vice versa.
By default None.
acquisition_frequencies : list[Literal["Low", "High", "Filter"]] | None, optional
Acquisition frequency, only applicable for Gamesa turbines. By default, None
variable_names : list[Literal["Acceleration - X", "Acceleration - Y", "Position - X", "Position - Y"]] | None, optional
Names of the variables to filter by.
value_type : Literal["timestamp", "date"], optional
If timestamp, will return timestamps as datetimes, if date will return as date (removing hour, minute, second).
output_type : Literal["dict", "DataFrame"], optional
Output type of the data. Can be one of ["dict", "DataFrame"]
By default "DataFrame"
Returns
-------
DataFrame
DataFrame with columns: object_name, sensor, acquisition_frequency, timestamp. Index can be ignored.
dict[str, dict[str, list[date | datetime]]]
Dictionary in the format {object_name: {sensor: {acquisition_frequency: [date | datetime], ...}, ...}, ...}
"""
# checking arguments
if output_type not in ["dict", "DataFrame"]:
raise ValueError(f"output_type must be one of ['dict', 'DataFrame'], not {output_type}")
if value_type not in ["date", "timestamp"]:
raise ValueError(f"value_type must be one of ['date', 'timestamp']. Got {value_type}")
where, _ = self._check_get_args(
object_names=object_names,
period=period,
acquisition_frequencies=acquisition_frequencies,
data_type=data_type,
variable_names=variable_names,
manufacturer=manufacturer,
sensors=sensors,
spectrum_type=spectrum_type,
amplitude_type=None,
filter_type="and",
)
# building the query
query = [
sql.SQL("SELECT DISTINCT object_name, raw_data_name, timestamp{type_cast} FROM v_raw_data_values ").format(
type_cast=sql.SQL("::DATE") if value_type == "date" else sql.SQL("::TIMESTAMP"),
),
where,
sql.SQL(" ORDER BY object_name, raw_data_name, timestamp"),
]
query = sql.Composed(query)
# executing the query
with self._perfdb.conn.reconnect() as conn:
df: DataFrame = conn.read_to_pandas(
query,
dtype={
"object_name": "string[pyarrow]",
"raw_data_name": "string[pyarrow]",
"timestamp": "datetime64[s]",
},
)
# converting names
df = df.set_index(["object_name", "raw_data_name"])
df = self._perfdb.vibration.spectrum._convert_raw_names_to_sensor_names( # noqa: SLF001
df=df,
manufacturer=manufacturer,
spectrum_type=spectrum_type,
)
df = df.reset_index(drop=False)
if output_type == "DataFrame":
return df
# converting to dict
df = df.groupby(["object_name", "sensor", "acquisition_frequency"])["timestamp"].apply(list)
result = df.to_dict()
final_result = {}
for (object_name, sensor, acquisition_frequency), value in result.items():
if object_name not in final_result:
final_result[object_name] = {}
if sensor not in final_result[object_name]:
final_result[object_name][sensor] = {}
final_result[object_name][sensor][acquisition_frequency] = value
return final_result