Analysis - Correlate

This module is for correlation analysis between multiple masts. The typical format is to give the method the site and reference MetMast or RefMast objects. The method then returns a DataFrame of results. Those results can then be applied with a separate method.

For example, correlations typically require two lines of text. One for running the actual correlation between masts and one for applying that correlation to synthesize missing data:

1
2
corr_results_10min = an.analysis.correlate.masts_10_minute_by_direction(ref_mast, site_mast)
syn_10min = an.analysis.correlate.apply_10min_results_by_direction(ref_mast, site_mast, corr_results_10min)
analysis.correlate.apply_10min_results_by_direction(ref_mast, site_mast, corr_results, ref_ws_col=None, ref_dir_col=None, site_ws_col=None, splice=True)[source]

Applies the slopes and offsets from a 10-minute correaltion, binned by direction, between two met masts.

Parameters:
ref_mast: MetMast
MetMast object
site_mast: MetMast
MetMast object
corr_results: DataFrame
slope, offset, R2, uncert, points for each direction sector
ref_ws_col: string, default None (primary anemometer assumed)
Reference anemometer data to use. Extracted from MetMast.data
ref_dir_col: string, default None (primary vane assumed)
Reference anemometer data to use. Extracted from MetMast.data
site_ws_col: string, default None (primary anemometer assumed)
Site anemometer data to use. Extracted from MetMast.data
splice: Boolean, default True
Returns site data where available and gap-fills any missing periods between the site mast and the reference mast’s measurement period. Otherwise, returns purely sythesized data without taking into account the measured wind speeds.
Returns:
out: time series DataFrame
predicted wind speeds at the site
analysis.correlate.apply_daily_results_by_month(ref_mast, site_mast, corr_results, ref_ws_col=None, site_ws_col=None, splice=True)[source]

Applies the slopes and offsets from a daily correaltion, binned by month, between two met masts. If the reference or site masts don’t have daily time series the method resamples to daily frequency, requiring 70% data coverage within each day to be a valid day.

Parameters:
ref_mast: MetMast
MetMast object
site_mast: MetMast
MetMast object
corr_results: DataFrame
slope, offset, R2, uncert, points for each direction sector
ref_ws_col: string, default None (primary anemometer assumed)
Reference anemometer data to use. Extracted from MetMast.data
ref_dir_col: string, default None (primary vane assumed)
Reference anemometer data to use. Extracted from MetMast.data
site_ws_col: string, default None (primary anemometer assumed)
Site anemometer data to use. Extracted from MetMast.data
splice: Boolean, default True
Returns site data where available and gap-fills any missing periods between the site mast and the reference mast’s measurement period. Otherwise, returns purely sythesized data without taking into account the measured wind speeds.
Returns:
out: time series DataFrame
predicted wind speeds at the site
analysis.correlate.apply_daily_results_by_month_to_mast_data(mast_data, corr_results, ref_ws_col='ref', site_ws_col='site', splice=True)[source]

Applies the slopes and offsets from a daily correaltion, binned by month, to a DataFrame of wind speed data.

Parameters:
mast_data: DataFrame
timeseries of wind speed data
corr_results: DataFrame
slope, offset, R2, uncert, points for each month
ref_ws_col: string, default ‘ref’
Reference anemometer data to use. Extracted from mast_data DataFrame.
site_ws_col: string, default ‘site’
Site anemometer data to use. Extracted from mast_data DataFrame
splice: Boolean, default True
Returns site data where available and gap-fills any missing periods between the site mast and the reference mast’s measurement period. Otherwise, returns purely sythesized data without taking into account the measured wind speeds.
Returns:
out: time series DataFrame
predicted wind speeds at the site
analysis.correlate.calculate_EDF_uncertainty(data, ref_ws_col='ref', site_ws_col='site')[source]

Calculate the EDF estimated correlation uncertainty between two wind speed columns. Assumes a correlation forced through the origin

Parameters:
data: DataFrame
DataFrame with wind speed columns ref_ws_col and site_ws_col
ref_ws_col: string, default ‘ref’
Reference anemometer data column to use.
site_ws_col: string, default ‘site’
Site anemometer data column to use.
analysis.correlate.calculate_IEC_uncertainty(data, ref_ws_col='ref', site_ws_col='site')[source]

Calculate the IEC correlation uncertainty between two wind speed columns

Parameters:
data: DataFrame
DataFrame with wind speed columns ref_ws_col and site_ws_col
ref_ws_col: string, default ‘ref’
Reference anemometer data column to use.
site_ws_col: string, default ‘site’
Site anemometer data column to use.
analysis.correlate.calculate_R2(data, ref_ws_col='ref', site_ws_col='site')[source]

Return a single R2 between two wind speed columns

Parameters:
data: DataFrame
DataFrame with wind speed columns ref_ws_col and site_ws_col
ref_ws_col: string, default ‘ref’
Reference anemometer data column to use.
site_ws_col: string, default ‘site’
Site anemometer data column to use.
analysis.correlate.masts_10_minute(ref_mast, site_mast, ref_ws_col=None, site_ws_col=None, method='ODR', force_through_origin=False)[source]

Calculate the slope and offset between two met masts.

Parameters:
ref_mast: MetMast
MetMast object
site_mast: MetMast
MetMast object
ref_ws_col: string, default None (primary anemometer assumed)
Reference anemometer data to use. Extracted from MetMast.data
site_ws_col: string, default None (primary anemometer assumed)
Site anemometer data to use. Extracted from MetMast.data
method: string, default ‘ODR’

Correlation method to use.

  • Orthoginal distance regression: ‘ODR’
  • Ordinary least squares: ‘OLS’
  • Robust linear models: ‘RLM’
force_through_origin: boolean, default False
Force the correlation through the origin (offset equal to zero)
Returns:
out: DataFrame
slope, offset, R2, uncert, points
analysis.correlate.masts_10_minute_by_direction(ref_mast, site_mast, ref_ws_col=None, ref_dir_col=None, site_ws_col=None, site_dir_col=None, method='ODR', force_through_origin=False, dir_sectors=16)[source]

Calculate the slope and offset, binned by direction, between two met masts.

Parameters:
ref_mast: MetMast
MetMast object
site_mast: MetMast
MetMast object
ref_ws_col: string, default None (primary anemometer assumed)
Reference anemometer data to use. Extracted from MetMast.data
ref_dir_col: string, default None (primary wind vane assumed)
Reference anemometer data to use. Extracted from MetMast.data
site_dir_col: string, default None (primary anemometer assumed)
Site anemometer data to use. Extracted from MetMast.data
method: string, default ‘ODR’

Correlation method to use.

  • Orthoginal distance regression: ‘ODR’
  • Ordinary least squares: ‘OLS’
  • Robust linear models: ‘RLM’
dir_sectors: int, default 16
Number of equally spaced direction sectors
force_through_origin: boolean, default False
Force the correlation through the origin (offset equal to zero)
Returns:
out: DataFrame
slope, offset, R2, uncert, points
analysis.correlate.masts_daily(ref_mast, site_mast, ref_ws_col=None, site_ws_col=None, method='ODR', force_through_origin=False, minimum_recovery_rate=0.7)[source]

Calculate the slope and offset for daily data between two met masts.

Parameters:
ref_mast: MetMast
MetMast object
site_mast: MetMast
MetMast object
ref_ws_col: string, default None (primary anemometer assumed)
Reference anemometer data to use. Extracted from MetMast.data
site_ws_col: string, default None (primary anemometer assumed)
Site anemometer data to use. Extracted from MetMast.data
method: string, default ‘ODR’

Correlation method to use.

  • Orthoginal distance regression: ‘ODR’
  • Ordinary least squares: ‘OLS’
  • Robust linear models: ‘RLM’
force_through_origin: boolean, default False
Force the correlation through the origin (offset equal to zero)
minimum_recovery_rate: float, default 0.7
Minimum allowable recovery rate until resampled data are excluded. For example, by defalt, when resampling 10-minute data to daily averages you would need at least 101 valid records to have a valid daily average.
Returns:
out: DataFrame
slope, offset, R2, uncert, points
analysis.correlate.masts_daily_by_month(ref_mast, site_mast, ref_ws_col=None, site_ws_col=None, method='ODR', force_through_origin=False, minimum_recovery_rate=0.7)[source]

Calculate the slope and offset for daily data, binned by month, between two met masts.

Parameters:
ref_mast: MetMast
MetMast object
site_mast: MetMast
MetMast object
ref_ws_col: string, default None (primary anemometer assumed)
Reference anemometer data to use. Extracted from MetMast.data
site_ws_col: string, default None (primary anemometer assumed)
Site anemometer data to use. Extracted from MetMast.data
method: string, default ‘ODR’

Correlation method to use.

  • Orthoginal distance regression: ‘ODR’
  • Ordinary least squares: ‘OLS’
  • Robust linear models: ‘RLM’
force_through_origin: boolean, default False
Force the correlation through the origin (offset equal to zero)
minimum_recovery_rate: float, default 0.7
Minimum allowable recovery rate until resampled data are excluded. For example, by defalt, when resampling 10-minute data to daily averages you would need at least 101 valid records to have a valid daily average.
Returns:
out: DataFrame
slope, offset, R2, uncert, points for each month
analysis.correlate.return_correlation_data_from_masts(ref_mast, site_mast)[source]

Return a DataFrame of reference and site data for correlations. Will be extracted from each MetMast object using the primary anemometers and wind vanes.

Parameters:
ref_mast: MetMast
Anemoi MetMast object
site_mast: MetMast
Anemoi MetMast object
Returns:

out: DataFrame with columns ref, site, and dir

analysis.correlate.valid_ws_correlation_data(data, ref_ws_col='ref', site_ws_col='site')[source]

Perform checks on wind speed correlation data.

Parameters:
data: DataFrame
DataFrame with wind speed columns ref_ws_col and site_ws_col
ref_ws_col: string, default ‘ref’
Reference anemometer data column to use.
site_ws_col: string, default ‘site’
Site anemometer data column to use.
analysis.correlate.ws_correlation_binned_by_direction(data, ref_ws_col='ref', site_ws_col='site', ref_dir_col='dir', dir_sectors=16, method='ODR', force_through_origin=False)[source]

Calculate the slope and offset, binned by direction, between two wind speed columns.

Parameters:
data: DataFrame
DataFrame with wind speed columns ref and site, and direction data dir
ref_ws_col: string, default None (primary anemometer assumed)
Reference anemometer data to use. Extracted from MetMast.data
site_ws_col: string, default None (primary anemometer assumed)
Site anemometer data to use. Extracted from MetMast.data
ref_dir_col: string, default None (primary wind vane assumed)
Reference wind vane data to use. Extracted from MetMast.data
dir_sectors: int, default 16
Number of equally spaced direction sectors
method: string, default ‘ODR’

Correlation method to use.

  • Orthoginal distance regression: ‘ODR’
  • Ordinary least squares: ‘OLS’
  • Robust linear models: ‘RLM’
force_through_origin: boolean, default False
Force the correlation through the origin (offset equal to zero)
Returns:
out: DataFrame
slope, offset, R2, uncert, points
analysis.correlate.ws_correlation_binned_by_month(data, ref_ws_col='ref', site_ws_col='site', method='ODR', force_through_origin=False)[source]

Calculate the slope and offset, binned by month, between two wind speed columns.

Parameters:
data: DataFrame
DataFrame with wind speed columns ref and site, and direction data dir
ref_ws_col: string, default None (primary anemometer assumed)
Reference anemometer data to use. Extracted from MetMast.data
site_ws_col: string, default None (primary anemometer assumed)
Site anemometer data to use. Extracted from MetMast.data
method: string, default ‘ODR’

Correlation method to use.

  • Orthoginal distance regression: ‘ODR’
  • Ordinary least squares: ‘OLS’
  • Robust linear models: ‘RLM’
force_through_origin: boolean, default False
Force the correlation through the origin (offset equal to zero)
Returns:
out: DataFrame
slope, offset, R2, uncert, points
analysis.correlate.ws_correlation_least_squares_model(data, ref_ws_col='ref', site_ws_col='site', force_through_origin=False)[source]

Calculate the slope and offset between two wind speed columns using ordinary least squares regression.

https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.linalg.lstsq.html

Parameters:
data: DataFrame
DataFrame with wind speed columns ref and site, and direction data dir
ref_ws_col: string, default None (primary anemometer assumed)
Reference anemometer data to use. Extracted from MetMast.data
site_ws_col: string, default None (primary anemometer assumed)
Site anemometer data to use. Extracted from MetMast.data
force_through_origin: boolean, default False
Force the correlation through the origin (offset equal to zero)
Returns:
out: DataFrame
slope, offset, R2, uncert, points
analysis.correlate.ws_correlation_method(data, ref_ws_col='ref', site_ws_col='site', method='ODR', force_through_origin=False)[source]

Calculate the slope and offset, for a given correlation method, between two wind speed columns.

Parameters:
data: DataFrame
DataFrame with wind speed columns ref and site, and direction data dir
ref_ws_col: string, default None (primary anemometer assumed)
Reference anemometer data to use. Extracted from MetMast.data
site_ws_col: string, default None (primary anemometer assumed)
Site anemometer data to use. Extracted from MetMast.data
method: string, default ‘ODR’

Correlation method to use.

  • Orthoginal distance regression: ‘ODR’
  • Ordinary least squares: ‘OLS’
  • Robust linear models: ‘RLM’
force_through_origin: boolean, default False
Force the correlation through the origin (offset equal to zero)
Returns:
out: DataFrame
slope, offset, R2, uncert, points
analysis.correlate.ws_correlation_orthoginal_distance_model(data, ref_ws_col='ref', site_ws_col='site', force_through_origin=False)[source]

Calculate the slope and offset between two wind speed columns using orthoganal distance regression.

https://docs.scipy.org/doc/scipy-0.18.1/reference/odr.html

Parameters:
data: DataFrame
DataFrame with wind speed columns ref and site, and direction data dir
ref_ws_col: string, default None (primary anemometer assumed)
Reference anemometer data to use. Extracted from MetMast.data
site_ws_col: string, default None (primary anemometer assumed)
Site anemometer data to use. Extracted from MetMast.data
force_through_origin: boolean, default False
Force the correlation through the origin (offset equal to zero)
Returns:
out: DataFrame
slope, offset, R2, uncert, points
analysis.correlate.ws_correlation_robust_linear_model(data, ref_ws_col='ref', site_ws_col='site', force_through_origin=False)[source]

Calculate the slope and offset between two wind speed columns using robust linear model.

http://www.statsmodels.org/dev/rlm.html

Parameters:
data: DataFrame
DataFrame with wind speed columns ref and site, and direction data dir
ref_ws_col: string, default None (primary anemometer assumed)
Reference anemometer data to use. Extracted from MetMast.data
site_ws_col: string, default None (primary anemometer assumed)
Site anemometer data to use. Extracted from MetMast.data
force_through_origin: boolean, default False
Force the correlation through the origin (offset equal to zero)
Returns:
out: DataFrame
slope, offset, R2, uncert, points