Analysis - Correlate¶
This module is for correlation analysis between multiple masts. The typical format is to give the method the site and reference MetMast or RefMast objects. The method then returns a DataFrame of results. Those results can then be applied with a separate method.
For example, correlations typically require two lines of text. One for running the actual correlation between masts and one for applying that correlation to synthesize missing data:
1 2 | corr_results_10min = an.analysis.correlate.masts_10_minute_by_direction(ref_mast, site_mast)
syn_10min = an.analysis.correlate.apply_10min_results_by_direction(ref_mast, site_mast, corr_results_10min)
|
-
analysis.correlate.
apply_10min_results_by_direction
(ref_mast, site_mast, corr_results, ref_ws_col=None, ref_dir_col=None, site_ws_col=None, splice=True)[source]¶ Applies the slopes and offsets from a 10-minute correaltion, binned by direction, between two met masts.
Parameters: - ref_mast: MetMast
- MetMast object
- site_mast: MetMast
- MetMast object
- corr_results: DataFrame
- slope, offset, R2, uncert, points for each direction sector
- ref_ws_col: string, default None (primary anemometer assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- ref_dir_col: string, default None (primary vane assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- site_ws_col: string, default None (primary anemometer assumed)
- Site anemometer data to use. Extracted from MetMast.data
- splice: Boolean, default True
- Returns site data where available and gap-fills any missing periods between the site mast and the reference mast’s measurement period. Otherwise, returns purely sythesized data without taking into account the measured wind speeds.
Returns: - out: time series DataFrame
- predicted wind speeds at the site
-
analysis.correlate.
apply_daily_results_by_month
(ref_mast, site_mast, corr_results, ref_ws_col=None, site_ws_col=None, splice=True)[source]¶ Applies the slopes and offsets from a daily correaltion, binned by month, between two met masts. If the reference or site masts don’t have daily time series the method resamples to daily frequency, requiring 70% data coverage within each day to be a valid day.
Parameters: - ref_mast: MetMast
- MetMast object
- site_mast: MetMast
- MetMast object
- corr_results: DataFrame
- slope, offset, R2, uncert, points for each direction sector
- ref_ws_col: string, default None (primary anemometer assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- ref_dir_col: string, default None (primary vane assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- site_ws_col: string, default None (primary anemometer assumed)
- Site anemometer data to use. Extracted from MetMast.data
- splice: Boolean, default True
- Returns site data where available and gap-fills any missing periods between the site mast and the reference mast’s measurement period. Otherwise, returns purely sythesized data without taking into account the measured wind speeds.
Returns: - out: time series DataFrame
- predicted wind speeds at the site
-
analysis.correlate.
apply_daily_results_by_month_to_mast_data
(mast_data, corr_results, ref_ws_col='ref', site_ws_col='site', splice=True)[source]¶ Applies the slopes and offsets from a daily correaltion, binned by month, to a DataFrame of wind speed data.
Parameters: - mast_data: DataFrame
- timeseries of wind speed data
- corr_results: DataFrame
- slope, offset, R2, uncert, points for each month
- ref_ws_col: string, default ‘ref’
- Reference anemometer data to use. Extracted from mast_data DataFrame.
- site_ws_col: string, default ‘site’
- Site anemometer data to use. Extracted from mast_data DataFrame
- splice: Boolean, default True
- Returns site data where available and gap-fills any missing periods between the site mast and the reference mast’s measurement period. Otherwise, returns purely sythesized data without taking into account the measured wind speeds.
Returns: - out: time series DataFrame
- predicted wind speeds at the site
-
analysis.correlate.
calculate_EDF_uncertainty
(data, ref_ws_col='ref', site_ws_col='site')[source]¶ Calculate the EDF estimated correlation uncertainty between two wind speed columns. Assumes a correlation forced through the origin
Parameters: - data: DataFrame
- DataFrame with wind speed columns ref_ws_col and site_ws_col
- ref_ws_col: string, default ‘ref’
- Reference anemometer data column to use.
- site_ws_col: string, default ‘site’
- Site anemometer data column to use.
-
analysis.correlate.
calculate_IEC_uncertainty
(data, ref_ws_col='ref', site_ws_col='site')[source]¶ Calculate the IEC correlation uncertainty between two wind speed columns
Parameters: - data: DataFrame
- DataFrame with wind speed columns ref_ws_col and site_ws_col
- ref_ws_col: string, default ‘ref’
- Reference anemometer data column to use.
- site_ws_col: string, default ‘site’
- Site anemometer data column to use.
-
analysis.correlate.
calculate_R2
(data, ref_ws_col='ref', site_ws_col='site')[source]¶ Return a single R2 between two wind speed columns
Parameters: - data: DataFrame
- DataFrame with wind speed columns ref_ws_col and site_ws_col
- ref_ws_col: string, default ‘ref’
- Reference anemometer data column to use.
- site_ws_col: string, default ‘site’
- Site anemometer data column to use.
-
analysis.correlate.
masts_10_minute
(ref_mast, site_mast, ref_ws_col=None, site_ws_col=None, method='ODR', force_through_origin=False)[source]¶ Calculate the slope and offset between two met masts.
Parameters: - ref_mast: MetMast
- MetMast object
- site_mast: MetMast
- MetMast object
- ref_ws_col: string, default None (primary anemometer assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- site_ws_col: string, default None (primary anemometer assumed)
- Site anemometer data to use. Extracted from MetMast.data
- method: string, default ‘ODR’
Correlation method to use.
- Orthoginal distance regression: ‘ODR’
- Ordinary least squares: ‘OLS’
- Robust linear models: ‘RLM’
- force_through_origin: boolean, default False
- Force the correlation through the origin (offset equal to zero)
Returns: - out: DataFrame
- slope, offset, R2, uncert, points
-
analysis.correlate.
masts_10_minute_by_direction
(ref_mast, site_mast, ref_ws_col=None, ref_dir_col=None, site_ws_col=None, site_dir_col=None, method='ODR', force_through_origin=False, dir_sectors=16)[source]¶ Calculate the slope and offset, binned by direction, between two met masts.
Parameters: - ref_mast: MetMast
- MetMast object
- site_mast: MetMast
- MetMast object
- ref_ws_col: string, default None (primary anemometer assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- ref_dir_col: string, default None (primary wind vane assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- site_dir_col: string, default None (primary anemometer assumed)
- Site anemometer data to use. Extracted from MetMast.data
- method: string, default ‘ODR’
Correlation method to use.
- Orthoginal distance regression: ‘ODR’
- Ordinary least squares: ‘OLS’
- Robust linear models: ‘RLM’
- dir_sectors: int, default 16
- Number of equally spaced direction sectors
- force_through_origin: boolean, default False
- Force the correlation through the origin (offset equal to zero)
Returns: - out: DataFrame
- slope, offset, R2, uncert, points
-
analysis.correlate.
masts_daily
(ref_mast, site_mast, ref_ws_col=None, site_ws_col=None, method='ODR', force_through_origin=False, minimum_recovery_rate=0.7)[source]¶ Calculate the slope and offset for daily data between two met masts.
Parameters: - ref_mast: MetMast
- MetMast object
- site_mast: MetMast
- MetMast object
- ref_ws_col: string, default None (primary anemometer assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- site_ws_col: string, default None (primary anemometer assumed)
- Site anemometer data to use. Extracted from MetMast.data
- method: string, default ‘ODR’
Correlation method to use.
- Orthoginal distance regression: ‘ODR’
- Ordinary least squares: ‘OLS’
- Robust linear models: ‘RLM’
- force_through_origin: boolean, default False
- Force the correlation through the origin (offset equal to zero)
- minimum_recovery_rate: float, default 0.7
- Minimum allowable recovery rate until resampled data are excluded. For example, by defalt, when resampling 10-minute data to daily averages you would need at least 101 valid records to have a valid daily average.
Returns: - out: DataFrame
- slope, offset, R2, uncert, points
-
analysis.correlate.
masts_daily_by_month
(ref_mast, site_mast, ref_ws_col=None, site_ws_col=None, method='ODR', force_through_origin=False, minimum_recovery_rate=0.7)[source]¶ Calculate the slope and offset for daily data, binned by month, between two met masts.
Parameters: - ref_mast: MetMast
- MetMast object
- site_mast: MetMast
- MetMast object
- ref_ws_col: string, default None (primary anemometer assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- site_ws_col: string, default None (primary anemometer assumed)
- Site anemometer data to use. Extracted from MetMast.data
- method: string, default ‘ODR’
Correlation method to use.
- Orthoginal distance regression: ‘ODR’
- Ordinary least squares: ‘OLS’
- Robust linear models: ‘RLM’
- force_through_origin: boolean, default False
- Force the correlation through the origin (offset equal to zero)
- minimum_recovery_rate: float, default 0.7
- Minimum allowable recovery rate until resampled data are excluded. For example, by defalt, when resampling 10-minute data to daily averages you would need at least 101 valid records to have a valid daily average.
Returns: - out: DataFrame
- slope, offset, R2, uncert, points for each month
-
analysis.correlate.
return_correlation_data_from_masts
(ref_mast, site_mast)[source]¶ Return a DataFrame of reference and site data for correlations. Will be extracted from each MetMast object using the primary anemometers and wind vanes.
Parameters: - ref_mast: MetMast
- Anemoi MetMast object
- site_mast: MetMast
- Anemoi MetMast object
Returns: out: DataFrame with columns ref, site, and dir
-
analysis.correlate.
valid_ws_correlation_data
(data, ref_ws_col='ref', site_ws_col='site')[source]¶ Perform checks on wind speed correlation data.
Parameters: - data: DataFrame
- DataFrame with wind speed columns ref_ws_col and site_ws_col
- ref_ws_col: string, default ‘ref’
- Reference anemometer data column to use.
- site_ws_col: string, default ‘site’
- Site anemometer data column to use.
-
analysis.correlate.
ws_correlation_binned_by_direction
(data, ref_ws_col='ref', site_ws_col='site', ref_dir_col='dir', dir_sectors=16, method='ODR', force_through_origin=False)[source]¶ Calculate the slope and offset, binned by direction, between two wind speed columns.
Parameters: - data: DataFrame
- DataFrame with wind speed columns ref and site, and direction data dir
- ref_ws_col: string, default None (primary anemometer assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- site_ws_col: string, default None (primary anemometer assumed)
- Site anemometer data to use. Extracted from MetMast.data
- ref_dir_col: string, default None (primary wind vane assumed)
- Reference wind vane data to use. Extracted from MetMast.data
- dir_sectors: int, default 16
- Number of equally spaced direction sectors
- method: string, default ‘ODR’
Correlation method to use.
- Orthoginal distance regression: ‘ODR’
- Ordinary least squares: ‘OLS’
- Robust linear models: ‘RLM’
- force_through_origin: boolean, default False
- Force the correlation through the origin (offset equal to zero)
Returns: - out: DataFrame
- slope, offset, R2, uncert, points
-
analysis.correlate.
ws_correlation_binned_by_month
(data, ref_ws_col='ref', site_ws_col='site', method='ODR', force_through_origin=False)[source]¶ Calculate the slope and offset, binned by month, between two wind speed columns.
Parameters: - data: DataFrame
- DataFrame with wind speed columns ref and site, and direction data dir
- ref_ws_col: string, default None (primary anemometer assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- site_ws_col: string, default None (primary anemometer assumed)
- Site anemometer data to use. Extracted from MetMast.data
- method: string, default ‘ODR’
Correlation method to use.
- Orthoginal distance regression: ‘ODR’
- Ordinary least squares: ‘OLS’
- Robust linear models: ‘RLM’
- force_through_origin: boolean, default False
- Force the correlation through the origin (offset equal to zero)
Returns: - out: DataFrame
- slope, offset, R2, uncert, points
-
analysis.correlate.
ws_correlation_least_squares_model
(data, ref_ws_col='ref', site_ws_col='site', force_through_origin=False)[source]¶ Calculate the slope and offset between two wind speed columns using ordinary least squares regression.
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.linalg.lstsq.html
Parameters: - data: DataFrame
- DataFrame with wind speed columns ref and site, and direction data dir
- ref_ws_col: string, default None (primary anemometer assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- site_ws_col: string, default None (primary anemometer assumed)
- Site anemometer data to use. Extracted from MetMast.data
- force_through_origin: boolean, default False
- Force the correlation through the origin (offset equal to zero)
Returns: - out: DataFrame
- slope, offset, R2, uncert, points
-
analysis.correlate.
ws_correlation_method
(data, ref_ws_col='ref', site_ws_col='site', method='ODR', force_through_origin=False)[source]¶ Calculate the slope and offset, for a given correlation method, between two wind speed columns.
Parameters: - data: DataFrame
- DataFrame with wind speed columns ref and site, and direction data dir
- ref_ws_col: string, default None (primary anemometer assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- site_ws_col: string, default None (primary anemometer assumed)
- Site anemometer data to use. Extracted from MetMast.data
- method: string, default ‘ODR’
Correlation method to use.
- Orthoginal distance regression: ‘ODR’
- Ordinary least squares: ‘OLS’
- Robust linear models: ‘RLM’
- force_through_origin: boolean, default False
- Force the correlation through the origin (offset equal to zero)
Returns: - out: DataFrame
- slope, offset, R2, uncert, points
-
analysis.correlate.
ws_correlation_orthoginal_distance_model
(data, ref_ws_col='ref', site_ws_col='site', force_through_origin=False)[source]¶ Calculate the slope and offset between two wind speed columns using orthoganal distance regression.
https://docs.scipy.org/doc/scipy-0.18.1/reference/odr.html
Parameters: - data: DataFrame
- DataFrame with wind speed columns ref and site, and direction data dir
- ref_ws_col: string, default None (primary anemometer assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- site_ws_col: string, default None (primary anemometer assumed)
- Site anemometer data to use. Extracted from MetMast.data
- force_through_origin: boolean, default False
- Force the correlation through the origin (offset equal to zero)
Returns: - out: DataFrame
- slope, offset, R2, uncert, points
-
analysis.correlate.
ws_correlation_robust_linear_model
(data, ref_ws_col='ref', site_ws_col='site', force_through_origin=False)[source]¶ Calculate the slope and offset between two wind speed columns using robust linear model.
http://www.statsmodels.org/dev/rlm.html
Parameters: - data: DataFrame
- DataFrame with wind speed columns ref and site, and direction data dir
- ref_ws_col: string, default None (primary anemometer assumed)
- Reference anemometer data to use. Extracted from MetMast.data
- site_ws_col: string, default None (primary anemometer assumed)
- Site anemometer data to use. Extracted from MetMast.data
- force_through_origin: boolean, default False
- Force the correlation through the origin (offset equal to zero)
Returns: - out: DataFrame
- slope, offset, R2, uncert, points