Open In Colab

Berdasarkan isu #127: anfrek: Gumbel

Referensi Isu:

Deskripsi Isu:

  • Mencari nilai ekstrim dengan kala ulang tertentu. Penerapan ini bisa digunakan untuk hujan rancangan atau debit banjir rancangan.

Diskusi Isu:

  • #156 - Bagaimana menghitung periode ulang distribusi (analisis frekuensi) tanpa melihat tabel?

Strategi:

  • Akan mengikuti fungsi log pearson #126 seperti pada manual.

PERSIAPAN DAN DATASET

In [1]:
import numpy as np
import pandas as pd
from scipy import stats
In [2]:
# contoh data diambil dari buku
# hidrologi: Aplikasi Metode Statistik untuk Analisa Data hal. 125

_DEBIT = [
    244, 217, 285, 261, 295, 252, 275, 204, 208, 194, 256, 207, 354, 445, 
    350, 336, 328, 269, 323, 364, 247, 290, 302, 301, 284, 276, 261, 303, 
    335, 320
]

_TAHUN = list(range(1918, 1935)) + list(range(1973, 1986))

data = pd.DataFrame(
    data=np.stack([_TAHUN, _DEBIT], axis=1),
    columns=['tahun', 'debit']
)
data.tahun = pd.to_datetime(data.tahun, format='%Y')
data.set_index('tahun', inplace=True)
data
Out[2]:
debit
tahun
1918-01-01 244
1919-01-01 217
1920-01-01 285
1921-01-01 261
1922-01-01 295
1923-01-01 252
1924-01-01 275
1925-01-01 204
1926-01-01 208
1927-01-01 194
1928-01-01 256
1929-01-01 207
1930-01-01 354
1931-01-01 445
1932-01-01 350
1933-01-01 336
1934-01-01 328
1973-01-01 269
1974-01-01 323
1975-01-01 364
1976-01-01 247
1977-01-01 290
1978-01-01 302
1979-01-01 301
1980-01-01 284
1981-01-01 276
1982-01-01 261
1983-01-01 303
1984-01-01 335
1985-01-01 320

TABEL

Terdapat 3 tabel untuk modul hk127 yaitu:

  • t_gumbel_gb: Tabel nilai $\bar{y}_N$ (yn) dan $\sigma_N$ (sn) dari Tabel 6.2.3 Means and Standard Deviations of Reduced Extremes hal. 228. Sumber: Statistics of Extremes oleh Gumbel, E. J. (2012)
  • t_gumbel_sw: Tabel nilai Yn (yn) dan Sn (sn) dari Tabel 3.11A Hubungan Reduksi Variat Rata-rata (Yn) dengan Jumlah Data (n) dan Tabel 3.11B Hubungan antara deviasi standar dan reduksi variat dengan jumlah data. Sumber: hidrologi: Aplikasi Metode Statistik untuk Analisa Data oleh Soewarno. (1995)
  • t_gumbel_st: Tabel nilai $Y_n$ (yn) dan $S_n$ (sn) dari Tabel 12.1. $Y_n$ dan $S_n$ Gumbel. Sumber: Rekayasa Statistika untuk Teknik Pengairan oleh Soetopo, W., Montarcih, L., Press, U. B., & Media, U. (2017).

Dalam modul hk127 nilai $Y_n$ dan $S_n$ akan menggunakan dari tabel t_gumbel_gb secara default. Mohon diperhatikan jika ingin menggunakan nilai $Y_n$ dan $S_n$ yang berasal dari sumber lain.

Catatan: Sumber buku Gumbel dari yang tahun 1957.

In [3]:
# tabel dari gumbel
# Statistics of Extremes oleh Gumbel p.228

# KODE: GB

_DATA_GB = [
    [0.48430, 0.90430],
    [0.49020, 0.92880],
    [0.49520, 0.94970],
    [0.49960, 0.96760],
    [0.50350, 0.98330],
    [0.50700, 0.99720],
    [0.51000, 1.00950],
    [0.51280, 1.02057],
    [0.51570, 1.03160],
    [0.51810, 1.04110],
    [0.52020, 1.04930],
    [0.52200, 1.05660],
    [0.52355, 1.06283],
    [0.52520, 1.06960],
    [0.52680, 1.07540],
    [0.52830, 1.08110],
    [0.52960, 1.08640],
    [0.53086, 1.09145],
    [0.53200, 1.09610],
    [0.53320, 1.10040],
    [0.53430, 1.10470],
    [0.53530, 1.10860],
    [0.53622, 1.11238],
    [0.53710, 1.11590],
    [0.53800, 1.11930],
    [0.53880, 1.12260],
    [0.53960, 1.12550],
    [0.54034, 1.12847],
    [0.54100, 1.13130],
    [0.54180, 1.13390],
    [0.54240, 1.13630],
    [0.54300, 1.13880],
    [0.54362, 1.14132],
    [0.54420, 1.14360],
    [0.54480, 1.14580],
    [0.54530, 1.14800],
    [0.54580, 1.14990],
    [0.54630, 1.15185],
    [0.54680, 1.15380],
    [0.54730, 1.15570],
    [0.54770, 1.15740],
    [0.54810, 1.15900],
    [0.54854, 1.16066],
    [0.54890, 1.16230],
    [0.54930, 1.16380],
    [0.54970, 1.16530],
    [0.55010, 1.16670],
    [0.55040, 1.16810],
    [0.55080, 1.16960],
    [0.55110, 1.17080],
    [0.55150, 1.17210],
    [0.55180, 1.17340],
    [0.55208, 1.17467],
    [0.55270, 1.17700],
    [0.55330, 1.17930],
    [0.55380, 1.18140],
    [0.55430, 1.18340],
    [0.55477, 1.18536],
    [0.55520, 1.18730],
    [0.55570, 1.18900],
    [0.55610, 1.19060],
    [0.55650, 1.19230],
    [0.55688, 1.19382],
    [0.55720, 1.19530],
    [0.55760, 1.19670],
    [0.55800, 1.19800],
    [0.55830, 1.19940],
    [0.55860, 1.20073],
    [0.55890, 1.20200],
    [0.55920, 1.20320],
    [0.55950, 1.20440],
    [0.55980, 1.20550],
    [0.56002, 1.20649],
    [0.56461, 1.22534],
    [0.56715, 1.23598],
    [0.56878, 1.24292],
    [0.56993, 1.24786],
    [0.57144, 1.25450],
    [0.57240, 1.25880],
    [0.57377, 1.26506],
    [0.57450, 1.26851]
]

_INDEX_GB = (
    list(range(8, 61)) +
    list(range(62, 101, 2)) +
    list(range(150, 301, 50)) +
    [400, 500, 750, 1000]
)

_COL_GB = ['yn', 'sn']

t_gumbel_gb = pd.DataFrame(
    data=_DATA_GB, index=_INDEX_GB, columns=_COL_GB
)
t_gumbel_gb
Out[3]:
yn sn
8 0.48430 0.90430
9 0.49020 0.92880
10 0.49520 0.94970
11 0.49960 0.96760
12 0.50350 0.98330
... ... ...
300 0.56993 1.24786
400 0.57144 1.25450
500 0.57240 1.25880
750 0.57377 1.26506
1000 0.57450 1.26851

81 rows × 2 columns

In [4]:
# tabel dari soewarno
# Tabel 3.11A & 3.11B p.129-130

# KODE: GB

_DATA_SW = [
    [0.4592, 0.9496],
    [0.4996, 0.9676],
    [0.5053, 0.9933],
    [0.5070, 0.9971],
    [0.5100, 1.0095],
    [0.5128, 1.0206],
    [0.5157, 1.0316],
    [0.5181, 1.0411],
    [0.5202, 1.0493],
    [0.5220, 1.0565],
    [0.5236, 1.0628],
    [0.5252, 1.0696],
    [0.5268, 1.0754],
    [0.5283, 1.0811],
    [0.5296, 1.0864],
    [0.5309, 1.0915],
    [0.5320, 1.1961],
    [0.5332, 1.1004],
    [0.5343, 1.1047],
    [0.5353, 1.1086],
    [0.5362, 1.1124],
    [0.5371, 1.1159],
    [0.5380, 1.1193],
    [0.5388, 1.1226],
    [0.5396, 1.1255],
    [0.5402, 1.1285],
    [0.5410, 1.1313],
    [0.5418, 1.1339],
    [0.5424, 1.1363],
    [0.5430, 1.1388],
    [0.5436, 1.1413],
    [0.5442, 1.1436],
    [0.5448, 1.1458],
    [0.5453, 1.1480],
    [0.5458, 1.1499],
    [0.5463, 1.1519],
    [0.5468, 1.1538],
    [0.5473, 1.1557],
    [0.5477, 1.1574],
    [0.5481, 1.1590],
    [0.5485, 1.1607],
    [0.5489, 1.1623],
    [0.5493, 1.1638],
    [0.5497, 1.1658],
    [0.5501, 1.1667],
    [0.5504, 1.1681],
    [0.5508, 1.1696],
    [0.5511, 1.1708],
    [0.5518, 1.1721],
    [0.5518, 1.1734],
    [0.5521, 1.1747],
    [0.5524, 1.1759],
    [0.5527, 1.1770],
    [0.5530, 1.1782],
    [0.5533, 1.1793],
    [0.5535, 1.1803],
    [0.5538, 1.1814],
    [0.5540, 1.1824],
    [0.5543, 1.1834],
    [0.5545, 1.1844],
    [0.5548, 1.1854],
    [0.5550, 1.1863],
    [0.5552, 1.1873],
    [0.5555, 1.1881],
    [0.5557, 1.1890],
    [0.5559, 1.1898],
    [0.5561, 1.1906],
    [0.5563, 1.1915],
    [0.5565, 1.1923],
    [0.5567, 1.1930],
    [0.5569, 1.1938],
    [0.5570, 1.1945],
    [0.5572, 1.1953],
    [0.5574, 1.1959],
    [0.5576, 1.1967],
    [0.5578, 1.1973],
    [0.5580, 1.1980],
    [0.5581, 1.1987],
    [0.5583, 1.1994],
    [0.5585, 1.2001],
    [0.5586, 1.2007],
    [0.5587, 1.2013],
    [0.5589, 1.2020],
    [0.5591, 1.2026],
    [0.5592, 1.2032],
    [0.5593, 1.2038],
    [0.5595, 1.2044],
    [0.5596, 1.2049],
    [0.5598, 1.2055],
    [0.5599, 1.2060],
    [0.5600, 1.2065],
]

_INDEX_SW = list(range(10, 101))

_COL_SW = ['yn', 'sn']

t_gumbel_sw = pd.DataFrame(
    data=_DATA_SW, index=_INDEX_SW, columns=_COL_SW
)
t_gumbel_sw
Out[4]:
yn sn
10 0.4592 0.9496
11 0.4996 0.9676
12 0.5053 0.9933
13 0.5070 0.9971
14 0.5100 1.0095
... ... ...
96 0.5595 1.2044
97 0.5596 1.2049
98 0.5598 1.2055
99 0.5599 1.2060
100 0.5600 1.2065

91 rows × 2 columns

In [5]:
# Tabel dari Soetopo hal. 98
# Tabel 12.1 Yn dan Sn Gumbel 

# KODE: ST

_DATA_ST = [
    [0.4843, 0.9043],
    [0.4902, 0.9288],
    [0.4952, 0.9497],
    [0.4996, 0.9676],
    [0.5035, 0.9833],
    [0.5070, 0.9972],
    [0.5100, 1.0095],
    [0.5128, 1.0205],
    [0.5157, 1.0316],
    [0.5181, 1.0411],
    [0.5202, 1.0493],
    [0.5220, 1.0566],
    [0.5235, 1.0628],
    [0.5252, 1.0696],
    [0.5268, 1.0754],
    [0.5283, 1.0811],
    [0.5296, 1.0864],
    [0.5309, 1.0915],
    [0.5320, 1.0961],
    [0.5332, 1.1004],
    [0.5343, 1.1047],
    [0.5353, 1.1086],
    [0.5362, 1.1124],
    [0.5371, 1.1159],
    [0.5380, 1.1193],
    [0.5388, 1.1226],
    [0.5396, 1.1255],
    [0.5402, 1.1285],
    [0.5410, 1.1313],
    [0.5418, 1.1339],
    [0.5424, 1.1363],
    [0.5430, 1.1388],
    [0.5436, 1.1413],
    [0.5442, 1.1436],
    [0.5448, 1.1458],
    [0.5453, 1.1480],
    [0.5458, 1.1499],
    [0.5463, 1.1519],
    [0.5468, 1.1538],
    [0.5473, 1.1557],
    [0.5477, 1.1574],
    [0.5481, 1.1590],
    [0.5485, 1.1607],
    [0.5489, 1.1623],
    [0.5493, 1.1638],
    [0.5497, 1.1658],
    [0.5501, 1.1667],
    [0.5504, 1.1681],
    [0.5508, 1.1696],
    [0.5511, 1.1708],
    [0.5515, 1.1721],
    [0.5518, 1.1734],
    [0.5521, 1.1747],
    [0.5524, 1.1759],
    [0.5527, 1.1770],
    [0.5530, 1.1782],
    [0.5533, 1.1793],
    [0.5535, 1.1803],
    [0.5538, 1.1814],
    [0.5540, 1.1824],
    [0.5543, 1.1834],
    [0.5545, 1.1844],
    [0.5548, 1.1854],
    [0.5550, 1.1863],
    [0.5552, 1.1873],
    [0.5555, 1.1881],
    [0.5557, 1.1890],
    [0.5559, 1.1898],
    [0.5561, 1.1906],
    [0.5563, 1.1915],
    [0.5565, 1.1923],
    [0.5567, 1.1930],
    [0.5569, 1.1938],
    [0.5570, 1.1945],
    [0.5572, 1.1953],
    [0.5574, 1.1959],
    [0.5576, 1.1967],
    [0.5578, 1.1973],
    [0.5580, 1.1980],
    [0.5581, 1.1987],
    [0.5583, 1.1994],
    [0.5585, 1.2001],
    [0.5586, 1.2007],
    [0.5587, 1.2013],
    [0.5589, 1.2020],
    [0.5591, 1.2026],
    [0.5592, 1.2032],
    [0.5593, 1.2038],
    [0.5595, 1.2044],
    [0.5596, 1.2049],
    [0.5598, 1.2055],
    [0.5599, 1.2060],
    [0.5600, 1.2065],
]

_INDEX_ST = list(range(8, 101))

_COL_ST = ['yn', 'sn']

t_gumbel_st = pd.DataFrame(
    data=_DATA_ST, index=_INDEX_ST, columns=_COL_ST
)
t_gumbel_st
Out[5]:
yn sn
8 0.4843 0.9043
9 0.4902 0.9288
10 0.4952 0.9497
11 0.4996 0.9676
12 0.5035 0.9833
... ... ...
96 0.5595 1.2044
97 0.5596 1.2049
98 0.5598 1.2055
99 0.5599 1.2060
100 0.5600 1.2065

93 rows × 2 columns

KODE

In [6]:
def _find_in_table(val, table, y_col=None, x_col=None):
    x = table.index if x_col is None else table[x_col]
    y = table.iloc[:, 0] if y_col is None else table[y_col]
    return np.interp(val, x, y)

def _find_Yn_Sn(n, table):
    yn = _find_in_table(n, table, y_col='yn')
    sn = _find_in_table(n, table, y_col='sn')
    return yn, sn
In [7]:
def find_coef(n, source='gumbel'):
    if source.lower() == 'gumbel':
        return _find_Yn_Sn(n, t_gumbel_gb)
    if source.lower() == 'soewarno':
        return _find_Yn_Sn(n, t_gumbel_sw)
    if source.lower() == 'soetopo':
        return _find_Yn_Sn(n, t_gumbel_st)

def calc_K(n, return_period, source='gumbel', show_stat=False):
    return_period = np.array(return_period)

    if source.lower() == 'scipy':
        # todo: perhitungan probabilitasnya belum dapat dipastikan formulanya
        prob = 1 - 1/return_period
        # prob = 1 - np.log(return_period/(return_period-1))
        return stats.gumbel_r.ppf(prob)
    elif source.lower() == 'powell':
        return -np.sqrt(6)/np.pi *(np.euler_gamma+np.log(np.log(return_period/(return_period-1))))
    else:
        # dibuku Soewarno dinyatakan T>=20 menggunakan
        # ln(T), tapi dicontohnya tidak mengikuti formula tersebut
        # jadi yang digunakan rumus umumnya saja.
        # if source.lower() == 'soewarno':
        #     yt = []
        #     for t in return_period:
        #         if t <= 20:
        #             yt += [-np.log(-np.log((t - 1)/t))]
        #         else:
        #             yt += [np.log(t)]
        #     yt = np.array(yt)
        # else:
        #     yt = -np.log(-np.log((return_period - 1)/return_period))

        yn, sn = find_coef(n, source=source)
        yt = -np.log(-np.log((return_period - 1)/return_period))
        K = (yt - yn) / sn

        if show_stat:
            print(f'y_n = {yn}')
            print(f's_n = {sn}')
            print(f'y_t = {yt}')
        return K

def calc_x_gumbel(x, return_period=[5], source='gumbel', show_stat=False):

    x_mean = np.mean(x)
    x_std = np.std(x, ddof=1)
    n = len(x)

    k = calc_K(n, return_period, source=source, show_stat=show_stat)

    if show_stat:
        print(f'x_mean = {x_mean:.5f}')
        print(f'x_std = {x_std:.5f}')
        print(f'k = {k}')
    
    val_x = x_mean + k * x_std
    return val_x

def freq_gumbel(
    df, col=None,
    return_period=[2, 5, 10, 20, 25, 50, 100], source='gumbel', show_stat=False,
    col_name='Gumbel', index_name='Kala Ulang'):

    col = df.columns[0] if col is None else col

    x = df[col].copy()

    arr = calc_x_gumbel(
        x, return_period=return_period, show_stat=show_stat,
        source=source
    )

    result = pd.DataFrame(
        data=arr, index=return_period, columns=[col_name]
    )

    result.index.name = index_name
    return result
In [8]:
def _calc_T(P):
    return 1 / (1-np.exp(-np.exp(-P)))

def _calc_prob_from_table(k, n, source='gumbel'):
    yn, sn = find_coef(n, source=source)
    P = k * sn + yn
    T = _calc_T(P)
    return np.around(1-1/T, 3)

def calc_prob(k, n, source='gumbel'):
    if source.lower() == 'gumbel':
        return _calc_prob_from_table(k, n, source=source)
    if source.lower() == 'soewarno':
        return _calc_prob_from_table(k, n, source=source)
    if source.lower() == 'soetopo':
        return _calc_prob_from_table(k, n, source=source)
    if source.lower() == 'scipy':
        return stats.gumbel_r.cdf(k)
    if source.lower() == 'powell':
        # persamaan ini ditemukan menggunakan wolfram alpha
        # x = e^(e^(-(π K)/sqrt(6) - p))/(e^(e^(-(π K)/sqrt(6) - p)) - 1)
        _top = np.exp(np.exp(-(np.pi*k)/np.sqrt(6)-np.euler_gamma))
        _bot = _top - 1
        T = _top / _bot
        return 1-1/T

FUNGSI

Fungsi find_coef(n, ...)

Function: find_coef(n, source='gumbel')

Fungsi find_coef(...) digunakan untuk mencari nilai $Y_n$ dan $S_n$ dari berbagai sumber berdasarkan nilai $n$ yaitu jumlah banyaknya data.

  • Argumen Posisi:
    • n: jumlah banyaknya data.
  • Argumen Opsional:
    • source: sumber nilai $Y_n$ dan $S_n$, 'gumbel' (default). Sumber yang dapat digunakan antara lain: Soewarno ('soewarno'), Soetopo ('soetopo').

Perlu dicatat bahwa batas jumlah data $N$ untuk masing-masing sumber berbeda-beda.

  • Untuk 'gumbel' batasan dimulai dari $[8, ∞]$ akan tetapi pada tabel hanya sampai $1000$.
  • Untuk 'soewarno' batasan dimulai dari $[10, 100]$.
  • Untuk 'soetopo' batasan dimulai dari $[10, 100]$.
In [9]:
find_coef(10)
Out[9]:
(0.4952, 0.9497)
In [10]:
find_coef(30, source='soetopo')  # menggunakan tabel dari soetopo
Out[10]:
(0.5362, 1.1124)
In [11]:
# perbandingan antara masing-masing sumber

_n = 25
source_test = ['gumbel', 'soewarno', 'soetopo']

for _source in source_test:
    print(f'Yn, Sn {_source:10}= {find_coef(_n, source=_source)}')
Yn, Sn gumbel    = (0.53086, 1.09145)
Yn, Sn soewarno  = (0.5309, 1.0915)
Yn, Sn soetopo   = (0.5309, 1.0915)

Fungsi calc_K(n, return_period, ...):

Function: calc_K(n, return_period, source='gumbel', show_stat=False)

Fungsi calc_K(...) digunakan untuk menghitung nilai frequency factor $K$ yang digunakan untuk menghitung nilai $X$ pada kala ulang tertentu.

  • Argumen Posisi:
    • n: jumlah banyaknya data.
    • return_period: kala ulang. Bisa dalam skalar ataupun _arraylike.
  • Argumen Opsional:
    • source: sumber nilai $Y_n$ dan $S_n$ (untuk 'gumbel', 'soewarno', dan 'soetopo'), 'gumbel' (default). Sumber yang dapat digunakan antara lain: Soewarno ('soewarno'), Soetopo ('soetopo'), fungsi stats.gumbel_r.ppf dari Scipy ('scipy'), dan metode Powell ('powell').
    • show_stat: menampilkan parameter statistik. False (default).

Catatan: Untuk metode Powell ('powell') menggunakan persamaan: $$K = - \frac{\sqrt{6}}{\pi} \left( \gamma + \ln{\ln\left({\frac{T}{T-1}}\right)}\right)$$ dengan $\gamma = 0.5772$ (np.euler_gamma) atau merupakan bilangan Euler–Mascheroni constant.

In [12]:
calc_K(10, 10)
Out[12]:
1.8481281744892548
In [13]:
calc_K(10, [10, 20, 50], source='soetopo')
Out[13]:
array([1.84812817, 2.60608113, 3.58717348])
In [14]:
calc_K(10, [10, 20, 50], source='soewarno', show_stat=True)
y_n = 0.4592
s_n = 0.9496
y_t = [2.25036733 2.97019525 3.90193866]
Out[14]:
array([1.8862335 , 2.64426627, 3.62546194])

Fungsi calc_x_gumbel(x, ...)

Function: calc_x_gumbel(x, return_period=[5], source='gumbel', show_stat=False)

Fungsi calc_x_gumbel(...) digunakan untuk mencari besar $X$ berdasarkan kala ulang (return period), yang hasilnya dalam bentuk numpy.array.

  • Argumen Posisi:
    • x: array.
  • Argumen Opsional:
    • return_period: Kala Ulang (Tahun). [5] (default).
    • source: sumber nilai $K$, 'gumbel' (default). Sumber yang dapat digunakan antara lain: Soewarno ('soewarno'), Soetopo ('soetopo'), fungsi stats.gumbel_r.ppf dari Scipy ('scipy'), dan metode Powell ('powell').
    • show_stat: menampilkan parameter statistik. False (default).
In [15]:
calc_x_gumbel(data.debit)
Out[15]:
array([334.33496634])
In [16]:
calc_x_gumbel(data.debit, show_stat=True)
y_n = 0.53622
s_n = 1.11238
y_t = [1.49993999]
x_mean = 286.20000
x_std = 55.56009
k = [0.86635861]
Out[16]:
array([334.33496634])
In [17]:
calc_x_gumbel(data.debit, return_period=[5, 10, 15, 20, 21], show_stat=True)
y_n = 0.53622
s_n = 1.11238
y_t = [1.49993999 2.25036733 2.67375209 2.97019525 3.02022654]
x_mean = 286.20000
x_std = 55.56009
k = [0.86635861 1.5409728  1.92158443 2.18807894 2.23305574]
Out[17]:
array([334.33496634, 371.81659511, 392.96341332, 407.7698733 ,
       410.2687885 ])

Fungsi freq_gumbel(df, ...)

Function: freq_gumbel(df, col=None, return_period=[2, 5, 10, 20, 25, 50, 100], source='gumbel', show_stat=False, col_name='Gumbel')

Fungsi freq_gumbel(...) merupakan fungsi kembangan lebih lanjut dari calc_x_gumbel(...) yang menerima input pandas.DataFrame dan memiliki luaran berupa pandas.DataFrame.

  • Argumen Posisi:
    • df: pandas.DataFrame.
  • Argumen Opsional:
    • col: nama kolom, None (default). Jika tidak diisi menggunakan kolom pertama dalam df sebagai data masukan.
    • return_period: Kala Ulang (Tahun), [2, 5, 10, 20, 25, 50, 100] (default).
    • source: sumber nilai $K$, 'gumbel' (default). Sumber yang dapat digunakan antara lain: Soewarno ('soewarno'), Soetopo ('soetopo'), fungsi stats.gumbel_r.ppf dari Scipy ('scipy'), dan metode Powell ('powell').
    • show_stat: menampilkan parameter statistik. False (default).
    • col_name: Nama kolom luaran, Gumbel (default).
In [18]:
freq_gumbel(data)
Out[18]:
Gumbel
Kala Ulang
2 277.723633
5 334.334966
10 371.816595
20 407.769873
25 419.174732
50 454.307704
100 489.181259
In [19]:
freq_gumbel(data, source='soewarno', col_name='Gumbel (Soewarno)')
Out[19]:
Gumbel (Soewarno)
Kala Ulang
2 277.724784
5 334.335100
10 371.816055
20 407.768686
25 419.173341
50 454.305681
100 489.178609
In [20]:
freq_gumbel(data, 'debit', source='soetopo', col_name=f'LP3 (soetopo)', show_stat=True)
y_n = 0.5362
s_n = 1.1124
y_t = [0.36651292 1.49993999 2.25036733 2.97019525 3.19853426 3.90193866
 4.60014923]
x_mean = 286.20000
x_std = 55.56009
k = [-0.15254142  0.86636101  1.54096308  2.18805758  2.39332458  3.02565503
  3.65331646]
Out[20]:
LP3 (soetopo)
Kala Ulang
2 277.724784
5 334.335100
10 371.816055
20 407.768686
25 419.173341
50 454.305681
100 489.178609
In [21]:
_res = []

for _s in ['gumbel', 'soewarno', 'soetopo', 'powell', 'scipy', ]:
    _res += [freq_gumbel(data, 'debit', source=_s, col_name=f'Gumbel ({_s})')]

pd.concat(_res, axis=1)
Out[21]:
Gumbel (gumbel) Gumbel (soewarno) Gumbel (soetopo) Gumbel (powell) Gumbel (scipy)
Kala Ulang
2 277.723633 277.724784 277.724784 277.072351 306.563493
5 334.334966 334.335100 334.335100 326.172444 369.536808
10 371.816595 371.816055 371.816055 358.680977 411.230622
20 407.769873 407.768686 407.768686 389.863943 451.224330
25 419.174732 419.173341 419.173341 399.755596 463.910867
50 454.307704 454.305681 454.305681 430.227094 502.992082
100 489.181259 489.178609 489.178609 460.473595 541.784727

Fungsi calc_prob(k, n, ...)

Function: calc_prob(k, n, source='gumbel')

Fungsi calc_prob(...) digunakan untuk menghitung nilai probabilitas/peluang berdasarkan nilai $K$ (frequency factor).

  • Argumen Posisi:
    • k: nilai $K$ (frequency factor). Nilai $K$ diperoleh menggunakan persamaan $K = \frac{x - \bar{x}}{s}$
    • n: jumlah banyaknya data.
  • Argumen Opsional:
    • source: sumber nilai $Y_n$ dan $S_n$ (untuk 'gumbel', 'soewarno', dan 'soetopo'), 'gumbel' (default). Sumber yang dapat digunakan antara lain: Soewarno ('soewarno'), Soetopo ('soetopo'), fungsi stats.gumbel_r.ppf dari Scipy ('scipy'), dan metode Powell ('powell').

Catatan:

  • Metode tabel ('gumbel', 'soewarno', 'soetopo')

    Nilai probabilitas/peluang diperoleh menggunakan formula $P=1-\frac{1}{T}$

  • Metode Powell ('powell')

    Nilai kala ulang $T$ (Tahun) untuk metode Powell menggunakan persamaan berikut: $$T=\frac{e^{e^{-\left(\pi k\right)/\sqrt{6}-\gamma}}}{e^{e^{-\left(\pi k\right)/\sqrt{6}-\gamma}}-1}$$ dengan $\gamma = 0.5772$ (np.euler_gamma) atau merupakan bilangan Euler–Mascheroni constant.

    Nilai probabilitas/peluang diperoleh menggunakan formula $P=1-\frac{1}{T}$

  • Metode scipy ('scipy')

    Nilai probabilitas diperoleh menggunakan fungsi stats.gumbel_r.cdf(...).

In [22]:
_k = calc_K(data.size, [1.001, 1.005, 1.01, 1.05, 1.11, 1.25, 1.33, 1.43, 1.67, 2, 2.5, 3.33, 4, 5, 10, 20, 50, 100, 200, 500, 1000])
_k
Out[22]:
array([-2.21957372, -1.98183192, -1.85688162, -1.48291416, -1.23534677,
       -0.90985544, -0.78056388, -0.64718086, -0.40052   , -0.15256215,
        0.12181718,  0.44365066,  0.63798281,  0.86635861,  1.5409728 ,
        2.18807894,  3.02569145,  3.65336416,  4.2787466 ,  5.10381998,
        5.72739088])
In [23]:
calc_prob(_k, data.size)
Out[23]:
array([0.001, 0.005, 0.01 , 0.048, 0.099, 0.2  , 0.248, 0.301, 0.401,
       0.5  , 0.6  , 0.7  , 0.75 , 0.8  , 0.9  , 0.95 , 0.98 , 0.99 ,
       0.995, 0.998, 0.999])
In [24]:
calc_prob(_k, data.size, source='scipy').round(3)
Out[24]:
array([0.   , 0.001, 0.002, 0.012, 0.032, 0.083, 0.113, 0.148, 0.225,
       0.312, 0.413, 0.526, 0.59 , 0.657, 0.807, 0.894, 0.953, 0.974,
       0.986, 0.994, 0.997])
In [25]:
calc_prob(_k, data.size, source='powell').round(3)
Out[25]:
array([0.   , 0.001, 0.002, 0.023, 0.065, 0.165, 0.217, 0.276, 0.391,
       0.505, 0.619, 0.728, 0.781, 0.831, 0.925, 0.967, 0.988, 0.995,
       0.998, 0.999, 1.   ])
In [26]:
calc_prob(_k, data.size, source='soewarno')
Out[26]:
array([0.001, 0.005, 0.01 , 0.048, 0.099, 0.2  , 0.248, 0.301, 0.401,
       0.5  , 0.6  , 0.7  , 0.75 , 0.8  , 0.9  , 0.95 , 0.98 , 0.99 ,
       0.995, 0.998, 0.999])

Changelog

- 20220323 - 1.1.0 - tambah argumen index_name="Kala Ulang" pada fungsi freq_gumbel() untuk penamaan index
- 20220315 - 1.0.1 - Tambah fungsi `calc_prob(...)`
- 20220310 - 1.0.0 - Initial

Copyright © 2022 Taruma Sakti Megariansyah

Source code in this notebook is licensed under a MIT License. Data in this notebook is licensed under a Creative Common Attribution 4.0 International.