Rolling ======= .. cpp:class:: pandas::Rolling Window operation class for rolling/expanding calculations. Example ------- .. code-block:: cpp #include using namespace pandas; // Use Rolling Rolling obj; // ... operations ... Constructors ------------ .. list-table:: :widths: 55 25 20 :header-rows: 1 * - Signature - Location - Example * - ``Rolling(const Series& series, size_t window, size_t min_periods = 1, bool center = false)`` - pd_rolling.h:62 - Indexing / Selection -------------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``const Series& get_series() const`` - const Series& - pd_rolling.h:93 - :ref:`View ` * - ``size_t get_window() const`` - size_t - pd_rolling.h:94 - * - ``std::pair get_window_bounds(size_t i) const`` - std::pair - pd_rolling.h:424 - * - ``std::vector get_window_values(size_t start, size_t end) const`` - std::vector - pd_rolling.h:500 - Statistics ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``Series count() const`` - Series - pd_rolling.h:124 - :ref:`View ` * - ``Series kurt() const`` - Series - pd_rolling.h:204 - :ref:`View ` * - ``Series max() const`` - Series - pd_rolling.h:118 - :ref:`View ` * - ``Series mean() const`` - Series - pd_rolling.h:106 - :ref:`View ` * - ``Series median() const`` - Series - pd_rolling.h:158 - :ref:`View ` * - ``Series min() const`` - Series - pd_rolling.h:112 - :ref:`View ` * - ``Series quantile(double q) const`` - Series - pd_rolling.h:231 - :ref:`View ` * - ``Series sem(int ddof = 1) const`` - Series - pd_rolling.h:252 - :ref:`View ` * - ``Series skew() const`` - Series - pd_rolling.h:178 - :ref:`View ` * - ``Series std_(int ddof = 1) const`` - Series - pd_rolling.h:130 - :ref:`View ` * - ``Series sum() const`` - Series - pd_rolling.h:100 - :ref:`View ` * - ``Series var(int ddof = 1) const`` - Series - pd_rolling.h:144 - :ref:`View ` Aggregation ----------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``Series apply(Func&& func) const`` - Series - pd_rolling.h:415 - :ref:`View ` * - ``Series apply_rolling(Func&& func) const`` - Series - pd_rolling.h:524 - Sorting ------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``Series rank() const`` - Series - pd_rolling.h:274 - :ref:`View ` Type Checking ------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``bool is_custom_bounds() const`` - bool - pd_rolling.h:91 - * - ``bool is_time_based() const`` - bool - pd_rolling.h:83 - :ref:`View ` Other Methods ------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``Series corr(const Series& other) const`` - Series - pd_rolling.h:357 - :ref:`View ` * - ``Series cov(const Series& other, int ddof = 1) const`` - Series - pd_rolling.h:291 - :ref:`View ` * - ``propagate_source_index(result_series)`` - - pd_rolling.h:267 - * - ``propagate_source_index(result_series)`` - - pd_rolling.h:349 - * - ``propagate_source_index(result_series)`` - - pd_rolling.h:406 - * - ``propagate_source_index(result_series)`` - - pd_rolling.h:540 - * - ``void propagate_source_index(Series& result) const`` - void - pd_rolling.h:547 - * - ``void set_custom_bounds(std::vector start, std::vector end)`` - void - pd_rolling.h:85 - * - ``void set_time_window(std::vector timestamps, int64_t window_seconds, int closed = 0)`` - void - pd_rolling.h:75 - Code Examples ------------- The following examples are extracted from the test suite. .. _example-rolling-get_series-0: .. dropdown:: get_series (pd_test_5_all.cpp:12970) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 12960 :emphasize-lines: 11 pandas_tests::check(!threw, "query_bool_and_numeric.no_throw", local_fail); if (!threw) { pandas_tests::check(result.nrows() == 1, "query_bool_and_numeric.nrows == 1 (got " + std::to_string(result.nrows()) + ")", local_fail); } } // === xs_level tests (Error 2) === // Note: xs_level() doesn't exist yet — test will verify it after implementation // === get_series + unstack tests (Error 1) === // Note: get_series() doesn't exist yet — test will verify it after implementation if (local_fail > 0) { std::cout << " [FAIL] : in f_test_anal_i_query_bool_unstack() : " << local_fail << " checks failed" << std::endl; throw std::runtime_error("f_test_anal_i_query_bool_unstack failed"); } std::cout << " -> tests passed" << std::endl; } // --- cpp_f_test_zanal_a_column_width.cpp --- .. _example-rolling-count-1: .. dropdown:: count (pd_test_1_all.cpp:66) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 56 :emphasize-lines: 11 if (arr.is_na(0)) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : is_na(0) should be false" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: is_na(0) should be false"); } if (!arr.has_na()) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : has_na() should be true" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: has_na() should be true"); } if (arr.count() != 2) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : count() should be 2" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: count() should be 2"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_kleene_and() { std::cout << "========= BooleanArray: Kleene AND ======================= "; .. _example-rolling-kurt-2: .. dropdown:: kurt (pd_test_1_all.cpp:4599) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 4589 :emphasize-lines: 11 std::cout << "========= Series skew/kurt ======================"; pandas::Series s({1.0, 2.0, 2.0, 3.0, 9.0}); auto skew_val = s.skew(); bool passed = skew_val.has_value() && *skew_val > 0; // Should be right-skewed if (!passed) { std::cout << " [FAIL] : in pd_test_aggregation_series_skew_kurt() : skew should be positive" << std::endl; throw std::runtime_error("pd_test_aggregation_series_skew_kurt failed: skew should be positive"); } auto kurt_val = s.kurt(); passed = kurt_val.has_value(); if (!passed) { std::cout << " [FAIL] : in pd_test_aggregation_series_skew_kurt() : kurt should have value" << std::endl; throw std::runtime_error("pd_test_aggregation_series_skew_kurt failed: kurt should have value"); } // Test kurtosis alias auto kurt_alias = s.kurtosis(); passed = kurt_alias.has_value() && std::abs(*kurt_alias - *kurt_val) < 0.0001; if (!passed) { .. _example-rolling-max-3: .. dropdown:: max (pd_test_1_all.cpp:771) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 761 :emphasize-lines: 11 pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true); // ordered // Test min std::optional min_val = arr.min(); if (!min_val.has_value() || *min_val != "low") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'"); } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); } // Test unordered throws for min/max pandas::CategoricalArray unordered = arr.as_unordered(); bool threw = false; try { unordered.min(); .. _example-rolling-mean-4: .. dropdown:: mean (pd_test_1_all.cpp:282) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 272 :emphasize-lines: 11 std::optional(true), std::optional(true) }); auto s = arr.sum(); if (!s.has_value() || s.value() != 3) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : sum should be 3" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: sum"); } auto m = arr.mean(); if (!m.has_value() || std::abs(m.value() - 0.75) > 0.001) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : mean should be 0.75" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: mean"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_dtype() { std::cout << "========= BooleanArray: dtype ======================= "; .. _example-rolling-median-5: .. dropdown:: median (pd_test_1_all.cpp:20910) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20900 :emphasize-lines: 11 throw std::runtime_error("pd_test_expanding_var failed: expanding var values incorrect"); } std::cout << " -> tests passed" << std::endl; } void pd_test_expanding_median() { std::cout << "========= Expanding median ======================"; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); auto result = s.expanding().median(); // Expanding median: 1, 1.5, 2, 2.5, 3 bool passed = std::abs(result[0] - 1.0) < 0.001 && std::abs(result[1] - 1.5) < 0.001 && std::abs(result[2] - 2.0) < 0.001 && std::abs(result[3] - 2.5) < 0.001 && std::abs(result[4] - 3.0) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_expanding_median() : expanding median values incorrect" << std::endl; throw std::runtime_error("pd_test_expanding_median failed: expanding median values incorrect"); .. _example-rolling-min-6: .. dropdown:: min (pd_test_1_all.cpp:764) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 754 :emphasize-lines: 11 } void pd_test_categorical_array_ordered_operations() { std::cout << "========= CategoricalArray: ordered operations (min/max) ======================= "; std::vector cats = {"low", "medium", "high"}; std::vector codes = {0, 2, 1, 0, -1}; // low, high, medium, low, NA pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true); // ordered // Test min std::optional min_val = arr.min(); if (!min_val.has_value() || *min_val != "low") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'"); } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); .. _example-rolling-quantile-7: .. dropdown:: quantile (pd_test_1_all.cpp:4540) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 4530 :emphasize-lines: 11 throw std::runtime_error("pd_test_aggregation_series_sem failed: sem value incorrect"); } std::cout << " -> tests passed" << std::endl; } void pd_test_aggregation_series_quantile() { std::cout << "========= Series quantile ======================="; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); auto q50 = s.quantile(0.5); bool passed = q50.has_value() && std::abs(*q50 - 3.0) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_aggregation_series_quantile() : quantile(0.5) should be 3.0" << std::endl; throw std::runtime_error("pd_test_aggregation_series_quantile failed: quantile(0.5) should be 3.0"); } // Test q=0 and q=1 auto q0 = s.quantile(0.0); passed = q0.has_value() && std::abs(*q0 - 1.0) < 0.001; if (!passed) { .. _example-rolling-sem-8: .. dropdown:: sem (pd_test_1_all.cpp:4525) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 4515 :emphasize-lines: 11 #include "../pandas/pd_dataframe.h" #include "../pandas/pd_series.h" namespace dataframe_tests { namespace dataframe_tests_aggregation { void pd_test_aggregation_series_sem() { std::cout << "========= Series sem ============================"; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); auto sem_val = s.sem(); // std(ddof=1) = sqrt(2.5), sem = sqrt(2.5)/sqrt(5) ≈ 0.707 bool passed = sem_val.has_value() && std::abs(*sem_val - 0.707) < 0.01; if (!passed) { std::cout << " [FAIL] : in pd_test_aggregation_series_sem() : sem value incorrect" << std::endl; throw std::runtime_error("pd_test_aggregation_series_sem failed: sem value incorrect"); } std::cout << " -> tests passed" << std::endl; } .. _example-rolling-skew-9: .. dropdown:: skew (pd_test_1_all.cpp:4592) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 4582 :emphasize-lines: 11 throw std::runtime_error("pd_test_aggregation_series_mode failed: multi-mode should return 2 values"); } std::cout << " -> tests passed" << std::endl; } void pd_test_aggregation_series_skew_kurt() { std::cout << "========= Series skew/kurt ======================"; pandas::Series s({1.0, 2.0, 2.0, 3.0, 9.0}); auto skew_val = s.skew(); bool passed = skew_val.has_value() && *skew_val > 0; // Should be right-skewed if (!passed) { std::cout << " [FAIL] : in pd_test_aggregation_series_skew_kurt() : skew should be positive" << std::endl; throw std::runtime_error("pd_test_aggregation_series_skew_kurt failed: skew should be positive"); } auto kurt_val = s.kurt(); passed = kurt_val.has_value(); if (!passed) { std::cout << " [FAIL] : in pd_test_aggregation_series_skew_kurt() : kurt should have value" << std::endl; .. _example-rolling-std_-10: .. dropdown:: std_ (pd_test_1_all.cpp:20752) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20742 :emphasize-lines: 11 throw std::runtime_error("pd_test_rolling_min_periods failed: with min_periods=1, idx 1 should be 3.0"); } std::cout << " -> tests passed" << std::endl; } void pd_test_rolling_std() { std::cout << "========= Rolling std ==========================="; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); auto result = s.rolling(3).std_(); // std([1,2,3]) = 1.0 (ddof=1) // std([2,3,4]) = 1.0 // std([3,4,5]) = 1.0 bool passed = std::abs(result[2] - 1.0) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_rolling_std() : rolling std should be 1.0" << std::endl; throw std::runtime_error("pd_test_rolling_std failed: rolling std should be 1.0"); } .. _example-rolling-sum-11: .. dropdown:: sum (pd_test_1_all.cpp:276) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 266 :emphasize-lines: 11 } // Test sum/mean pandas::BooleanArray arr({ std::optional(true), std::optional(false), std::optional(true), std::optional(true) }); auto s = arr.sum(); if (!s.has_value() || s.value() != 3) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : sum should be 3" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: sum"); } auto m = arr.mean(); if (!m.has_value() || std::abs(m.value() - 0.75) > 0.001) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : mean should be 0.75" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: mean"); } .. _example-rolling-var-12: .. dropdown:: var (pd_test_1_all.cpp:20890) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20880 :emphasize-lines: 11 throw std::runtime_error("pd_test_expanding_std failed: expanding std values incorrect"); } std::cout << " -> tests passed" << std::endl; } void pd_test_expanding_var() { std::cout << "========= Expanding var ========================="; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); auto result = s.expanding().var(); // Expanding var (ddof=1): NaN, 0.5, 1.0, 1.6667, 2.5 bool passed = std::isnan(result[0]) && std::abs(result[1] - 0.5) < 0.001 && std::abs(result[2] - 1.0) < 0.001 && std::abs(result[3] - 1.6667) < 0.001 && std::abs(result[4] - 2.5) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_expanding_var() : expanding var values incorrect" << std::endl; throw std::runtime_error("pd_test_expanding_var failed: expanding var values incorrect"); .. _example-rolling-apply-13: .. dropdown:: apply (pd_test_1_all.cpp:11244) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11234 :emphasize-lines: 11 void pd_test_func_apply_dataframe_apply_axis0() { std::cout << "========= DataFrame apply axis=0 ======================"; std::map> data = { {"A", {1.0, 2.0, 3.0}}, {"B", {4.0, 5.0, 6.0}} }; pandas::DataFrame df(data); // apply axis=0 applies function to each column auto result = df.apply([](const std::vector& col) { return std::accumulate(col.begin(), col.end(), 0.0); }, 0); bool passed = true; // Plan F·dtype: axis=0 reduce now returns a single "result" column // with the original column names ("A", "B") as the row index. // Sum of A: 1+2+3=6, Sum of B: 4+5+6=15 const auto& result_col = result["result"]; double sum_a = std::stod(result_col.get_value_str(0)); .. _example-rolling-rank-14: .. dropdown:: rank (pd_test_1_all.cpp:6451) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6441 :emphasize-lines: 11 // ===================================================================== // Test: Rank // ===================================================================== void pd_test_dataframe_rank() { std::cout << "========= rank ============================="; // Test Series rank with default method (average) { std::vector data = {3.0, 1.0, 4.0, 1.0, 5.0}; pandas::Series s(data, "test"); auto ranked = s.rank(); // Values: 3, 1, 4, 1, 5 -> Sorted: 1, 1, 3, 4, 5 // Ranks (average): 1.5, 1.5, 3, 4, 5 // Original positions: 3->3, 1->1.5, 4->4, 1->1.5, 5->5 double r0 = std::stod(ranked.get_value_str(0)); // 3.0 -> rank 3 double r1 = std::stod(ranked.get_value_str(1)); // 1.0 -> rank 1.5 if (std::abs(r0 - 3.0) > 1e-10) { std::cout << " [FAIL] : in pd_test_dataframe_rank() : value 3.0 should have rank 3, got " << r0 << std::endl; throw std::runtime_error("pd_test_dataframe_rank failed: value 3.0 rank"); .. _example-rolling-is_time_based-15: .. dropdown:: is_time_based (pd_test_3_all.cpp:24693) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 24683 :emphasize-lines: 11 numpy::datetime64("2023-01-01"), numpy::datetime64("2023-01-02"), numpy::datetime64("2023-01-03"), numpy::datetime64("2023-01-04"), numpy::datetime64("2023-01-05") }; auto dti = std::make_unique(dates); pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); s.set_index(std::move(dti)); auto rolling = pandas::setup_time_rolling(s, "3D", "right", 1); if (!rolling.is_time_based()) throw std::runtime_error("expected time-based rolling"); std::cout << " -> tests passed" << std::endl; } void pd_test_rolling_time_no_dti() { std::cout << "========= setup_time_rolling no DTI ================="; // Series without DatetimeIndex should throw pandas::Series s({1.0, 2.0, 3.0}); bool threw = false; try { .. _example-rolling-corr-16: .. dropdown:: corr (pd_test_1_all.cpp:4655) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 4645 :emphasize-lines: 11 } void pd_test_aggregation_dataframe_corr() { std::cout << "========= DataFrame corr ========================"; std::map> data; data["A"] = {1.0, 2.0, 3.0, 4.0, 5.0}; data["B"] = {2.0, 4.0, 6.0, 8.0, 10.0}; // Perfect correlation pandas::DataFrame df(data); auto corr_df = df.corr(); // Check dimensions bool passed = corr_df.nrows() == 2 && corr_df.ncols() == 2; if (!passed) { std::cout << " [FAIL] : in pd_test_aggregation_dataframe_corr() : corr should be 2x2" << std::endl; throw std::runtime_error("pd_test_aggregation_dataframe_corr failed: corr should be 2x2"); } // Diagonal should be 1.0 std::string aa = corr_df["A"].get_value_str(0); .. _example-rolling-cov-17: .. dropdown:: cov (pd_test_1_all.cpp:4690) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 4680 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_aggregation_dataframe_cov() { std::cout << "========= DataFrame cov ========================="; std::map> data; data["A"] = {1.0, 2.0, 3.0}; pandas::DataFrame df(data); auto cov_df = df.cov(); // Check dimensions bool passed = cov_df.nrows() == 1 && cov_df.ncols() == 1; if (!passed) { std::cout << " [FAIL] : in pd_test_aggregation_dataframe_cov() : cov should be 1x1" << std::endl; throw std::runtime_error("pd_test_aggregation_dataframe_cov failed: cov should be 1x1"); } // Var(A) = 1.0 with ddof=1 std::string aa = cov_df["A"].get_value_str(0);