SeriesResampler =============== .. cpp:class:: pandas::SeriesResampler Window operation class for rolling/expanding calculations. Example ------- .. code-block:: cpp #include using namespace pandas; // Use SeriesResampler SeriesResampler obj; // ... operations ... Constructors ------------ .. list-table:: :widths: 55 25 20 :header-rows: 1 * - Signature - Location - Example * - ``SeriesResampler(const Series& series, const std::string& freq, const std::string& closed = "", const std::string& label = "", const std::string& origin = "epoch", int64_t offset_nanos = 0)`` - pd_resampler.h:456 - Indexing / Selection -------------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``Series first() const`` - Series - pd_resampler.h:493 - :ref:`View ` * - ``int64_t get_period_key(int64_t epoch_ns) const`` - int64_t - pd_resampler.h:525 - * - ``Series last() const`` - Series - pd_resampler.h:494 - :ref:`View ` Missing Data ------------ .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``Series bfill() const`` - Series - pd_resampler.h:498 - :ref:`View ` * - ``Series ffill(int limit = -1) const`` - Series - pd_resampler.h:497 - :ref:`View ` Statistics ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``Series count() const`` - Series - pd_resampler.h:482 - :ref:`View ` * - ``Series max() const`` - Series - pd_resampler.h:475 - :ref:`View ` * - ``Series mean() const`` - Series - pd_resampler.h:473 - :ref:`View ` * - ``Series median() const`` - Series - pd_resampler.h:483 - :ref:`View ` * - ``Series min() const`` - Series - pd_resampler.h:474 - :ref:`View ` * - ``Series prod() const`` - Series - pd_resampler.h:484 - :ref:`View ` * - ``Series std_(int ddof = 1) const`` - Series - pd_resampler.h:476 - :ref:`View ` * - ``Series sum() const`` - Series - pd_resampler.h:472 - :ref:`View ` * - ``Series var(int ddof = 1) const`` - Series - pd_resampler.h:477 - :ref:`View ` Aggregation ----------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``Series agg(const std::string& func) const`` - Series - pd_resampler.h:487 - :ref:`View ` * - ``DataFrame agg_dict(const std::vector>>& col_funcs) const`` - DataFrame - pd_resampler.h:490 - * - ``Series apply( const std::function&)>& cb) const`` - Series - pd_resampler.h:512 - :ref:`View ` Comparison ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``Series nearest() const`` - Series - pd_resampler.h:500 - Time Series ----------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``Series asfreq() const`` - Series - pd_resampler.h:499 - :ref:`View ` Other Methods ------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``void build_groups()`` - void - pd_resampler.h:524 - * - ``const std::vector& group_keys() const`` - const std::vector& - pd_resampler.h:520 - * - ``const std::unordered_map>& groups() const`` - const std::unordered_map>& - pd_resampler.h:519 - :ref:`View ` * - ``size_t ngroups() const { return group_keys_order_.size()`` - size_t - pd_resampler.h:516 - :ref:`View ` * - ``int64_t period_key_to_timestamp(int64_t key) const`` - int64_t - pd_resampler.h:526 - * - ``const Series& series() const`` - const Series& - pd_resampler.h:521 - :ref:`View ` * - ``void set_result_datetime_index(Series& result) const`` - void - pd_resampler.h:529 - Code Examples ------------- The following examples are extracted from the test suite. .. _example-seriesresampler-first-0: .. dropdown:: first (pd_test_1_all.cpp:11616) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11606 :emphasize-lines: 11 void pd_test_groupby_first_last() { std::cout << "========= GroupBy first/last ===================="; std::map> data = { {"category", {1.0, 1.0, 2.0, 2.0}}, {"value", {10.0, 20.0, 30.0, 40.0}} }; pandas::DataFrame df(data); auto first_result = df.groupby("category").first(); auto last_result = df.groupby("category").last(); // First for group 1: 10, group 2: 30 // Last for group 1: 20, group 2: 40 double first1 = std::stod(first_result["value"].get_value_str(0)); double first2 = std::stod(first_result["value"].get_value_str(1)); bool passed = ((std::abs(first1 - 10.0) < 0.001 && std::abs(first2 - 30.0) < 0.001) || (std::abs(first1 - 30.0) < 0.001 && std::abs(first2 - 10.0) < 0.001)); if (!passed) { .. _example-seriesresampler-last-1: .. dropdown:: last (pd_test_1_all.cpp:11617) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11607 :emphasize-lines: 11 void pd_test_groupby_first_last() { std::cout << "========= GroupBy first/last ===================="; std::map> data = { {"category", {1.0, 1.0, 2.0, 2.0}}, {"value", {10.0, 20.0, 30.0, 40.0}} }; pandas::DataFrame df(data); auto first_result = df.groupby("category").first(); auto last_result = df.groupby("category").last(); // First for group 1: 10, group 2: 30 // Last for group 1: 20, group 2: 40 double first1 = std::stod(first_result["value"].get_value_str(0)); double first2 = std::stod(first_result["value"].get_value_str(1)); bool passed = ((std::abs(first1 - 10.0) < 0.001 && std::abs(first2 - 30.0) < 0.001) || (std::abs(first1 - 30.0) < 0.001 && std::abs(first2 - 10.0) < 0.001)); if (!passed) { std::cout << " [FAIL] : in pd_test_groupby_first_last() : first values incorrect" << std::endl; .. _example-seriesresampler-bfill-2: .. dropdown:: bfill (pd_test_1_all.cpp:23603) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 23593 :emphasize-lines: 11 std::cout << "====================================== [OK] pd_test_equals test suite ========================== " << std::endl; return 0; } } // namespace dataframe_tests // ------------------- pd_test_equals.cpp (end) ----------------------------- // ------------------- pd_test_ffill_bfill.cpp (start) ----------------------------- // dataframe_tests/pd_test_ffill_bfill.cpp // Test file for DataFrame.ffill() and DataFrame.bfill() methods #include #include #include #include #include #include "../pandas/pd_dataframe.h" // CRITICAL: No using namespace directives .. _example-seriesresampler-ffill-3: .. dropdown:: ffill (pd_test_1_all.cpp:23603) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 23593 :emphasize-lines: 11 std::cout << "====================================== [OK] pd_test_equals test suite ========================== " << std::endl; return 0; } } // namespace dataframe_tests // ------------------- pd_test_equals.cpp (end) ----------------------------- // ------------------- pd_test_ffill_bfill.cpp (start) ----------------------------- // dataframe_tests/pd_test_ffill_bfill.cpp // Test file for DataFrame.ffill() and DataFrame.bfill() methods #include #include #include #include #include #include "../pandas/pd_dataframe.h" // CRITICAL: No using namespace directives .. _example-seriesresampler-count-4: .. dropdown:: count (pd_test_1_all.cpp:66) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 56 :emphasize-lines: 11 if (arr.is_na(0)) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : is_na(0) should be false" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: is_na(0) should be false"); } if (!arr.has_na()) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : has_na() should be true" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: has_na() should be true"); } if (arr.count() != 2) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : count() should be 2" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: count() should be 2"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_kleene_and() { std::cout << "========= BooleanArray: Kleene AND ======================= "; .. _example-seriesresampler-max-5: .. dropdown:: max (pd_test_1_all.cpp:771) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 761 :emphasize-lines: 11 pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true); // ordered // Test min std::optional min_val = arr.min(); if (!min_val.has_value() || *min_val != "low") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'"); } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); } // Test unordered throws for min/max pandas::CategoricalArray unordered = arr.as_unordered(); bool threw = false; try { unordered.min(); .. _example-seriesresampler-mean-6: .. dropdown:: mean (pd_test_1_all.cpp:282) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 272 :emphasize-lines: 11 std::optional(true), std::optional(true) }); auto s = arr.sum(); if (!s.has_value() || s.value() != 3) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : sum should be 3" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: sum"); } auto m = arr.mean(); if (!m.has_value() || std::abs(m.value() - 0.75) > 0.001) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : mean should be 0.75" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: mean"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_dtype() { std::cout << "========= BooleanArray: dtype ======================= "; .. _example-seriesresampler-median-7: .. dropdown:: median (pd_test_1_all.cpp:20910) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20900 :emphasize-lines: 11 throw std::runtime_error("pd_test_expanding_var failed: expanding var values incorrect"); } std::cout << " -> tests passed" << std::endl; } void pd_test_expanding_median() { std::cout << "========= Expanding median ======================"; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); auto result = s.expanding().median(); // Expanding median: 1, 1.5, 2, 2.5, 3 bool passed = std::abs(result[0] - 1.0) < 0.001 && std::abs(result[1] - 1.5) < 0.001 && std::abs(result[2] - 2.0) < 0.001 && std::abs(result[3] - 2.5) < 0.001 && std::abs(result[4] - 3.0) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_expanding_median() : expanding median values incorrect" << std::endl; throw std::runtime_error("pd_test_expanding_median failed: expanding median values incorrect"); .. _example-seriesresampler-min-8: .. dropdown:: min (pd_test_1_all.cpp:764) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 754 :emphasize-lines: 11 } void pd_test_categorical_array_ordered_operations() { std::cout << "========= CategoricalArray: ordered operations (min/max) ======================= "; std::vector cats = {"low", "medium", "high"}; std::vector codes = {0, 2, 1, 0, -1}; // low, high, medium, low, NA pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true); // ordered // Test min std::optional min_val = arr.min(); if (!min_val.has_value() || *min_val != "low") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'"); } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); .. _example-seriesresampler-prod-9: .. dropdown:: prod (pd_test_1_all.cpp:26082) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 26072 :emphasize-lines: 11 std::cout << "====================================== [OK] pd_test_pivot_table test suite ========================== " << std::endl; return 0; } } // namespace dataframe_tests // ------------------- pd_test_pivot_table.cpp (end) ----------------------------- // ------------------- pd_test_prod.cpp (start) ----------------------------- // dataframe_tests/pd_test_prod.cpp // Tests for DataFrame.prod() and DataFrame.prod_cols() methods #include #include #include #include #include "../pandas/pd_dataframe.h" // CRITICAL: No using namespace directives namespace dataframe_tests { .. _example-seriesresampler-std_-10: .. dropdown:: std_ (pd_test_1_all.cpp:20752) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20742 :emphasize-lines: 11 throw std::runtime_error("pd_test_rolling_min_periods failed: with min_periods=1, idx 1 should be 3.0"); } std::cout << " -> tests passed" << std::endl; } void pd_test_rolling_std() { std::cout << "========= Rolling std ==========================="; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); auto result = s.rolling(3).std_(); // std([1,2,3]) = 1.0 (ddof=1) // std([2,3,4]) = 1.0 // std([3,4,5]) = 1.0 bool passed = std::abs(result[2] - 1.0) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_rolling_std() : rolling std should be 1.0" << std::endl; throw std::runtime_error("pd_test_rolling_std failed: rolling std should be 1.0"); } .. _example-seriesresampler-sum-11: .. dropdown:: sum (pd_test_1_all.cpp:276) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 266 :emphasize-lines: 11 } // Test sum/mean pandas::BooleanArray arr({ std::optional(true), std::optional(false), std::optional(true), std::optional(true) }); auto s = arr.sum(); if (!s.has_value() || s.value() != 3) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : sum should be 3" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: sum"); } auto m = arr.mean(); if (!m.has_value() || std::abs(m.value() - 0.75) > 0.001) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : mean should be 0.75" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: mean"); } .. _example-seriesresampler-var-12: .. dropdown:: var (pd_test_1_all.cpp:20890) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20880 :emphasize-lines: 11 throw std::runtime_error("pd_test_expanding_std failed: expanding std values incorrect"); } std::cout << " -> tests passed" << std::endl; } void pd_test_expanding_var() { std::cout << "========= Expanding var ========================="; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); auto result = s.expanding().var(); // Expanding var (ddof=1): NaN, 0.5, 1.0, 1.6667, 2.5 bool passed = std::isnan(result[0]) && std::abs(result[1] - 0.5) < 0.001 && std::abs(result[2] - 1.0) < 0.001 && std::abs(result[3] - 1.6667) < 0.001 && std::abs(result[4] - 2.5) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_expanding_var() : expanding var values incorrect" << std::endl; throw std::runtime_error("pd_test_expanding_var failed: expanding var values incorrect"); .. _example-seriesresampler-agg-13: .. dropdown:: agg (pd_test_1_all.cpp:11100) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11090 :emphasize-lines: 11 } void pd_test_func_apply_series_agg() { std::cout << "========= Series agg =================================="; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}, "values"); bool passed = true; // Test string-based aggregation auto sum_result = s.agg("sum"); if (!sum_result.has_value() || !approx_equal(sum_result.value(), 15.0)) { passed = false; std::cout << " [FAIL] : in pd_test_func_apply_series_agg() : sum failed" << std::endl; throw std::runtime_error("pd_test_func_apply_series_agg failed: sum failed"); } auto mean_result = s.agg("mean"); if (!mean_result.has_value() || !approx_equal(mean_result.value(), 3.0)) { passed = false; std::cout << " [FAIL] : in pd_test_func_apply_series_agg() : mean failed" << std::endl; .. _example-seriesresampler-apply-14: .. dropdown:: apply (pd_test_1_all.cpp:11244) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11234 :emphasize-lines: 11 void pd_test_func_apply_dataframe_apply_axis0() { std::cout << "========= DataFrame apply axis=0 ======================"; std::map> data = { {"A", {1.0, 2.0, 3.0}}, {"B", {4.0, 5.0, 6.0}} }; pandas::DataFrame df(data); // apply axis=0 applies function to each column auto result = df.apply([](const std::vector& col) { return std::accumulate(col.begin(), col.end(), 0.0); }, 0); bool passed = true; // Plan F·dtype: axis=0 reduce now returns a single "result" column // with the original column names ("A", "B") as the row index. // Sum of A: 1+2+3=6, Sum of B: 4+5+6=15 const auto& result_col = result["result"]; double sum_a = std::stod(result_col.get_value_str(0)); .. _example-seriesresampler-asfreq-15: .. dropdown:: asfreq (pd_test_1_all.cpp:2869) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2859 :emphasize-lines: 11 std::cout << "========= PeriodArray: asfreq ======================= "; // Monthly to quarterly pandas::PeriodArray arr_m(std::vector{ "2024-01", "2024-04", "2024-07", "NaT" }, "M"); auto arr_q = arr_m.asfreq("Q"); if (arr_q.size() != 4) { std::cout << " [FAIL] : asfreq size should be 4" << std::endl; throw std::runtime_error("pd_test_period_array_asfreq failed: size"); } if (arr_q.freqstr() != "Q") { std::cout << " [FAIL] : asfreq freqstr should be 'Q'" << std::endl; throw std::runtime_error("pd_test_period_array_asfreq failed: freqstr"); } // Check NaT is preserved .. _example-seriesresampler-groups-16: .. dropdown:: groups (pd_test_2_all.cpp:20864) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20854 :emphasize-lines: 11 // ===================================================================== // Per-group expanding tests // ===================================================================== void test_series_groupby_expanding_sum() { std::cout << " -- test_series_groupby_expanding_sum --" << std::endl; // Two groups: A=[1,2,3], B=[10,20] std::vector vals = {1.0, 10.0, 2.0, 20.0, 3.0}; pandas::Series data(vals); pandas::Series groups({"A", "B", "A", "B", "A"}); auto sgb = data.groupby(groups); pandas::SeriesGroupByExpandingWindow ew(sgb, 1); auto result = ew.sum(); check(result.size() == 5, "size_5"); // A group: expanding sum = 1, 3, 6 // B group: expanding sum = 10, 30 // Original order: [A:1, B:10, A:3, B:30, A:6] check(approx_eq(result[0], 1.0), "A_exp_sum_0"); .. _example-seriesresampler-ngroups-17: .. dropdown:: ngroups (pd_test_1_all.cpp:11497) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11487 :emphasize-lines: 11 // Create DataFrame with category column std::map> data = { {"category", {1.0, 1.0, 2.0, 2.0, 2.0}}, {"value", {10.0, 20.0, 30.0, 40.0, 50.0}} }; pandas::DataFrame df(data); // Test groupby auto grouped = df.groupby("category"); bool passed = grouped.ngroups() == 2; if (!passed) { std::cout << " [FAIL] : in pd_test_groupby_basic() : ngroups should be 2" << std::endl; throw std::runtime_error("pd_test_groupby_basic failed: ngroups should be 2"); } std::cout << " -> tests passed" << std::endl; } void pd_test_groupby_multiple_columns() { std::cout << "========= GroupBy multiple columns =============="; .. _example-seriesresampler-series-18: .. dropdown:: series (pd_test_2_all.cpp:2307) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2297 :emphasize-lines: 11 std::vector index = {"a", "b", "c", "d", "e"}; std::map> data1; data1["col1"] = {1.0, 2.0, 3.0, 4.0, 5.0}; data1["col2"] = {2.0, 4.0, 6.0, 8.0, 10.0}; // Perfectly correlated with col1 pandas::DataFrame df1(data1, std::make_unique>(index)); // Series with same index and values that correlate with df columns pandas::Series series({1.0, 2.0, 3.0, 4.0, 5.0}); series.set_index(pandas::Index(index)); pandas::Series result = df1.corrwith(series); bool passed = true; // col1 should have correlation 1.0 with series if (!approx_equal(result[0], 1.0)) { std::cout << "\n [FAIL] : Expected correlation 1.0 for col1, got " << result[0] << std::endl; passed = false; }